Oumi

2 Jobs

29 Employees

About the Company

Oumi is a community of researchers, developers, and institutions united in their mission to make frontier AI more open, collaborative, and accessible. Oumi offers an unconditionally open-source AI platform that empowers the broader community to push the boundaries of AI while enabling enterprises to enhance and truly own their AI destiny. The Oumi platform allows users to build, evaluate, and deploy cutting-edge AI models at any scale through an all-in-one, fully open-source solution.

Listed Jobs

Company Name: Oumi
Job Title: ML Engineer
Job Description: Job title: ML Engineer Role Summary: Design, build, and maintain scalable platform infrastructure that supports end‑to‑end AI model development, training, and deployment. Integrate machine learning pipelines, automate provisioning and monitoring, and collaborate closely with research and engineering teams to ensure high reliability and performance. Expactations: • Deliver production‑grade infrastructure for thousands of users • Champion code quality and open‑source best practices • Own end‑to‑end lifecycle of ML pipelines from data prep to deployment • Drive continuous improvement of performance and scalability • Actively contribute to and guide open‑source platform and model development Key Responsibilities: • Architect and implement robust training infrastructure on cloud platforms (AWS, GCP, Azure). • Build, optimize, and maintain ML pipelines (data ingestion, preprocessing, model training, evaluation, deployment). • Design distributed compute solutions to handle large datasets and models at scale. • Identify and resolve performance bottlenecks in platforms and pipelines. • Automate infrastructure provisioning, deployment, and monitoring using IaC and CI/CD pipelines. • Work with research and product teams to align platform capabilities with their workflows. • Contribute to open‑source codebase, maintain documentation, and guide community contributions. Required Skills: • Platform engineering / DevOps experience; strong IaC (Terraform, CloudFormation) expertise. • Deep understanding of machine learning concepts and end‑to‑end workflows. • Proficient in Python; solid software engineering practices. • Hands‑on experience with cloud services (AWS, GCP, or Azure). • Proven design of scalable, distributed systems; container orchestration (Kubernetes). • Familiarity with open‑source ecosystems and community contribution practices. • Strong analytical, debugging, and performance tuning skills. Required Education & Certifications: • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field. • Relevant certifications (e.g., AWS Certified Solutions Architect, GCP Professional Cloud Architect, Azure Solutions Architect) are a plus.

Seattle, United states

Hybrid

22-10-2025

Company Name: Oumi
Job Title: ML Research Engineer
Job Description: **Job Title:** ML Research Engineer **Role Summary** Research-engineering hybrid role focused on advancing open-source generative AI and large language models (LLMs) through scalable infrastructure development, experimental collaboration, and community-driven innovation. **Expectations** - Advanced expertise in ML, deep learning, or NLP, with a focus on generative AI/LLMs. - Strong Python proficiency and ML framework experience (e.g., PyTorch). - Demonstrated ability to design/maintain scalable ML infrastructure (distributed systems, cloud-based training). - Collaborative mindset for cross-functional research and engineering teamwork. **Key Responsibilities** - Architect and implement systems for LLM training, fine-tuning, and evaluation. - Partner with researchers to design experiments, develop reusable code, and analyze results. - Apply techniques such as reinforcement learning (RLHF), supervised fine-tuning, and prompt optimization to improve model alignment and performance. - Build distributed training pipelines for multi-GPU/multi-node environments. - Contribute open-source tools and models to foster community transparency and collaboration. - Optimize ML system performance across data processing, training, and deployment stages. **Required Skills** - Python programming; PyTorch (or equivalent ML framework) expertise. - ML infrastructure design (distributed training, cloud systems). - Strong quantitative analysis and problem-solving capabilities. **Required Education & Certifications** - Bachelor’s degree in Computer Science, AI/Machine Learning, or related field. - Advanced degree (Master’s/Ph.D.) preferred for research depth.

Palo alto, United states

Hybrid

22-10-2025