Together AI

3 Jobs

247 Employees

About the Company

Together AI is a research-driven AI cloud infrastructure provider. Our purpose-built GPU cloud platform empowers AI engineers and researchers to train, fine-tune, and run frontier class AI models. Our customers include leading SaaS companies such as Salesforce, Zoom, and Zomato, as well as pioneering AI startups like ElevenLabs, Hedra, and Cartesia. We advocate for open source AI and believe that transparent AI systems will drive innovation and create the best outcomes for society.

Listed Jobs

Company Name: Together AI
Job Title: Machine Learning Engineer - Inference
Job Description: **Job Title:** Machine Learning Engineer - Inference **Role Summary:** Design and optimize high-performance AI inference systems for large language models, collaborating with researchers and engineers to deliver scalable, production-ready solutions. **Expectations:** - 3+ years of production-quality code experience. - Proficiency in Python, PyTorch, and high-performance system design. - Strong understanding of low-level OS concepts (threading, memory, networking). **Key Responsibilities:** - Develop and optimize AI inference engine systems for reliability and scalability. - Build runtime services for large-scale AI applications. - Collaborate with cross-functional teams to implement research into production features. - Conduct design/code reviews to maintain high quality standards. - Create tools, documentation, and infrastructure for data ingestion and processing. **Required Skills:** - Python, PyTorch, and high-performance library/tooling development. - Low-level OS expertise: multi-threading, memory management, networking. - Prior experience with AI inference systems (e.g., TGI, vLLM) preferred. - Knowledge of inference techniques (e.g., speculative decoding) and CUDA/Triton programming. - Familiarity with Rust, Cython, or compilers a bonus. **Required Education & Certifications:** Not specified.

San francisco, United states

Hybrid

Junior

14-12-2025

Company Name: Together AI
Job Title: LLM Inference Frameworks and Optimization Engineer
Job Description: **Job Title** LLM Inference Frameworks and Optimization Engineer **Role Summary** Design, develop, and optimize large‑scale, low‑latency inference engines for text, image, and multimodal models. Focus on distributed parallelism, GPU/accelerator efficiency, and software‑hardware co‑design to deliver high‑throughput, fault‑tolerant AI deployment. **Expectations** - Lead end‑to‑end development of inference pipelines for LLMs and vision models at scale. - Demonstrate measurable improvements in latency, throughput, or cost per inference. - Collaborate cross‑functionally with hardware, research, and infrastructure teams. - Deliver production‑ready, maintainable code in Python/C++ with CUDA. - Communicate technical trade‑offs to stakeholders. **Key Responsibilities** - Build fault‑tolerant, high‑concurrency distributed inference engines for multimodal generation. - Engineer parallelism strategies (Mixture of Experts, tensor, pipeline parallelism). - Apply CUDA graph, TensorRT/TRT‑LLM, and PyTorch compilation (torch.compile) optimizations. - Perform cache system tuning (e.g., Mooncake, PagedAttention). - Conduct performance bottleneck analysis and co‑optimize GPU/TPU/custom accelerator workloads. - Integrate model execution plans into end‑to‑end serving pipelines. - Maintain code quality, documentation, and automated testing. **Required Skills** - 3+ years deep‑learning inference, distributed systems, or HPC experience. - Proficient in Python & C++/CUDA; familiarity with GPU programming (CUDA/Triton/TensorRT). - Deep knowledge of transformer, large‑language, vision, and diffusion model optimization. - Experience with LLM inference frameworks (TensorRT‑LLM, vLLM, SGLang, TGI). - Knowledge of model quantization, KV cache systems, and distributed scheduling. - Strong analytical, problem‑solving, and performance‑driven mindset. - Excellent collaboration and communication skills. **Nice‑to‑Have** - RDMA/RoCE, distributed filesystems (HDFS, Ceph), Kubernetes experience. - Contributions to open‑source inference projects. **Required Education & Certifications** - Bachelor’s degree (or higher) in Computer Science, Electrical Engineering, or related field. - Certifications in GPU programming or distributed systems are a plus.

San francisco, United states

On site

Junior

14-12-2025

Company Name: Together AI
Job Title: Security Engineer Intern (Summer 2026)
Job Description: Job Title: Security Engineer Intern Role Summary: Develop and implement secure AI systems by designing enterprise-wide security solutions, building AI-driven security models, and collaborating with IT teams to enforce IAM best practices. Focus on safeguarding corporate assets through proactive and reactive security measures. Expectations: Write maintainable code, lead IAM policy implementation, and contribute to AI-assisted security applications to enhance threat detection and response. Key Responsibilities: - Design and deploy security controls to protect AI infrastructure - Develop clean, efficient code for security tools and automation - Collaborate with IT to establish identity and access management (IAM) policies - Build AI models for data classification and security operations - Support cross-functional teams in maintaining security standards Required Skills: - Proficiency in Python or bash - Experience with AI-assisted application development - Strong understanding of security frameworks and threat mitigation Required Education & Certifications: Bachelor’s degree (or equivalent) in Computer Science, Software Engineering, or related field, with graduation by Summer 2027. No certifications required.

San francisco, United states

Hybrid

Fresher

06-01-2026