Top MLOps / DevOps Engineer Jobs Openings in 2025

Looking for opportunities in MLOps / DevOps Engineer? This curated list features the latest MLOps / DevOps Engineer job openings from AI-native companies. Whether you're an experienced professional or just entering the field, find roles that match your expertise, from startups to global tech leaders. Updated everyday.

Together AI.jpg

Customer Support Engineer

Together AI
USD
180000
-
260000
US.svg
United States
Full-time
Remote
false
Customer Support Engineer Location: San Francisco, CA (Hybrid) About the role: As a Customer Support Engineer at a pioneering AI company, you'll be the first line of defense to support customers as they build out training, fine tuning, and inference solutions with Together AI. You'll dive deep into complex technical challenges, providing swift and effective solutions while serving as a product expert. As a part of the Customer Experience organization, you will collaborate closely with product and sales, driving continuous improvement of our offerings. This is an exciting opportunity for a deeply technical professional passionate about AI and customer success to make a significant impact in a fast-paced, innovative environment. Responsibilities Engage directly with customers to tackle and resolve complex technical challenges involving our cutting-edge GPU clusters and our inference and fine-tuning services; ensure swift and effective solutions every time. Become a product expert in all of our Gen AI solutions, serving as the last line of technical defense before issues are escalated to Engineering and Product teams. Collaborate seamlessly across Engineering, Research, and Product teams to address customer concerns; collaborate with senior leaders both internally and externally to ensure the highest levels of customer satisfaction. Transform customer insights into action by identifying patterns in support cases and working with Engineering and Go-To-Market teams to drive Together’s roadmap (e.g., future models to support) Maintain detailed documentation of system configurations, procedures, troubleshooting guides, and FAQs to facilitate knowledge sharing with team and customers. Be flexible in providing support coverage during holidays, nights and weekends as required by business needs to ensure consistent and reliable service for our customers. Qualifications 5+ years of experience in a customer-facing technical role with at least 1 year in a support function in AI  Strong technical background, with knowledge of AI, ML, GPU technologies and their integration into high-performance computing (HPC) environments. Familiarity with infrastructure services (e.g., Kubernetes, SLURM), infrastructure as code solutions (e.g., Ansible) high-performance network fabrics, NFS-based storage management, container infrastructure, and scripting and programming languages. Familiarity with operating storage systems in HPC environments such as Vast and Weka Familiarity with inspecting and resolving network-related errors  Strong knowledge of Python, TypeScript, and/or JavaScript with testing/debugging experience using curl and Postman-like tools Foundational understanding in the installation, configuration, administration, troubleshooting, and securing of compute clusters. Complex technical problem solving and troubleshooting, with a proactive approach to issue resolution Ability to work cross-functionally with teams such as Sales, Engineering, Support, Product and Research to drive customer success. Strong sense of ownership and willingness to learn new skills to ensure both team and customer success. Excellent communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders. Ability to operate in dynamic environments, adept at managing multiple projects, and comfortable with frequent context switching and prioritization. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.  Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $180K-260K + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
Anthropic.jpg

Engineering Manager, API Experience

Anthropic
-
US.svg
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Labelbox.jpg

Senior Engineering Manager, Multimodal AI Editors

Labelbox
-
US.svg
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Horizon3.ai

Senior Applied AI Engineer

Horizon3ai
-
US.svg
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Cohere Health.jpg

Member of Technical Staff, Search

Cohere
-
US.svg
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Anthropic.jpg

TPU Kernel Engineer

Anthropic
-
US.svg
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Norm Ai.jpg

Platform Engineer

Norm AI
-
US.svg
United States
Full-time
Remote
true
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Abridge.jpg

Senior Platform Engineer

Abridge
-
US.svg
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Lambda.jpg

Senior Site Reliability Engineer - Fleet Reliability

Lambda AI
-
US.svg
United States
Full-time
Remote
true
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Yutori.jpg

AI Engineer — LLM Infra

Yutori
-
US.svg
United States
Full-time
Remote
false
Yutori is reimagining how people interact with the web by building AI agents that can reliably do everyday digital tasks. We are building the entire stack to be agent-first, from training our own models to generative product interfaces.Towards this goal, we are looking for a member of the AI technical staff to join the founding team. Someone technically strong, and excited about building superhuman AI agents that take actions on the web.Our founders — Devi Parikh, Abhishek Das, Dhruv Batra — have decades of experience in AI research and product spanning generative, multimodal and embodied AI at Meta. Our team combines AI experience with design-minded product thinking to build and deliver on Yutori’s mission.Yutori is backed by a stellar set of visionary investors — Elad Gil, Sarah Guo, Jeff Dean, Fei-Fei Li, Amjad Masad, Guillermo Rauch, Akshay Kothari, Soleio, Oliver Cameron, Julien Chaumond, Logan Kilpatrick, Bryan McCann, Vladlen Koltun, Jamie Cuffe, Michele Catasta, etc.Responsibilities:Scale infra for post-training of multimodal LLMs (CPT, SFT, RL, search, reward models)Scale infra for agentic inference (throughput and latency of perception-planning-action loops)Build the foundations of a superhuman generalist web-agentWork closely with product engineers to translate cutting-edge AI capabilities into reliable product experiences.What we’re looking for:Experience with ML infrastructure (GPU clusters) and supporting networking (NCCL)Experience optimizing post-training and inference performance of multimodal LLMs (data/tensor/pipeline/context/expert parallelism, optimizing MFU, throughput, latency)Low level systems experience (Triton, CUDA)High IQ, high EQ, high agency, high craftsmanship, low ego. Proactive, clear communication.Benefits and perks:Competitive salary and equityVisa sponsorship and relocation stipend to bring you to SFGenerous health, dental, vision insurance for you and your dependents20 days of paid time off per yearWork laptop and budget to set up your work officeDaily team lunchesCommuter benefitsSmall, focused team of high-potential individuals. In-person in SF.
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
No job found
There is no job in this category at the moment. Please try again later