Machine Learning Engineer Jobs

Discover the latest remote and onsite Machine Learning Engineer roles across top active AI companies. Updated hourly.

Check out 1871 new Machine Learning Engineer opportunities posted on The Homebase

Software Development in Test Intern

New
Top rated
Together AI
Full-time
Full-time
Posted

Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines such as SGLang- or vLLM-style systems and Together's inference stack, including kernel backends, speculative decoding (e.g., ATLAS), and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate reinforcement learning (RL) and post-training pipelines (including RLHF, RLAIF, GRPO, DPO-style methods, and reward modeling) where most of the cost is inference, jointly optimizing algorithms and systems. Make RL and post-training workloads more efficient with inference-aware training loops such as asynchronous RL rollouts and speculative decoding techniques. Use these pipelines to train, evaluate, and iterate on frontier models based on the inference stack. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation with efficient inference, identifying bottlenecks in training engines, inference engines, data pipelines, and user-facing layers. Run ablations and scale-up experiments to study trade-offs between model quality, latency, throughput, and cost and integrate findings into model, RL, and system design. Profile, debug, and optimize inference and post-training services under production workloads. Drive roadmap items requiring real engine modifications, including changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to rigorously validate improvements. Provide technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training. Mentor engineers and researchers on full-stack ML systems work and performance engineering.

$200,000 – $280,000
Undisclosed
YEAR

(USD)

San Francisco
Maybe global
Onsite

Machine Learning Operations Engineer

New
Top rated
Haydenai
Full-time
Full-time
Posted

Optimize orchestration processes to ensure efficient deployment and management of AI models. Implement cost-saving strategies to minimize infrastructure expenses while maximizing performance. Upgrade throughput to enhance scalability and responsiveness of AI systems. Collaborate with cross-functional teams to identify bottlenecks and implement solutions to improve workflow efficiency. Ship new features and updates rapidly while maintaining high levels of quality and reliability. Deploy and monitor machine learning models produced by deep learning engineers. Design, deploy, and maintain performant and scalable processes for data acquisition and manipulation to enhance dataset accessibility. Participate actively in the team's software development process, including design reviews, code reviews, and brainstorming sessions. Maintain accurate and updated software development documentation.

$135,699 – $190,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Remote

Lead AI/ML Engineer

New
Top rated
ASAPP
Full-time
Full-time
Posted

Lead the design and implementation of scalable ML/AI systems focused on large language models, vector databases, and retrieval-based architectures. Integrate and apply foundation models from providers like OpenAI, AWS Bedrock, and Anthropic for prototyping and production use cases. Adapt, evaluate, and optimize large language models for domain-specific enterprise applications. Build and maintain infrastructure for AI model experimentation, deployment, and monitoring in production. Improve model performance and inference workflows addressing latency, cost, and reliability. Provide technical leadership by mentoring engineers and promoting best ML engineering practices. Partner with product and cross-functional stakeholders to translate requirements into scalable ML solutions. Contribute to the evolution of internal standards for AI experimentation, evaluation, and deployment. Lead the design and delivery of end-to-end voice AI solutions combining large language models with speech technologies including speech-to-text, text-to-speech, and real-time streaming audio pipelines, architecting low-latency, highly reliable conversational voice systems and guiding a team through ambiguity toward production excellence. Understand and apply constraints of voice experiences such as latency, turn-taking, interruption handling, streaming inference, and audio quality to create scalable, enterprise-grade systems.

$170,000 – $190,000
Undisclosed
YEAR

(USD)

New York or Mountain View, United States
Maybe global
Hybrid

Forward Deployed Engineer - ML

New
Top rated
Modal
Full-time
Full-time
Posted

As a Forward Deployed ML Engineer at Modal, you will work hands-on with companies like Suno, Lovable, Cognition, and Meta to architect and optimize production AI workloads on Modal. You will contribute to open-source projects, publish technical content demonstrating Modal's capabilities across the AI stack, and collaborate with Modal's product and sales teams as both an engineer and a product stakeholder. Additionally, you will build trusted relationships with technical leaders at companies doing frontier AI work and conduct technical demos, experiments, and proof-of-concepts that highlight Modal's performance advantages.

Undisclosed

()

Stockholm, Sweden
Maybe global
Onsite

Global Hardware Sourcing & Supply Manager

New
Top rated
Together AI
Full-time
Full-time
Posted

Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines including kernel backends, speculative decoding, and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate RL and post-training pipelines optimizing algorithms and systems where most cost is inference. Make RL and post-training workloads more efficient with inference-aware training loops, async RL rollouts, speculative decoding, and other techniques to reduce rollout collection and evaluation costs. Use these pipelines to train, evaluate, and iterate on frontier models. Co-design algorithms and infrastructure tightly coupling objectives, rollout collection, and evaluation to efficient inference, and identify bottlenecks across training engine, inference engine, data pipeline, and user-facing layers. Run experiments to understand trade-offs between model quality, latency, throughput, and cost, feeding insights back into design. Profile, debug, and optimize inference and post-training services under production workloads. Drive roadmap items requiring engine modifications such as kernel, memory layout, scheduling logic, and API changes. Establish metrics, benchmarks, and experimentation frameworks for rigorous validation of improvements. Provide technical leadership by setting technical direction for cross-team efforts in inference, RL, and post-training; mentor engineers and researchers on full-stack ML systems and performance engineering.

$200,000 – $280,000
Undisclosed
YEAR

(USD)

San Francisco
Maybe global
Onsite

Staff Software Engineer, Model LifeCycle

New
Top rated
Crusoe
Full-time
Full-time
Posted

The Staff Software Engineer for the Model LifeCycle team is responsible for building a comprehensive managed platform for the entire application development lifecycle focused on Machine Learning models, including Large Language Models (LLMs). Responsibilities include contributing to fine-tuning systems for large foundation models, including multi-node orchestration, checkpointing, failure recovery, and cost-efficient scaling, implementing and maintaining end-to-end training pipelines for LLMs, contributing to distillation and reinforcement learning pipelines, developing and maintaining agent execution infrastructure, and implementing features for dataset, model, and experiment management such as versioning, lineage, evaluation, and reproducible fine-tuning at scale. The role involves close collaboration with Principal Engineers, product, business, and platform teams to implement core abstractions and APIs, contributing to architectural decisions around training runtimes, scheduling, storage, and model lifecycle management, and engaging with the open-source LLM ecosystem. The role offers significant scope for ownership in implementing and contributing to the design of core systems.

$208,725 – $253,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

Senior Performance Engineer- Pretraining

New
Top rated
AlephAlpha
Full-time
Full-time
Posted

Engineer the systems required to train foundation models at scale with the objective to maximize hardware utilization and training throughput on large-scale GPU clusters. Profile training loops using tools such as PyTorch Profiler, Nsight Systems, and Nsight Compute to identify system- and kernel-level bottlenecks and maximize model throughput. Configure and tune composite parallelism strategies, optimizing load balance, minimizing critical-path bottlenecks, and managing communication-to-computation trade-offs for large-scale large language model training. Partner with AI Researchers to define model architectures for hardware efficiency without compromising convergence.

Undisclosed

()

Heidelberg, Germany
Maybe global
Hybrid

System Software Engineer

New
Top rated
HP IQ
Full-time
Full-time
Posted

As a modeling lead for the AI lab, you will be responsible for defining the technical roadmap for the team and supporting the modeling needs across the organization. You will define and establish best practices to manage the model life cycle from data acquisition to deployment, and build tools and platforms to facilitate building and deploying ML models on different devices with specific constraints. You will work closely with different teams across the organization to support their modeling needs, translating high level user needs to specific modeling requirements, creating plans, and technically driving the team to execute on those. You will define and drive the AI Lab technical strategy in support of HP’s AI roadmap, owning decisions across models, runtimes, inference engines, and optimization. You will lead on device AI strategy, including model compression, quantization, distillation, and hardware aware optimization across CPUs, GPUs, NPUs, and TPUs. You will architect and evolve tooling and platforms that support the full model lifecycle from data and training through evaluation, deployment, and monitoring. You will establish standards and evaluation frameworks to ensure high quality, safe, and performant Gen AI models in production. You will partner closely with cross functional leaders and teams to align technical direction with product and hardware strategy. Additionally, you will mentor a small group of senior engineers while operating as a hands on technical leader who sets direction and moves quickly.

$200,000 – $340,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

AceUp - Lead ML Engineer (Generative AI & LLM Focus)

New
Top rated
Silver.dev
Full-time
Full-time
Posted

The Lead ML Engineer is responsible for architecting conversational agents that maintain long-running coherent dialogues and handle complex reasoning tasks. They develop low-latency retrieval augmented generation (RAG) pipelines that ground LLM responses in proprietary data to ensure accuracy and minimize hallucinations. The role leads the development of NLP pipelines to extract structured insights from various unstructured data sources, implements advanced personalization layers that adapt model behavior based on user history and context, and owns the deployment lifecycle of models including prompt architecture, evaluation frameworks, latency optimization, and cost management on Vertex AI. Additionally, the engineer acts as a technical mentor by reviewing code, setting architectural standards, and guiding technical decisions without managing personnel.

$66,000 – $120,000
Undisclosed
YEAR

(USD)

Argentina
Maybe global
Remote

Member of Technical Staff - ML Training Systems

New
Top rated
Modal
Full-time
Full-time
Posted

The role requires engineers with experience training production machine learning models and the contribution to open-source projects as well as evolving Modal's infrastructure to train the next generation of language models.

$150,000 – $350,000
Undisclosed
YEAR

(USD)

New York, United States
Maybe global
Hybrid

Want to see more Machine Learning Engineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for Machine Learning Engineer jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does a Machine Learning Engineer do?","answer":"Machine Learning Engineers design, build, and deploy AI systems that solve real-world problems. They transform research prototypes into production-ready solutions by creating scalable ML pipelines, optimizing model performance, and handling data preprocessing workflows. They integrate models with applications via APIs, implement monitoring systems, and ensure models perform reliably in production environments. Daily tasks include collaborating with data scientists, fine-tuning algorithms, building deployment infrastructure, and maintaining data privacy. They work across diverse applications like recommendation engines, fraud detection systems, and computer vision tools while ensuring models remain accurate and efficient."},{"question":"What skills are required for Machine Learning Engineer jobs?","answer":"Strong programming skills in Python are fundamental, alongside proficiency with ML frameworks like TensorFlow and PyTorch. Machine Learning Engineers need solid mathematics and statistics knowledge, particularly in linear algebra, calculus, and probability theory. Experience with cloud platforms (AWS, GCP, Azure) is essential for deploying models at scale. Skills in data preprocessing, feature engineering, and model evaluation are critical for building effective systems. Engineers should understand MLOps practices, RESTful APIs, containerization tools like Docker, and version control systems. Practical experience with deep learning architectures and natural language processing is valuable for specialized roles."},{"question":"What qualifications are needed for Machine Learning Engineer jobs?","answer":"Most Machine Learning Engineer positions require a bachelor's degree in computer science, mathematics, or related field, with many employers preferring advanced degrees for senior roles. Beyond formal education, employers value demonstrated experience building and deploying machine learning models. A strong portfolio showcasing completed projects is often more important than academic credentials alone. Relevant certifications from cloud providers or in specific ML frameworks can strengthen applications. Employers look for candidates with verifiable experience in model deployment, optimization, and maintenance. Knowledge of software engineering best practices like testing, version control, and documentation is increasingly essential in this hybrid role."},{"question":"What is the salary range for Machine Learning Engineer jobs?","answer":"Machine Learning Engineer salaries vary based on several key factors. Geographic location significantly impacts compensation, with tech hubs like San Francisco, Seattle, and New York typically offering higher wages. Experience level creates substantial differences, with senior engineers earning considerably more than entry-level positions. Specialized expertise in areas like computer vision, reinforcement learning, or NLP can command premium compensation. Company size and industry also influence pay scales, with large tech companies and finance firms often offering higher salaries than startups or non-profits. Educational background, portfolio quality, and demonstrated impact on previous business outcomes further affect earning potential."},{"question":"How long does it take to get hired as a Machine Learning Engineer?","answer":"The hiring timeline for Machine Learning Engineer positions typically ranges from 4-12 weeks, depending on the company's hiring process and your qualifications. The interview process often includes technical screenings, coding challenges, system design discussions, and model implementation exercises. Candidates with strong portfolios demonstrating deployed ML projects may progress more quickly through initial screens. Specialized roles requiring expertise in deep learning or specific domain knowledge might have longer evaluation periods. Companies often test both theoretical understanding and practical implementation skills through multi-stage interviews. Building relationships with hiring managers through professional networks can sometimes accelerate the process."},{"question":"Are Machine Learning Engineer jobs in demand?","answer":"Machine Learning Engineer jobs remain in high demand across industries as organizations implement AI solutions to solve complex problems. Companies actively recruit ML Engineers for applications in recommendation systems, fraud detection, computer vision, natural language processing, and autonomous technologies. The role's hybrid nature—combining software engineering and data science expertise—makes qualified candidates particularly valuable. Organizations need specialists who can both develop models and deploy them in production environments. While the field is competitive, professionals with demonstrated experience building and maintaining ML systems at scale continue to find strong opportunities, especially those with specialized knowledge in emerging areas like reinforcement learning."},{"question":"What is the difference between Machine Learning Engineer and Data Scientist?","answer":"Machine Learning Engineers focus on implementing and deploying models in production environments, while Data Scientists concentrate on research, analysis, and prototype development. ML Engineers build scalable pipelines, optimize model performance, and create deployment infrastructure using software engineering practices. Data Scientists explore data, develop statistical insights, and experiment with algorithms to solve business problems. ML Engineers work extensively with frameworks like TensorFlow and deployment tools, whereas Data Scientists may spend more time with analytical tools and statistical methods. While Data Scientists uncover patterns and build proofs of concept, ML Engineers transform these prototypes into robust, production-ready systems that can operate at scale."}]