AI Infrastructure Engineer Jobs

Discover the latest remote and onsite AI Infrastructure Engineer roles across top active AI companies. Updated hourly.

Check out 13 new AI Infrastructure Engineer opportunities posted on The Homebase

Principal Engineer, AI Model LifeCycle

New
Top rated
Crusoe
Full-time
Full-time
Posted

The Principal Software Engineer for the Model LifeCycle team is responsible for managing fine-tuning systems for large foundation models, including multi-node orchestration, checkpointing, failure recovery, and cost-efficient scaling. They implement and maintain end-to-end training pipelines for Large Language Models, distillation and reinforcement learning pipelines, and agent execution infrastructure. Additionally, they manage dataset, model, and experiment management including versioning, lineage, evaluation, and reproducible fine-tuning at scale. The role involves close collaboration with product, business, and platform teams to shape core abstractions and APIs, influence long-term architectural decisions around training runtimes, scheduling, storage, and model lifecycle management, and contribute to the open-source LLM ecosystem. This role offers significant ownership in designing and building core systems from first principles.

$260,000 – $326,000
Undisclosed
YEAR

(USD)

San Francisco, United States
Maybe global
Onsite

Tech Lead, Android Core Product - Córdoba, Argentina

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for various use cases. Deploy and operate the core ML inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture to improve the performance, latency, throughput, and efficiency of deployed models. Build tools to identify bottlenecks and sources of instability, and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Córdoba, Argentina
Maybe global
Remote

Tech Lead, Android Core Product - Buenos Aires, Argentina

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for a diverse range of use cases; deploy and operate the core ML inference workloads for the AI Voices serving pipeline; introduce new techniques, tools, and architecture to improve the performance, latency, throughput, and efficiency of deployed models; build tools to identify bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Buenos Aires, Argentina
Maybe global
Remote

Tech Lead, Android Core Product - Medellín, Colombia

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to their customers for a diverse range of use cases. Deploy and operate the core ML inference workloads for AI Voices serving pipeline. Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of deployed models. Build tools to provide visibility into bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Medellín, Colombia
Maybe global
Remote

Tech Lead, Android Core Product - Lagos, Nigeria

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for diverse use cases. Deploy and operate the core machine learning inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture to improve the performance, latency, throughput, and efficiency of deployed models. Build tools to provide visibility into bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Lagos, Nigeria
Maybe global
Remote

Tech Lead, Android Core Product - Kolkata, India

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for various use cases. Deploy and operate the core ML inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture to improve performance, latency, throughput, and efficiency of deployed models. Build tools to identify bottlenecks and sources of instability and design and implement solutions to address high priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Kolkata, India
Maybe global
Remote

Tech Lead, Android Core Product - Noida, India

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for diverse use cases; deploy and operate the core ML inference workloads for AI Voices serving pipeline; introduce new techniques, tools, and architecture that improve performance, latency, throughput, and efficiency of deployed models; build tools to identify bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Noida, India
Maybe global
Remote

Tech Lead, Android Core Product - Luxembourg City, Luxembourg

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for a diverse range of use cases. Deploy and operate the core ML inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture to improve performance, latency, throughput, and efficiency of deployed models. Build tools to identify bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Luxembourg City, Luxembourg
Maybe global
Remote

Tech Lead, Android Core Product - Coimbra, Portugal

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for a diverse range of use cases. Deploy and operate the core ML inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of deployed models. Build tools to identify bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Coimbra, Portugal
Maybe global
Remote

Tech Lead, Android Core Product - Cardiff, United Kingdom

New
Top rated
Speechify
Full-time
Full-time
Posted

Work alongside machine learning researchers, engineers, and product managers to bring AI Voices to customers for a diverse range of use cases. Deploy and operate the core ML inference workloads for the AI Voices serving pipeline. Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of deployed models. Build tools to gain visibility into bottlenecks and sources of instability and design and implement solutions to address the highest priority issues.

$140,000 – $200,000
Undisclosed
YEAR

(USD)

Cardiff, United Kingdom
Maybe global
Remote

Want to see more AI Infrastructure Engineer jobs?

View all jobs

Access all 4,256 remote & onsite AI jobs.

Join our private AI community to unlock full job access, and connect with founders, hiring managers, and top AI professionals.
(Yes, it’s still free—your best contributions are the price of admission.)

Frequently Asked Questions

Have questions about roles, locations, or requirements for AI Infrastructure Engineer jobs?

Question text goes here

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

[{"question":"What does a AI Infrastructure Engineer do?","answer":"AI Infrastructure Engineers design and build the systems that power machine learning workloads. They optimize performance by resolving bottlenecks, implement scaling solutions through load balancing and redundancy, and deploy cloud infrastructure specifically for AI applications. These specialists build fault-tolerant systems for serving large language models, maintain continuous integration pipelines, and collaborate with AI teams to translate research needs into production-ready infrastructure."},{"question":"What skills are required for AI Infrastructure Engineer?","answer":"Key skills for this role include proficiency with cloud platforms (AWS SageMaker, Azure ML, Vertex AI), infrastructure as code tools like Terraform, and containerization technologies such as Docker and Kubernetes. Strong programming abilities in Python, Go or C++ are essential, with CUDA knowledge for GPU optimization. Experience with monitoring tools (Prometheus, Grafana), distributed systems, deep learning frameworks, and Linux/UNIX environments is highly valued in candidates."},{"question":"What qualifications are needed for AI Infrastructure Engineer role?","answer":"Employers typically require a bachelor's degree in Computer Science, AI, Machine Learning, or related technical field. Most positions demand 4+ years of experience in cloud infrastructure, large-scale systems, or software engineering with an infrastructure focus. Practical expertise in cloud computing, Linux administration, network architecture, and container technologies is essential. Specialized knowledge in GPU programming, distributed systems, and LLM serving capabilities strengthens applications considerably."},{"question":"What is the salary range for AI Infrastructure Engineer job?","answer":"The research provided doesn't contain specific salary information for AI Infrastructure Engineers. Compensation typically varies based on location, experience level, company size, and the specific technical skills required. As this role combines specialized AI knowledge with infrastructure expertise, salaries generally reflect the high demand for professionals who can effectively build and optimize systems for machine learning workloads at scale."},{"question":"How long does it take to get hired as a AI Infrastructure Engineer?","answer":"The research doesn't provide specific hiring timeline information. The hiring process length varies by company and often includes technical assessments of cloud architecture knowledge, infrastructure as code experience, and machine learning operations skills. Given the specialized nature of AI infrastructure roles and their typical requirement of 4+ years of relevant experience, candidates should expect thorough evaluation of their technical capabilities and problem-solving abilities."},{"question":"Are AI Infrastructure Engineer job in demand?","answer":"Yes, AI Infrastructure Engineer positions show strong demand signals. Major companies like Accenture, Scale AI, and Zoom are actively recruiting for these specialized roles. The increasing deployment of large language models and AI applications across industries creates consistent need for professionals who can build optimized infrastructure. The specialized skill intersection of cloud platforms, containerization, GPU optimization, and machine learning operations makes qualified candidates particularly valuable in today's job market."}]