Top MLOps / DevOps Engineer Jobs Openings in 2025
Looking for opportunities in MLOps / DevOps Engineer? This curated list features the latest MLOps / DevOps Engineer job openings from AI-native companies. Whether you're an experienced professional or just entering the field, find roles that match your expertise, from startups to global tech leaders. Updated everyday.
Engineering Manager, Core Infrastructure
Harvey
501-1000
USD
0
250000
-
300000
United States
Full-time
Remote
false
Why HarveyHarvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customized and developed by our expert team of lawyers, engineers and research scientists. We’ve found product market fit and are scaling our team very quickly. Some reasons to join Harvey are:Exceptional product market fit: We have partnered with the largest law firms and professional service providers in the world, including Paul Weiss, A&O Shearman, Ashurst, O'Melveny & Myers, PwC, KKR, and many others.Strategic investors: Raised over $500 million from strategic investors including Sequoia, Google Ventures, Kleiner Perkins, and OpenAI.World-class team: Harvey is hiring the best talent from DeepMind, Google Brain, Stripe, FAIR, Tesla Autopilot, Glean, Superhuman, Figma, and more.Partnerships: Our engineers and researchers work directly with OpenAI to build the future of generative AI and redefine professional services.Performance: 4x ARR in 2024.Competitive compensation.Role OverviewOur infrastructure is the foundation that powers every user interaction with Harvey. We’re looking for an Engineering Manager to lead our Core Infrastructure team — the group responsible for building reliable, scalable, and secure systems that support our legal AI platform globally. This role will own cloud infrastructure, observability, container orchestration, and core platform reliability. You’ll be guiding a team of high-agency engineers and partnering closely with security and product teams to ensure our infra is an accelerant, not a constraint.At Harvey, we value Decisiveness, Simplicity, and the mindset that Job's Not Finished. We move fast, prioritize clarity, and are always striving for excellence. If this resonates with you, we'd love to hear from you.What You’ll DoLead and grow a team of engineers focused on infrastructure, networking, and platform reliability.Own cloud operations, scaling worldwide while ensuring high availability and performance.Drive key initiatives around observability, cost optimization, disaster recovery, and infrastructure security.Oversee infrastructure-as-code practices across the engineering org.Guide technical decision-making for container orchestration, service meshes, data infrastructure, and more.Hire, grow, and retain exceptional engineers who thrive in a high-trust, high-impact environment.Collaborate cross-functionally to align infrastructure work with product roadmap and company goals.Foster a culture of operational excellence, blameless incident response, and continuous improvement.What You Have2+ years of engineering management experience and 5+ years of hands-on infrastructure engineering.Deep expertise in cloud platforms (AWS, GCP, or Azure), Kubernetes, and infrastructure-as-code tooling (Terraform, Pulumi).Strong understanding of observability stacks (e.g. Datadog, Sentry) and incident response workflows.Experience with CI/CD, networking, and security principles at scale.A track record of designing and scaling systems with reliability and performance in mind.Excellent communication and collaboration skills, with a bias toward clarity and action.A systems mindset and passion for simplifying complex infrastructure.A track record of leading complex cross-functional projects and delivering measurable impact.Compensation Range$250,000 - 300,000 USDPlease find our CA applicant privacy notice here.Harvey is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.We are in the early innings of a generational company. Joining early at a hypergrowth startup has proven to lead to exponential growth in responsibility, access, and ability. Apply here today!
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
June 20, 2025
Detection Engineer
ElevenLabs
201-500
-
Anywhere
Full-time
Remote
true
This role is remote and can be executed globally. However, to facilitate working with the Security team, we prefer candidates based in timezones that allow overlap with Europe. About ElevenLabsElevenLabs is a research and product company defining the frontier of Audio AI. Millions of individuals use ElevenLabs to read articles, voice over their videos, and reclaim voices lost from disability. And the leading developers and enterprises use ElevenLabs to create Conversational AI agents for support, sales, and education.ElevenLabs launched in January 2023 with the first AI model to cross the threshold of human-like speech. In January 2025, we raised a $180 million Series C round, valuing ElevenLabs at $3.3 billion. The round was co-led by Andreessen Horowitz and ICONIQ Growth, with continued support from the leading names in tech, including Nat Friedman, Daniel Gross, Instagram co-founder Mike Krieger, Oculus VR co-founder Brendan Iribe, DeepMind and Inflection co-founder Mustafa Suleyman, and many others.ElevenLabs is only 2 years old and scaling rapidly. We are just getting started. If you want to work hard and have an incredible impact, we would love to hear from you.How we workHigh-velocity: Rapid experimentation, lean autonomous teams, and minimal bureaucracy.Impact not job titles: We don’t have job titles. Instead, it’s about the impact you have. No task is above or beneath you.AI first: We use AI to move faster with higher-quality results. We do this across the whole company—from engineering to growth to operations.Excellence everywhere: Everything we do should match the quality of our AI models.Global team: We prioritize your talent, not your location. What we offerLearning & development: Annual discretionary stipend towards professional development. Social travel: Annual discretionary stipend to meet up with colleagues each year, however you choose.Annual company offsite: We bring the entire company together at a new location every year.Co-working: If you’re not located near one of our main hubs, we offer a monthly coworking stipend.About the roleAs a Detection Engineer at ElevenLabs, you'll be on the front lines of our security operations, playing a critical role in building and maintaining our detection and incident response capabilities. You'll have an automation mindset, constantly looking for ways to scale our security efforts and reduce manual work. This role is perfect for someone passionate about security frameworks and best practices, driven by ownership, and eager to continuously improve our security posture. You’ll be instrumental in developing best-in-class security practices as we scale.RequirementsProven experience in incident response and security operations, including triaging security alerts, conducting investigations, and leading response efforts.Strong background in detection engineering, including developing, tuning, and maintaining security detection rules and alerts.Hands-on experience with SIEM Infrastructure, specifically with Google SecOps (Chronicle). This includes data onboarding, parsing, rule creation, and dashboarding.Proficiency in security monitoring across various platforms, including JAMF MDM for macOS endpoints, Google Workspace, Okta and general SaaS applications. Experience with cloud security monitoring, particularly in Google Cloud (GCP) with familiarity in GCP Security Command Center (SCC).Solid scripting skills (e.g., Python, Bash) for automating detection and response tasks, data parsing, and security tooling integration.Deep understanding of common attack techniques, threat intelligence, and the ability to translate them into actionable detections.Familiarity with security frameworks and best practices (e.g., MITRE ATT&CK, NIST Cybersecurity Framework).Excellent analytical and problem-solving skills, with a keen eye for detail and the ability to connect disparate pieces of information during investigations.#LI-Remote
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 20, 2025
Engineering Manager, Machine Learning
Captions
101-200
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 19, 2025
AI Platform Engineer (f/m/d)
AlephAlpha
201-500
-
Germany
Full-time
Remote
false
Overview
We are seeking a skilled and motivated AI Platform Engineer (f/m/d) to join our team at PhariaAI. In this role, you will play a crucial part in helping our customers successfully deploy, operate, and scale the PhariaAI stack across on-premise and cloud environments. You will work directly with customers to understand their infrastructure requirements, drive secure and scalable operations, and ensure the reliable performance of AI workloads in production settings.Your responsibilitiesHelping our customers deploy and operate the PhariaAI stack in on-premise and cloud environments. Gaining a deep understanding of customer infrastructure requirements to ensure a fast time-to-value. Helping our customers secure the PhariaAI stack for critical production use cases. Ensure scalability and performance of PhariaAI operations by taking a holistic perspective across multiple layers of the solution. Enable technical experts at our customer to deploy and operate PhariaAI self-sufficient towards defined SLOs. Work closely together with our customer’s technology experts adopting a hands-on and solution-oriented mindset.
Your Profile
Basic Qualifications You care about making something people want. You want to ship something that will bring value to users. You want to deliver AI solutions end-to-end and not only build a prototype. Degree in Computer Science or a related field. Experience with the Kubernetes ecosystem and tooling for package management (including Helm), containerization, monitoring and security. Experience with deploying and operating LLMs for inference including managing compute constraints and working with LLM APIs. Familiarity with NVIDIA GPU Operator and NVIDIA hardware preferred. Solid expertise in networking technologies, including HTTP proxies, routing mechanisms, and certificate management. Solid experience with computing our cloud infrastructure providers like GCP, Azure, AWS, OpenStack, or VMWare, particularly in managing GPU-enabled compute resources. Drive to implement AI innovations into real-world applications.Excellent communication skills in English and German (preferred).Preferred Qualifications Experience with infrastructure-as-code tools like Terraform and cluster management tools. Experience with fast-paced work environments and organizational growth.What you can expect from usBecome part of an AI revolution!30 days of paid vacationAccess to a variety of fitness & wellness offerings via WellhubMental health support through nilo.healthSubstantially subsidized company pension plan for your future securitySubsidized Germany-wide transportation ticketBudget for additional technical equipmentRegular team events to stay connectedFlexible working hours for better work-life balance and hybrid working modelVirtual Stock Option PlanJobRad® Bike Lease
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
June 19, 2025
Senior Systems Administrator
Together AI
201-500
USD
160000
-
230000
United States
Full-time
Remote
false
About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. As the Research Systems Engineer, you will partner with research professionals to design, implement, and maintain high-performance computing (HPC) clusters and cloud environments to support research and development activities. You will collaborate with research professionals to ensure seamless operation of research environments, including job scheduling, resource allocation, and data management. Responsibilities: Lead the installation and upgrades of system hardware and software, including computational systems, clusters, standalone machines, storage systems and a variety of network fabrics including Ethernet, InfiniBand, and Fibre Channel. Provide expertise and guidance in HPC infrastructure, design, implementation, and optimization. Serve as the primary technical point of contact for our Research team. Troubleshoot and resolve any system related problems to ensure the Research team’s success in using the environments Coordinate across multi-vendor resources, manage escalations effectively, handle complex issues, and ensure timely and satisfactory resolutions. Maintain detailed documentation of system configurations, procedures, and troubleshooting guides to facilitate knowledge sharing within the Research team. Contribute to the creation of training materials to enable the Research team’s success and platform adoption. Research new and emerging technologies, evaluate workflows and plans, and make recommendations for future improvements to the HPC environment Qualifications: 5+ years of Linux system administration experience Strong understanding of HPC architectures with GPU management Experience with job schedulers and resource managers (e.g. Slurm) Knowledge of Linux operating systems (e.g., Ubuntu, Red Hat, CentOS) Working experience with programming languages (e.g., Go, Python, Bash) Experience with network protocols (e.g., TCP/IP, InfiniBand) Experience with containerization and virtualization technologies (e.g., Docker, Kubernetes) Knowledge of cloud computing platforms (e.g., AWS, Azure, Google Cloud) Familiarity with machine learning and artificial intelligence frameworks (e.g., TensorFlow, PyTorch) Experience with data analytics, visualization and observability tools (e.g., Grafana, Tableau, Power BI) Strong problem-solving and analytical skills Excellent communication and collaboration skills Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 18, 2025
Senior DevOps Engineer
Together AI
201-500
USD
160000
-
230000
United States
Full-time
Remote
false
About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We are hiring a talented Senior DevOps Engineer to develop the software and processes for orchestration of AI workloads over large fleets of distributed GPU hardware. In this role, you'll be part of a cloud engineering organization that aims to automate everything and build failure-resistant and horizontally scalable cloud infrastructure for GPU-resident applications. As a Senior DevOps Engineer, you'll build deep understanding of Together AI’s services and use that knowledge to optimize and evolve our infrastructure's reliability, availability, serviceability, and profitability. The best applicants for this role are deeply technical, enthusiastic, great collaborators, and intrinsically motivated to deliver high quality infrastructure. You have experience practicing infrastructure-as-code, including the use of tools like Terraform and Ansible. You also have strong software development fundamentals, systems knowledge, troubleshooting abilities, and a deep sense of responsibility. Requirements Minimum of 5 years of prior relevant experience in DevOps, cloud computing, data center operations and Linux systems administration Experience in programming in at least one of the following languages: Go, Python, Java, and C++ Experience designing and building advanced CI/CD pipeline frameworks Experience with cloud computing toolsets like Terraform, Vault, and Packer Experience with configuration management tools like Ansible, Pulumi, Chef and Puppet Experience with Kubernetes and containerization Strong sense of ownership and desire to build great tools for others Responsibilities Introduce tools to facilitate greater automation and operability of services Design, build, and maintain CI/CD infrastructure Architect, deploy, and scale observability infrastructure Create runtime tools/processes that optimize cloud triaging and limit downtime Define best practices to make our systems and services measurable Work closely with internal teams to ensure best practices are appropriately applied Build tools to help engineering and research teams measure and improve their velocity Analyze and decompose complex software systems Collaborate with and influence others to improve the overall design About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 18, 2025
Senior Site Reliability Engineer - Networking
Lambda AI
501-1000
-
United Kingdom
Full-time
Remote
true
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 18, 2025
Information Security Engineer - Generalist
X AI
5000+
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 17, 2025
Model Designer
OpenAI
5000+
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 17, 2025
Cloud and DevOps Engineer - NYC
Distyl
51-100
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 16, 2025
Systems Engineer - Air
helsing
201-500
-
Germany
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 15, 2025
Engineering Manager, Agent Software Engineering
Decagon
101-200
-
United States
Full-time
Remote
false
About DecagonDecagon is building the most advanced conversational AI agents for the enterprise. Since starting the company, we've been on a tear, winning over customers like Duolingo, Notion, Rippling, Eventbrite, Webflow, BILT and many more. Our AI agents provide a human-like customer support experience that enables enterprises to better serve their customers and efficiently manage their customer experience organizations.We've raised $100M in total funding from Bain Capital Ventures, Accel, a16z, BOND Capital, A*, Elad Gil, and notable angels, including the founders of Box, Airtable, Rippling, Okta, Lattice, and Klaviyo.About the TeamThe Agent SWE team at Decagon deploys the most advanced conversational AI agents to our customers that impact millions of users and directly drive Decagon’s growth. You will guide a team to build on our industry-leading AI agent platform, collaborate directly with customers and use your own creativity to devise long-term, scalable solutions that support their needs.Our mission is to deliver magical support experiences — AI agents working alongside human agents to help users resolve their issues.About the RoleAs a leader on the Agent Software Engineeirng team, you’ll have complete ownership and autonomy in shipping best-in-class AI agents, from initial implementation through continuous iteration. You’ll work directly with leaders across industries like finance, healthcare and hospitality, solving their users’ needs with reliable and intuitive AI agents.Engineers here own their work end-to-end and are trusted to make a real impact. This role is for someone who is excited to mentor a team of junior engineers, dives deep into complex system challenges and builds elegant solutions that scale to millions of users.In this role, you willLead a team to design and build AI agents that outperform human agents in managing complex customer interactions and driving customer retentionCollaborate closely with enterprise customers across a number of verticals, understand their needs and transform these pain points into magical AI agentsPartner with product, design and research to identify cross-customer trends that guide the evolution of Decagon’s agent building platform and research effortsContribute to team strategy and help define the future of AI customer support agentsYour background looks something like thisHave 1+ years of engineering management experienceHave 5+ years of industry experience in software engineeringProficiency with Python, Typescript and asynchronous programmingA high degree of comfort digging into systems failures within deep technology stacks using any tool necessaryEven betterPrior experience working with multi-modal modelsBenefitsMedical, dental, and vision benefitsTake what you need vacation policyDaily lunches, dinners and snacks in the office to keep you at your bestCompensation$300K – $375K + Offers Equity
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 15, 2025
Research Empowerment Infrastructure Engineer
OpenAI
5000+
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 14, 2025
Signal Integrity Engineer
OpenAI
5000+
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 13, 2025
Lead Engineer, Interactive Avatar
HeyGen
201-500
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 13, 2025
Information Security Officer
helsing
201-500
USD
0
140000
-
180000
United States
Full-time
Remote
false
Who we are Helsing is a defense AI company. Our mission is to protect our democracies. We aim to achieve technological leadership, so that open societies can continue to make sovereign decisions and control their ethical standards. As democracies, we believe we have a special responsibility to be thoughtful about the development and deployment of powerful technologies like AI. We take this responsibility seriously. We are an ambitious and committed team of engineers, AI specialists and customer-facing program managers. We are looking for mission-driven people to join our European teams – and apply their skills to solve the most complex and impactful problems. We embrace an open and transparent culture that welcomes healthy debates on the use of technology in defense, its benefits, and its ethical implications. The role As our first US based Information Security Officer, you will be responsible for establishing and managing Helsing’s US IT and information security infrastructure. You will work across teams and geographies to establish secure and trusted infrastructure for collaborative work efforts focused on the transfer, development, and delivery of defense technologies in alignment with applicable regulations, standards, and industry best practices. You will be an essential aspect of Helsing’s ability to deliver complex systems that answer the challenges of tomorrow’s battlefields. The day-to-day Procure and manage IT and information security systems and associated budgets Architect and routinely assess IT and information security systems for compliance and risk posture in alignment with applicable regulations, standards, and best practices Source, implement, and manage foundational IT infrastructure to enable Helsing’s US operations across business and technical working teams Collaborate with Helsing’s Central IT team to architect a collaborative environment for code releases and technology transfer Establish and enforce security policies and protocols to maintain compliance with US government and industry standards Build the US business’s IT organization in partnership with Helsing’s central engineering leadership based on current and future business needs Address day-to-day business IT needs as they arise You should apply if you Have demonstrable experience in IT infrastructure and information security in classified environments (ideally in the defense industry) Have experience leading integrated teams and working across organizations to manage IT infrastructure and security requirements Have experience managing and implementing systems that meet NIST, CMMC, and NISPOM requirements to include safeguarding information designated ITAR, CUI, and other sensitive designations. Thrive on architecting systems, decomposing requirements, and ensuring your peers have the IT resources they need to execute their work Your personal values match ours: ownership, initiative, dedication to mission, speed and inclusiveness Are a high performer who thrives in a fast-paced environment Are collaborative, humble, intellectually curious, and driven to solve hard problems Hold a current security clearance (ideally Top Secret) Feel strongly about the right of democracies to defend their sovereignty through the fielding of capabilities that bolster deterrence and decisive action Join Helsing and work with world-leading experts in their fields Helsing’s work is important. You’ll be directly contributing to the protection of democratic countries while balancing both ethical and geopolitical concerns The work is unique. We operate in a domain that has highly unusual technical requirements and constraints, and where robustness, safety, and ethical considerations are vital. You will face unique Engineering and AI challenges that make a meaningful impact in the world Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defense industry is entering the most exciting phase of the technological development curve. Advances in our field of world are not incremental: Helsing is part of, and often leading, historic leaps forward In our domain, success is a matter of order-of-magnitude improvements and novel capabilities. This means we take bets, aim high, and focus on big opportunities. Despite being a relatively young company, Helsing has already been selected for multiple significant government contracts We actively encourage healthy, proactive, and diverse debate internally about what we do and how we choose to do it. Teams and individual engineers are trusted (and encouraged) to practice responsible autonomy and critical thinking, and to focus on outcomes, not conformity. At Helsing you will have a say in how we (and you!) work, the opportunity to engage on what does and doesn’t work, and to take ownership of aspects of our culture that you care deeply about What we offer A focus on outcomes, not time-tracking A generous compensation and benefits package (in addition to base salary) that includes, but may not be limited to, insurance coverage (medical and travel), flexible paid time off, paid holidays, and remote and/or hybrid work available depending on position. All compensation and benefits are subject to the terms and conditions of the underlying plans or programs, as applicable and as may be amended, terminated or superseded from time to time. The annual base salary range for this full-time position in the location listed is €140,000 to €180,000 USD. The actual base salary offered to the successful candidate will be determined by a variety of factors including relevant experience, qualifications, education, skill level, interview performance, and the level and scope of the position. Helsing is an Equal Opportunity Employer. We will consider all qualified applicants without regard to race, color, sex, sexual orientation, gender identity, national origin, age, disability, protected veteran status, genetics, or any other characteristic protected by applicable federal, state, or local law. Helsing's Candidate Privacy and Confidentiality Regime can be found here.
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 13, 2025
Tech Lead - AI Engagement
Perplexity
1001-5000
-
United States
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 13, 2025
Tech Lead, AI Video Agent
HeyGen
201-500
-
United States
Full-time
Remote
false
About HeyGen At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at www.heygen.com. Visit our Mission and Culture doc here. The opportunity This Tech Lead, AI Video Agent role is ideal for someone with deep expertise in large-scale system optimization, ranking/search infrastructure, and real-time quality tuning. You’ll lead the development of our AI agent system for video generation - an intelligent assistant powered by large language models (LLMs) and multi-modal understanding, enabling users to create videos through natural interaction and automation. Key Responsibilities System Architecture & Optimization: Design and scale end-to-end systems that power intelligent video agents, with a focus on search, ranking, ads optimization, and personalized content delivery. ML & LLM Integration: Work closely with research teams to deploy and fine-tune large language models (LLMs) and multimodal AI systems. Integrate these models into production pipelines for reasoning, planning, and content generation tasks. Team Leadership: Lead and mentor a cross-functional team of engineers specializing in infrastructure, ML systems, and product development. Foster a high-performance and collaborative engineering culture. AI Agent Development: Drive the development of AI agents capable of planning, reasoning, and executing video creation tasks through prompt orchestration, tool use, and feedback loops. Infrastructure & Quality Engineering: Build robust backend services optimized for inference speed, system reliability, and content quality, including real-time evaluation and feedback mechanisms. Cross-functional Collaboration: Partner with AI researchers, designers, and product managers to transform cutting-edge models into user-facing product features. Qualifications 7+ years of engineering experience, including 2+ years in a technical leadership or managerial role. Proven experience building and optimizing large-scale systems in areas such as ads ranking, search, or recommendation engines. Strong background in machine learning system design, including experience deploying and scaling LLMs or generative models in production. Familiarity with modern LLMs (e.g., GPT, Claude, PaLM), multimodal models (e.g., Flamingo, Gemini, LLaVA), and open-source frameworks like LangChain, LlamaIndex, or similar agentic toolkits. Excellent programming and architectural design skills; experience with backend technologies such as Python, Go, or C++ is preferred. Experience building infrastructure for real-time systems, with an emphasis on performance, observability, and scalability. Strong collaboration and communication skills, with the ability to translate research innovation into production-ready systems. What HeyGen Offers Competitive salary and benefits package. Dynamic and inclusive work environment focused on innovation and creativity. Opportunities for professional growth and leadership development. A fast-paced, impact-driven culture with direct access to cutting-edge AI applications. Access to advanced tools, compute infrastructure, and cross-disciplinary talent. Base Salary Range
$220,000 to $300,000 annually; Please note that the salary information is a general guideline only. HeyGen considers factors such as scope and responsibilities of the position, candidate's work experience, education/training, key skills, and internal equity, as well as location, market, and business considerations when extending an offer. As part of our total rewards package, HeyGen offers comprehensive benefits including a 401k plan, health benefits, generous PTO, a parental leave program and emotional health resources. HeyGen is an Equal Opportunity Employer.
We celebrate diversity and are committed to creating an inclusive environment for all employees. Join us at HeyGen and be part of a team that's reshaping the world of video creation through innovative technology!
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 13, 2025
Member of Technical Staff, Post training team
Cohere
501-1000
-
United Kingdom
Full-time
Remote
false
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 13, 2025
Red Team Specialist, Safeguards
Anthropic
1001-5000
-
United States
Full-time
Remote
true
MLOps / DevOps Engineer
Data Science & Analytics
Apply
June 12, 2025
No job found
There is no job in this category at the moment. Please try again later