AI MLOps / DevOps Engineer Jobs | Top AI MLOps / DevOps Engineer Openings in 2025

Kubernetes Platform Engineer

TensorWave

51-100

-

United States

Full-time

Remote

At TensorWave, we're leading the charge in AI compute, building a versatile cloud platform that's driving the next generation of AI innovation. We're focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what's possible in the AI landscape. About the Role:As a Kubernetes Platform Engineer focused on support and operations, you’ll play a critical role in maintaining the stability and reliability of our bare-metal Kubernetes infrastructure. You will work closely with senior engineers, taking point on troubleshooting, incident response, and day-to-day cluster operations across multi-tenant workloads.This is a great opportunity for engineers ready to deepen their Kubernetes expertise while supporting cutting-edge AI environments in real-time.Responsibilities:Own and troubleshoot operational issues within Kubernetes environmentsMaintain and monitor core services (e.g., Cilium, HAProxy, Prometheus, etc.)Ensure uptime, performance, and reliability of multi-tenant clustersAssist with Ingress/Egress connectivity and network debuggingSupport internal and customer teams in secure, isolated VPC environmentsCollaborate with senior engineers on automation and cluster lifecycle improvementsRequired Skills & Experience:2–4 years experience in DevOps, SRE, or Linux infrastructure roles1+ years of hands-on experience with Kubernetes in productionFamiliarity with networking, CNI plugins, and core Linux troubleshootingStrong infrastructure-as-code mindset using tools like Helm, Terraform, or AnsibleSolid experience with monitoring and logging tools (e.g., Prometheus, Grafana, Loki)Understanding of secure infrastructure design principles and least-privilege accessComfortable working in a team-oriented, fast-paced operational environmentNice to Have:Experience with RKE2, Rancher, or similar platformsExperience troubleshooting or supporting AI or GPU-based workloadsFamiliarity with HAProxy, Cilium, or other Kubernetes ingress/networking toolsWhat We Bring:In addition to a competitive salary, we offer a variety of benefits to support your needs, including:Stock Options100% paid Medical, Dental, and Vision insuranceLife and Voluntary Supplemental InsuranceShort Term Disability InsuranceFlexible Spending Account401(k)Flexible PTOPaid HolidaysParental LeaveMental Health Benefits through Spring Health

MLOps / DevOps Engineer

Apply

September 5, 2025

Kubernetes Platform Engineer

Senior Kubernetes Platform Engineer

Senior Support Engineer

AI Infrastructure Engineer

Member of Technical Staff - GPU Infrastructure

Backline Manager (Apache Spark™)

Customer Support Engineer, India

Datacenter Liquid Cooling Architect

Senior Manager - Security Incident Detection and Response

Detection and Response Engineer

Engineering Manager - Site Reliability Engineering/SRE (f/m/d)*

Manager, HPC Design

Staff HPC Hardware Engineer

Application Security Engineer, X

Detection and Response Engineer

Kubernetes Architect

Site Reliability Engineer

AI/HPC Network Development Engineer - Networking

AI/HPC Network Development Engineer - Networking

IT Engineer

Popular Categories