AI MLOps / DevOps Engineer Jobs | Top AI MLOps / DevOps Engineer Openings in 2025

Latest AI Jobs

Showing 61 – 79 of 79 jobs

Tag

Senior Manager, Engineering - AI Agent

Ironclad

501-1000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

June 3, 2025

Hidden link

Contracts Manager, Infrastructure

OpenAI

5000+

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

June 3, 2025

Hidden link

Senior AI Engineer

Air Ops

1-10

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

June 3, 2025

Hidden link

Senior AI Engineer

Air Ops

1-10

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

June 3, 2025

Hidden link

Lead Cloud Network Engineer

Nice

5000+

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

June 2, 2025

Hidden link

Cloud Ops Engineer

Nice

5000+

United States

Full-time

Remote

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.Cloud Operations Engineer  A Cloud Operations Engineer works with NICE Customers and Internal stakeholders, and their responsibilities include showing leadership in proactive alert case management. The Ops Engineer should be efficient in terms of accuracy in execution, documentation, timeliness, communication, consistency, and professionalism.  The Ops Engineer will be responsible for liaising with management to follow and share feedback on existing and new processes, methodologies, best practices, and changes.    Key Responsibilities Work efficiently under pressure to meet tight deadlines and goals while keeping the momentum needed to drive initiatives to completion. Ability to effectively and proactively communicate (both written and verbal) to various Internal stakeholders/groups and customers daily. High level of accountability at the individual level; service and support that exceeds client needs. Experience in working independently and with other team members. Ability to multi-task as required and provide rapid support in production. Experience with and understanding of complex process and data flow. Is self-motivated to strive for professional excellence in all aspects of work. Excellent interpersonal and communication skills Good skills in team relationship building. Takes a “Can-Do” approach and attitude and delivers.  Perform end to end operational duties including application server health, as per documentation, process and methodology. Identify and resolve operational issues (e.g., batch failures, network issues, client data feed errors) Monitor infra and Application performance, CPU, file systems, databases, batch jobs. Maintain COMPLETE operational documentation, e.g., incident tracking and run books.  Produce metric reports including daily productivity status. Review Client service request tickets under service SLAs. Provide on-call off hour support and work during non-prime shift hours.     Qualifications 3+ years’ experience in Application/production support experience with Cloud-Based hosting administration, management, and performance tuning under a high availability SLA environment. Excellent Proficiency with Unix, Linux, Windows, Tomcat, SSH NICE product knowledge & experience (from implementation or support) will be key. Experience with PowerShell and scripting skills. Experience with SQL Server, Oracle and MySQL. Experience with application debugging, performance, scalability Familiarity with standard application security compliance best practices Knowledge of fault detection and resolution processes Must be able to provide on-call off-hour support and work during non-prime shift hours. Experience with AWS is a good to have. Knowledge of ETL is a plus.  About NiCE NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions. Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries. NiCE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.  

MLOps / DevOps Engineer

Apply

June 2, 2025

Hidden link

Applied AI Lead

Tandem

1001-5000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

June 1, 2025

Hidden link

Technical Services Engineer

Anyscale

501-1000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

May 30, 2025

Hidden link

Engineering Manager, ML Acceleration

Anthropic

1001-5000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

May 29, 2025

Hidden link

Engineering Manager, Inference

Anthropic

1001-5000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

May 29, 2025

Hidden link

Engineering Manager, ML Performance and Scaling

Anthropic

1001-5000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

May 29, 2025

Hidden link

AI Architect

Nice

5000+

United States

Full-time

Remote

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.Job Description NICE is looking for an exceptional Senior AI Architect to join our team. As an Senior AI Architect, you will help design features for our contact center solutions that use generative AI. You will work with a globally distributed team to explore and create solutions to industry problems using state-of-the-art large language models and other AI techniques. You bring to this team environment several years’ experience in software development, generative AI, and the curiosity and grit to see a project to delivery.   Responsibilities   * Meet with internal stakeholders, including Product Management, to understand product requirements. * Define design and architecture to support product features based on generative AI. * Communicate design to Product Management, development teams, and other stakeholders. * Compare design approaches based on factors such as cost, response quality, and latency. * Collaborate with management to improve developent process, including testing standards. * Help diagnose and resolve escalated production issues with AI-based features. * Meet with globally remote implementation teams, sometimes outside US business hours. * Build and optimize application prototypes that leverage generative AI. * Stay informed of the latest advancements in AI application development and tools.   Minimum Skills and Experience   * Minimum of 1 year of experience working with generative AI applications. * Minimum of 3 years of experience working in production software development. * Minimum of 3 years of experience working with AWS resources. * Excellent proficiency in Python programming. * Ability to develop and maintain good working relationships with cross-functional teams. * Ability to clearly communicate and present to internal and external stakeholders.   Additional Desired Skills and Experience   * Experience with Python and at least one web app framework for prototyping, e.g., Streamlit or Flask. * Experience with data science or machine learning beyond generative AI. * Experience working on international, globe-spanning teams.About NiCE NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions. Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries. NiCE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.  

MLOps / DevOps Engineer

Apply

May 29, 2025

Hidden link

Agent Architect

Hippocratic AI

101-200

United States

Full-time

Remote

About Us:Hippocratic AI is building safety-focused large language model (LLM) for the healthcare industry. Our team comprised of ex-researchers from Microsoft, Meta, Nvidia, Apple, Stanford, John Hopkins and HuggingFace are reinventing the next generation of foundation model training and alignment to create AI-powered conversational agents for real time patient-AI interactions.About the Role:We are looking for an Agent Architect to design, develop, and innovate the next generation of agentic systems that drive our healthcare-focused AI platform. This individual will serve as a central architect in shaping how our agents think, act, and interact across diverse clinical use cases.You will be responsible for selecting the right agentic paradigms—ranging from tool use, retrieval-augmented generation (RAG), and prompt engineering, to model training—and defining the underlying architecture for intelligent, safe, and responsive agents. You will work closely with our research, engineering, and evaluation teams to integrate cutting-edge techniques and continuously push the boundaries of agent capabilities.This role blends deep technical knowledge with strategic thinking and experimentation. It’s ideal for those who thrive at the intersection of LLM system design, product innovation, and applied AI research.Responsibilities:Architect and design new AI agents across a variety of clinical and operational use casesEvaluate and select the optimal agentic paradigm for each scenario (e.g., tools, engines, prompting, RAG, model training)Choose the appropriate models from our internal model library for specific tasksCollaborate with the research team to fine-tune models and optimize agent behaviorDefine and iterate on evaluation protocols in partnership with the evaluation teamDevelop new agent patterns and workflows to enable novel capabilities and interactionsRapidly incorporate state-of-the-art techniques from the latest scientific literature and open-source developmentsTrack and integrate capabilities from emerging foundational models to improve system performance and scopeRequired Qualifications:5+ years in a technical field such as software engineering, machine learning, data science, or AI product developmentDeep understanding of agentic system design and language model behaviorsProficiency with Python and modern ML toolingExperience building and evaluating non-deterministic AI/ML systemsStrong analytical and problem-solving skills, with an experimental mindsetFamiliarity with LLM paradigms such as prompting, RAG, fine-tuning, and tool usePreferred Skills:Experience designing agentic workflows or orchestration frameworks for LLMsBackground in AI research or exposure to cutting-edge developments in NLPAbility to translate complex technical ideas into scalable architecturesInterest in healthcare applications, patient interaction design, and safety-critical systemsExcellent written and verbal communication skills, with the ability to clearly document design decisions and evaluationsCandidate Background:We recognize that agent architecture is a novel and rapidly evolving field. Ideal candidates may come from varied backgrounds such as applied ML engineering, AI product design, prompt engineering, or even NLP-focused research roles. If you are excited about designing intelligent systems from the ground up and want to shape how LLMs interact with the world in a safe and impactful way, we encourage you to apply.Why Join Our Team:Innovative Mission: We are developing a safe, healthcare-focused large language model (LLM) designed to revolutionize health outcomes on a global scale.Visionary Leadership: Hippocratic AI was co-founded by CEO Munjal Shah, alongside a group of physicians, hospital administrators, healthcare professionals, and artificial intelligence researchers from leading institutions, including El Camino Health, Johns Hopkins, Stanford, Microsoft, Google, and NVIDIA.Strategic Investors: We have raised a total of $278 million in funding, backed by top investors such as Andreessen Horowitz, General Catalyst, Kleiner Perkins, NVIDIA’s NVentures, Premji Invest, SV Angel, and six health systems.World-Class Team: Our team is composed of leading experts in healthcare and artificial intelligence, ensuring our technology is safe, effective, and capable of delivering meaningful improvements to healthcare delivery and outcomes.For more information, visit www.HippocraticAI.com.We value in-person teamwork and believe the best ideas happen together. Our team is expected to be in the office five days a week in Palo Alto, CA unless explicitly noted otherwise in the job description.

MLOps / DevOps Engineer

Apply

May 28, 2025

Hidden link

Senior Agent Engineer

Hippocratic AI

101-200

United States

Full-time

Remote

About Us:Hippocratic AI is building safety-focused large language model (LLM) for the healthcare industry. Our team comprised of ex-researchers from Microsoft, Meta, Nvidia, Apple, Stanford, John Hopkins and HuggingFace are reinventing the next generation of foundation model training and alignment to create AI-powered conversational agents for real time patient-AI interactions.About the Role: We are looking for an AI Agent Engineer to bend language models to their will. In this role, you will collaborate with engineers and research scientists to enhance the effectiveness and safety of generative AI solutions by designing, testing, and improving prompts that drive clinical safety and patient experience. You will also create automations to test your creations. An ideal candidate is equal parts software engineer and prompt engineer, product, loves experimentation and tinkering, is extremely thorough and detail oriented, and has a passion for conversation and communication. Responsibilities:Help achieve world class performance on several safety-critical LLM applicationsCollaborate with product managers to write detailed product specs for how models should behaveHelp design agentic model workflowsUse advanced prompting techniques to develop and optimize prompts for language models to improve model performance, clinical safety, and patient experience.Contribute to automated model testing and evaluation infrastructureBuild automations to create and select the best prompts for any given taskConduct experiments and analyze outcomes of model outputs to refine and iterate on prompt strategies.Write and configure prompts to create engaging, patient-oriented conversationsRequired Qualifications:Experience (5+ years) in a technical or semi-technical field, such as engineering, analytics, data science, or product managementDemonstrated product sense and intuition for conversation designProficiency writing simple scripts in PythonFamiliarity with basic principles of machine learning, conversational AI, and Large Language ModelsPreferred Skills:Experience developing, designing, and/or interacting with non-deterministic AI/ML systemsExperience designing and conducting experiments and statistical analysis, specifically measuring and optimizing the accuracy of model outputsExperience with written language, communications and/or conversation designSoftware engineering experienceCandidate Background:Given that the field of prompt engineering is very new, we understand that it’s unlikely that you will have direct experience on your resume. While direct experience in prompt engineering is not expected, we value transferable skills that align with the responsibilities of the position. Ideal candidates might come from various related fields where they have honed relevant skills in problem-solving, analytical thinking, and written communication.You might be a good fit if you have an engineering background and have or are looking to move into a product role. Perhaps you’re a data scientist who writes on the side. Or maybe you are a conversation designer who dabbled in computational linguistics.We are looking for high-aptitude adaptive thinkers who are willing to jump in and solve new problems using a variety of technical and non-technical approaches. Why Join Our Team:Innovative Mission: We are developing a safe, healthcare-focused large language model (LLM) designed to revolutionize health outcomes on a global scale.Visionary Leadership: Hippocratic AI was co-founded by CEO Munjal Shah, alongside a group of physicians, hospital administrators, healthcare professionals, and artificial intelligence researchers from leading institutions, including El Camino Health, Johns Hopkins, Stanford, Microsoft, Google, and NVIDIA.Strategic Investors: We have raised a total of $278 million in funding, backed by top investors such as Andreessen Horowitz, General Catalyst, Kleiner Perkins, NVIDIA’s NVentures, Premji Invest, SV Angel, and six health systems.World-Class Team: Our team is composed of leading experts in healthcare and artificial intelligence, ensuring our technology is safe, effective, and capable of delivering meaningful improvements to healthcare delivery and outcomes.For more information, visit www.HippocraticAI.com.We value in-person teamwork and believe the best ideas happen together. Our team is expected to be in the office five days a week in Palo Alto, CA unless explicitly noted otherwise in the job description

MLOps / DevOps Engineer

Apply

May 28, 2025

Hidden link

Engineering Manager - AI Products

Perplexity

1001-5000

United States

Full-time

Remote

MLOps / DevOps Engineer

Apply

May 28, 2025

Hidden link

Implementation - Professional Services Engineer

Nice

5000+

Philippines

Full-time

Remote

MLOps / DevOps Engineer

Apply

May 28, 2025

Hidden link

Manager, Cloud Architecture and DevOps

Nice

5000+

Israel

Full-time

Remote

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.So, what’s the role all about? We are seeking an experienced Cloud Services Manager\ Manager, cloud architecture and DevOps to lead and drive cloud infrastructure, DevOps, and automation initiatives in a global R&D environment. The ideal candidate will have extensive experience managing global teams, expertise in AWS and Azure, and a deep understanding of CI/CD, DevOps processes, and Cloud Migrations. This role requires close collaboration with R&D teams to ensure seamless cloud integration, automation, security, and compliance.   How will you make an impact?  Lead and manage a Global Cloud Services team, driving innovation and efficiency in IT Design, implement, and oversee Public Cloud Infrastructure Management, focusing on AWS and Azure. Develop and maintain Cloud Architecture and Automation strategies to optimize performance and scalability. Oversee Cloud Security and compliance management, ensuring adherence to industry standards. Provide trusted advisory support to stakeholders on cloud best practices, governance, and cost optimization. Drive IT infrastructure modernization and lead cloud migration projects for legacy systems. Implement and enhance CI/CD pipelines, DevOps workflows, and automation frameworks.   Have you got what it takes? 5+ years of experience leading global teams in cloud services or infrastructure. Strong expertise in AWS and Azure, including best practices, architecture, and management. Proven experience in CI/CD, DevOps methodologies, and automation frameworks. Hands-on experience in IT infrastructure, with a focus on cloud migrations. Deep understanding of R&D workflows and collaboration with development teams. Strong knowledge of security, compliance, and governance in cloud environments. Excellent communication skills and ability to provide strategic guidance to senior stakeholders.   You will have an advantage if you also have: Experience with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation. Knowledge of Kubernetes, Docker, and serverless architectures. Familiarity with FinOps principles to optimize cloud costs.     What’s in it for you? Join an ever-growing, market disrupting, global company where the teams – comprised of the best of the best – work in a fast-paced, collaborative, and creative environment! As the market leader, every day at NICE is a chance to learn and grow, and there are endless internal career opportunities across multiple roles, disciplines, domains, and locations. If you are passionate, innovative, and excited to constantly raise the bar, you may just be our next NICEr!   Enjoy NICE-FLEX! At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere.   Requisition ID:  7414 Reporting into: Director, Cloud Services & Infrastructure Role Type: People Manager   #LI-HybridAbout NiCE NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions. Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries. NiCE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.  

MLOps / DevOps Engineer

Apply

May 28, 2025

Hidden link

Agent Deployment Engineer (Contract)

Hippocratic AI

101-200

United States

Contractor

Remote

About Us:Hippocratic AI has developed a safety-focused Large Language Model (LLM) for healthcare. The company believes that a safe LLM can dramatically improve healthcare accessibility and health outcomes in the world by bringing deep healthcare expertise to every human. No other technology has the potential to have this level of global impact on health. About the Role: We're seeking a Agent Deployment Engineer to join our collaborative team of engineers, scientists, and healthcare professionals working on transformative AI solutions. In this role, you'll help develop and maintain our conversation layer using a mix of software and prompting skills to allow our AI agent facilitate autonomous conversations.What You’ll Do:Develop complex prompts in our SOTA conversation planning layer to enable complex agentic model workflowsCollaborate with Product Managers, Software Engineers and Clinicians to create fully autonomous, clinical conversations.Write, configure and iterate on prompts to create engaging, patient-oriented conversationsUse advanced prompting techniques to develop and optimize prompts for language models to improve model performance, clinical safety, and patient experience.Conduct experiments and analyze outcomes of model outputs to refine and iterate on prompt strategies.What We’re Looking For (Must-Have):Bachelor’s degree in Computer Science, Computer Engineering, or a related field (or equivalent coursework/projects).2+ years industry experience Experience with Python Experience with LLM prompting (chatGPT, Claude, etc) through professional or personal useExposure to working with databases and building simple RESTful APIs.Strong problem-solving mindset and eagerness to learn new technologies.Nice-to-Have (But Not Required):Experience with personal or academic projects involving backend development.Basic understanding of AI/ML concepts or a desire to learn about them.Knowledge of modern web frameworks (e.g., Flask, Django, or Spring Boot).Awareness of data privacy and security best practices, especially in regulated environments.Why Join Our Team:Innovative Mission: We are developing a safe, healthcare-focused large language model (LLM) designed to revolutionize health outcomes on a global scale.Visionary Leadership: Hippocratic AI was co-founded by CEO Munjal Shah, alongside a group of physicians, hospital administrators, healthcare professionals, and artificial intelligence researchers from leading institutions, including El Camino Health, Johns Hopkins, Stanford, Microsoft, Google, and NVIDIA.Strategic Investors: We have raised a total of $278 million in funding, backed by top investors such as Andreessen Horowitz, General Catalyst, Kleiner Perkins, NVIDIA’s NVentures, Premji Invest, SV Angel, and six health systems.World-Class Team: Our team is composed of leading experts in healthcare and artificial intelligence, ensuring our technology is safe, effective, and capable of delivering meaningful improvements to healthcare delivery and outcomes.For more information, visit www.HippocraticAI.com.Our team values in-person collaboration, with on-site presence expected five days a week in Palo Alto, CA unless otherwise specified.Why Join Us:Be part of an innovative team creating impactful AI solutions in the healthcare space.Receive mentorship and hands-on training to advance your technical and professional skills.Collaborate in a supportive, team-driven environment that values learning and development.Work on meaningful projects with real-world applications in patient care.We’re excited to meet passionate engineers who are eager to make a difference. If you’re enthusiastic about learning and applying your skills to cutting-edge technology, we encourage you to apply!

MLOps / DevOps Engineer

Apply

May 27, 2025

Hidden link

Director, Field Engineering

Together AI

201-500

USD

230000

300000

United States

Full-time

Remote

Director, Field Engineering Location: San Francisco, CA (Hybrid) About the role: As a Director, Field Engineering at Together AI, you will build and lead a deeply technical post-sales team that work with our customers to build AI applications. This leader will directly impact the path to renewal and expansion for our customers, making them an integral component of the overall revenue team. This is an exciting opportunity for a deeply technical professional passionate about AI and customer success to make a significant impact in a fast-paced, innovative environment. Responsibilities Mentor and lead a team of post-sales engineers that are deeply embedded in Together AI’s largest customers, driving technical outcomes every day Manage account health of Together AI’s entire customer base, understanding active issues and overall trends leading to real-time customer health Proactively identify risk in accounts based on technical telemetry and overall business metrics Own voice of the customer program to deliver high-value feedback to our Product, Engineering, and Research teams Take on IC responsibility as a technical advisor to 3-5 key customers, primarily assisting in maintaining reliability and performance of their GPU clusters Build educational content and tooling for both internal and external use around Together’s solutions (i.e., playbooks, blogs, demos, etc.) Build and maintain strong relationships with technical stakeholders within accounts, ensuring the successful deployment and scaling of their applications Qualifications 10+ years of experience in a technical role with at least 5 years managing customer-facing technical teams Strong organizational skills and ability to manage dozens of customer implementations at once  Strong technical background, with knowledge of AI, ML, GPU technologies and their integration into high-performance computing (HPC) environments. Familiarity with infrastructure services (e.g., Kubernetes, SLURM), infrastructure as code solutions (e.g., Ansible), container infrastructure (Docker). Strong sense of ownership and willingness to learn new skills to ensure both team and customer success. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.  Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: 230-300K + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our Privacy Policy at https://www.together.ai/privacy

MLOps / DevOps Engineer

Software Engineer

Apply

May 27, 2025

Hidden link

Customer Support Engineer

Together AI

201-500

USD

180000

260000

United States

Full-time

Remote

Customer Support Engineer Location: San Francisco, CA (Hybrid) About the role: As a Customer Support Engineer at a pioneering AI company, you'll be the first line of defense to support customers as they build out training, fine tuning, and inference solutions with Together AI. You'll dive deep into complex technical challenges, providing swift and effective solutions while serving as a product expert. As a part of the Customer Experience organization, you will collaborate closely with product and sales, driving continuous improvement of our offerings. This is an exciting opportunity for a deeply technical professional passionate about AI and customer success to make a significant impact in a fast-paced, innovative environment. Responsibilities Engage directly with customers to tackle and resolve complex technical challenges involving our cutting-edge GPU clusters and our inference and fine-tuning services; ensure swift and effective solutions every time. Become a product expert in all of our Gen AI solutions, serving as the last line of technical defense before issues are escalated to Engineering and Product teams. Collaborate seamlessly across Engineering, Research, and Product teams to address customer concerns; collaborate with senior leaders both internally and externally to ensure the highest levels of customer satisfaction. Transform customer insights into action by identifying patterns in support cases and working with Engineering and Go-To-Market teams to drive Together’s roadmap (e.g., future models to support) Maintain detailed documentation of system configurations, procedures, troubleshooting guides, and FAQs to facilitate knowledge sharing with team and customers. Be flexible in providing support coverage during holidays, nights and weekends as required by business needs to ensure consistent and reliable service for our customers. Qualifications 5+ years of experience in a customer-facing technical role with at least 1 year in a support function in AI  Strong technical background, with knowledge of AI, ML, GPU technologies and their integration into high-performance computing (HPC) environments. Familiarity with infrastructure services (e.g., Kubernetes, SLURM), infrastructure as code solutions (e.g., Ansible) high-performance network fabrics, NFS-based storage management, container infrastructure, and scripting and programming languages. Familiarity with operating storage systems in HPC environments such as Vast and Weka Familiarity with inspecting and resolving network-related errors  Strong knowledge of Python, TypeScript, and/or JavaScript with testing/debugging experience using curl and Postman-like tools Foundational understanding in the installation, configuration, administration, troubleshooting, and securing of compute clusters. Complex technical problem solving and troubleshooting, with a proactive approach to issue resolution Ability to work cross-functionally with teams such as Sales, Engineering, Support, Product and Research to drive customer success. Strong sense of ownership and willingness to learn new skills to ensure both team and customer success. Excellent communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders. Ability to operate in dynamic environments, adept at managing multiple projects, and comfortable with frequent context switching and prioritization. About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.  Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $180K-260K + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

MLOps / DevOps Engineer

Software Engineer

Apply

May 27, 2025

Hidden link

No job found

Your search did not match any job. Please try again