Top AI MLOps / DevOps Engineer Jobs Openings in 2025

Looking for opportunities in AI MLOps / DevOps Engineer? This curated list features the latest AI MLOps / DevOps Engineer job openings from AI-native companies. Whether you're an experienced professional or just entering the field, find roles that match your expertise, from startups to global tech leaders. Updated everyday.

Edit filters

Latest AI Jobs

Showing 6179  of 79 jobs
Tag
helsing_logo
System Integrator
helsing
-
GE.svg
Germany
Full-time
Remote
false
Who we are Helsing is a defence AI company. Our mission is to protect our democracies. We aim to achieve technological leadership, so that open societies can continue to make sovereign decisions and control their ethical standards.  As democracies, we believe we have a special responsibility to be thoughtful about the development and deployment of powerful technologies like AI. We take this responsibility seriously.  We are an ambitious and committed team of engineers, AI specialists and customer-facing programme managers. We are looking for mission-driven people to join our European teams – and apply their skills to solve the most complex and impactful problems. We embrace an open and transparent culture that welcomes healthy debates on the use of technology in defence, its benefits, and its ethical implications.  The role As a System Integrator, you are responsible for implementing all aspects of systems and technologies within one of Helsing’s programs that will determine the success of the forces. As part of our team, you will directly participate in our collaborations with industry partners, procurement agencies, and military forces. Your work will play a central role in developing complex systems that provide solutions to the challenges of tomorrow's battlefield. This includes, but is not limited to, gaining an in-depth understanding of state-of-the-art systems with a focus on land defense systems.   We have assembled a distinctive partnerships-and-programmes team across various fields of expertise and backgrounds. Together we lift ambitions and shape the thinking of our industry partners and customers on software and AI in defence, national security, and intelligence. You will work with and learn from leading experts to build lasting industry and customer relationships. The day-to-day Lead and execute initiatives in hardware and software integration to develop new systems within customer solutions Develop and maintain essential test and verification documents, ensuring the highest standards of quality and accuracy Offer expertise on the integration of systems within a multidisciplinary environment Recommend designs, as well as innovative, scalable, and adaptive systems and solutions Advise stakeholders and recommend solutions to complex problems by leveraging expertise in various technologies Identify and address system vulnerabilities, driving improvements through professional insight and problem-solving skills You should apply if you Have 3 years of experience in the field of system integration or a comparable area Hold a completed degree in Telecommunications Engineering, Electrical Engineering, Computer Science, Mechanical Engineering, or a related field Possess experience in the domain of defence and/or security-related systems is a plus Exhibit strong teamwork skills, resilience, independence, and a high level of self-motivation Are proficient in analyzing and resolving problems, addressing weaknesses, and managing dependencies Note: We operate in an industry where women, as well as other minority groups, are systematically under-represented. We encourage you to apply even if you don’t meet all the listed qualifications; ability and impact cannot be summarised in a few bullet points. Join Helsing and work with world-leading experts in their fields  Helsing’s work is important. You’ll be directly contributing to the protection of democratic countries while balancing both ethical and geopolitical concerns The work is unique. We operate in a domain that has highly unusual technical requirements and constraints, and where robustness, safety, and ethical considerations are vital. You will face unique Engineering and AI challenges that make a meaningful impact in the world Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances in our field of world are not incremental: Helsing is part of, and often leading, historic leaps forward In our domain, success is a matter of order-of-magnitude improvements and novel capabilities. This means we take bets, aim high, and focus on big opportunities. Despite being a relatively young company, Helsing has already been selected for multiple significant government contracts We actively encourage healthy, proactive, and diverse debate internally about what we do and how we choose to do it. Teams and individual engineers are trusted (and encouraged) to practise responsible autonomy and critical thinking, and to focus on outcomes, not conformity. At Helsing you will have a say in how we (and you!) work, the opportunity to engage on what does and doesn’t work, and to take ownership of aspects of our culture that you care deeply about What we offer A focus on outcomes, not time-tracking Competitive compensation and stock options Relocation support Social and education allowances Regular company events and all-hands to bring together employees as one team across Europe   Helsing is an equal opportunities employer. We are committed to equal employment opportunity regardless of race, religion, sexual orientation, age, marital status, disability or gender identity. Please do not submit personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, data concerning your health, or data concerning your sexual orientation.  Helsing's Candidate Privacy and Confidentiality Regime can be found here.     
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
union_gruppen_logo
Support Engineer - Taiwan
Union
USD
70000
-
180000
TW.svg
Taiwan
Full-time
Remote
true
About UsAt Union, we are solving one of the hardest challenges in AI infrastructure today: enabling high-velocity iteration while maintaining seamless production-readiness for AI workloads at scale. Flyte, the open-source project we steward, is the emerging standard for modern data and AI orchestration, with numerous leading technology organizations - like LinkedIn, Spotify, and Gojek - running millions of mission-critical workloads on the platform. We have a deep bench of infrastructure veterans from companies in the Big Three and beyond and a technical founding team who originally created Flyte while at Lyft. The Opportunity Reporting into our Head of Engineering, you will be accountable for delivering consistent and long-term customer value from Union. We are looking for team members who are innately curious, get joy out of solving complex technical problems, and thrive in collaborative, self managed environments. The ideal candidate has demonstrated ability to independently debug and troubleshoot issues across the entire infrastructure stack and manage technical operations. Bonus points if you have been a part of or contributed to the Flyte/Union communities or contribute or other OSS. In this role, you will:Resolve, triage, or escalate customer workflow issues by diagnosing platform/infrastructure vs. user code problems.Drive retention and increase adoption of Union by developing deep knowledge of our customers’ business objectives and technical solutions i.e. use cases, infrastructure configuration, integrations, etc.Partner with Engineering, Product, Sales, Customer Solutions to represent the voice of the customer, highlighting customer pain points and challenges in the adoption the platform.Collaborate with our Documentation team to develop and maintain customer-facing documentation, FAQs, common customer issues, best practices, etc.Represent the voice of the customer in product design, solution, and prioritization conversations.Learn, contribute to, and champion the Flyte open-source workflow orchestration system (see https://slack.flyte.org), as well as related Union managed services. Educate technical users on Union products.Own meeting support KPIs and SLAs by implementing and managing expectations, processes, and tooling for customer support.Collaborate with Customer Success to maintain a cadence of communication with customers about their adoption trends, sentiment, and opportunities for deeper engagement.About you:3+ years of Data or Software Engineering background with a strong understanding of public cloud technologies, containers and ML/data infrastructure.You have strong Python programming skills and are able to debug code. Go experience is a plus.Customer obsession. You take pride in solving customers' technical problems and have demonstrable experience supporting both SMB and enterprise customers with infrastructure products and solutions. You have used or supported machine learning enabled solutions and/or MLOps/DevOps in an enterprise setting.You have an awareness (or strong interest) of the latest trends in the broader machine learning and data orchestration fields.Possess excellent written and verbal communications skills, with a tone that conveys empathy.Familiarity with data processing, ML systems, MLOps and Machine Learning infrastructure components.Familiarity with Spark, Airflow, Tensorflow, Pytorch and other ML libraries are helpful, as is a grounding in the fundamentals of distributed systems.Benefits & BelongingAt Union.ai we know that employees who feel their best can build amazing things and we are proud to offer best in class benefits that will continually evolve and grow as the needs of our employees do. Benefits may vary based on countryExcellent medical - We pay 100% of your premiums and 90% for your dependentsGenerous dental and vision plans- We pay 90% of the premiums for you and your dependentsMeaningful equity in the form of options – all employees are owners hereUnlimited time off + 12 company holidays 401K match - Union.ai matches 100% of contributions up to the first 3%, and 50% up to 5%16 weeks paid parental leave for primary and secondary caregiversFlexible work schedule (some restrictions apply)For in office employees: Lunch provided onsite and well stocked kitchen with snacks and drinks. We believe that our differences are what bring us together to achieve truly special outcomes. We strive to be inclusive and focus on building teams that embody that quality too. Union.ai is an equal-opportunity employer and we encourage you to apply, even if your experience doesn’t align exactly with our job description.
MLOps / DevOps Engineer
Data Science & Analytics
Data Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
lambda_labs_logo
Staff Security Engineer
Lambda AI
USD
0
350000
-
500000
US.svg
United States
Full-time
Remote
false
Lambda is the #1 GPU Cloud for ML/AI teams training, fine-tuning and inferencing AI models, where engineers can easily, securely and affordably build, test and deploy AI products at scale. Lambda’s product portfolio includes on-prem GPU systems, hosted GPUs across public & private clouds and managed inference services – servicing government, researchers, startups and Enterprises world-wide. If you'd like to build the world's best deep learning cloud, join us.  *Note: This position requires presence in our San Francisco office location 4 days per week; Lambda’s designated work from home day is currently Tuesday. Engineering at Lambda is responsible for building and scaling our cloud offering. Our scope includes the Lambda website, cloud APIs and systems as well as internal tooling for system deployment, management and maintenance. Lambda Security protects some of the world's most valuable digital assets: invaluable training data, model weights representing immense computational investments, and the sensitive inputs required to leverage best of breed AI models. We're responsible for securing every byte that powers breakthrough artificial intelligence. As a Staff Security Engineer, you'll be the technical backbone of our security program, building and implementing security solutions that directly protect customer data and enable Lambda to be the safest place to build with AI. Reporting to the Senior Manager of Security and collaborating closely with Product Engineering, Platform Engineering, and embedded Technical Program Managers, you'll drive hands-on security improvements across our AI-focused infrastructure. Your work will span detection and response systems, vulnerability management, security architecture, and tooling that scales with our rapid growth while maintaining the highest security standards. You will work on implementing enterprise-grade detection capabilities, automating incident response workflows, hardening our multi-cloud and bare metal infrastructure, and establishing security tooling that positions Lambda as the industry's most trusted AI computing platform. You'll have unique access to LLMs hosted on our own infrastructure to pioneer AI-powered security solutions that wouldn't be possible anywhere else. If you thrive on solving complex security challenges in cutting-edge AI infrastructure and want to build security programs that scale from hundreds to thousands of systems, we'd love to talk. We value diverse backgrounds, experiences, and skills, and we are excited to hear from candidates who can bring unique perspectives to our team. If you do not exactly meet this description but believe you may be a good fit, please still apply and help us understand your readiness for a Security Technical Program Manager role. You cannot waste our time.What You’ll DoDrive Security Improvements: Design and implement comprehensive security solutions including detection capabilities, automation, and endpoint detection and response (EDR) across Lambda's infrastructure.Lead Incident Response: Drive critical security incident resolution, developing response automation and conducting post-incident reviews that strengthen our security posture.Develop Security Architecture: Create security architecture patterns and implementation guides that engineering teams can adopt to build secure-by-default systems.Build Detection & Response: Implement and tune SIEM/SOAR solutions, creating detection rules that identify threats while minimizing false positives.Pioneer AI-Powered Security: Leverage Lambda's hosted LLMs to build next-generation security capabilities including automated threat analysis, intelligent alert correlation, and AI-assisted incident response that push far beyond traditional approaches.Collaborate Across Engineering: Partner with Product and Platform Engineering teams to integrate security requirements into their development cycles at optimal moments.Mentor Security Excellence: Coach engineers across the organization on secure coding practices and security tool usage, multiplying your impact.Drive Vulnerability Management: Establish and operate vulnerability scanning, prioritization, and remediation programs that protect critical assets.Develop Security Tooling: Build security tools and automations that enable teams to maintain security standards without sacrificing development velocity.Advocate for Security: Communicate security value to stakeholders, translating technical risks into business impact for informed decision-making.YouHave 5+ years of hands-on security engineering experience and 10+ years of total engineering experience, with demonstrated impact protecting enterprise infrastructure.Thrive in high-speed, high-ambiguity startup environments where you build security programs while responding to immediate threats.Deep technical expertise with security tooling including SIEM/SOAR platforms, EDR solutions, vulnerability scanners, and cloud security monitoring.Excel at solving complex problems in Python, Go, or similar languages, building automations that scale security impact.Proven ability to work effectively with cross-functional technical teams both with and without authority (we're all on the same team!).Strong Linux systems experience in both bare metal and cloud environments, understanding infrastructure from kernel to application layer.Demonstrated experience driving security improvements that were enthusiastically adopted by engineering teams.Excellence at translating security concerns into business risk, enabling stakeholders to make informed decisions.If you do not meet all of these requirements but believe you may be a good fit, please still apply and provide a cover letter that helps us understand your readiness for a staff security engineering role.Nice to HaveYou've led or developed major components of enterprise security programs (detection & response, vulnerability management, security architecture, security tooling).Experience driving or providing significant evidence for compliance audits, such as SOC 2, ISO 27001, PCI-DSS, HIPAA/HITECH, or FedRAMP.Deep experience working with virtualization solutions such as KVM, Hyper-V, or Xen in production environments.Significant experience operating large-scale production services (SRE experience across thousands of hosts).You've built or deployed critical security infrastructure like SIEM solutions, canaries/honeypots, IDS/IPS, or custom detection platforms.Security certifications like CISSP, OSCP, or similar that demonstrate continued learning.Experience with AI/ML infrastructure security or protecting high-value computational workloads.Excitement about leveraging our direct access to state-of-the-art LLMs to revolutionize security operations—imagine AI-powered threat hunting, automated security report generation, and intelligent vulnerability prioritization at a scale only possible when you host the AI infrastructure yourself.Salary Range InformationBased on market data and other factors, the annual salary range for this position is $350,000-$500,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description. About LambdaFounded in 2012, ~350 employees (2024) and growing fastWe offer generous cash & equity compensationOur investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitabilityOur research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOGHealth, dental, and vision coverage for you and your dependentsWellness and Commuter stipends for select roles401k Plan with 2% company match (USA employees)Flexible Paid Time Off Plan that we all actually useA Final Note:You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.Equal Opportunity EmployerLambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
union_gruppen_logo
Support Engineer - US
Union
USD
70000
-
180000
US.svg
United States
Full-time
Remote
false
About UsAt Union, we are solving one of the hardest challenges in AI infrastructure today: enabling high-velocity iteration while maintaining seamless production-readiness for AI workloads at scale. Flyte, the open-source project we steward, is the emerging standard for modern data and AI orchestration, with numerous leading technology organizations - like LinkedIn, Spotify, and Gojek - running millions of mission-critical workloads on the platform. We have a deep bench of infrastructure veterans from companies in the Big Three and beyond and a technical founding team who originally created Flyte while at Lyft. The Opportunity Reporting into our Head of Engineering, you will be accountable for delivering consistent and long-term customer value from Union. We are looking for team members who are innately curious, get joy out of solving complex technical problems, and thrive in collaborative, self managed environments. The ideal candidate has demonstrated ability to independently debug and troubleshoot issues across the entire infrastructure stack and managed technical operations. Bonus points if you have been a part of or contributed to the Flyte/Union communities or contribute or other OSS.  In this role, you will:Resolve, triage, or escalate customer workflow issues by diagnosing platform/infrastructure vs. user code problems.Drive retention and increase adoption of Union by developing deep knowledge of our customers’ business objectives and technical solutions i.e. use cases, infrastructure configuration, integrations, etc.Partner with Engineering, Product, Sales, Customer Solutions to represent the voice of the customer, highlighting customer pain points and challenges in the adoption the platform.Collaborate with our Documentation team to develop and maintain customer-facing documentation, FAQs, common customer issues, best practices, etc.Represent the voice of the customer in product design, solution, and prioritization conversations.Learn, contribute to, and champion the Flyte open-source workflow orchestration system (see https://slack.flyte.org), as well as related Union managed services. Educate technical users on Union products.Own meeting support KPIs and SLAs by implementing and managing expectations, processes, and tooling for customer support.Collaborate with Customer Success to maintain a cadence of communication with customers about their adoption trends, sentiment, and opportunities for deeper engagement. About you:3+ years of Data or Software Engineering background with a strong understanding of public cloud technologies, containers and ML/data infrastructure.You have strong Python programming skills and are able to debug code. Go experience is a plus.Customer obsession. You take pride in solving customers' technical problems and have demonstrable experience supporting both SMB and enterprise customers with infrastructure products and solutions. You have used or supported machine learning enabled solutions and/or MLOps/DevOps in an enterprise setting.You have an awareness (or strong interest) of the latest trends in the broader machine learning and data orchestration fields.Possess excellent written and verbal communications skills, with a tone that conveys empathy.Familiarity with data processing, ML systems, MLOps and Machine Learning infrastructure components.Familiarity with Spark, Airflow, Tensorflow, Pytorch and other ML libraries are helpful, as is a grounding in the fundamentals of distributed systems.Benefits & BelongingAt Union.ai we know that employees who feel their best can build amazing things and we are proud to offer best in class benefits that will continually evolve and grow as the needs of our employees do. Benefits may vary based on countryExcellent medical - We pay 100% of your premiums and 90% for your dependentsGenerous dental and vision plans- We pay 90% of the premiums for you and your dependentsMeaningful equity in the form of options – all employees are owners hereUnlimited time off + 12 company holidays 401K match - Union.ai matches 100% of contributions up to the first 3%, and 50% up to 5%16 weeks paid parental leave for primary and secondary caregiversFlexible work schedule (some restrictions apply)For in office employees: Lunch provided onsite and well stocked kitchen with snacks and drinks. We believe that our differences are what bring us together to achieve truly special outcomes. We strive to be inclusive and focus on building teams that embody that quality too. Union.ai is an equal-opportunity employer and we encourage you to apply, even if your experience doesn’t align exactly with our job description.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
openai_logo
Data Center Security Technical Lead
OpenAI
USD
490000
-
US.svg
United States
Full-time
Remote
true
About the TeamSecurity is at the foundation of OpenAI’s mission to ensure that artificial general intelligence benefits all of humanity. The Security team protects OpenAI’s technology, people, and products. We are technical in what we build but are operational in how we do our work, and are committed to supporting all products and research at OpenAI. Our Security team tenets include: prioritizing for impact, enabling researchers, preparing for future transformative technologies, and engaging a robust security culture. About the RoleWe are seeking a highly experienced and strategic cybersecurity leader to join our team to lead our data center security efforts. This person will drive the overall vision and execution of cyber defense and security architecture efforts for OpenAI’s rapidly expanding datacenter footprint, spanning multiple, diverse environments. You will be directly responsible for leading a multidisciplinary team to ensure the security of these ambitious infrastructure programs, including key projects already in flight.This role is open to remote employees, or relocation assistance is available to one of our OpenAI offices in San Francisco, Seattle, or New York City.In this role, you will:Spearhead the overall security vision, architecture, execution, and sustained operations to protect data center infrastructure across a global footprint.Recruit, mentor, and lead a diverse, distributed, and multidisciplinary team of security professionals across multiple timezones to securely enable mission impact.Integrate hardware‑rooted trust, segmented enclaves, side‑channel mitigations, and insider‑risk countermeasures tuned for persistent, well‑resourced adversaries.Drive security design reviews, threat modeling, supply‑chain risk assessments, secure provisioning, runtime monitoring, incident response, and structured decommissioning.Deeply collaborate with engineering, operations, and security teams to define, implement, enforce, and evolve security best practices, including network segmentation, physical security, identity management, and access control policies.Engage with suppliers and partners to evaluate and mitigate risks associated with third-party hardware, firmware, and software dependencies.Prepare and deliver executive-level updates on risk posture, compliance, incident status, and program progress.Ensure our facilities meet the strictest compliance requirements.Continually evolve the security of our data centers alongside advances in cybersecurity controls, adversary activity, and commensurate with increased sophistication and risks of our frontier models.You might thrive in this role if you have:15+ years in security, including 10+ years leading large-scale infrastructure or datacenter programs with global scope.Proven track record protecting hyperscale, colocation, or hybrid environments against sophisticated adversaries, including insider threats and close-access attacks.Demonstrated success recruiting, developing, and retaining multidisciplinary security teams across time zones and cultures.Deep knowledge of security design and operations, including host-level hardening, network segmentation, IAM/PAM, secure provisioning, firmware and bare-metal security, and governance frameworks.Strong familiarity with control frameworks (e.g., NIST, ISO 27001, SOC 2) and experience driving compliance in regulated environments.About OpenAIOpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. For additional information, please see OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement.Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the San Francisco Fair Chance Ordinance, the Los Angeles County Fair Chance Ordinance for Employers, and the California Fair Chance Act. For unincorporated Los Angeles County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: protect computer hardware entrusted to you from theft, loss or damage; return all computer hardware in your possession (including the data contained therein) upon termination of employment or end of assignment; and maintain the confidentiality of proprietary, confidential, and non-public information. In addition, job duties require access to secure and protected information technology systems and related data security obligations.We are committed to providing reasonable accommodations to applicants with disabilities, and requests can be made via this link.OpenAI Global Applicant Privacy PolicyAt OpenAI, we believe artificial intelligence has the potential to help people solve immense global challenges, and we want the upside of AI to be widely shared. Join us in shaping the future of technology.
MLOps / DevOps Engineer
Data Science & Analytics
Vibe Coding
Software Engineering
Apply
Hidden link
helsing_logo
V&V Systems Engineer - Air
helsing
-
GE.svg
Germany
GB.svg
United Kingdom
FR.svg
France
Remote
false
Who we are Helsing is a defence AI company. Our mission is to protect our democracies. We aim to achieve technological leadership, so that open societies can continue to make sovereign decisions and control their ethical standards.  As democracies, we believe we have a special responsibility to be thoughtful about the development and deployment of powerful technologies like AI. We take this responsibility seriously.  We are an ambitious and committed team of engineers, AI specialists and customer-facing programme managers. We are looking for mission-driven people to join our European teams – and apply their skills to solve the most complex and impactful problems. We embrace an open and transparent culture that welcomes healthy debates on the use of technology in defence, its benefits, and its ethical implications.  The role You will be part of a team developing the world's first AI-enabled electronic warfare capability for a 4+ generation fighter jet, from concept to maiden flight. As a member of our chief engineering team, you will lead and guide all validation and verification (V&V) engineering activities. This includes defining the overall system and software V&V strategy, setting up the V&V process and tool chain, and executing all V&V activities in full compliance with aerospace standards. Your main challenge is to collaborate with our world-leading AI software team to ensure the developed software meets the contractual qualification requirements while exceeding customer expectations. The day-to-day As a key member of the Chief Engineering team, your day-to-day responsibilities include: Negotiate system qualification requirements and MoC with our customer, derive high-level software performance requirements, and validate high-level software performance requirements in close collaboration with our software leads Improve the traceability of system qualification requirements across all relevant software development artifacts, and develop a V&V product strategy to ensure the successful qualification of the final system and software. Work collaboratively with the project manager to ensure proper planning of all V&V activities, monitor technical progress, and guide the team toward successful qualification Oversee the preparation of all V&V documents to ensure they meet compliance and certification standards. Authorize the documents before they are delivered to the customer Lead the final software product qualification activities with both customer and authorities. Collaborate closely with our quality assurance team to ensure that all V&V deliverables adhere to established quality standards. Address any issues promptly to maintain the integrity of the project. Provide regular updates on the V&V status and any technical issues identified to the Chief Engineering and Program Management teams, ensuring all stakeholders are informed of significant developments You should apply if you Led already a 10+ system V&V team for the development of avionics or electronic warfare systems, from initial requirements definition to final qualification Successfully verified and validated at least two software-intensive avionics or electronic warfare systems currently in service Tested embedded software for an avionics or electronic warfare system in accordance with DO-178C standards and successfully passed the final qualification review Established a successful relationship with at least one civil or military aviation authority and participated in at least two panel meetings or Stage of Involvement (SOI) activities Hold a degree in aerospace engineering, electrical engineering, physics, computer science, or another engineering discipline that provides a strong technical understanding of complex topics Possess excellent communication skills and the ability to report and present results clearly and effectively to both internal and external stakeholders Demonstrate a passion for teamwork, with the ability to coordinate and guide the team to meet aerospace standards Note: We operate in an industry where women, as well as other minority groups, are systematically under-represented. We encourage you to apply even if you don’t meet all the listed qualifications; ability and impact cannot be summarised in a few bullet points. Join Helsing and work with world-leading experts in their fields  Helsing’s work is important. You’ll be directly contributing to the protection of democratic countries while balancing both ethical and geopolitical concerns The work is unique. We operate in a domain that has highly unusual technical requirements and constraints, and where robustness, safety, and ethical considerations are vital. You will face unique Engineering and AI challenges that make a meaningful impact in the world Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances in our field of world are not incremental: Helsing is part of, and often leading, historic leaps forward In our domain, success is a matter of order-of-magnitude improvements and novel capabilities. This means we take bets, aim high, and focus on big opportunities. Despite being a relatively young company, Helsing has already been selected for multiple significant government contracts We actively encourage healthy, proactive, and diverse debate internally about what we do and how we choose to do it. Teams and individual engineers are trusted (and encouraged) to practise responsible autonomy and critical thinking, and to focus on outcomes, not conformity. At Helsing you will have a say in how we (and you!) work, the opportunity to engage on what does and doesn’t work, and to take ownership of aspects of our culture that you care deeply about What we offer A focus on outcomes, not time-tracking Competitive compensation and stock options Relocation support Social and education allowances Regular company events and all-hands to bring together employees as one team across Europe   Helsing is an equal opportunities employer. We are committed to equal employment opportunity regardless of race, religion, sexual orientation, age, marital status, disability or gender identity. Please do not submit personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, data concerning your health, or data concerning your sexual orientation.  Helsing's Candidate Privacy and Confidentiality Regime can be found here.     
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
nice_systems_logo
Professional Services Engineer
Nice
-
PH.svg
Philippines
Full-time
Remote
false
At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.About NICE and MPower Proactive AI Agents NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the 100 largest corporations in the world, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NICE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions. Innovation efforts in cloud-native open platforms, artificial intelligence and analytics is driving rapid growth of our software products with reported revenues of USD 1.7 billion in FY 2020.   MPower Proactive AI Agents has recently become a part of the NICE CXOne suite and is a market leader in proactive conversational AI.    MPower Proactive AI Agents is a cloud-hosted conversation platform. Proactive and omni-channel, MPower Proactive AI Agents leverages artificial intelligence, natural language understanding and machine learning so that global brands can transform their customer journeys. MPower Proactive AI Agents disrupts the traditional call-centres; just a fraction of the cost to deploy and run, but with three times the customer engagement rate of its manual call-centre predecessors. This means that as well as being lower cost, MPower Proactive AI Agents can drive corporate KPIs harder and faster, at the same time as improving the customer experience.  About the role As a rapidly growing business, we are building a professional services team here in Manila. In this new role, you will become a MPower Proactive AI Agents expert and advocate, delivering high-quality implementations, analysis, support, and many more to clients across the globe.   Working initially with both the EU and US teams you will provide implementation and support to our new and existing clients, then when MPower Proactive AI Agents expands into APAC, the Manila team will also lead the implementation, quality assurance, support, and delivery for local clients. Key Responsibilities Become an expert in MPower Proactive AI Agents solution design consultancy and implementation  Work with global teams to build, test and support customer journeys and conversations from specification using our internal tooling  Learn clients’ business processes to define and document requirements for automated transactions of proactive customer journeys  Document End-to-End Application Integration Specifications  Configure, perform E2E testing, debug, and deploy the SaaS platform based on client requirements  Define and implement MPower Proactive AI Agents customer journey optimization strategies by utilizing data analytics in conjunction with a deep understanding of the MPower Proactive AI Agents transactional data model  Write queries in SQL and build reports and dashboards  Write and debug code snippets (in JavaScript)  Design and test APIs  Assist with training of new implementation specialists  Provide on-call support to global customers  Basic Qualifications:  4+ years experience in an implementation / professional services / consulting role  Graduated with a degree in a Computer Science / Business Analytics / Data Analytics / STEM type degree  Knowledge of basic programming languages and data structures  Knowledge of data analysis and SQL  A keen problem solver with a passion for learning  Ability to successfully work in a team-oriented, highly collaborative environment  Amenable to work shifting schedules   Preferred Qualifications:  If you don't have these but are willing to learn, please apply:  Technical expertise in data structure extraction and summarization  Experience with JavaScript, C#, SQL, HTML, CSS, Python  Experience with integration tools  Application knowledge of Excel, Tableau (or similar BI tool)  Requisition ID: 7852 Reporting into: Manager, Professional Services Role Type: Individual Contributor  About NiCE NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions. Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries. NiCE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.  
MLOps / DevOps Engineer
Data Science & Analytics
Data Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
1691021621180
Senior IT Engineer
X AI
-
US.svg
United States
Full-time
Remote
false
About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All engineers and researchers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.Senior IT Engineer Role We are seeking a highly skilled and experienced Senior IT Engineer to join our Corporate IT team. In this senior-level role, you will be responsible for driving the architecture, integration, and optimization of our core IT infrastructure — with a strong focus on identity and access management (IAM), collaboration platforms, endpoint management, and employee lifecycle automation. You’ll work with technologies such as Okta, Slack, Google Workspace, JAMF, Intune, and HRIS platforms like Workday and Rippling to build secure, scalable, and efficient systems. The ideal candidate is a proactive, systems-oriented problem-solver with deep expertise in Identity Governance and Administration (IGA), Mobile Device Management (MDM), and automation frameworks, including low-code/no-code orchestration tools and AI-enhanced workflows. Key Responsibilities Architect and manage IAM solutions using IdPs such as Okta, Ping, and JumpCloud, with a focus on IGA: user lifecycle automation, RBAC, access workflows, compliance reporting, and policy enforcement. Integrate IAM platforms with Google Workspace, Slack, HRIS systems (e.g., Workday, Rippling) and other SaaS tools to support SSO, MFA, directory sync, and role-based access provisioning. Administer and optimize collaboration platforms, especially Slack and Google Workspace, including advanced configuration, security settings, and workflow integrations. Implement and manage device compliance and security across macOS, iOS, Windows, and Android using MDM solutions like JAMF, Kandji, and Intune. Design and build automated IT workflows using scripting languages and modern orchestration tools — including low-code or AI-driven platforms that enable scalable, event-based automation and system integration across cloud services and APIs. Develop and maintain custom scripts in Python, Bash, and PowerShell to automate repetitive tasks, monitor systems, and streamline onboarding/offboarding processes. Leverage integrations with HRIS tools (Workday, Rippling) to drive lifecycle automation and ensure secure, consistent access to resources based on user roles and employment status. Maintain and support systems used by engineering teams, ensuring high availability of development tools, secure access, and streamlined workflows. Collaborate across teams (HR, Security, Engineering, Finance) to gather requirements, implement IT solutions, and align tools with business needs. Conduct regular security assessments of infrastructure and SaaS environments, and ensure compliance with internal and external standards. Mentor and support junior team members, contribute to documentation, and promote knowledge sharing and operational maturity across the IT team. Evaluate and implement new technologies, including emerging AI-driven or automation-enhancing platforms, to continuously improve efficiency, scalability, and security. Required Qualifications Bachelor’s degree in Computer Science, Information Technology, or a related field (or equivalent practical experience). 7+ years of experience in IT engineering, with 3+ years in a senior or staff-level capacity. Expertise in Okta or similar identity platforms, including SSO/MFA, lifecycle automation, RBAC, and compliance integration. Experience integrating IAM with Google Workspace, Slack, and HRIS tools like Workday or Rippling. Strong experience with MDM platforms such as JAMF, Kandji, and Intune, managing multi-OS device environments. Hands-on experience building automated IT workflows, including use of scripting and modern orchestration platforms that integrate APIs and event-based triggers. Proficiency in Python, Bash, and PowerShell for scripting, automation, and diagnostics. Understanding of IT architecture across SaaS, networking, and endpoint security. Proven track record of solving complex problems in fast-paced, security-conscious environments. Strong written and verbal communication skills, with the ability to work across both technical and non-technical stakeholders. Preferred Qualifications Relevant certifications: Okta Certified Professional, JAMF Certified Expert, Google Workspace Administrator, Microsoft Intune Administrator, or equivalent. Experience with other IAM platforms (e.g., Azure AD), or collaboration tools (e.g., Microsoft Teams). Familiarity with cloud environments (AWS, Azure) and containerization technologies (Docker, Kubernetes). Exposure to low-code/no-code platforms, AI-enhanced automation, or workflow orchestration tools. Understanding of ITIL practices, or experience working in Agile operational environments. Background in high-growth or enterprise-scale organizations with a strong security and compliance culture. xAI is an equal opportunity employer. California Consumer Privacy Act (CCPA) Notice
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
arizeai_logo
Open Source Infrastructure Engineer
Arize AI
USD
125000
-
225000
US.svg
United States
Full-time
Remote
true
The Opportunity AI is rapidly transforming the world. Whether it’s developing the next generation of human-level intelligence, enhancing voice assistants, or enabling researchers to analyze genetic markers at scale, AI is increasingly integrated into various aspects of our daily lives. Arize AI is the leading AI observability and evaluation platform, helping AI teams discover issues, diagnose problems, and improve the results of their AI applications. We are here to build world-class software that helps make AI applications work better — and we’re inviting you to join us on this mission. About Arize Phoenix Large Language Models (LLMs) like GPT4 are capable of human and super-human reasoning, but they don’t understand your personal data, are unable to perform complex tasks, remember the actions they performed yesterday, and so on. The core mission of the Arize OSS team is to enable developers to build LLM-powered applications with safety and observability in mind. We are at the forefront of a new way to build software. Doing this successfully will usher in a new age of knowledge automation and completely redefine the modern tech stack. Our product offerings consist of an open-source Typescript libraries as well as an enterprise platform. Arize Phoenix has hundreds of thousands monthly downloads and is used from large enterprises to early-stage startups - it is both easy to get started but robust for production use. We’re backed by TCV, Foundation Capital, Swift Ventures, and Battery Ventures. The Role You will focus on designing, building, and scaling the infrastructure that powers Arize’s platform services. You will work directly with our open-source team to ensure the Arize Phoenix’s  robustness, scalability, and accessibility for developers worldwide. This is a unique opportunity to contribute to foundational infrastructure that will shape the way organizations observe and evaluate AI systems. What You’ll Work On Collaborate with internal teams and the open-source community to architect and scale infrastructure that supports Arize Phoenix’s growth. Enable and enhance Phoenix’s SaaS capabilities by building out features such as central authentication, data retention, and capacity scaling. Build out a robust set of infrastructure tools and services that alleviates the operations across the suite of Arize's product offering. What Will Set You Apart 3+ years of experience building scalable infrastructure and developer tools, ideally in an open-source context. Strong understanding of the open-source development lifecycle and community dynamics. Expertise in Kubernetes, Terraform, and other modern infrastructure tools. Familiarity with build tools like Bazel and CI/CD systems. A pragmatic approach to solving problems and prioritizing impactful solutions over trends. Experience working with AI frameworks, LLM integrations, or observability platforms is a plus. Familiarity with Go, Python, and some TypeScript is a plus. Location: The estimated annual salary and variable compensation for this role is between $125,000 - $225,000, plus a competitive equity package. Actual compensation is determined based upon a variety of job related factors that may include: transferable work experience, skill sets, and qualifications. Total compensation also includes a comprehensive benefit package, including: medical, dental, vision, 401(k) plan, unlimited paid time off, generous parental leave plan, and others for mental and wellness support. While we are a remote-first company, we have opened offices in New York City and the San Francisco Bay Area, as an option for those in those cities who wish to work in-person. For all other employees, there is a WFH monthly stipend to pay for co-working spaces.More About Arize Arize’s mission is to make the world’s AI work and work for the people. Our founders came together through a common frustration: investments in AI are growing rapidly across businesses and organizations of all types, yet it is incredibly difficult to understand why a machine learning model behaves the way it does after it is deployed into the real world. Learn more about Arize in an interview with our founders: https://www.forbes.com/sites/frederickdaso/2020/09/01/arize-ai-helps-us-understand-how-ai-works/#322488d7753c   Diversity & Inclusion @ Arize Our company's mission is to make AI work and make AI work for the people, we hope to make an impact in bias industry-wide and that's a big motivator for people who work here. We actively hope that individuals contribute to a good culture Regularly have chats with industry experts, researchers, and ethicists across the ecosystem to advance the use of responsible AI Culturally conscious events such as LGBTQ trivia during pride month We have an active Lady Arizers subgroup
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
magicschool_ai_logo
Senior Site Reliability Engineer (Observability & Resilience)
MagicSchool AI
USD
0
130000
-
150000
US.svg
United States
Full-time
Remote
true
WHO WE ARE: MagicSchool is the premier generative AI platform for teachers. We're just over 2 years old, and more than 5.5 million teachers from all over the world have joined our platform. Join a top team at a fast growing company that is working towards real social impact. Make an account and try us out at our website and connect with our passionate community on our Wall of Love.Role Description:As Senior Site Reliability Engineer (Observability & Resilience), you will lead observability across our platform and help design the resilient infrastructure our customers and educators rely on every day. In this hands-on, individual contributor role, you’ll drive instrumentation and telemetry strategy while partnering closely with product and engineering to plan for Resilience, Recovery, and Availability.Responsibilities:In this role, you will be responsible for driving to the following outcomes:Observability Leadership: Design and implement observability patterns—including metrics, logging, tracing, and alerting—to ensure we have clear, actionable visibility into platform behavior and performance. Build internal tooling and dashboards: Empower our teams with real-time system insights.Operational Excellence: Define and maintain SLIs and SLOs in partnership with product and engineering teams. Establish best practices for alert tuning and signal-to-noise balancing to reduce incident fatigue and improve response accuracy.Platform Resilience: Architect and support infrastructure that prioritizes high availability, disaster recovery, and graceful degradation. Leverage Terraform and infrastructure-as-code to ensure consistent, reliable deployments across AWS and Google Cloud.Cross-Functional Enablement: Collaborate with engineers across teams to embed resilient design and observability from the ground up. Provide training and pairing support to product engineers, helping them build and maintain telemetry that supports the full software lifecycle. Experience & Qualifications:To be successful in this role, you’ll bring the following experience and qualifications:Professional Experience: At least 5 years in an SRE, DevOps, or observability-focused role, with a track record of success in fast-paced, high-growth environments.Observability & Resilience: Experience designing and operating systems for high availability and disaster recovery. Familiarity with incident response, alert fatigue reduction, and signal-to-noise balancing.Tooling Expertise: Deep experience with observability tools such as Grafana, Prometheus, Loki, Datadog, and OpenTelemetry. Proven ability to operationalize these tools for maximum team impact.Infrastructure Skills: Strong proficiency with Terraform and infrastructure-as-code workflows. Experience with multi-cloud deployments and operating resilient systems at scale.Enablement & Collaboration: Passion for enabling product engineers through training and pairing on observability patterns. Ability to drive cross-functional initiatives that improve system health and team effectiveness.Communication Skills: Skilled at explaining complex infrastructure and observability concepts to both technical and non-technical audiences. Calm and decisive under pressure, especially during incident response. Nice to Have:Experience with Sentinel, Loki, or similar logging/metrics stacks.Exposure to educational or compliance-heavy environments.Strong debugging skills and a calm presence during incidents.Notice: Priority Deadline and Review Start DatePlease note that applications for this position will be accepted until 7/18/25 — applications received after this date will be reviewed on an intermittent basis. While we encourage early submissions, all applications received by the priority deadline will receive equal consideration. Thank you for your interest, and we look forward to reviewing your application. Why Join Us?Work on cutting-edge AI technology that directly impacts educators and students.Join a mission-driven team passionate about making education more efficient and equitable.Flexibility of working from home, while fostering a unique culture built on relationships, trust, communication, and collaboration with our team - no matter where they live.Unlimited time off to empower our employees to manage their work-life balance. We work hard for our teachers and users, and encourage our employees to rest and take the time they need.Choice of employer-paid health insurance plans so that you can take care of yourself and your family. Dental and vision are also offered at very low premiums.Every employee is offered generous stock options, vested over 4 years.Plus a 401k match & monthly wellness stipendOur Values:Educators are Magic:  Educators are the most important ingredient in the educational process - they are the magic, not the AI. Trust them, empower them, and put them at the center of leading change in service of students and families.Joy and Magic: Bring joy and magic into every learning experience - push the boundaries of what’s possible with AI.Community:  Foster community that supports one another during a time of rapid technological change. Listen to them and serve their needs.Innovation:  The education system is outdated and in need of innovation and change - AI is an opportunity to bring equity, access, and serve the individual needs of students better than we ever have before.Responsibility: Put responsibility and safety at the forefront of the technological change that AI is bringing to education.Diversity: Diversity of thought, perspectives, and backgrounds helps us serve the wide audience of educators and students around the world.Excellence:  Educators and students deserve the best - and we strive for the highest quality in everything we do.
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
magicschool_ai_logo
Senior Security Engineer (Application & Cloud Security)
MagicSchool AI
USD
0
150000
-
170000
US.svg
United States
Full-time
Remote
true
WHO WE ARE: MagicSchool is the premier generative AI platform for teachers. We're just over 2 years old, and more than 5.5 million teachers from all over the world have joined our platform. Join a top team at a fast growing company that is working towards real social impact. Make an account and try us out at our website and connect with our passionate community on our Wall of Love.Role Description:As Senior Security Engineer (Application & Cloud Security), you will lead the development of secure engineering practices across our products and infrastructure. In this hands-on, individual contributor role, you’ll drive threat modeling, secure architecture, and application security across our multi-cloud stack while partnering with teams across engineering, product, IT, and compliance. You’ll report directly to the VP of Engineering and play a critical role in protecting the systems educators and students rely on every day. This position is Remote (US-based).Responsibilities:In this role, you will be responsible for driving to the following outcomes:Secure Development Lifecycle: Champion secure development practices including threat modeling, code reviews, and vulnerability management. Lead the evaluation and implementation of tooling such as SAST, DAST, and SCA, and build developer-friendly workflows to “shift security left.”Secure Architecture & Design: Collaborate with product and engineering teams to design secure systems and deployment models grounded in zero trust principles. Serve as a trusted advisor on cloud security best practices across AWS and Google Cloud environments.Internal Enablement: Lead security education programs for engineers and staff, including workshops, incident simulations, and best practice sharing. Coach engineers on practical security techniques and tradeoffs in the software development lifecycle.Security Assessments & Red Teaming: Plan and execute internal offensive security exercises, including red team–style assessments, penetration testing, and adversary emulation.Incident Response & Preparedness: Own and evolve security incident response playbooks. Collaborate with technical and operational teams on real-world incident response and postmortems.Cross-Functional Alignment: Partner with IT and Compliance to support programs aligned with SOC 2, FERPA, and COPPA, ensuring engineering efforts align with our regulatory commitments.Experience & Qualifications:To be successful in this role, you’ll bring the following experience and qualifications:Professional Experience: At least 5 years of experience in application or cloud security, with a track record of advancing security practices in fast-paced technical environments.Security Expertise: Hands-on experience with secure development tooling (SAST, DAST, SCA), and cloud-native security within AWS and/or Google Cloud. Prior involvement in offensive security or red teaming.Threat Modeling & Architecture: Strong experience conducting or facilitating threat modeling, whether using formal frameworks (e.g., STRIDE, PASTA) or more lightweight and iterative team-based approaches.Communication & Influence: Ability to communicate complex security topics to both technical and non-technical stakeholders. Skilled in influencing engineering teams and leading by example.Mentorship & Enablement: Experience coaching engineers or teams on security principles and integrating security without compromising developer velocity.Nice to Have:Experience supporting security components of SOC 2, FERPA, or COPPA programsPrior experience in a high-growth startup or fast-paced engineering environmentNotice: Priority Deadline and Review Start DatePlease note that applications for this position will be accepted until 7/18/25 — applications received after this date will be reviewed on an intermittent basis. While we encourage early submissions, all applications received by the priority deadline will receive equal consideration. Thank you for your interest, and we look forward to reviewing your application.Why Join Us?Work on cutting-edge AI technology that directly impacts educators and students.Join a mission-driven team passionate about making education more efficient and equitable.Flexibility of working from home, while fostering a unique culture built on relationships, trust, communication, and collaboration with our team - no matter where they live.Unlimited time off to empower our employees to manage their work-life balance. We work hard for our teachers and users, and encourage our employees to rest and take the time they need.Choice of employer-paid health insurance plans so that you can take care of yourself and your family. Dental and vision are also offered at very low premiums.Every employee is offered generous stock options, vested over 4 years.Plus a 401k match & monthly wellness stipendOur Values:Educators are Magic:  Educators are the most important ingredient in the educational process - they are the magic, not the AI. Trust them, empower them, and put them at the center of leading change in service of students and families.Joy and Magic: Bring joy and magic into every learning experience - push the boundaries of what’s possible with AI.Community:  Foster community that supports one another during a time of rapid technological change. Listen to them and serve their needs.Innovation:  The education system is outdated and in need of innovation and change - AI is an opportunity to bring equity, access, and serve the individual needs of students better than we ever have before.Responsibility: Put responsibility and safety at the forefront of the technological change that AI is bringing to education.Diversity: Diversity of thought, perspectives, and backgrounds helps us serve the wide audience of educators and students around the world.Excellence:  Educators and students deserve the best - and we strive for the highest quality in everything we do.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
horizon3ai_logo
Sr Automation and IT Engineer
Horizon3ai
USD
150000
-
210000
US.svg
United States
Full-time
Remote
false
Get to Know UsHorizon3.ai is a fast-growing, remote cybersecurity company dedicated to the mission of enabling organizations to proactively find, fix and verify exploitable attack vectors before criminals exploit them. Our flagship product, the NodeZeroTM platform, delivers production-safe autonomous pentests and other key assessment operations that scale across the largest internal, external, cloud, and hybrid cloud environments. NodeZero has been adopted by organizations of all sizes, from small educational institutions to government agencies and Global 100 enterprises. It is used by IT Ops/SecOps teams, consulting pentesters, and MSSPs and MSPs.We are a fusion of former U.S. Special Operations cyber operators, startup engineers & operators, and formerly frustrated cybersecurity practitioners. We're committed to helping solve our common security problems: ineffective security tools and false positives, resulting in alert fatigue, blind spots, "checkbox” security culture, cybersecurity skills shortage, and the long lead time and expense of hiring outside consultants. Collectively, we are a team of learn it alls, committed to a culture of respect, collaboration, ownership, and results.As a remote first company, we require minimum 25Mbps consumer grade broadband connection. What You’ll DoWe are seeking a skilled Sr Automation and IT Engineer with strong experience in day-to-day IT Help Desk Operations to join our growing IT operations team. As a key member of our team, you’ll be responsible for the operational health and effectiveness of our end user environment, have a key understanding of both Windows and MacOS systems, experience with Unified Endpoint Management (UEM) platforms, and a foundational knowledge of cybersecurity best practices.This role is instrumental in helping us scale and mature our Help Desk and business-wide IT operational capabilities while maintaining a strong security posture across the organization.This role will be responsible for:Automation and AILead the automation of repetitive IT workflows (onboarding, offboarding, access requests, software deployment) using tools like Okta Workflows, Jira Automations, scripting (PowerShell, Bash, Python), API, or low-code platforms.Support adoption and safe usage of AI-enabled tools (e.g., Microsoft Copilot, Google Gemini, ChatGPT) by enabling policy controls, usage monitoring, and employee education.Continuously evaluate and implement technologies that support modern, scalable, secure, and efficient IT operations for a distributed workforce.Technical SupportProvide prompt and efficient technical support to end-users via phone, email, chat, and/or in-person.Troubleshoot and resolve a wide range of technical issues, including but not limited to:Computer hardware and software issues (Windows and macOS)Application issuesNetwork connectivity problems (wired and wireless)Application installation and configurationPrinter and peripheral device issuesEmail deliveryMobile device supportDocumentation & ReportingBuild and maintain internal documentation such as Standard Operating Procedures (SOP) of toolsMaintain accurate records of support requests, troubleshooting steps, and resolutions.Create clear knowledgebase documentation for common technical issues and solutions.Document all support incidents and resolutions in the ticketing system.Develop and maintain internal knowledge base articles and end-user guides to reduce support friction and improve self-service.UEM Management & AssetsProvision, manage, and support Unified Endpoint Management (UEM) technologies to ensure device compliance, monitor system performance, and maintain an inventory of devices in the organization.Develop and maintain automated provisioning workflows for device setup during onboarding, ensuring security baselines, encryption, and application policies are consistently applied at scale.Continuously evaluate new UEM features and vendor integrations, recommending enhancements that improve support efficiency, device visibility, and security enforcement across the fleet.Coordinate and oversee secure asset recovery processes for employee offboarding, including clear communications, retrieval workflows, shipping logistics, and data wipe validation.CybersecurityAssist in maintaining a secure computing environment by identifying and mitigating basic security vulnerabilities, supporting the implementation of security protocols, and ensuring that EDR and firewall systems are up to date.Collaborate with Security and IT leadership to implement and enforce technical controls (e.g. device hardening, local admin rights, USB/media restrictions)Assist in vulnerability management efforts, including patch compliance tracking and coordination with internal stakeholders to remediate findings from vulnerability scans or internal audits.Support device and identity hygiene audits for compliance with frameworks such as SOC 2, ISO 27001, and GDPR, assisting with evidence collection and policy implementation as required.What You’ll Bring5+ years of IT support, systems administration or IT operations roles3+ years of device management platforms (Kandji, SureMDM, Intune, etc..)Proven experience supporting and enabling fully remote teamsDeep expertise knowledge with both Windows and macOS systemExpert with Unified Endpoint Management (UEM) platforms such as Kandji, Intune, SureMDM.Strong knowledge of Okta or similar identity provider platforms.Demonstrated ability to lead IT initiatives and drive process improvementsExperience supporting email platforms such as Gmail or Outlook 365.Basic understanding of cybersecurity hygiene and common threats.Experience with scripting or automation tools to streamline IT tasks (PowerShell, Python, Bash, or automation platforms).Excellent written communication skills for documentation and user training.Strong project management and organizational skills to manage and prioritize multiple tasks in a fast-paced environment.What Sets You Apart?A+ Certification (or in progress) is mandatory.3+ years of experience in a high-growth SaaS company or technology startupExperience leading IT teams or mentoring junior IT staffProven experience with IT service management principles and advanced ticketing system workflows (Jira Service Desk)Expert understanding of Unified Endpoint Management (UEM) technologies and their implementation with proven implementation for remote environmentsStrong problem-solving skills, a mindset for continuous improvement, and the ability to operate independently in a fast-paced, security-conscious environmentBasic knowledge of cybersecurity principles and practices (firewalls, antivirus software, patch management, etc.).Excellent problem-solving and troubleshooting skills with the ability to work independently or as part of a team.Strong communication skills, with the ability to explain technical issues to non-technical users.Familiarity with email flow, including how email systems work, troubleshooting common email-related issues, and understanding common security protocols (e.g., SPF, DKIM, DMARC).Compensation and ValuesAt Horizon3, we believe that our people are our greatest asset, and our compensation philosophy reflects this core value. We are committed to fostering an environment where all employees feel valued, respected, and rewarded for their contributions. Our compensation structure is designed to be fair, competitive, and transparent, ensuring that every team member is recognized and compensated equitably across roles, levels, and locations.In accordance with various State’s transparency regulations, we provide the following salary range information for this position:Base salary range: $150,000 - $210,000 annually. The exact salary will be determined based on the selected candidate’s location, qualifications, experience, and relevant skills.Additional compensation: This role may also be eligible for an equity package (in the form of stock options). If any other compensation benefits apply, they will be discussed during the interview process.Perks of Horizon3.aiInclusive Team: We value diversity and promote an inclusive culture where everyone can thrive.Growth Opportunities: Be part of a dynamic and growing team with numerous career development opportunities.Innovative Culture: Work in a collaborative environment that encourages creativity and out-of-the-box thinking.Competitive Compensation: We offer competitive salary and benefits which includes health, vision & dental care for you and your family, a flexible vacation policy, and generous parental leave.You Belong HereHorizon3 is not just an equal opportunity employer - we are a community that values diversity, equity, and inclusion as fundamental principles of our culture and success. We are dedicated to fostering a workplace where everyone feels welcome and respected, regardless of race, color, religion, sex, national origin, age, disability, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, hair length or any other legally protected status by law.Our commitment to diversity and inclusion means we strive to attract, develop, and retain a workforce that reflects the varied communities we serve. We believe that diverse perspectives drive innovation and strengthen our ability to create cutting-edge cybersecurity solutions. At Horizon3, every team member is valued and supported in an environment that encourages personal and professional growth.We welcome candidates from all backgrounds and experiences, and we encourage all qualified individuals to apply. Come be a part of Horizon3, where your unique contributions are recognized, and your potential is limitless.Other DutiesPlease note this job description is not designed to cover or contain a comprehensive listing of activities, duties or responsibilities that are required of the employee. Duties, responsibilities, and activities may change at any time with or without notice.Application NoteIn any materials you submit, you may redact or remove age-identifying information such as age, date of birth, or dates of school attendance or graduation. You will not be penalized for redacting or removing this information.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
lambda_labs_logo
Senior Cloud Solutions Engineer
Lambda AI
USD
249600
-
374400
US.svg
United States
Full-time
Remote
true
Lambda is the #1 GPU Cloud for ML/AI teams training, fine-tuning and inferencing AI models, where engineers can easily, securely and affordably build, test and deploy AI products at scale. Lambda’s product portfolio includes on-prem GPU systems, hosted GPUs across public & private clouds and managed inference services – servicing government, researchers, startups and Enterprises world-wide. If you'd like to build the world's best deep learning cloud, join us.  Engineering at Lambda is responsible for building and scaling our cloud offering. Our scope includes the Lambda website, cloud APIs and systems as well as internal tooling for system deployment, management and maintenance. What You’ll DoAdvocate for Lambda’s ProductsDevelop and maintain expertise in Lambda’s cloud products and servicesDemonstrate Lambda’s software and solutions to customers, partners and staffCreate field enablement materials for technical audience, lead workshops and support product advocacy effortsProvide technical feedback from customers to Lambda’s product and marketing teamsOwn the technical side of Lambda’s sales process.Partner with Lambda account executives to drive customer adoption and ensure successful deliveryEvaluate and assess customers needs to deeply understand pain-points, bottlenecks and expected outcomesRecommend appropriate cloud services and configurations to design a cohesive solution that supports the customers applications and workflowDocument proposal and designs in formats including but not limited to presentations, white-papers, visuals, Bill of Materials and rack elevationsDemonstrate expertise on Lambda’s cloud infrastructureBuild structured and purposeful learning into your work routineDevelop and support internal Lambda community as a subject matter expertBe an expert at deploying AI/ML workloads on Lambda cloudStay up to date on the latest deep learning trends, best practices and experiment with them using internal tools and resourcesDevelop high quality processes and documentationReinforce Lambda’s cultureContribute positively throughout the organizationMaintain a high level of agility and responsivenessHyper-focused on customer satisfactionYouHave 8+ years of experience designing, deploying and scaling cloud infrastructureHave 4+ years of experience as a solutions architect or consultative capacity supporting cloud infrastructure and servicesHave 3+ years of experience working with cloud-based AI/ML servicesHave deep knowledge of the ML ecosystem, including common models, practical use cases, and supporting toolsHave experience building with modern infrastructure tools such as Docker, Kubernetes, Ansible and TerraformHave deep knowledge of cloud infrastructure, security, networking, and cost optimization techniquesHave experience coding in Python, C# or similar programming languageHave experience developing with NVIDIAs GPUsHave led complex technical projects with diverse stakeholdersHave demonstrated impact at an organizational/multi-departmental levelHave experience leading complex cloud deals, influencing C-level stakeholders, and mentoring junior SEs or architectsThrive in dynamic settings and embrace radical ownership of initiatives and outcomesNice to HaveExperience with AI/ML use case to algorithm selection, training, tuning, inference, and training pipeline design and build.Participation in go-to-market (GTM) initiatives or product launchesExperience working with LLM architecturesExperience working with RESTful API and general service-oriented architectureSalary Range InformationBased on market data and other factors, the salary range for this position is $249,600 - $374,400 OTE. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.About LambdaFounded in 2012, ~350 employees (2024) and growing fastWe offer generous cash & equity compensationOur investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitabilityOur research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOGHealth, dental, and vision coverage for you and your dependentsWellness and Commuter stipends for select roles401k Plan with 2% company match (USA employees)Flexible Paid Time Off Plan that we all actually useA Final Note:You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.Equal Opportunity EmployerLambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
MLOps / DevOps Engineer
Data Science & Analytics
Solutions Architect
Software Engineering
Apply
Hidden link
gauss_labs_ai_logo
Site Reliability Engineer (Vancouver)
Gauss Labs
-
CA.svg
Canada
Full-time
Remote
true
Gauss Labs is seeking a highly skilled Site Reliability Engineer to join our team in Vancouver. As an SRE at Gauss Labs, you will play a critical role in ensuring our industrial AI platform's reliability, performance, and scalability. You will be responsible for building and maintaining a robust solution that supports our growing business at customer sites. This role requires a high level of technical expertise, a collaborative mindset, and a strong desire to continuously improve systems and processes.
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
magicschool_ai_logo
Customer Engineer
MagicSchool AI
USD
0
80000
-
105000
US.svg
United States
Full-time
Remote
true
WHO WE ARE: MagicSchool is the premier generative AI platform for teachers. We're just over 2 years old, and more than 5.5 million teachers from all over the world have joined our platform. Join a top team at a fast growing company that is working towards real social impact. Make an account and try us out at our website and connect with our passionate community on our Wall of Love.Customer Success Engineer (EdTech)Role Description: As a Customer Success Engineer, you will bridge technical engineering with client success, ensuring seamless integrations, resolving technical challenges, and delivering outstanding onboarding and post-sales support. You will design and implement scalable solutions, customize generative AI tools to fit customer needs, and serve as a trusted technical advisor to internal teams and external clients. This role requires both technical expertise and strong communication skills to translate complex technical concepts into accessible solutions for educators, administrators, and non-technical stakeholders.Responsibilities:Technical Implementation and Support Serve as the primary technical point of contact for integrations (e.g., SSO, Clever, API), providing support during onboarding and implementation while reducing reliance on engineering.Generative AI Customization Customize AI tools with prompt engineering and workflows based on customer needs to improve educator outcomes.Documentation and Operational Efficiency Create and maintain clear, user-friendly technical documentation to streamline onboarding, integration, and support.Cross-Functional Collaboration Partner with internal teams (engineering, CX, product, sales) to align on technical solutions and support both pre- and post-sales efforts.Proof-of-Concepts and Prototypes Develop prototypes and integration demos to showcase technical capabilities in customer engagements.Customer-Focused Innovation Leverage customer feedback and usage data to surface insights that inform product and engineering roadmaps.Qualifications, Competencies, and Skills:Strong technical acumen with the ability to debug integration issues (e.g., SSO, APIs) and troubleshoot effectively.Proficiency in modern web development technologies, including JavaScript (Node.js, React, TypeScript), PostgreSQL, and SSO protocols (OAuth 2.0, OpenID).Familiarity with SQL, Metabase, or similar data tools for reconciliation and insights.Excellent communication skills to work with both technical and non-technical stakeholders.Strong problem-solving skills with a customer-first approach to technical challenges.Experience:Required:2+ years in a technical role such as Customer Success Engineer, Technical Implementation Specialist, or Pre/Post-Sales Engineer.Demonstrated expertise in API integrations, SSO systems, and generative AI customization.Proven ability to manage and resolve technical integration challenges in client-facing environments.Preferred:5+ years in a technical role in EdTech such as Customer Success Engineer, Technical Implementation Specialist, or Pre/Post-Sales Engineer.Experience with Edlink or other EdTech integration platforms.Familiarity with ticketing systems and process improvement methodologies.Background in both Customer Success and Engineering functions, showcasing versatility and a collaborative mindset.Startup experience and the ability to thrive in a fast-paced, dynamic environment. Notice: Priority Deadline and Review Start DatePlease note that applications for this position will be accepted until 6/29/25 - applications received after this date will be reviewed on an intermittent basis. While we encourage early submissions, all applications received by the priority deadline will receive equal consideration. Thank you for your interest, and we look forward to reviewing your application.Why Join Us?Work on cutting-edge AI technology that directly impacts educators and students.Join a mission-driven team passionate about making education more efficient and equitable.Flexibility of working from home, while fostering a unique culture built on relationships, trust, communication, and collaboration with our team - no matter where they live.Unlimited time off to empower our employees to manage their work-life balance. We work hard for our teachers and users, and encourage our employees to rest and take the time they need.Choice of employer-paid health insurance plans so that you can take care of yourself and your family. Dental and vision are also offered at very low premiums.Every employee is offered generous stock options, vested over 4 years.Plus a 401k match & monthly wellness stipendOur Values:Educators are Magic:  Educators are the most important ingredient in the educational process - they are the magic, not the AI. Trust them, empower them, and put them at the center of leading change in service of students and families.Joy and Magic: Bring joy and magic into every learning experience - push the boundaries of what’s possible with AI.Community:  Foster community that supports one another during a time of rapid technological change. Listen to them and serve their needs.Innovation:  The education system is outdated and in need of innovation and change - AI is an opportunity to bring equity, access, and serve the individual needs of students better than we ever have before.Responsibility: Put responsibility and safety at the forefront of the technological change that AI is bringing to education.Diversity: Diversity of thought, perspectives, and backgrounds helps us serve the wide audience of educators and students around the world.Excellence:  Educators and students deserve the best - and we strive for the highest quality in everything we do.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
togethercomputer_logo
Senior Developer Productivity Engineer
Together AI
USD
150000
-
230000
US.svg
United States
Full-time
Remote
false
About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. As a Senior Developer Productivity Engineer at Together AI, you’ll own the systems and tooling that empower engineers to ship high-quality software faster. You’ll optimize workflows, enhance testing, enable reliable and reusable CI/CD, and work with developers to build out stable local environments. Your work will directly impact release velocity, developer happiness, and cross-cutting enablement ensuring engineers spend less time churning on infrastructure and more time building. Requirements Bachelor’s degree in Computer Science, Engineering, or related field or 5+ years of industry experience in DevOps/SRE, developer tooling, or infrastructure engineering. Proficiency in Python, Go, and JavaScript/TypeScript (for tooling, automation, and FE enablement). Strong experience with CI/CD tools (GitHub Actions, ArgoCD, Gitops methodology) for building scalable, reusable pipelines. Experience with local dev environment tooling (e.g., Skaffold) and containerized development workflows Experience with creating starter templates in coordination with engineering for enabling rapid spin up of new services  Strong ownership and a builder mindset, you love creating tools others rely on. Problem-solving rigor and attention to detail (e.g., diagnosing flaky tests, build latency). Nice to have Kubernetes expertise (EKS, K3s) and optimizing containerized builds. Infrastructure as Code (Terraform, Ansible, Pulumi). Front-end tooling familiarity (e.g., React, Next.js, Jest) to optimize FE dev workflows. Monitoring/observability (Prometheus, Grafana, Honeycomb) to debug bottlenecks. Responsibilities Automate CI/CD pipelines for zero-downtime deployments (e.g., Canary, Blue/Green). Create smart pipelines encouraging reusable workflows and simplicity Streamline build/test/deploy workflows. Build shared tooling (CLIs, codegen, IDE plugins) to accelerate teams. Reduce friction (e.g., faster builds, hot-reload, test tooling). Collaborate with developers to identify pain points and streamline workflows. Champion best practices through documentation.   Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $150,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
appier_logo
Engineering Manager, AI & Infrastructure (AdCreative.ai) - Fully Remote, Turkey Based
Appier
-
TR.svg
Turkey
Full-time
Remote
true
About Appier  Appier is a software-as-a-service (SaaS) company that uses artificial intelligence (AI) to power business decision-making. Founded in 2012 with a vision of democratizing AI, Appier’s mission is turning AI into ROI by making software intelligent. Appier now has 17 offices across APAC, Europe and U.S., and is listed on the Tokyo Stock Exchange (Ticker number: 4180). Visit www.appier.com for more information. About AdCreative.ai AdCreative.ai, based in Paris, France, is a leading AI-driven platform revolutionizing how businesses create advertising and marketing assets. By combining proprietary AI models, advanced technology, and a user-centric approach, AdCreative.ai empowers brands to generate high-performing ad creatives that deliver measurable ROI. Founded in 2021 and trusted by global brands like Snapchat, Pernod Ricard, Reckitt, BNP Paribas and Chopard, the platform merges cutting-edge AI technology with advertising expertise to shape the industry's future, making campaigns faster, smarter, and more impactful. As a trusted partner in the AI era, AdCreative.ai offers innovative tools like Creative Insights, AI Video Ad Generation, and Stock Image Creation, enabling businesses to scale with precision and achieve creative excellence in their campaigns. About the role We are seeking an experienced and visionary Engineering Manager to lead and grow a high-performing engineering team. This role requires a balance of strong technical expertise, people leadership, and strategic thinking to drive product and infrastructure initiatives forward. The ideal candidate has a deep background in backend or systems engineering, a solid understanding of DevOps practices, and—preferably—experience with MLOps workflows in production-grade machine learning environments. Responsibilities  Lead, mentor, and scale a team of engineers working across infrastructure, backend services, DevOps platforms, and MLOps Collaborate with product, data, and machine learning teams to align engineering efforts with business goals   Oversee project planning, prioritization, and delivery with a focus on quality and velocity   Promote engineering excellence by establishing best practices in code quality, testing, deployment, and observability   Drive initiatives in system scalability, reliability, and maintainability   Contribute to infrastructure decisions, ensuring efficient CI/CD pipelines and DevOps tooling   (Preferred) Support and evolve MLOps pipelines in collaboration with ML engineers and data scientists About you  [Minimum qualifications] 4+ years of hands-on software engineering experience, including at least 2+ years in a leadership or management role   Proven experience leading cross-functional technical teams in fast-paced environments   Solid understanding of DevOps principles and modern cloud infrastructure (e.g., AWS, GCP, Azure)   Familiarity with CI/CD tools (e.g., GitHub Actions, ArgoCD, Jenkins), infrastructure as code (e.g., Terraform, Pulumi), and container orchestration (e.g., Kubernetes)   Strong communication skills and the ability to collaborate effectively with both technical and non-technical stakeholders  [Preferred qualifications] Experience with MLOps workflows (model training, deployment, monitoring, data pipelines)   Background in machine learning infrastructure or working with ML engineering teams   Exposure to data platforms, versioning tools (e.g., DVC), or ML orchestration frameworks (e.g., Kubeflow, MLflow)   Ability to guide architectural decisions involving machine learning systems in production #LI-SW1 #LI-Fully Remote
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
harvey_ai_logo
Staff Software Engineer, Core Infrastructure
Harvey
USD
0
250000
-
290000
US.svg
United States
Full-time
Remote
false
Why HarveyHarvey is a secure AI platform for legal and professional services that augments productivity and automates complex workflows. Harvey uses algorithms with reasoning-adept LLMs that have been customized and developed by our expert team of lawyers, engineers and research scientists. We’ve found product market fit and are scaling our team very quickly. Some reasons to join Harvey are:Exceptional product market fit: We have partnered with the largest law firms and professional service providers in the world, including Paul Weiss, A&O Shearman, Ashurst, O'Melveny & Myers, PwC, KKR, and many others.Strategic investors: Raised over $500 million from strategic investors including Sequoia, Google Ventures, Kleiner Perkins, and OpenAI.World-class team: Harvey is hiring the best talent from DeepMind, Google Brain, Stripe, FAIR, Tesla Autopilot, Glean, Superhuman, Figma, and more.Partnerships: Our engineers and researchers work directly with OpenAI to build the future of generative AI and redefine professional services.Performance: 4x ARR in 2024.Competitive compensation.Role OverviewAs a Staff Software Engineer on the Core Infrastructure team at Harvey, you'll play a critical role in designing and building new infrastructure systems while equally scaling and strengthening the existing infrastructure systems. Your contributions will ensure the reliability, scalability, and security of our cutting-edge legal AI platform. You'll work in an environment balanced between innovation — building new systems — and operational excellence, ensuring that Harvey remains resilient and efficient as it scales products, regions, customers, and usage. This role is based in San Francisco, CA. We use an in-person work model and offer relocation assistance to new employees. What You’ll DoDesign, develop, and deploy new infrastructure services and automation tools to support platform growth and new product initiatives.Manage and optimize existing infrastructure components (compute, storage, networking) across 50+ global regions.Lead incident management processes, including postmortems, root cause analyses, and driving actionable improvements.Implement short-term and long-term infrastructure scaling strategies to improve reliability, scalability, and performance.Collaborate across teams to drive reliability, security, and compliance throughout the software lifecycle.Provide technical mentorship and leadership, promoting best practices and fostering team growth. What You Have10+ years of experience in Infrastructure Engineering, Site Reliability Engineering, or similar roles in a production environment.Long track record building and scaling complex, large-scale distributed systems.Strong fluency in Infrastructure as Code (IaC) tools (Pulumi, Terraform, CloudFormation, etc.).Deep proficiency with cloud infrastructure platforms (Azure, GCP, AWS, etc.).Strong programming skills (Python, Bash, Go, or similar languages).Experience with observability tools (Datadog, Sentry, etc.) and incident response practices (PagerDuty, IncidentIO, etc.).Solid understanding of Kubernetes, networking, databases, OS, cloud storage, cloud security, CI/CD, etc.Excellent problem-solving skills, a "spidey sense" of where things could go wrong, and a commitment to operational excellence. Compensation Range$250,000 - $290,000 USDPlease find our CA applicant privacy notice here.Harvey is an equal opportunity employer and does not discriminate on the basis of race, gender, sexual orientation, gender identity/expression, national origin, disability, age, genetic information, veteran status, marital status, pregnancy or related condition, or any other basis protected by law.We are in the early innings of a generational company. Joining early at a hypergrowth startup has proven to lead to exponential growth in responsibility, access, and ability. Apply here today!
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
anthropicresearch_logo
ML Infrastructure Engineering Manager, Safeguards
Anthropic
USD
340000
-
425000
US.svg
United States
Full-time
Remote
false
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.About the role Anthropic is seeking an ML Infrastructure Engineering Manager to lead a critical team within our Safeguards organization. You'll manage and grow a team of infrastructure engineers who build and scale the foundational systems that power our AI safety and trust mechanisms. This role combines deep technical leadership in ML infrastructure with people management, driving both the strategic vision and day-to-day execution of systems that ensure our AI models operate safely and reliably at scale. Your team will be responsible for the infrastructure backbone that enables real-time safety evaluations and systems to make Claude safe You'll work closely with research teams to translate cutting-edge safety research into production-ready systems, while partnering with Safeguards, Security, and Alignment teams to ensure our infrastructure meets the demanding requirements of safety-critical applications. Responsibilities: Set team vision and roadmap for ML infrastructure that powers Anthropic's safety and trust systems, ensuring scalability, reliability, and performance at production scale Lead a team of ML infrastructure and software engineers to build robust platforms supporting real-time safety evaluations, feature stores, model serving, and data pipelines Partner with Safeguards, Security, Research, and Product teams to identify infrastructure requirements and translate complex safety research into scalable production systems Drive technical strategy for ML infrastructure architecture, making key decisions about technology choices, system design, and platform evolution Maintain deep technical expertise in ML infrastructure, distributed systems, and safety-critical applications to provide technical leadership and guidance Hire, support, and develop team members through continuous feedback, career coaching, and people management practices Collaborate across teams to ensure infrastructure supports rapid experimentation while maintaining production reliability and safety standards Champion engineering best practices including automated testing, deployment pipelines, monitoring, and incident response for safety-critical systems You may be a good fit if you: Have 4+ years of management experience leading technical teams focused on ML infrastructure, platform engineering, or distributed systems Have 8+ years of hands-on experience building production ML infrastructure, ideally in safety-critical domains like fraud detection, content moderation, or risk assessment Demonstrated ability to lead and manage high-performing technical teams through periods of rapid growth and scaling challenges Possess deep technical knowledge of ML serving platforms, feature stores, data pipelines, and distributed systems architecture Show excellent communication skills in translating complex technical concepts for various audiences, from individual contributors to executive leadership Have strong project management skills with the ability to balance multiple priorities and coordinate across cross-functional teams Experience managing teams that bridge research and production, with a track record of productionizing experimental systems Strong candidates may also: Possess knowledge of modern ML frameworks, cloud platforms, and container orchestration in production environments Excel at building strong relationships with research teams and translating their needs into infrastructure requirements Have experience implementing automated testing, deployment, and monitoring systems for ML models in production Demonstrate passion for ensuring the responsible development and deployment of AI systems Have managed teams working on real-time, high-throughput systems with strict latency and reliability requirements Experience with compliance and security requirements for safety-critical applications At Anthropic, we value diversity and are committed to creating an inclusive environment for all employees. We encourage applications from candidates of all backgrounds. Deadline to apply: None. Applications will be reviewed on a rolling basis. The expected salary range for this position is:Annual Salary:$340,000—$425,000 USDLogistics Education requirements: We require at least a Bachelor's degree in a related field or equivalent experience. Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed.  Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team. How we're different We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills. The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences. Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.
MLOps / DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
aleph_alpha_logo
Senior Systems Engineer (f/m/d)
AlephAlpha
-
GE.svg
Germany
Full-time
Remote
false
Overview: We are seeking an experienced Senior Systems Engineer to join our growing infrastructure team. As we advance our AI stack and scale our infrastructure, you will play a pivotal role in designing, maintaining, and optimising our systems. Your expertise will help ensure high availability, security, and performance while enabling seamless deployment for our customers and internal teams. As a key technical team member, you will drive improvements across our infrastructure. Your contributions will be instrumental in shaping the future of our AI-powered solutions. Your Responsibilities: Be part of the design, development, and optimisation of the Pharia AI stack and the supporting infrastructure. Help define best practices and guide teams in writing Helm charts and deploying their artefacts efficiently. Maintain highly available Kubernetes (K8s) clusters on StackIT or similar cloud platforms. You know how to design, build and maintain Kubernetes Operators. Provide strategic guidance and hands-on assistance to customers for deploying and maintaining our products on their infrastructure. Ensure compliance with security and reliability best practices; represent the team in audits and respond to security questionnaires. Drive automation efforts and improve CI/CD pipelines to enhance deployment efficiency and system resilience. Collaborate with cross-functional teams to align infrastructure with business and product goals. Your Profile: Extensive experience in designing, deploying, and maintaining Kubernetes clusters in production environments. Automation & CI/CD Expertise: Proficiency in tools such as Helm, Ansible, Terraform, ArgoCD, GitLab CI, and JFrog. Experience with Kubernetes Operators design and implementation. Strong programming skills in at least one language from our stack: Rust or Go. Deep understanding of security, reliability, and scalability best practices for infrastructure. Excellent communication and collaboration skills, with a track record of contributing to a culture of learning and innovation. Experience working in fast-paced startup environments is a plus. What You Can Expect From Us: Become part of an AI revolution! 30 days of paid vacation Access to a variety of fitness & wellness offerings via Wellhub Mental health support through nilo.health Substantially subsidised company pension plan for your future security Subsidised Germany-wide transportation ticket Budget for additional technical equipment Flexible working hours for better work-life balance and hybrid working model Virtual Stock Option Plan
MLOps / DevOps Engineer
Data Science & Analytics
Apply
Hidden link
No job found
Your search did not match any job. Please try again
Country
Clear
Job type
Clear
Remote
Clear
Only remote job
Company size
Clear
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.