Top DevOps Engineer Jobs Openings in 2025

Looking for opportunities in DevOps Engineer? This curated list features the latest DevOps Engineer job openings from AI-native companies. Whether you're an experienced professional or just entering the field, find roles that match your expertise, from startups to global tech leaders. Updated everyday.

Aleph Alpha.jpg

Senior Network Engineer (f/m/d)

AlephAlpha
-
GE.svg
Germany
Full-time
Remote
false
At Aleph Alpha, we are shaping the future of AI with European values at the core. The heart of our product is developing cutting-edge generative AI solutions with a strong emphasis on sovereignty, ethical development, and societal benefit. Our generative AI offering empowers businesses, governments, and individuals to achieve their full potential.Building world-class AI solutions requires a robust and secure technological backbone. Our Infrastructure Team ensures the scalability, reliability, and security of Aleph Alpha’s sovereign AI, enabling us to deliver cutting-edge innovation with confidence.As a Senior Network Engineer at Aleph Alpha, you will play a critical role in managing and evolving the network infrastructure that powers our AI research and model training operations. You will be responsible for ensuring the stability, security, and scalability of our enterprise network across multiple datacenter locations.Your responsibilities:Design, implement, and maintain enterprise network infrastructure including routing, switching, and VLAN configurationsManage and optimize firewall and security configurations to ensure compliance with high-security standardsEnsure full operational functionality of GPU clusters supporting large-scale AI/ML workloadsBuild redundancy and establish standards to guarantee uninterrupted operationsPlan and prepare network infrastructure for datacenter expansion — locally and internationallyCollaborate with cross-functional teams including HPCM, DevOps, and SecurityDocument network configurations, processes, and best practicesYour profile:Several years of experience in enterprise network managementStrong knowledge of routing, switching, VLANs, and firewall administrationExperience in high-security network environmentsFamiliarity with Fortinet/FortiGate firewalls (preferred)Experience with Aruba and/or HPE networking equipment is a plusAbility to work independently and adapt quickly to new technologiesStrong problem-solving skills and a proactive mindsetFluent in English; German is a plusWhat you can expect from us:Become part of an AI revolution!30 days of paid vacationAccess to a variety of fitness & wellness offerings via WellhubMental health support through nilo.healthJobRad® Bike LeaseSubstantially subsidized company pension plan for your future securitySubsidized Germany-wide transportation ticketBudget for additional technical equipmentFlexible working hours for better work-life balance and hybrid working modelVirtual Stock Option Plan
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Obviant.jpg

DevOps Engineer

Obviant
-
US.svg
United States
Full-time
Remote
false
DevSecOps / Platform EngineerArlington, VA — Full TimeThe defense market is surging, but the data that drives it hasn’t kept up. Companies, government, and investors are forced to perform heavily manual processes and piece together hundreds of disparate sources to make decisions. Obviant is building a data source of truth and AI tools for defense acquisition to solve this. We fuse information from thousands of sources – structured + unstructured – to provide a cohesive picture of budget, programs, the organizations running them, and much more. Whether it’s a company navigating GTM or a program manager developing capabilities, we’re providing all sides with the intelligence they need to execute effectively. We’re growing fast and backed by top funds and DoD/national security veterans.We believe that public sector mission sets matter above anything else. If you feel the same way, we’d love for you to join us.The RoleAs a DevSecOps / Platform Engineer at Obviant, you will build and operate the foundational systems that power our data ingestion pipelines, application infrastructure, and secure deployment environments. You'll work across cloud infrastructure, CI/CD, container orchestration, and security automation to ensure our platform is reliable, scalable, and compliant with the needs of defense users.This is a hands-on role with deep ownership. You’ll collaborate closely with engineering, data, and product teams to develop the infrastructure that supports fast iteration, an expanding feature surface, and mission-critical workflows for government stakeholders.We move fast, simplify complexity, and build systems that scale.ResponsibilitiesDesign, implement, and operate secure, cloud-native infrastructure that powers Obviant’s core platform Build and maintain CI/CD pipelines that enable high-velocity, high-reliability shipping across teams Work with containerized workloads (Docker, Kubernetes) to automate deployments and manage environments Develop Infrastructure-as-Code frameworks to standardize and scale system provisioning Implement and uphold DevSecOps best practices—hardening images, managing vulnerabilities, and automating security controls Collaborate with full-stack engineers and data teams to support ingestion pipelines, new product workflows, and user-facing features Troubleshoot infrastructure issues in real time, identify root causes, and drive long-term improvements to reliability and performance Participate in research and development to improve automation, observability, build tooling, and operational efficiency Contribute directly to how government technology is built and delivered by shaping infrastructure strategy end-to-endWhat You Bring3+ years of experience operating cloud infrastructure (AWS preferred; GCP/Azure welcome) Strong knowledge of containerization and orchestration (Kubernetes, EKS, Helm, etc.) Experience implementing Infrastructure as Code (Terraform, Pulumi, CloudFormation, etc.) Hands-on experience building and maintaining CI/CD pipelines Understanding of container/image hardening and vulnerability management Proficiency in at least one scripting language (Typescript, Go, etc…) Experience debugging distributed systems and automating workflows Strong communication skills—clear, concise, and collaborative Ability to thrive in ambiguity, move quickly, and drive outcomes with high ownership Passion for national security and mission-driven work Comfortable with the pace and expectations of a fast-growing startup Bonus:Experience supporting government or regulated environments Exposure to IL4/IL5, compliance frameworks, or security accreditation processes Kubernetes certifications (CKA, CKAD) AWS Solutions Architect certification Experience with multi-cloud networking and container hardening Active or ability to obtain a security clearanceOur Working Style — Why You Might Love It HereYou care about government & are mission-oriented - Our work is important, and is critical to improving a system that impacts us all.Perseverance and endurance - Hard problems are worth solving, and solving them can take a long time. There is no such thing as exhausting all options, it’s just time to look for new ones.Empowerment > micro-management – We’re building a culture of high-performers. Our job is to equip them with what they need and eliminate roadblocks for them to succeed. We trust their judgment, skills, and experience from there.We’re collaborative and communicate well - Constructive dialogue that takes all viewpoints into account is the only way we get to the right decision. Respect, trust, and complete transparency with each other is critical - keep it all in the openYou’re really good at what you do… but it speaks for itself – High output, no ego. Being humble is extremely important to usYou don’t mind change and are comfortable with uncertainty - We’re deliberate about setting goals, but we’re comfortable changing course and dealing with discomfort to get there. We’re still figuring things out, and that demands being flexible and iterative.Work doesn’t feel like “work” to you – We’re passionate about what we’re going after, and we devote more time to it than a typical 9-5. That often means putting in extra time at night and occasionally on weekends. However, maintaining your own personal balance comes above all else, and you should establish that however you need to - flexible schedule, taking advantage of time off, or anything else you need.You like to move fast and have a bias towards action - Our roadmap is directional at this stage - speed and a feeling of urgency is key to prove it out. We expect each other to proactively determine what needs to get done and go for it.Integrity is never negotiable – Transparency, honesty, and respect comes above all else.Benefits & StructureWe’re a tight-knit team headquartered in Arlington, VA. We work in the office together most days, and believe being in the same place is a competitive advantage.Flexible schedule - We all have other things going on in our lives. Doctor visits, kids’ activities, dog walks - take care of it whenever you have to. And work from home when you need to.Competitive compensation + Sizeable equity - We’re building something with massive upside potential, and you’ll have ownership in that. This is ours.Flexible vacation time - Use what you want, as long as you’re taking care of what needs to get done.Full health, dental, and vision insurance.And more…
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Decagon.jpg

Staff Software Engineer, Infrastructure

Decagon
USD
0
300000
-
375000
US.svg
United States
Full-time
Remote
false
About DecagonDecagon is the leading conversational AI platform empowering every brand to deliver concierge customer experience. Our AI agents provide intelligent, human-like responses across chat, email, and voice, resolving millions of customer inquiries across every language and at any time.Since coming out of stealth, Decagon has experienced rapid growth. We partner with industry leaders like Hertz, Eventbrite, Duolingo, Oura, Bilt, Curology, and Samsara to redefine customer experience at scale. We've raised over $200M from Bain Capital Ventures, Accel, a16z, BOND Capital, A*, Elad Gil, and notable angels such as the founders of Box, Airtable, Rippling, Okta, Lattice, and Klaviyo.We’re an in-office company, driven by a shared commitment to excellence and velocity. Our values—customers are everything, relentless momentum, winner’s mindset, and stronger together—shape how we work and grow as a team.About the TeamThe Infrastructure team builds and operates the foundations that power Decagon: networking, data, ML serving, developer platform, and real‑time voice. We partner closely with product, data, and ML to deliver high‑scale, low‑latency systems with clear SLOs and great developer ergonomics.We organize around five focus areas:Core Infra: The foundational cloud stack—networking, compute, storage, security, and infrastructure‑as‑code—to ensure reliability, scale, and cost efficiency.Data Infra: Streaming/batch data platforms powering analytics/BI and customer‑facing telemetry, including for customer‑managed and on‑prem environments.ML Infra: GPU and model‑serving platforms for LLM inference with multi‑provider routing and support for on‑prem/air‑gapped deployments.Platform (DevEx): CI/CD, paved paths, and core services that make shipping fast, safe, and consistent across teams.Voice Infra: Telephony/WebRTC stack and observability enabling ultra‑low‑latency, high‑quality voice experiences.Our mission is to deliver magical support experiences — AI agents working alongside humans to resolve issues quickly and accurately. About the RoleWe’re hiring a Senior Infrastructure Engineer to design, build, and operate production infrastructure for high‑scale, low‑latency systems. You’ll own critical services end‑to‑end, improve reliability and performance, and create paved‑paths that let every Decagon engineer ship confidently. In this role, you willDesign and implement critical infrastructure services with strong SLOs, clear runbooks, and actionable telemetry.Partner with research and product teams to architect solutions, set up prototypes, evaluate performance, and scale new features.Tune service latencies: optimize networking paths, apply smart caching/queuing, and tune CPU/memory/I/O for tight p95/p99s.Evolve CI/CD, golden paths, and self‑service tooling to improve developer velocity and safety.Support various deployment architectures for customers with robust observability and upgrade paths.Lead infrastructure‑as‑code (Terraform) and GitOps practices; reduce drift with reusable modules and policy‑as‑code.Participate in on‑call and drive down toil through automation and elimination of recurring issues. Your background looks something like this8+ years building and operating production infrastructure at scale.Depth in at least one area across Core/Data/AI‑ML/Platform/Voice, with curiosity to learn the rest.Proven track record meeting high availability and low latency targets (owning SLOs, p95/p99, and load testing).Excellent observability chops (OpenTelemetry, Prometheus/Grafana, Datadog) and incident response (PagerDuty, SLO/error budgets).Clear written communication and the ability to turn ambiguous requirements into simple, reliable designs. Even betterExperience being an early backend/platform/infrastructure engineer at another companyStrong Kubernetes experience (GKE/EKS/AKS) and experience across multiple cloud providers (GCP, AWS, and Azure)Experience with customer‑managed deployments BenefitsMedical, dental, and visionFlexible time offDaily lunch/dinner & snacks in the office
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
Crusoe.jpg

Staff Infrastructure Security Engineer

Crusoe
-
US.svg
United States
Full-time
Remote
false
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.We are seeking a highly skilled Staff Infrastructure Security Engineer to architect, deploy, and operationalize the foundational security services that will underpin our shift to a Zero Trust model.In this strategic role, you will define and establish the "roots of trust" for our organization, serving as a technical leader in Secrets Management and Identity architecture. While your immediate focus is to serve as the Subject Matter Expert (SME) driving our enterprise HashiCorp Vault platform from Proof-of-Concept (PoC) to global production readiness, your long-term scope is far broader. You will be responsible for evolving our credentials management strategy, onboarding engineering teams to secure self-service workflows, and designing scalable trust patterns across our hybrid multi-cloud environment.Key Responsibilities1. Strategic Architecture & GovernanceZero Trust Architecture: Architect a highly available, disaster-resilient, and scalable multi-cluster secrets management platform that serves as the foundation for the organization’s Zero Trust strategy.Technical Leadership: Drive consensus across Cloud Engineering, DevOps, and SRE teams to define standardized secret management workflows and integrate security patterns into the SDLC.Compliance & Governance: Ensure the platform design meets rigorous internal policies and external compliance frameworks (e.g., SOX, ISO 27001).Policy as Code: Design and implement advanced governance controls, including Sentinel Policy as Code, to automate security guardrails and access decisions.2. Platform Engineering & ImplementationInfrastructure as Code (IaC): Lead the engineering of the Vault infrastructure using Terraform, ensuring all deployments are reproducible, version-controlled, and automated.Identity Integration: Architect the integration between the secrets platform, Identity Providers (Okta), and workload identities (Kubernetes Service Accounts) to establish robust machine-to-machine authentication.Advanced Secrets Capabilities: Configure and tune essential secrets engines (KV, Transit, KMIP) and Enterprise features (Performance Replication, Seal automation) to support diverse engineering use cases.3. Operational Excellence & Developer EnablementVault as a Service (VaaS): Operationalize the platform by building self-service mechanisms, distinct "paved road" onboarding procedures, and documentation that allows engineering teams to easily consume security services.Observability: Implement comprehensive monitoring, alerting, and audit logging to ensure platform health, provide visibility into usage patterns, and satisfy audit requirements.Lifecycle Management: Own the full operational lifecycle of the production environment, including patching, version upgrades, backup/restore procedures, and incident response runbooks.Required Qualifications6+ years (or equivalent) hands-on experience in cloud security, DevOps, or infrastructure engineering.Deep expertise and proven track record deploying and managing HashiCorp Vault in an enterprise environment (experience with the Enterprise edition is highly preferred).Expert-level knowledge of Secrets Management, X.509 PKI (Public Key Infrastructure), Certificate Authority Operations, and Cryptography concepts.Strong experience with Google Cloud Platform (GCP) and cloud native identity and access management (IAM).Proficiency with Infrastructure as Code (IaC) tools, especially Terraform, for automating the deployment and configuration of Vault and its dependent infrastructure.Technical SkillsFluent in at least one programming language (ideally Go or Python).Demonstrable experience with Kubernetes and container security principles, especially integrating secrets into microservices architectures.Strong understanding of network security concepts (IP addressing, IP routing, firewalls, segmentation, Zero Trust).Benefits:Industry competitive payRestricted Stock Units in a fast growing, well-funded technology companyHealth insurance package options that include HDHP and PPO, vision, and dental for you and your dependentsEmployer contributions to HSA accountsPaid Parental LeavePaid life insurance, short-term and long-term disabilityTeladoc401(k) with a 100% match up to 4% of salaryGenerous paid time off and holiday scheduleCell phone reimbursementTuition reimbursementSubscription to the Calm appMetLife LegalCompany paid commuter benefit; $300 per monthCrusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Security Engineer
Software Engineering
Apply
Hidden link
Cohere Health.jpg

Forward Deployed Engineer, Infrastructure Specialist (Public Sector)

Cohere
-
CA.svg
Canada
Full-time
Remote
true
Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.Join us on our mission and shape the future!About North:North is Cohere's cutting-edge AI workspace platform, designed to revolutionize the way enterprises utilize AI. It offers a secure and customizable environment, allowing companies to deploy AI while maintaining control over sensitive data. North integrates seamlessly with existing workflows, providing a trusted platform that connects AI agents with workplace tools and applications.Why This Role?This role offers a unique opportunity to shape how enterprises harness the power of AI in real-world applications. As a bridge between our core North product and our clients’ engineering teams, you’ll be at the forefront of solving complex problems and securely integrating AI into critical sectors such as finance, healthcare, and telecommunications. Our esteemed clients include industry leaders like RBC, Dell, and LG CNS.We are seeking engineers who deeply care about customers and want to work at the cutting edge of Agentic AI.In this role, you will:Lead end-to-end deployment of North in private cloud and on-premises environments, including planning, configuration, testing, and rollout.Partner with enterprise IT teams to assess infrastructure, security requirements, and data management practices.Experiment at a high velocity and with a high level of quality to engage our customers and ultimately deliver solutions that exceed their expectationsDesign and implement deployment strategies tailored to client needs, ensuring compliance with data privacy and security standards.Troubleshoot and resolve deployment-related technical issues, providing timely solutions to minimize downtime.You may be a good fit if:Canadian citizenship and security clearance required.You have experience with and enjoy working directly with customersYou have experience deploying enterprise software in private/hybrid cloud environmentsYou have proven experience administering production Kubernetes clusters and expertise with HelmFamiliarity with DevOps practices, CI/CD pipelines, and tools like Git for version controlYou have strong expertise in cloud infrastructure (Azure, AWS, GCP), networking, and virtualizationYou excel in fast-paced environments and can execute while priorities and objectives are a moving targetIf some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.Full-Time Employees at Cohere enjoy these Perks:🤝 An open and inclusive culture and work environment 🧑‍💻 Work closely with a team on the cutting edge of AI research 🍽 Weekly lunch stipend, in-office lunches & snacks🦷 Full health and dental benefits, including a separate budget to take care of your mental health 🐣 100% Parental Leave top-up for up to 6 months🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement🏙 Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend✈️ 6 weeks of vacation (30 working days!)
DevOps Engineer
Data Science & Analytics
Solutions Architect
Software Engineering
Apply
Hidden link
PhysicsX.jpg

Enterprise Security Engineer

PhysicsX
USD
85000
-
125000
US.svg
United States
Full-time
Remote
false
About us PhysicsX is a deep-tech company with roots in numerical physics and Formula One, dedicated to accelerating hardware innovation at the speed of software. We are building an AI-driven simulation software stack for engineering and manufacturing across advanced industries. By enabling high-fidelity, multi-physics simulation through AI inference across the entire engineering lifecycle, PhysicsX unlocks new levels of optimization and automation in design, manufacturing, and operations — empowering engineers to push the boundaries of possibility. Our customers include leading innovators in Aerospace & Defense, Materials, Energy, Semiconductors, and Automotive.The Role As a Compliance Engineer, you will be building and operationalizing our compliance program and overseeing the day-to-day implementation of controls, helping us pass audits, and scaling our governance processes in a fast-paced SaaS environment. You’ll own the systems, tools, and automation workflows that allow us to meet and maintain standards like SOC 2 and ISO 27001, without slowing down the business. Key skills: Experience with implementing one or more security automation platforms (e.g. Thoropass, Vanta, Drata, Secureframe) Experience with automating SOC 2 compliance Experience with interacting with corporate customers in a business-to-business setting Excellent communication and collaboration skills. Experience interacting with auditors   What you will do Design and manage GRC tools, evidence collection workflows, and vendor risk processes Support responses to customer security assessments and RFPs Collaborate with cross-functional teams to align security with product, legal, and customer trust requirements Track, measure, and report on control effectiveness and risk posture Lead and manage audits, internal readiness assessments, and third-party risk processes Automate and operationalize the compliance roadmap (e.g., SOC 2, ISO 27001) For new compliance standards identify gaps and help drive   What you bring to the table 8+ years in compliance roles A systems-thinking mindset and a drive to eliminate manual, repetitive compliance tasks. Experience building compliance programs that scale with speed and minimal overhead. Proven experience implementing or supporting compliance frameworks such as SOC 2 or ISO 27001 Hands-on experience with GRC platforms and automating compliance workflows. Excellent communication and documentation skills. Nice to Have Skills Experience deploying and scaling GRC tooling in early-stage environments Familiarity with customer trust programs and security questionnaire automation Experience with AI compliance and governance   Salary range estimated at $85,000 to $125,000  We value diversity and are committed to equal employment opportunity regardless of sex, race, religion, ethnicity, nationality, disability, age, sexual orientation or gender identity. We strongly encourage individuals from groups traditionally underrepresented in tech to apply. To help make a change, we sponsor bright women from disadvantaged backgrounds through their university degrees in science and mathematics.    We collect diversity and inclusion data solely for the purpose of monitoring the effectiveness of our equal opportunities policies and ensuring compliance with UK employment and equality legislation. This information is confidential, used only in aggregate form, and will not influence the outcome of your application.   
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Mindrift.jpg

Freelance AI Red Team Engineer

Mindrift
USD
0
0
-
44
SG.svg
Singapore
Part-time
Remote
true
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleGenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as an AI Tutor in Coding, you’ll have the opportunity to collaborate on these projects.  Although every project is unique, you might typically: Evaluate and red team AI models and agents and machine learning systems for vulnerabilities and safety risks. Create offline reproducible & auto-evaluable test cases to test safety & capability of AI agents. Develop and implement automation scripts, custom tools, environments and test harnesses. Lead or contribute to security research initiatives, especially in AI safety, creating and implementing realistic and challenging attack scenarios for the model. Advise on cybersecurity best practices and policy implications.How to get started Simply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsYou hold a Bachelor's or Master’s Degree in Computer Science, Software Engineering, Cybersecurity, Digital Forensics or other related fields. Your level of English is advanced (C1) or above.Proficient in scripting and automation using Python, Bash, or PowerShell.Experienced with containerization and CI/CD security tools, especially Docker. Hands-on experience with penetration testing across web, API, network, and infrastructure environments. Knowledge of vulnerabilities in current AI models, including prompt injections, with knowledge of OWASP Top 10 for Large Language Models (LLMs).Familiar with AI red-teaming frameworks such as garak or PyRIT. Experience in AI/ML security, evaluation, and red teaming, particularly with LLMs, AI agents, and RAG pipelines. Proficient in offensive exploitation and exploit development.Skilled in reverse engineering using tools like Ghidra or equivalents. Expertise in network and application security, including web application security. Knowledge of operating system security concepts such as Linux privilege escalation and Windows internals. Familiar with secure coding practices for full-stack development. You are ready to learn new methods, able to switch between tasks and topics quickly and sometimes work with challenging, complex guidelines.Our freelance role is fully remote so, you just need a laptop, internet connection, time available and enthusiasm to take on a challenge.BenefitsWhy this freelance opportunity might be a great fit for you? Get paid for your expertise, with rates that can go up to $44/hour depending on your skills, experience, and project needs.Take part in a part-time, remote, freelance project that fits around your primary professional or academic commitments.Work on advanced AI projects and gain valuable experience that enhances your portfolio.Influence how future AI models understand and communicate in your field of expertise.
DevOps Engineer
Data Science & Analytics
Machine Learning Engineer
Data Science & Analytics
Apply
Hidden link
Mindrift.jpg

Freelance AI Red Team Engineer

Mindrift
USD
0
0
-
65
US.svg
United States
Part-time
Remote
true
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleGenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as an AI Tutor in Coding, you’ll have the opportunity to collaborate on these projects.  Although every project is unique, you might typically: Evaluate and red team AI models and agents and machine learning systems for vulnerabilities and safety risks. Create offline reproducible & auto-evaluable test cases to test safety & capability of AI agents. Develop and implement automation scripts, custom tools, environments and test harnesses. Lead or contribute to security research initiatives, especially in AI safety, creating and implementing realistic and challenging attack scenarios for the model. Advise on cybersecurity best practices and policy implications.How to get started Simply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsYou hold a Bachelor's or Master’s Degree in Computer Science, Software Engineering, Cybersecurity, Digital Forensics or other related fields. Your level of English is advanced (C1) or above.Proficient in scripting and automation using Python, Bash, or PowerShell.Experienced with containerization and CI/CD security tools, especially Docker. Hands-on experience with penetration testing across web, API, network, and infrastructure environments. Knowledge of vulnerabilities in current AI models, including prompt injections, with knowledge of OWASP Top 10 for Large Language Models (LLMs).Familiar with AI red-teaming frameworks such as garak or PyRIT. Experience in AI/ML security, evaluation, and red teaming, particularly with LLMs, AI agents, and RAG pipelines. Proficient in offensive exploitation and exploit development.Skilled in reverse engineering using tools like Ghidra or equivalents. Expertise in network and application security, including web application security. Knowledge of operating system security concepts such as Linux privilege escalation and Windows internals. Familiar with secure coding practices for full-stack development. You are ready to learn new methods, able to switch between tasks and topics quickly and sometimes work with challenging, complex guidelines.Our freelance role is fully remote so, you just need a laptop, internet connection, time available and enthusiasm to take on a challenge.BenefitsWhy this freelance opportunity might be a great fit for you? Get paid for your expertise, with rates that can go up to $65/hour depending on your skills, experience, and project needs.Take part in a part-time, remote, freelance project that fits around your primary professional or academic commitments.Work on advanced AI projects and gain valuable experience that enhances your portfolio.Influence how future AI models understand and communicate in your field of expertise.
DevOps Engineer
Data Science & Analytics
Machine Learning Engineer
Data Science & Analytics
Apply
Hidden link
Mindrift.jpg

Freelance AI Red Team Engineer

Mindrift
USD
0
0
-
65
US.svg
United States
Part-time
Remote
true
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective intelligence to ethically shape the future of AI.What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleGenAI models are improving very quickly, and one of our goals is to make them capable of addressing specialized questions and achieving complex reasoning skills. If you join the platform as an AI Tutor in Coding, you’ll have the opportunity to collaborate on these projects.  Although every project is unique, you might typically: Evaluate and red team AI models and agents and machine learning systems for vulnerabilities and safety risks. Create offline reproducible & auto-evaluable test cases to test safety & capability of AI agents. Develop and implement automation scripts, custom tools, environments and test harnesses. Lead or contribute to security research initiatives, especially in AI safety, creating and implementing realistic and challenging attack scenarios for the model. Advise on cybersecurity best practices and policy implications.How to get started Simply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsYou hold a Bachelor's or Master’s Degree in Computer Science, Software Engineering, Cybersecurity, Digital Forensics or other related fields. Your level of English is advanced (C1) or above.Proficient in scripting and automation using Python, Bash, or PowerShell.Experienced with containerization and CI/CD security tools, especially Docker. Hands-on experience with penetration testing across web, API, network, and infrastructure environments. Knowledge of vulnerabilities in current AI models, including prompt injections, with knowledge of OWASP Top 10 for Large Language Models (LLMs).Familiar with AI red-teaming frameworks such as garak or PyRIT. Experience in AI/ML security, evaluation, and red teaming, particularly with LLMs, AI agents, and RAG pipelines. Proficient in offensive exploitation and exploit development.Skilled in reverse engineering using tools like Ghidra or equivalents. Expertise in network and application security, including web application security. Knowledge of operating system security concepts such as Linux privilege escalation and Windows internals. Familiar with secure coding practices for full-stack development. You are ready to learn new methods, able to switch between tasks and topics quickly and sometimes work with challenging, complex guidelines.Our freelance role is fully remote so, you just need a laptop, internet connection, time available and enthusiasm to take on a challenge.BenefitsWhy this freelance opportunity might be a great fit for you? Get paid for your expertise, with rates that can go up to $65/hour depending on your skills, experience, and project needs.Take part in a part-time, remote, freelance project that fits around your primary professional or academic commitments.Work on advanced AI projects and gain valuable experience that enhances your portfolio.Influence how future AI models understand and communicate in your field of expertise.
DevOps Engineer
Data Science & Analytics
Machine Learning Engineer
Data Science & Analytics
Apply
Hidden link
Anduril Industries.jpg

Senior Machine Learning/MLOps Engineer

Anduril
USD
0
146000
-
194000
US.svg
United States
Full-time
Remote
false
Anduril Industries is a defense technology company with a mission to transform U.S. and allied military capabilities with advanced technology. By bringing the expertise, technology, and business model of the 21st century’s most innovative companies to the defense industry, Anduril is changing how military systems are designed, built and sold. Anduril’s family of systems is powered by Lattice OS, an AI-powered operating system that turns thousands of data streams into a realtime, 3D command and control center. As the world enters an era of strategic competition, Anduril is committed to bringing cutting-edge autonomy, AI, computer vision, sensor fusion, and networking technology to the military in months, not years.ABOUT THE TEAM The Corp Tech Acquisition team scopes and manages the implementation of Anduril’s acquired companies. We help enable the new acquisitions to build, ship, and deploy products at scale with Anduril’s systems and processes. As we continue to acquire companies and expand our capabilities, we are seeking a highly skilled Acquisition Program Manager specializing in Mergers & Acquisitions (M&A). This role will lead and coordinate the acquisition process, work with leadership and cross-functional teams to ensure a smooth integration, and manage all aspects of program planning and execution. ABOUT THE JOB   Oversee the acquisition program lifecycle from due diligence, integration, and adoption to completion across multiple acquisitions Work closely with cross-functional stakeholders (IT, Legal, HR, Supply Chain, Manufacturing, Mission Operations, Finance, Product, Deployments) to root cause problems and scope key requirements, milestones, and dependencies for acquisition implementation success Own building the program management foundation for the acquisition team Own defining, managing, and improving program management processes for all acquisition implementations Help implement risk management strategies, identifying potential issues and developing contingency plans Manage the program timeline across all related acquisitions, ensuring milestones are met and programs stay on track Define program scope, goals, and deliverables in collaboration with stakeholders and senior management Facilitate communication and collaboration across cross-functional teams and departments Provide regular updates and/or risks to the appropriate management channels and escalate issues, as necessary, according to each acquisitions integration plan Analyze each program status and, when necessary, revise the scope, schedule, or resources to ensure that program requirements can be met Establish and maintain relationships with relevant stakeholders, providing day-to-day contact on program status and changes REQUIRED QUALIFICATIONS   50%+ travel required insanely high execution bar, and will see all programs through from conception to tactical completion to move Anduril forward 5+ years of program management experience, preferably with managing complex systems and operations implementations 5+ years of experience with managing executive communication, board of director goals or driving cross company initiatives Excellent written and verbal communication skills and strong presentation skills, able to clearly articulate needs to leadership team and a wide variety of cross-functional stakeholders Collaborate across teams, strategizing how to bridge different parts of the organization to achieve cross-functional outcomes Ability to observe and anticipate potential risks across programs, milestones, timelines, etc. You are incredibly organized, detail-oriented, and and excel in strategic planning You have both high ownership and low ego, approaching everything with strong outcome orientation and high humility You’re discerning and an incredibly fast learner U.S. Persons status is required as this position needs to access export-controlled data    US Salary Range$146,000—$194,000 USD  The salary range for this role is an estimate based on a wide range of compensation factors, inclusive of base salary only. Actual salary offer may vary based on (but not limited to) work experience, education and/or training, critical skills, and/or business considerations. Highly competitive equity grants are included in the majority of full time offers; and are considered part of Anduril's total compensation package. Additionally, Anduril offers top-tier benefits for full-time employees, including:  Healthcare Benefits  US Roles: Comprehensive medical, dental, and vision plans at little to no cost to you.  UK & AUS Roles: We cover full cost of medical insurance premiums for you and your dependents.  IE Roles: We offer an annual contribution toward your private health insurance for you and your dependents.  Additional Benefits  Income Protection: Anduril covers life and disability insurance for all employees.  Generous time off: Highly competitive PTO plans with a holiday hiatus in December. Caregiver & Wellness Leave is available to care for family members, bond with a new baby, or address your own medical needs.  Family Planning & Parenting Support: Coverage for fertility treatments (e.g., IVF, preservation), adoption, and gestational carriers, along with resources to support you and your partner from planning to parenting.  Mental Health Resources: Access free mental health resources 24/7, including therapy and life coaching. Additional work-life services, such as legal and financial support, are also available.  Professional Development: Annual reimbursement for professional development  Commuter Benefits: Company-funded commuter benefits based on your region.  Relocation Assistance: Available depending on role eligibility.  Retirement Savings Plan  US Roles: Traditional 401(k), Roth, and after-tax (mega backdoor Roth) options.  UK & IE Roles: Pension plan with employer match.  AUS Roles: Superannuation plan.  The recruiter assigned to this role can share more information about the specific compensation and benefit details associated with this role during the hiring process.  To view Anduril's candidate data privacy policy, please visit https://anduril.com/applicant-privacy-notice/. 
DevOps Engineer
Data Science & Analytics
Machine Learning Engineer
Data Science & Analytics
Apply
Hidden link
Lambda.jpg

Data Center Substation/Utility Engineering - Electrical

Lambda AI
USD
0
185000
-
327000
US.svg
United States
Full-time
Remote
false
Lambda, The Superintelligence Cloud, builds Gigawatt-scale AI Factories for Training and Inference. Lambda’s mission is to make compute as ubiquitous as electricity and give every person access to artificial intelligence. One person, one GPU. If you'd like to build the world's best deep learning cloud, join us.  Note: This position prefers presence in our Bay Area office locations, but is open to remote presence for the right candidate.About the JobLambda is seeking a Data Center Substation/Utility Electrical Engineer who provides strategic, technical, and executive leadership for Lambda’s high-voltage substation, transmission, and power-generation programs across North America. This role oversees multi-site, multi-state programs involving 69kV–500kV substation development, transmission line interconnections, grid integration, protection and control systems, and utility interface management critical to Lambda’s AI/Cloud power infrastructure.This key role works heavily with the Energy team and sets program strategy, defines engineering and construction standards, ensures regulatory and utility compliance, and drives the delivery of complex power infrastructure safely, on time, and within budget. This position requires a deep command of HV electrical systems, utility coordination, EPC oversight, and lifecycle management for mission-critical substations supporting Lambda or developer data centers.What You’ll DoTechnical Leadership in HV Substation & Transmission SystemsProvide technical oversight and governance for all HV substation designs (69kV–500kV), including bus configurations, breaker schemes, transformer sizing, reactive compensation, protection systems, SCADA integration, and grounding. Also provide guidance with large-scale diesel and natural gas generator plants (microgrid). Possess a strong understanding of Battery Energy Storage Systems (BESS) and renewable resource integration (solar, fuel cells, etc.). Supporting site selection with powered land, as well as self-build infrastructure.Oversee the design and performance criteria for:Generator control systemsLoad sharing, load shedding, and black start capabilitiesGenerator transient response and dynamic stabilityEmissions systems (SCR, DEF, CO/NOx compliance)Utility parallel operationDemand response and curtailment strategiesDetermine technical standards for:Transmission interconnections and line tap arrangementsRelay protection and control philosophiesMetering schemes and revenue-quality instrumentationInsulation coordination and equipment BIL requirementsSubstation communication protocols (IEC 61850, DNP3, Modbus)Arc-flash, fault current, and short-circuit design considerationsLead the review and approval of all major engineering deliverables, such as:One-line diagramsPhysical, Civil, and Protection & control schematicsRelay settings and coordination studiesHV switching, grounding, and lightning protection plansGuide technical investigations and root-cause analysis for power system events, outages, and equipment failures.Transmission & Utility Interconnection StrategyEstablish and maintain relationships with utilities, ISOs/RTOs, and transmission owners; lead interconnection negotiations and technical discussions.Oversee all aspects of utility interconnection from feasibility through energization, including:Load flow and stability studiesShort circuit and protection coordinationTransmission planning requirementsInterconnection application strategy and milestone trackingEnsure that program decisions align with utility standards, NERC/FERC requirements, and state regulatory frameworks.Program Planning, Delivery & ExecutionDirect the execution of multiple HV substation and transmission projects, ensuring engineering integrity, equipment standardization, and construction quality.Oversee:EPC contractor selection (behind the meter or traditional utility interconnection)Factory acceptance testing (FAT) for HV equipmentField acceptance testing (FAT/SAT) for protection and control systemsCommissioning procedures and energization plansEnsure long-lead equipment procurement strategies for transformers, breakers, relays, controls, GIS/AIS gear, and transmission structures.Construction & Field OversightProvide executive and technical oversight for construction sequencing, clearance planning, switching coordination, and commissioning safety.Ensure adherence to construction standards related to:High-voltage safety and switching proceduresTransmission structure erection and conductor installationGround grid installation, testing, and validationRelay testing (end-to-end, point-to-point, functional)PermittingResolve complex field issues with EPCs, utilities, and commissioning teams.Senior/Executive-Level DutiesLead substation program standards, templates, modular designs, and equipment specifications to ensure repeatability and scale across core markets.Drive decision-making related to transformer sizing, redundancy (N-1, N+1), load growth, and grid capacity planning.Approve acceptance criteria for protective relay settings, verifying alignment with utility and internal standards.Orchestrate system modeling and analytical studies for AI load support (ETAP, SKM, ASPEN, PSLF, PSS/E).Provide executive reporting on technical risk, system reliability, NERC compliance impacts, and substation performance KPIs.YouBachelor’s degree in Electrical Engineering.8+ years in HV substation, transmission engineering, EPC leadership, or power delivery program management. Power generation experience is a plus.Extensive knowledge of HV electrical systems.Understanding of utility interconnection processes, NERC requirements, RTO/ISO rules, and state regulatory protocols.Demonstrated experience delivering large-scale, multi-site, mission-critical HV infrastructure projects.Prior responsibility for technical approval of designs, engineering packages, relay settings, FAT/SAT, and energization.Nice to HaveAbility to translate highly technical concepts for executive audiences.Strong commercial acumen with experience negotiating EPC agreements, equipment contracts, and utility service arrangements.Excellent leadership, risk-management, and cross-functional communication skills.Proficient in reading, analyzing, and interpreting technical specifications, financial reports, and legal documentsCombination of PE, MS, PMP or PgMP preferred.Salary Range InformationThe annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.About LambdaFounded in 2012, ~400 employees (2025) and growing fastWe offer generous cash & equity compensationOur investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitabilityOur research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOGHealth, dental, and vision coverage for you and your dependentsWellness and Commuter stipends for select roles401k Plan with 2% company match (USA employees)Flexible Paid Time Off Plan that we all actually useA Final Note:You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.Equal Opportunity EmployerLambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Replit.jpg

SOC Engineer

Replit
USD
0
180000
-
250000
US.svg
United States
Full-time
Remote
false
Replit is the agentic software creation platform that enables anyone to build applications using natural language. With millions of users worldwide and over 500,000 business users, Replit is democratizing software development by removing traditional barriers to application creation.We are looking for a SOC Engineer to join our Security Operations team and help defend a fast-moving, cloud-native AI vibe-coding platform. In this role, you will stay on top of emerging threats—from 0-days and active exploitation campaigns to bug bounty findings and customer-reported issues—and rapidly determine their relevance and potential impact to Replit. You will conduct investigations, analyze signals across our environment, and collaborate with Security, SRE, and Engineering teams to develop and drive effective containment and mitigation strategies.This is a hands-on, investigative role requiring strong technical depth, understanding of modern software engineering and CI/CD systems, familiarity with cloud-native infrastructure (especially GCP), and the ability to work across multiple teams in a fast-paced environment.ResponsibilitiesThreat Awareness & Rapid AssessmentContinuously monitor emerging threats, including bad actor activity, 0-day vulnerabilities, public exploitation campaigns, bug bounty reports, and customer-reported security issuesQuickly assess the applicability of these threats to Replit’s cloud infrastructure, SaaS services, internal tooling, and platform components. Investigation & Impact AnalysisConduct targeted investigations to determine whether Replit is already impacted by a newly discovered threat, vulnerability, or exploit.Analyze logs, telemetry, and system behaviors using SIEM, metrics, Cloud Logging, and related tools.Identify gaps or weaknesses in existing detection or visibility and propose improvements. Containment, Mitigation & Cross-Team CollaborationResearch potential impact paths and develop mitigation strategies for confirmed or applicable threats.Partner closely with Security, SRE, and Engineering teams to coordinate and implement containment, patches, configuration updates, or code-level fixes.Document findings, mitigations, and follow-up actions clearly for internal teams.Required Skills & ExperienceStrong understanding of software engineering fundamentals, including code structure, build systems, dependencies, and package ecosystems—enabling effective partnership with Engineering teams.Understanding of CI/CD pipelines and DevOps workflows, enabling collaboration with Infrastructure and DevOps teams.Solid knowledge of cloud architecture, especially Google Cloud Platform (GCP) services used in modern cloud-native deployments.Familiarity with SaaS architectures, identity systems, and integration patterns for effective collaboration with Cloud Security teams.Hands-on experience with SIEM, Cloud Logging, and log-based investigation workflows.Ability to perform investigations using log data, behavioral indicators, and threat intelligence.General understanding of vulnerability lifecycles, exploitability analysis, and common attack vectors.Preferred QualificationsExperience with threat intelligence, security research, or vulnerability analysis.Familiarity with Kubernetes, containers, serverless infrastructure, or modern distributed systems.Ability to write scripts or small tools for investigation or automation (Python, Go, Bash).Experience working with bug bounty programs or coordinated vulnerability disclosure workflows.Experience in fast-paced, cloud-native, or AI/ML-driven environments. What We ValueCuriosity & initiative: Strong desire to understand attacker behaviors, emerging threats, and how they apply to real-world systems.Speed & analytical rigor: Ability to quickly assess high-risk vulnerabilities with clear, evidence-based reasoning.Collaboration: Comfort working across cross-functional teams spanning Security, SRE, Engineering, and Infrastructure.Clear communication: Ability to explain findings, risks, and mitigation strategies to stakeholders at all levels.Ownership mindset: Takes initiative to drive investigations, improvements, and remediations to completionContinuous learning: Passion for staying up to date on new vulnerabilities, exploit trends, and cloud-native security best practices.This is a full-time role that can be held from our Foster City, CA office. The role has an in-office requirement of Monday, Wednesday, and Friday.Full-Time Employee Benefits Include:💰 Competitive Salary & Equity💹 401(k) Program⚕️ Health, Dental, Vision and Life Insurance🩼 Short Term and Long Term Disability🚼 Paid Parental, Medical, Caregiver Leave🚗 Commuter Benefits📱 Monthly Wellness Stipend🧑‍💻 Autonoumous Work Environement🖥 In Office Set-Up Reimbursement🏝 Flexible Time Off (FTO) + Holidays🚀 Quarterly Team Gatherings☕ In Office AmenitiesWant to learn more about what we are up to?Meet the Replit AgentReplit: Make an app for thatReplit BlogAmjad TED TalkInterviewing + Culture at ReplitOperating PrinciplesReasons not to work at ReplitTo achieve our mission of making programming more accessible around the world, we need our team to be representative of the world. We welcome your unique perspective and experiences in shaping this product. We encourage people from all kinds of backgrounds to apply, including and especially candidates from underrepresented and non-traditional backgrounds.
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
Replit.jpg

Security Operations Lead

Replit
USD
0
220000
-
325000
US.svg
United States
Full-time
Remote
false
Replit is the agentic software creation platform that enables anyone to build applications using natural language. With millions of users worldwide and over 500,000 business users, Replit is democratizing software development by removing traditional barriers to application creation.We are looking for a Security Operations Lead (SOC Lead) to build, mature, and operate our 24/7 detection and response capabilities across a modern cloud-native and AI-driven environment. This role leads the global SOC function—monitoring, SIEM ownership, detection engineering, alert triage, and operational readiness—while also evaluating and integrating emerging AI-based SOC products and autonomous response platforms.You will oversee monitoring across multi-cloud environments (GCP primary, AWS/Azure secondary), Kubernetes, SaaS services, endpoints, developer tools, and AI workloads. You’ll collaborate closely with Cloud Security, Compliance/GRC, SRE, Platform Engineering, IT/Endpoint teams, and AI Infrastructure to ensure our detection strategy scales and stays ahead of evolving threats.This is a hands-on leadership role perfect for someone who wants to shape the SOC of the future while solving complex challenges in a high-scale AI setting.What You’ll DoSOC Leadership & 24/7 MonitoringLead, mentor, and scale a global SOC team responsible for 24/7 monitoring, alert intake, triage, correlation, and escalation.Build operational rigor: processes, runbooks, SLAs, metrics, and quality standards for high-scale environments.Cover monitoring across:Cloud infrastructure (GCP, AWS, Azure)Kubernetes/GKE/EKS/AKS clustersSaaS platforms (Google Workspace, GitHub, Slack, Okta, etc.)Endpoints (macOS, Linux, Windows) including EDR/XDR telemetryDeveloper platforms + CI/CD pipelinesAI/ML systems and model-serving workflows AI-Based SOC Integration & InnovationEvaluate, adopt, and integrate AI-native SOC technologies for triaging, detection, and correlationIdentify opportunities to automate triage, investigations, enrichment, and reporting.Serve as the internal expert on the capabilities and limitations of AI-based SOC tooling. SIEM & Telemetry OwnershipOwn the entire SIEM ecosystem—ingestion, normalization, correlation, enrichment, tuning, dashboards, and metrics.Expand telemetry across:Cloud logs, API logs, system eventsSaaS audit logs and admin eventsIdentity providers (Okta, Google, Azure AD)Endpoint EDR/XDR event streamsStandardize data schemas and improve detection signal quality across sources. Detection EngineeringDevelop high-fidelity detections for:Cloud-native attacksIdentity threats and lateral movementSaaS misconfigurations and privilege abuseEndpoint malware/behavior anomaliesInsider threats and account takeover patternsUse MITRE ATT&CK, MITRE Cloud Matrix, and threat intel to drive detection coverage.Collaborate with Engineering, Cloud Security, and SRE to ensure telemetry supports detection use cases. Triage, Threat Analysis & EscalationLead day-to-day triage and threat analysis activities, ensuring accurate categorization and prioritization.Drive complex investigations involving correlated events across cloud, SaaS, endpoints, and developer platforms.Guide root cause analysis and work with owners to drive remediation and architectural improvements.Continuously refine logic, reduce false positives, and improve signal quality. Cross-Functional CollaborationPartner with Cloud Security on cloud posture and preventative controls.Work with Compliance/GRC to support SOC 2, ISO 27001, and audit readiness.Collaborate with SRE and Engineering to instrument new services with structured logs and detection hooks.Coordinate with IT / Endpoint teams to ensure full endpoint telemetry and EDR response readiness.Communicate threats, gaps, and trends to leadership and engineering stakeholders. Required Skills & Experience7+ years of experience in Security Operations, with 3+ years in a senior or lead capacity.Experience leading or collaborating with 24/7 SOC environments (internal, hybrid, or MSSP).Strong experience with SIEM platforms (Chronicle, Splunk, Elastic, Sentinel, Panther, etc.).Deep understanding of:Cloud security monitoring (GCP required; AWS/Azure preferred)SaaS security monitoring (Okta, Google Workspace, GitHub, Slack, etc.)Endpoint security telemetry (EDR/XDR tools such as CrowdStrike, SentinelOne, or Defender)Kubernetes and container detectionHands-on detection engineering skills, event correlation, threat hunting, and log analysis.Familiarity with AI-based SOC platforms and LLM-driven detection/triage tools.Strong understanding of identity security, OAuth/OIDC, and API telemetry patterns.Experience with SOAR and scripting (Python, Go, Bash).Knowledge of MITRE ATT&CK, cloud kill chains, behavioral detections, and detection lifecycle management.Preferred QualificationsExperience with UBA/UEBA, ML-driven anomaly detection, or autonomous remediation systems.Previous experience at a high-growth tech company.Security certifications (GCIH, GCIA, GCTI, GCDA, GCFA, etc.).What We ValueOperational excellence: Building reliable, scalable SOC systems.Analytical rigor: Capable of making sense of large, complex, multi-source telemetry.Leadership: Mentorship and guidance of analysts and engineers.Adaptability: Comfortable evaluating and integrating next-gen AI-based SOC tools.Clear communication: Able to articulate risk, incidents, and recommendations to both technical and executive audiences.Automation mindset: Focused on reducing manual toil via SOAR, scripting, and AI augmentation. Curiosity: Passion for learning, experimenting, and staying ahead of evolving threats—especially those targeting cloud-native and AI systems.This is a full-time role that can be held from our Foster City, CA office. The role has an in-office requirement of Monday, Wednesday, and Friday.Full-Time Employee Benefits Include:💰 Competitive Salary & Equity💹 401(k) Program⚕️ Health, Dental, Vision and Life Insurance🩼 Short Term and Long Term Disability🚼 Paid Parental, Medical, Caregiver Leave🚗 Commuter Benefits📱 Monthly Wellness Stipend🧑‍💻 Autonoumous Work Environement🖥 In Office Set-Up Reimbursement🏝 Flexible Time Off (FTO) + Holidays🚀 Quarterly Team Gatherings☕ In Office AmenitiesWant to learn more about what we are up to?Meet the Replit AgentReplit: Make an app for thatReplit BlogAmjad TED TalkInterviewing + Culture at ReplitOperating PrinciplesReasons not to work at ReplitTo achieve our mission of making programming more accessible around the world, we need our team to be representative of the world. We welcome your unique perspective and experiences in shaping this product. We encourage people from all kinds of backgrounds to apply, including and especially candidates from underrepresented and non-traditional backgrounds.
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Replit.jpg

Cloud Security Lead

Replit
USD
0
220000
-
325000
US.svg
United States
Full-time
Remote
false
Replit is the agentic software creation platform that enables anyone to build applications using natural language. With millions of users worldwide and over 500,000 business users, Replit is democratizing software development by removing traditional barriers to application creation.Join us at the forefront of AI and cloud-native security as we work to secure one of the most innovative developer platforms in the world. As the Cloud Security Lead, you will shape the cloud and infrastructure security program that protects millions of developers, enables safe AI-assisted development, and ensures organizations can confidently bring our platform into enterprise environments.In this role, you will own cloud security across GCP (primary) and supplemental environments in AWS and Azure, as well as containerized systems, SaaS platforms, and our multi-tenant AI infrastructure. You’ll improve our security posture through strong architecture, posture management, secure-by-default development practices, and close partnership with Engineering, Compliance, Security Architecture, and Platform teams.This is a highly impactful, hands-on leadership role—perfect for someone who wants to solve complex security challenges at scale while influencing product, engineering, and go-to-market teams.What You’ll Do:Cloud Security EngineeringLead configuration hardening across GCP, with additional oversight of workloads and integrations running in AWS and Azure.Own and optimize CSPM platforms across multi-cloud environments—establishing configuration baselines, guardrails, and remediation workflows.Secure critical SaaS platforms, ensuring proper configurations, access controls, and engineering integrations.Lead infrastructure vulnerability management across multi-cloud systems, containers, registries, and platform services.Enhance security across containerized and Kubernetes (GKE/EKS/AKS) workloads, including runtime protections, network policies, and workload identity.Assess secure logging configurations across cloud/SaaS providers, ensuring audit logs, retention, and routing meet monitoring and architecture needs. Secure Development & Architecture EnablementPartner with engineering teams to make services secure by default, embedding security into development workflows, CI/CD pipelines, and cloud-native deployments. Cross-Functional ResponsibilitiesCollaborate with Security Monitoring, Compliance/GRC, Architecture, DevOps, Platform Engineering, and ML Infrastructure.Participate in communicating security advisories, best practices, and updates to Replit’s customers.Support incident investigations as a cloud security subject-matter expert.Required Skills & Experience:7+ years of experience in cloud engineering, with 3+ years in a senior or lead role.Hands-on experience with CSPM tools (Wiz, Lacework, Prisma, Orca, SCC, etc.).Deep expertise in GCP security (IAM, VPC, KMS, GKE, Cloud Logging).Experience securing and governing SaaS platforms and identity integrations.Operational experience with infrastructure vulnerability management across cloud and container environments.Working knowledge of AWS and/or Azure security services and configurations.Experience with container and Kubernetes security across GKE, EKS, or AKS.Strong IaC security experience with Terraform, Pulumi, or similar tooling.Familiarity with compliance standards (SOC 2, ISO 27001, PCI DSS).Preferred Qualifications:Experience supporting engineering teams in building secure-first, cloud-native or PaaS environments.Background securing AI/ML pipelines, model-serving infrastructure, or developer platform services.Experience in high-growth technology or cloud-native product companies.Experience with securing AI/agentic systems and sensitive data pipelines.Automation/scripting with Python.Relevant certifications (e.g., GCP Professional Cloud Security Engineer, AWS/Azure security certs).What We Value:Problem-solving mindset — Ability to break down complex security and operational challenges into clear engineering solutions.Autonomy — Comfortable leading initiatives, collaborating effectively, and driving outcomes with minimal oversight.Communication excellence — Able to translate deep technical concepts for engineers, executives, and enterprise customers.Continuous learning — Passion for staying current with AI security, cloud-native advances, and emerging threats.Automation-first approach — Belief in reducing operational toil and building scalable, self-healing systems.This is a full-time role that can be held from our Foster City, CA office. The role has an in-office requirement of Monday, Wednesday, and Friday.Full-Time Employee Benefits Include:💰 Competitive Salary & Equity💹 401(k) Program⚕️ Health, Dental, Vision and Life Insurance🩼 Short Term and Long Term Disability🚼 Paid Parental, Medical, Caregiver Leave🚗 Commuter Benefits📱 Monthly Wellness Stipend🧑‍💻 Autonoumous Work Environement🖥 In Office Set-Up Reimbursement🏝 Flexible Time Off (FTO) + Holidays🚀 Quarterly Team Gatherings☕ In Office AmenitiesWant to learn more about what we are up to?Meet the Replit AgentReplit: Make an app for thatReplit BlogAmjad TED TalkInterviewing + Culture at ReplitOperating PrinciplesReasons not to work at ReplitTo achieve our mission of making programming more accessible around the world, we need our team to be representative of the world. We welcome your unique perspective and experiences in shaping this product. We encourage people from all kinds of backgrounds to apply, including and especially candidates from underrepresented and non-traditional backgrounds.
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
Symbolica AI.jpg

Talent Sourcer – AI & ML Research

Symbolica AI
-
US.svg
United States
Full-time
Remote
false
About Us Symbolica is an AI research lab pioneering the application of category theory to enable logical reasoning in machines. We’re a well-resourced, nimble team of experts on a mission to bridge the gap between theoretical mathematics and cutting-edge technologies, creating symbolic reasoning models that think like humans – precise, logical, and interpretable. While others focus on scaling data-hungry neural networks, we’re building AI that understands the structures of thought, not just patterns in data. Our approach combines rigorous research with fast-paced, results-driven execution. We’re reimagining the very foundations of intelligence while simultaneously developing product-focused machine learning models in a tight feedback loop, where research fuels application. Founded in 2022, we’ve raised over $30M from leading Silicon Valley investors, including Khosla Ventures, General Catalyst, Abstract Ventures, and Day One Ventures, to push the boundaries of applying formal mathematics and logic to machine learning. Our vision is to create AI systems that transform industries, empowering machines to solve humanity’s most complex challenges with precision and insight. Join us to redefine the future of AI by turning groundbreaking ideas into reality.About the Role As a DevOps Engineering Lead working closely with our Head of ML Engineering, you will lead the design, build, and optimize the infrastructure and tools that enable us to take our research and development efforts from the lab into a highly reliable, performant and secure software stack in production. You'll help accelerate the processes involved in going from research prototypes into production and enterprise ready platforms with security, availability and reliability in mind. Your work will be at the intersection of research and engineering, ensuring our R&D team has the robust platform they need to push the boundaries of AI, working with our GPU vendors, cloud providers, and on-prem servers. 📍 This is an onsite role that is based in our SF office (345 California St.) Key Responsibilities - Focus on improving the reliability and performance of our Lambda cluster and model training pipeline. - Assist in managing multiple Kubernetes environments across cloud providers - Maintain and build the internal observability platform across all environments, covering everything from GPUs, AI applications and distributed backend systems. - Take ownership of our model training and deployment systems, bringing them to a more scalable, production-ready state. - Aid in building comprehensive CI tests for GitOps repositories and promotion systems - Build and maintain different environments for research and client facing products according to best practices About You - 5+ years of experience in DevOps, or infrastructure roles, with at least 2 years in machine learning infrastructure or MLOps. It would be a benefit if you have either built, maintained, or managed ML infrastructure using DevOps practices in the past. - Proficient in cloud-native architectures, with the ability to make the right tradeoffs where necessary - Experienced with Linux, containers, GPU management, Nix, Kubernetes and an interest in making sure the infrastructure behind our models is secure by design. - Exceptional problem-solving skills with the ability to nimbly solve edge-cases with minimum disruption. - Solid software engineering skills in Rust, Golang or Python What We Offer Competitive salary and early-stage equity package. A high-trust, execution-first culture with minimal bureaucracy. Direct ownership of meaningful projects with real business impact. A rare opportunity to sit at the interface between deep research and real-world productization. Read more about Symbolica: https://fortune.com/2024/04/09/vinod-khosla-former-tesla-autopilot-engineer-ai-models/ https://venturebeat.com/ai/move-over-deep-learning-symbolicas-structured-approach-could-transform-ai/ Symbolica is an equal opportunities employer. We celebrate diversity and are committed to creating an inclusive environment for all employees, regardless of race, gender, age, religion, disability, or sexual orientation.  Symbolica is an equal opportunities employer. We celebrate diversity and are committed to creating an inclusive environment for all employees, regardless of race, gender, age, religion, disability, or sexual orientation.
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Figure.jpg

Legal Intern [Summer 2026]

Figure AI
USD
150000
-
350000
US.svg
United States
Full-time
Remote
false
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA. Figure’s vision is to deploy autonomous humanoids at a global scale. Our Helix team is looking for an experienced Training Infrastructure Engineer, to take our infrastructure to the next level. This role is focused on managing the training cluster, implementing distributed training algorithms, data loaders, and developer tools for AI researchers. The ideal candidate has experience building tools and infrastructure for a large-scale deep learning system. Responsibilities Design, deploy, and maintain Figure's training clusters Architect and maintain scalable deep learning frameworks for training on massive robot datasets Work together with AI researchers to implement training of new model architectures at a large scale Implement distributed training and parallelization strategies to reduce model development cycles Implement tooling for data processing, model experimentation, and continuous integration Requirements Strong software engineering fundamentals Bachelor's or Master's degree in Computer Science, Robotics, Engineering, or a related field Experience with Python and PyTorch Experience managing HPC clusters for deep neural network training Minimum of 4 years of professional, full-time experience building reliable backend systems Bonus Qualifications Experience managing cloud infrastructure (AWS, Azure, GCP) Experience with job scheduling / orchestration tools (SLURM, Kubernetes, LSF, etc.) Experience with configuration management tools (Ansible, Terraform, Puppet, Chef, etc.) The US base salary range for this full-time position is between $150,000 - $350,000 annually. The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
webAI.jpg

DevSecOps Engineer

webAI
-
US.svg
United States
Full-time
Remote
false
About Us:webAI is pioneering the future of artificial intelligence by establishing the first distributed AI infrastructure dedicated to personalized AI. We recognize the evolving demands of a data-driven society for scalability and flexibility, and we firmly believe that the future of AI lies in distributed processing at the edge, bringing computation closer to the source of data generation. Our mission is to build a future where a company's valuable data and intellectual property remain entirely private, enabling the deployment of large-scale AI models directly on standard consumer hardware without compromising the information embedded within those models. We are developing an end-to-end platform that is secure, scalable, and fully under the control of our users, empowering enterprises with AI that understands their unique business. We are a team driven by truth, ownership, tenacity, and humility, and we seek individuals who resonate with these core values and are passionate about shaping the next generation of AI.About the Role:We are seeking a DevOps/Compliance Engineer to support our Public Sector initiatives by building, securing, and maintaining compliant infrastructure environments for deploying AI models within government and regulated systems. This role bridges modern DevOps practices with the strict compliance and security standards required for federal engagements. You will play a critical role in designing infrastructure automation, ensuring FedRAMP and NIST compliance, and helping deliver secure, auditable, containerized AI solutions to our public sector partners.Responsibilities:Design, implement, and maintain scalable, secure cloud and edge infrastructure for AI workloads in government environments.Manage containerization and orchestration technologies such as Docker and Kubernetes, optimizing for performance, isolation, and compliance.Develop and maintain Infrastructure as Code (IaC) using Terraform, Ansible, or Pulumi to automate secure, compliant infrastructure provisioning.Implement and manage CI/CD pipelines with integrated security controls, encryption, and vulnerability scanning.Ensure compliance with federal security frameworks such as NIST SP 800-53, FedRAMP, and DISA STIGs.Collaborate with Security, Legal, and Public Sector teams to maintain continuous compliance posture and generate audit-ready evidence.Package and deliver software artifacts (containers, binaries, configurations) for deployment in restricted or air-gapped environments.Configure and maintain monitoring, logging, and observability tools to ensure system reliability and compliance visibility.Support MLOps workflows to productionize AI models with consistent, secure automation.Contribute to documentation and knowledge sharing on infrastructure and compliance best practices.Qualifications:Active US Security clearance or eligibility and willingness to obtain a US Security clearance5+ years of experience in DevOps, Site Reliability, or Infrastructure Engineering.Proficiency with Docker, Kubernetes, and cloud-native deployment tools.Strong experience implementing Infrastructure as Code with Terraform, Ansible, or Pulumi.Deep understanding of security and compliance frameworks such as NIST SP 800-53, FedRAMP, and DISA STIGs.Experience with MLOps tools and practices for automating and scaling model deployments.Proficiency in Python, Bash, or Go for automation and scripting.Experience integrating security controls into CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, etc.).Familiarity with observability tools such as Prometheus, Grafana, ELK, or CloudWatch.Preferred SkillsExperience working within FedRAMP Moderate or High environments.Familiarity with AI model deployment pipelines and model governance best practices.Knowledge of Zero Trust architecture and secure identity management.Strong collaboration skills and ability to work cross-functionally across technical, security, and compliance teams.Excellent written and verbal communication skills for interfacing with government stakeholders and auditors. We at webAI are committed to living out the core values we have put in place as the foundation on which we operate as a team. We seek individuals who exemplify the following:Truth - Emphasizing transparency and honesty in every interaction and decision.Ownership - Taking full responsibility for one’s actions and decisions, demonstrating commitment to the success of our clients. Tenacity - Persisting in the face of challenges and setbacks, continually striving for excellence and improvement.Humility - Maintaining a respectful and learning-oriented mindset, acknowledging the strengths and contributions of others.Benefits:Competitive salary and performance-based incentives.Comprehensive health, dental, and vision benefits package.401k Match (US-based only)$200/mos Health and Wellness Stipend$400/year Continuing Education Credit$500/year Function Health subscription (US-based only)Free parking, for in-office employeesUnlimited Approved PTOParental Leave for Eligible EmployeesSupplemental Life Insurance webAI is an Equal Opportunity Employer and does not discriminate against any employee or applicant on the basis of age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We adhere to these principles in all aspects of employment, including recruitment, hiring, training, compensation, promotion, benefits, social and recreational programs, and discipline. In addition, it is the policy of webAI to provide reasonable accommodation to qualified employees who have protected disabilities to the extent required by applicable laws, regulations and ordinances where a particular employee works.
DevOps Engineer
Data Science & Analytics
Apply
Hidden link
Shield AI.jpg

Staff Engineer, Systems Test (R4151)

Shield AI
USD
140000
-
210000
US.svg
United States
Full-time
Remote
false
Founded in 2015, Shield AI is a venture-backed deep-tech company with the mission of protecting service members and civilians with intelligent systems. Its products include the V-BAT and X-BAT aircraft, Hivemind Enterprise, and the Hivemind Vision product lines. With nine offices and facilities across the U.S., Europe, the Middle East, and the Asia-Pacific, Shield AI’s technology actively supports operations worldwide. For more information, visit www.shield.ai. Follow Shield AI on LinkedIn, X, Instagram, and YouTube. Job Description:We’re seeking a Staff Integration & Test Engineer to lead advanced integration and test activities for Shield AI’s Hivemind autonomy systems in Frisco, TX. You’ll define and execute test strategies that span Hivemind Software integration, simulation, hardware-in-the-loop, vehicle-in-the-loop, and live flight operation, ensuring robust system performance and reliability in real-world mission environments.  As a senior technical leader, you’ll architect test infrastructure, collaborate with cross-functional teams, mentor teammates and drive continuous improvement in validation methodologies and test automation. This role is deeply hands-on and highly collaborative, working across software, hardware, systems, and flight test disciplines to ensure seamless integration of Hivemind autonomy on platforms such as VBAT.  Shield AI is scaling and growing rapidly, the ideal candidate will demonstrate adaptability, a growth mindset, and a willingness to learn new technologies and methodologies quickly in a fast-paced, evolving environment. This is an opportunity to grow alongside a company that is changing the world, building something insanely great with a mission-driven culture, a sense of urgency, and an unwavering commitment to protecting those who serve. What you'll do:Lead system-level integration, test planning, and validation for advanced autonomous aircraft systemsDefine and implement test architectures, methodologies, and strategies across simulation, HIL, VIL, and flight environments. Own and manage comprehensive test plans, defining objectives, success criteria, procedures, and resource needs. Architect and evolve test infrastructure and automation frameworks that enable scalable and repeatable validation. Define Hivemind Software test release processes and quality release gates. Collaborate closely with software, hardware, and systems engineering teams to ensure robust integration and system readiness. Conduct hands-on debugging and validation of autonomy, avionics, and control systems. Lead flight test preparation, system configuration, and real-time troubleshooting during live events.Develop tools and utilities in Python (and optionally C++) to support automation, data analysis, and telemetry validation.Establish test documentation standards, ensuring traceability, repeatability, and knowledge sharing across teams. Mentor and provide technical direction to junior and senior engineers, fostering a culture of technical rigor and continuous improvement. Partner with program and mission teams to communicate test readiness, progress, and system performance effectively.Required qualification:Bachelor’s or Master’s degree in Engineering, Computer Science, Robotics, Aerospace Engineering, or related technical discipline.8+ years of experience in system integration, test planning, and validation of complex systems—ideally within robotics, aerospace, or autonomy.Proven expertise in test planning, including test plan creation, test case design, and validation tracking and software quality release processes.Deep understanding of test infrastructure, automation, and validation methodologies.Strong proficiency in Python for scripting, automation, and analysis; working knowledge of C++ preferred.Experience architecting and maintaining Hardware-in-the-Loop (HIL), Vehicle-in-the-Loop (VIL), or similar real-time test systems.Proven ability to troubleshoot complex, multidisciplinary systems involving software, hardware, and controls.Demonstrated success leading technical projects, mentoring engineers, and defining test strategies for multi-system programs.Excellent communication and cross-functional collaboration skills.Adaptability, growth mindset, and willingness to learn new technologies quickly in a scaling, fast-paced environment.Self-starter with strong sense of urgency, initiative, and comfort operating in ambiguity.U.S. Citizenship and ability to obtain and maintain a SECRET clearance.Preferred qualifications:Experience testing or integrating autonomous air, ground or sea vehicles.Background in defense, aerospace, or mission-critical robotics systems.Experience developing test infrastructure and automation frameworks at organizational scale.Familiarity with simulation and modeling tools for system-level validation.Knowledge of configuration management, verification processes, and data analytics for test reporting.Experience supporting flight test operations, including safety, instrumentation, and post-flight analysis. 140,000 - 210,000 a year#LI-LD1#LD Full-time regular employee offer package: Pay within range listed + Bonus + Benefits + Equity Temporary employee offer package: Pay within range listed above + temporary benefits package (applicable after 60 days of employment) Salary compensation is influenced by a wide array of factors including but not limited to skill set, level of experience, licenses and certifications, and specific work location. All offers are contingent on a cleared background and possible reference check. Military fellows and part-time employees are not eligible for benefits. Please speak to your talent acquisition representative for more information. ### Shield AI is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, marital status, disability, gender identity or Veteran status. If you have a disability or special need that requires accommodation, please let us know. 
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Robotics Engineer
Software Engineering
Apply
Hidden link
Cohere Health.jpg

Staff Software Engineer, GPU Infrastructure (HPC)

Cohere
-
CA.svg
Canada
Full-time
Remote
true
Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.Join us on our mission and shape the future!Why this team?The internal infrastructure team is responsible for building world-class infrastructure and tools used to train, evaluate and serve Cohere's foundational models. By joining our team, you will work in close collaboration with AI researchers to support their AI workload needs on the cutting edge, with a strong focus on stability, scalability, and observability. You will be responsible for building and operating superclusters across multiple clouds. Your work will directly accelerate the development of industry-leading AI models that power Cohere's platform North. We’re hiring software engineers at multiple levels. Whether you’re early in your career or a seasoned staff engineer, you’ll find opportunities to grow and make an impact here.Please Note: All of our infrastructure roles require participating in a 24x7 on-call rotation, where you are compensated for your on-call schedule. As a Staff Software Engineer, you will:Build and scale ML-optimized HPC infrastructure: Deploy and manage Kubernetes-based GPU/TPU superclusters across multiple clouds, ensuring high throughput and low-latency performance for AI workloads.Optimize for AI/ML training: Collaborate with cloud providers to fine-tune infrastructure for cost efficiency, reliability, and performance, leveraging technologies like RDMA, NCCL, and high-speed interconnects.Troubleshoot and resolve complex issues: Proactively identify and resolve infrastructure bottlenecks, performance degradation, and system failures to ensure minimal disruption to AI/ML workflows.Enable researchers with self-service tools: Design intuitive interfaces and workflows that allow researchers to monitor, debug, and optimize their training jobs independently.Drive innovation in ML infrastructure: Work closely with AI researchers to understand emerging needs (e.g., JAX, PyTorch, distributed training) and translate them into robust, scalable infrastructure solutions.Champion best practices: Advocate for observability, automation, and infrastructure-as-code (IaC) across the organization, ensuring systems are maintainable and resilient.Mentorship and collaboration: Share expertise through code reviews, documentation, and cross-team collaboration, fostering a culture of knowledge transfer and engineering excellence. You may be a good fit if you have:Deep expertise in ML/HPC infrastructure: Experience with GPU/TPU clusters, distributed training frameworks (JAX, PyTorch, TensorFlow), and high-performance computing (HPC) environments.Kubernetes at scale: Proven ability to deploy, manage, and troubleshoot cloud-native Kubernetes clusters for AI workloads.Strong programming skills: Proficiency in Python (for ML tooling) and Go (for systems engineering), with a preference for open-source contributions over reinventing solutions.Low-level systems knowledge: Familiarity with Linux internals, RDMA networking, and performance optimization for ML workloads.Research collaboration experience: A track record of working closely with AI researchers or ML engineers to solve infrastructure challenges.Self-directed problem-solving: The ability to identify bottlenecks, propose solutions, and drive impact in a fast-paced environment.If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.Full-Time Employees at Cohere enjoy these Perks:🤝 An open and inclusive culture and work environment 🧑‍💻 Work closely with a team on the cutting edge of AI research 🍽 Weekly lunch stipend, in-office lunches & snacks🦷 Full health and dental benefits, including a separate budget to take care of your mental health 🐣 100% Parental Leave top-up for up to 6 months🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement🏙 Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend✈️ 6 weeks of vacation (30 working days!)
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Machine Learning Engineer
Data Science & Analytics
Apply
Hidden link
Lambda.jpg

Manager - Security Architecture

Lambda AI
USD
297000
-
495000
US.svg
United States
Full-time
Remote
false
Lambda, The Superintelligence Cloud, builds Gigawatt-scale AI Factories for Training and Inference. Lambda’s mission is to make compute as ubiquitous as electricity and give every person access to artificial intelligence. One person, one GPU. If you'd like to build the world's best deep learning cloud, join us.  *Note: This position requires presence in our San Francisco, San Jose, or Seattle office location 4 days per week; Lambda’s designated work from home day is currently Tuesday.About the RoleLambda Security protects some of the world's most valuable digital assets: invaluable training data, model weights representing immense computational investments, and the sensitive inputs required to leverage best of breed AI models. We're responsible for securing every byte that powers breakthrough artificial intelligence.Reporting to the Senior Manager of Security, your team serves dual functions: building security for the business and demonstrating that work directly to customers. As security advisors to Product Engineering, Platform Engineering, and IT teams, your team will establish security policies and architecture standards, conduct threat modeling and design reviews for critical systems, and create implementation guidance that engineering teams can adopt. In support of our customers, your team will develop customer-facing security documentation and participate directly in enterprise security discussions. This work ensures the right security decisions get made across Lambda's AI infrastructure while protecting customer data, enabling hypergrowth velocity, and building the trust that closes enterprise deals.As Manager of the Security Architecture team, you'll build and lead a team of 4-5 security engineers with expertise across application security, infrastructure security, and corporate security. You'll hire strong specialists, coach them through complex security problems, set team priorities and architectural direction, and create a culture where security judgment accelerates business velocity rather than creating friction.Your success is measured by the security decisions your team enables across the business: engineering teams building secure-by-default systems, compliance frameworks mapped to technical controls, and customers trusting Lambda's infrastructure with their most valuable AI workloads. Your team will balance proactive architecture work (defining what "good" looks like) with reactive consultation (reviewing designs and answering complex security questions).Your immediate focus will be building your team, establishing processes for design reviews and architecture guidance that scale with Lambda's growth, and developing a 6-12 month roadmap aligned with Lambda's 2026 security strategic plan including compliance initiatives like ISO 27001.We're looking for engineering managers who pair strong people leadership with enough security depth to coach specialists, set architectural direction, and translate security decisions into business value. If you're energized by building high-performing teams, enabling security at scale through excellent judgment rather than brute force, and helping enterprise customers trust their most valuable AI workloads to Lambda's infrastructure, we'd love to talk.We value diverse backgrounds, experiences, and skills, and we are excited to hear from candidates who can bring unique perspectives to our team. If you do not exactly meet this description but believe you may be a good fit, please still apply and help us understand your readiness for this role. Your application is not a waste of our time.What You'll DoTeam Leadership & DevelopmentBuild, hire, and develop a high-performing team of 4-5 security engineers with deep expertise across application security, infrastructure security, and corporate security.Foster a culture where security judgment accelerates business velocity, creating an environment where specialists thrive through clear expectations, regular coaching, and opportunities for growth.Conduct regular one-on-ones and provide constructive feedback that helps your engineers advance their technical depth and expand their cross-functional impact.Set team priorities and architectural direction, ensuring your team focuses on the highest-impact security decisions across Lambda's AI infrastructure.Strategic Architecture & Program ManagementOwn your team's 6-12 month roadmap, balancing proactive architecture work (defining security standards and patterns) with reactive consultation (design reviews and complex security questions).Establish security policies and architecture standards that enable Product Engineering, Platform Engineering, and IT teams to build secure-by-default systems.Define measurable success criteria for your team's work, translating security architecture decisions into business impact that stakeholders understand.Proactively guide the evolution of Lambda's security architecture program as the company matures, ensuring architecture decisions align with compliance commitments and evolving customer security requirements.Cross-Functional Collaboration & Customer EnablementPartner deeply with Product Engineering, Platform Engineering, and IT teams to integrate security architecture guidance at optimal moments in their development cycles.Conduct and oversee threat modeling and design reviews for critical systems, ensuring your team provides actionable recommendations that balance security rigor with development velocity.Enable your team to create implementation guidance and architecture patterns that engineering teams voluntarily adopt because they make secure development easier.Support enterprise sales by developing customer-facing security documentation and coaching your team through direct security discussions with prospective customers evaluating Lambda's infrastructure.Collaborate with peer security teams (Detection & Response, Platform, Program Coordination) to ensure cohesive security architecture across all security functions.What We Think a Candidate Needs to Demonstrate to Succeed5+ years of security engineering or security architecture experience with 3+ years leading technical teams, demonstrating ability to build and develop high-performing security specialists.Proven track record building team cultures where specialists thrive through clear expectations, effective coaching, and career development that expands both technical depth and cross-functional impact.Strong technical background in security architecture, threat modeling, and secure design principles with enough depth to guide team decisions, evaluate complex tradeoffs, and coach engineers through difficult security problems.Experience working across application security, infrastructure security, or corporate security domains, with demonstrated ability to set architectural direction and security standards that engineering teams adopt.Excellent collaboration skills working with highly technical engineering teams both with and without authority, building relationships that enable security architecture guidance at optimal moments in development cycles.Skilled communicator who translates security architecture decisions into business value, helping stakeholders understand how technical security work protects customer data and enables business velocity.Ability to thrive in high-speed, high-ambiguity startup environments where you balance building team capability and security architecture foundations while executing at a fast pace.Nice to HavePrior experience in AI/ML infrastructure companies or cloud service providers where you've navigated the unique security challenges of multi-tenant systems and customer data isolation at scale.Hands-on experience driving compliance audits (SOC 2, ISO 27001, PCI-DSS, HIPAA/HITECH, or FedRAMP) including evidence collection, control mapping, and managing auditor relationships.Deep familiarity with bare metal infrastructure security in addition to cloud platforms, understanding physical security considerations and hardware-level security controls.Experience creating security architecture patterns that were adopted widely across multiple teams or organizations, demonstrating ability to build reusable solutions that scale beyond a single use case.Experience managing security engineers through significant career transitions, such as promoting ICs to lead roles or helping specialists successfully pivot between security domains.Enthusiasm about leveraging Lambda's access to state-of-the-art LLMs to pioneer AI-powered security architecture capabilities—imagine automated threat modeling, intelligent design review assistance, and architecture validation at scale only possible when you host the AI infrastructure yourself.Salary Range InformationThe annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.About LambdaFounded in 2012, ~400 employees (2025) and growing fastWe offer generous cash & equity compensationOur investors include Andra Capital, SGW, Andrej Karpathy, ARK Invest, Fincadia Advisors, G Squared, In-Q-Tel (IQT), KHK & Partners, NVIDIA, Pegatron, Supermicro, Wistron, Wiwynn, US Innovative Technology, Gradient Ventures, Mercato Partners, SVB, 1517, Crescent Cove.We are experiencing extremely high demand for our systems, with quarter over quarter, year over year profitabilityOur research papers have been accepted into top machine learning and graphics conferences, including NeurIPS, ICCV, SIGGRAPH, and TOGHealth, dental, and vision coverage for you and your dependentsWellness and Commuter stipends for select roles401k Plan with 2% company match (USA employees)Flexible Paid Time Off Plan that we all actually useA Final Note:You do not need to match all of the listed expectations to apply for this position. We are committed to building a team with a variety of backgrounds, experiences, and skills.Equal Opportunity EmployerLambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.
DevOps Engineer
Data Science & Analytics
Software Engineer
Software Engineering
Apply
Hidden link
No job found
There is no job in this category at the moment. Please try again later