Infrastructure Engineer
The Infrastructure Engineer is responsible for designing, building, and deploying robust, secure, and scalable cloud infrastructure for AI and machine learning workflows. They will work in a cross-functional team and partner with technical and non-technical stakeholders from the initial idea generation through to implementation and shipping. The role involves enabling Machine Learning Engineers and Data Scientists by contributing to internal best practices, standards, and reusable code repositories. The engineer will proactively identify and recommend new ways customers can leverage cloud infrastructure to address their key challenges, create and maintain reusable company-wide libraries and infrastructure-as-code, and research and integrate the best open-source technologies to enhance Faculty's infrastructure capabilities.
Staff DevOps Engineer
As a Staff DevOps Engineer at webAI, you will design and architect secure, scalable cloud and edge infrastructure for deploying AI workloads across multi-cloud and hybrid environments, build and maintain production-grade Infrastructure as Code managing over 100 resources with GitOps workflows and automated validation, design and operate production Kubernetes clusters optimized for AI/ML workloads with GPU support, implement secure CI/CD pipelines with integrated security controls and automated deployment workflows, lead MLOps infrastructure initiatives including model deployment pipelines and monitoring, design observability and monitoring systems with tools like Prometheus and Grafana aligned to performance indicators, implement security best practices including least-privilege access and automated compliance validation, lead incident response and reliability initiatives including on-call rotations and post-mortems, architect disaster recovery and business continuity strategies with automated backup and failover processes, develop reusable infrastructure modules to standardize deployment patterns, mentor engineers on cloud architecture and DevOps best practices, and drive technical documentation and knowledge sharing including runbooks and infrastructure standards.
Site Reliability Engineer, Managed AI
The Site Reliability Engineer is responsible for designing and operating reliable managed AI services focused on serving and scaling large language model workloads. They build automation and reliability tooling to support distributed AI pipelines and inference services, define, measure, and improve SLIs/SLOs across AI workloads to ensure performance and reliability, and collaborate with AI, platform, and infrastructure teams to optimize large-scale training and inference clusters. Additionally, they automate observability by building telemetry and performance tuning strategies for latency-sensitive AI services, investigate and resolve reliability issues in distributed AI systems using telemetry, logs, and profiling, and contribute to the architecture of next-generation distributed systems designed specifically for AI-first environments.
Site Reliability Engineer, Inference Infrastructure
As a Site Reliability Engineer on the Model Serving team, you will build self-service systems that automate managing, deploying, and operating services, including custom Kubernetes operators supporting language model deployments. You will automate environment observability and resilience, enabling all developers to troubleshoot and resolve problems, and take steps to ensure defined SLOs are met, including participating in an on-call rotation. Additionally, you will build strong relationships with internal developers and influence the Infrastructure team’s roadmap based on their feedback, as well as develop the team through knowledge sharing and an active review process.
DevOps Engineer
The DevSecOps / Platform Engineer will design, implement, and operate secure, cloud-native infrastructure powering core data and application platforms for a defense-focused company. They will develop CI/CD pipelines, automate deployments, uphold security practices, and collaborate across teams to ensure reliability, scalability, and compliance for government users.
Staff Software Engineer, Infrastructure
You will design, build, and operate production infrastructure for high-scale, low-latency systems, owning critical services end-to-end to improve reliability and performance. The role also involves partnering with research and product teams, optimizing service latencies, evolving CI/CD and self-service tooling, and leading infrastructure-as-code and GitOps practices.
Staff Infrastructure Security Engineer
The engineer will architect, deploy, and operationalize foundational security services to support Crusoe's move toward Zero Trust, serving as a technical leader for secrets management and identity architecture. Responsibilities span from driving enterprise-wide platforms like HashiCorp Vault to defining trust patterns and secure onboarding in a hybrid, multi-cloud environment.
Enterprise Security Engineer
You will be responsible for building and operationalizing the company's compliance program, implementing controls, and supporting audits in a fast-paced SaaS environment. Key tasks include managing GRC tools, automating workflows for compliance standards such as SOC 2 and ISO 27001, and supporting responses to customer security assessments.
Freelance AI Red Team Engineer
As a Freelance AI Red Team Engineer, you will evaluate and red team AI models, agents, and machine learning systems for safety risks and vulnerabilities. You will also develop automation tools, create rigorous test scenarios, and contribute to security research initiatives in the AI domain.
Freelance AI Red Team Engineer
Evaluate and red team AI models and agents for vulnerabilities and safety risks, and develop automation tools and test harnesses for AI systems. Contribute to security research initiatives, including designing and implementing challenging attack scenarios for AI models.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.