Software Development in Test Intern
Advance inference efficiency end-to-end by designing and prototyping algorithms, architectures, and scheduling strategies for low-latency, high-throughput inference. Implement and maintain changes in high-performance inference engines such as SGLang- or vLLM-style systems and Together's inference stack, including kernel backends, speculative decoding (e.g., ATLAS), and quantization. Profile and optimize performance across GPU, networking, and memory layers to improve latency, throughput, and cost. Design and operate reinforcement learning (RL) and post-training pipelines (including RLHF, RLAIF, GRPO, DPO-style methods, and reward modeling) where most of the cost is inference, jointly optimizing algorithms and systems. Make RL and post-training workloads more efficient with inference-aware training loops such as asynchronous RL rollouts and speculative decoding techniques. Use these pipelines to train, evaluate, and iterate on frontier models based on the inference stack. Co-design algorithms and infrastructure to tightly couple objectives, rollout collection, and evaluation with efficient inference, identifying bottlenecks in training engines, inference engines, data pipelines, and user-facing layers. Run ablations and scale-up experiments to study trade-offs between model quality, latency, throughput, and cost and integrate findings into model, RL, and system design. Profile, debug, and optimize inference and post-training services under production workloads. Drive roadmap items requiring real engine modifications, including changing kernels, memory layouts, scheduling logic, and APIs. Establish metrics, benchmarks, and experimentation frameworks to rigorously validate improvements. Provide technical leadership by setting technical direction for cross-team efforts at the intersection of inference, RL, and post-training. Mentor engineers and researchers on full-stack ML systems work and performance engineering.
Machine Learning Operations Engineer
Optimize orchestration processes to ensure efficient deployment and management of AI models. Implement cost-saving strategies to minimize infrastructure expenses while maximizing performance. Upgrade throughput to enhance scalability and responsiveness of AI systems. Collaborate with cross-functional teams to identify bottlenecks and implement solutions to improve workflow efficiency. Ship new features and updates rapidly while maintaining high levels of quality and reliability. Deploy and monitor machine learning models produced by deep learning engineers. Design, deploy, and maintain performant and scalable processes for data acquisition and manipulation to enhance dataset accessibility. Participate actively in the team's software development process, including design reviews, code reviews, and brainstorming sessions. Maintain accurate and updated software development documentation.
Software Engineering Manager, Autonomous
As an Engineering Manager on the Autonomous team, you will lead and scale a high-calibre team of engineers dedicated to defining the future of AI agent development and advancing AI and backend systems. You will oversee the technical roadmap for the team by translating architectural complexity into clear product strategies, mentor and support the professional growth of a diverse group of engineers, and partner closely with Product and Design to ensure the agent-building tools remain intuitive and technically robust. You will champion a "show > tell" culture to ensure rapid shipping while maintaining high technical stability and user experience standards, and clear technical and operational roadblocks to enable the team to operate with high agency and clarity. You will act as the bridge between product vision and technical execution.
Manager Information Security
You will be responsible for defining operational domains and evaluating the reliability of the AI capabilities developed in-house. You will develop and extend the state-of-the-art in uncertainty quantification and uncertainty calibration. This involves understanding the AI systems built, interfacing with them, and evaluating their robustness in real-world and adversarial scenarios. You will contribute to impactful projects and collaborate with people across several teams and backgrounds.
Data Engineer - Foundational
As a Data Engineer on the Foundational team, you will build ETL/ELT pipelines to extract, decode, and store raw Electro-Optical (EO) and Infrared (IR) video into optimized formats like WebDataset, TFRecords, or Parquet. You will develop algorithms to synchronise EO and IR frames temporally and spatially for model training inputs. Architect storage-to-GPU pipelines to ensure multi-node training clusters maintain over 90% GPU utilisation without I/O bottlenecks. Write and optimise distributed data processing jobs using Apache Spark, Ray, or Apache Beam to handle thousands of hours of tactical video logs. Implement automated quality checks to filter corrupted or blank frames and maintain reproducible training runs with versioning and lineage tracking. Evaluate and implement advanced storage solutions such as MinIO or S3 tiering to manage datasets while optimising cost and latency.
Lead AI/ML Engineer
Lead the design and implementation of scalable ML/AI systems focused on large language models, vector databases, and retrieval-based architectures. Integrate and apply foundation models from providers like OpenAI, AWS Bedrock, and Anthropic for prototyping and production use cases. Adapt, evaluate, and optimize large language models for domain-specific enterprise applications. Build and maintain infrastructure for AI model experimentation, deployment, and monitoring in production. Improve model performance and inference workflows addressing latency, cost, and reliability. Provide technical leadership by mentoring engineers and promoting best ML engineering practices. Partner with product and cross-functional stakeholders to translate requirements into scalable ML solutions. Contribute to the evolution of internal standards for AI experimentation, evaluation, and deployment. Lead the design and delivery of end-to-end voice AI solutions combining large language models with speech technologies including speech-to-text, text-to-speech, and real-time streaming audio pipelines, architecting low-latency, highly reliable conversational voice systems and guiding a team through ambiguity toward production excellence. Understand and apply constraints of voice experiences such as latency, turn-taking, interruption handling, streaming inference, and audio quality to create scalable, enterprise-grade systems.
Software Engineer, Inference Platform
The Software Engineer for the Inference Platform at Fluidstack will own inference deployments end-to-end, including initial configuration, performance tuning, production SLA maintenance, and incident response. They will drive measurable improvements in throughput, time-to-first-token (TTFT), and cost-per-token across diverse model families and customer workload patterns. Responsibilities include building and operating key-value (KV) cache and scheduling infrastructure to maximize utilization across concurrent requests, implementing and validating disaggregated prefill/decode pipelines, and managing Kubernetes-based orchestration at scale. The role requires profiling and resolving bottlenecks at compute, memory, and communication layers, instrumenting deployments for end-to-end observability, partnering with customers to translate model architectures, access patterns, and latency requirements into deployment configurations, and contributing to the inference platform architecture and roadmap focused on reducing deployment complexity, improving hardware utilization, and expanding support for new model classes and accelerators. Additionally, participation in an on-call rotation (up to one week per month) to maintain reliability and SLA commitments of production deployments is required.
AI Researcher
The AI Researcher will work across the model development loop including designing and testing architecture changes and training regimes for large language models, running controlled experiments at scale to isolate causal effects, studying failure modes in reasoning, generalisation, robustness, and representation, shaping objectives, data mixtures, and optimisation choices that influence model behaviour, building and refining evaluations that measure capability and reliability, analysing training dynamics using logs, metrics, and model outputs, collaborating with ML systems engineers on distributed training and training operations, and writing clear internal notes to translate experimental results into design decisions. The role requires substantial time spent in code, training runs, logs, and evaluation outputs aiming for clarity about what improves the model and why.
AI Software Engineer (Model Training)
You will build and maintain the systems that support large scale model training, including designing and maintaining distributed training pipelines for large language models, building data ingestion and preprocessing systems for large training datasets, developing tooling for experiment management, checkpointing, and reproducibility, monitoring and debugging long running training jobs across clusters, improving reliability and observability across the training stack, optimizing training throughput across compute, memory, and data pipelines, working closely with researchers to translate experimental ideas into training runs, and diagnosing failures across infrastructure, training loops, and data pipelines. The work requires spending time inside code, logs, dashboards, and experiment outputs to make large scale training reliable.
Scientist/Sr Scientist, Display Technology (Contract)
The job responsibilities include having industry experience as a research engineer in an AI-related company and being excited to work, learn, and teach within a collaborative team on challenging problems.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Need help with something? Here are our most frequently asked questions.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.