Member of Technical Staff, GPU Optimization
Optimize model training and inference pipelines, including data loading, preprocessing, checkpointing, and deployment, to improve throughput, latency, and memory efficiency on NVIDIA GPUs; design, implement, and benchmark custom CUDA and Triton kernels for performance-critical operations; integrate low-level optimizations into PyTorch-based codebases, including custom operators, low-precision formats, and TorchInductor passes; profile and debug the entire stack from kernel launches to multi-GPU I/O paths using various profiling tools such as Nsight, nvprof, PyTorch Profiler, and custom tools; collaborate with colleagues to co-design model architectures and data pipelines that are hardware-friendly while maintaining state-of-the-art quality; stay updated on the latest GPU and compiler technologies and assess their impact; work closely with infrastructure and backend teams to improve cluster orchestration, scaling strategies, and observability for large experiments; provide clear, data-driven insights regarding performance, quality, and cost trade-offs; contribute to a culture emphasizing fast iteration, thoughtful profiling, and performance-centric design.
Member of Technical Staff, ML Engineer
The Machine Learning Engineer will partner closely with Researchers to bring large-scale multimodal video diffusion models into production. Responsibilities include developing high-performance GPU-based inference pipelines for large multimodal diffusion models, building, optimizing, and maintaining serving infrastructure to deliver low-latency predictions at large scale, and collaborating with DevOps teams to containerize models, manage autoscaling, and ensure uptime SLAs. The role also involves leveraging techniques like quantization, pruning, and distillation to reduce latency and memory footprint without compromising quality, implementing continuous fine-tuning workflows to adapt models based on real-world data and feedback, designing and maintaining automated CI/CD pipelines for model deployment, versioning, and rollback, implementing robust monitoring (latency, throughput, concept drift) and alerting for critical production systems, and exploring cutting-edge GPU acceleration frameworks (e.g., TensorRT, Triton, TorchServe) to continuously improve throughput and reduce costs.
Machine Learning Engineer
Design, build, and maintain scalable machine learning systems encompassing data ingestion, preprocessing, training, testing, and deployment. Develop and optimize end-to-end ML pipelines covering data collection, labeling, training, validation, and monitoring to ensure system reliability and reproducibility. Implement robust MLOps practices such as model versioning, experiment tracking, continuous integration and deployment (CI/CD) for ML, and continuous monitoring in production. Collaborate with product and engineering teams to integrate and deploy models into real-time products with an emphasis on efficiency and scalability. Ensure data quality, observability, and performance across all AI systems. Stay current with the latest advancements in AI infrastructure, tooling, and research to maintain leadership in AI innovation.
Founding Senior Machine Learning Engineer
Train and fine-tune large language models (LLMs) and audio models to maximize speed, accuracy, and production-readiness for real-time AI voice experiences. Build datasets, define rigorous metrics, and measure model performance across high-impact voice AI tasks to guide development. Work closely with engineering to deploy models into production, monitor their performance, and ensure they remain fast, reliable, and accurate at scale. Build scalable pipelines to collect structured human feedback, benchmark subjective quality, and inform model iterations. Design and maintain machine learning infrastructure needed for fast experimentation, robust training, and continuous deployment.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.