Software Engineer, Inference Platform
The Software Engineer for the Inference Platform at Fluidstack will own inference deployments end-to-end, including initial configuration, performance tuning, production SLA maintenance, and incident response. They will drive measurable improvements in throughput, time-to-first-token (TTFT), and cost-per-token across diverse model families and customer workload patterns. Responsibilities include building and operating key-value (KV) cache and scheduling infrastructure to maximize utilization across concurrent requests, implementing and validating disaggregated prefill/decode pipelines, and managing Kubernetes-based orchestration at scale. The role requires profiling and resolving bottlenecks at compute, memory, and communication layers, instrumenting deployments for end-to-end observability, partnering with customers to translate model architectures, access patterns, and latency requirements into deployment configurations, and contributing to the inference platform architecture and roadmap focused on reducing deployment complexity, improving hardware utilization, and expanding support for new model classes and accelerators. Additionally, participation in an on-call rotation (up to one week per month) to maintain reliability and SLA commitments of production deployments is required.
Forward Deployed Engineer (FDE) - Seattle
Forward Deployed Engineers (FDEs) are responsible for leading complex end-to-end deployments of frontier models in production alongside strategic customers, owning discovery, technical scoping, system design, build, and production rollout. They operate across multiple deployments from the first prototype to stable production, build full-stack systems that deliver customer value, embed closely with customer teams to understand their needs and guide adoption, scope work, sequence delivery, remove blockers, make trade-offs between scope, speed, and quality, contribute directly in the code for clarity or progress, codify working patterns into reusable tools or playbooks, share feedback with Research and Product teams regarding model performance, and maintain team progress through clarity and follow-through.
Energy Engineering & Python Expert - Freelance AI Trainer
Design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy); improve AI reasoning to align with industry-standard logic; apply structured scoring criteria to multi-step problems.
Energy Engineering & Python Expert - Freelance AI Trainer
Design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy); improve AI reasoning to align with industry-standard logic; apply structured scoring criteria to multi-step problems.
Energy Engineering & Python Expert - Freelance AI Trainer
Contributors may design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python libraries such as NumPy, Pandas, and SciPy; improve AI reasoning to align with industry-standard logic; and apply structured scoring criteria to multi-step problems.
Energy Engineering & Python Expert - Freelance AI Trainer
Design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy); improve AI reasoning to align with industry-standard logic; apply structured scoring criteria to multi-step problems.
Energy Engineering & Python Expert - Freelance AI Trainer
Design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy); improve AI reasoning to align with industry-standard logic; apply structured scoring criteria to multi-step problems.
Energy Engineering & Python Expert - Freelance AI Trainer
Contributors may design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy); improve AI reasoning to align with industry-standard logic; and apply structured scoring criteria to multi-step problems.
Software Engineer
Design a collaborative "Multiplayer Computer" that lets humans and AI agents work together on shared shells, filesystems, and state—conflict-free and in real time; build high-throughput backend applications and services; create tooling that helps AI systems minimize mistakes through static analysis and deterministic techniques; develop infrastructure (frontend & backend) that empowers product engineers to rapidly ship delightful user experiences; support sophisticated user interfaces, including terminals, code editors, window-management systems, and innovative experiences that require both creativity and algorithmic skill; and bridge the gap between prompt engineers and frontend engineers. Telecommuting is permitted with in-office presence required three times a week (Monday, Wednesday, Friday), with only incidental domestic travel required.
Software Engineer, GenAI
Design and build GenAI systems that turn large language models (LLMs) into composable, dependable tools, leveraging retrieval, tool use, agentic reasoning, and structured outputs. Collaborate with ML and infrastructure engineers to scale and optimize GenAI workflows, manage latency, context windows, and model choice. Write high-quality, modular code that handles failure gracefully, is flexible to change, and easy to iterate on. Own major architectural decisions regarding workflow architecture, data flow, caching, and structuring generative outputs. Drive rigorous evaluation by building benchmark datasets, developing automated and human-in-the-loop evaluation frameworks, designing experiments to identify failure modes and edge cases, conducting A/B tests to inform deployment, and using clinician feedback to guide model improvement. Prototype rapidly with new models, open-source tools, and novel prompting techniques. Own the end-to-end productionization of LLM workflows: deploy models in low-latency, high-uptime environments, build monitoring and observability systems, implement post-processing guardrails, and manage workflow versioning.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.