AI Researcher
The AI Researcher will work across the model development loop including designing and testing architecture changes and training regimes for large language models, running controlled experiments at scale to isolate causal effects, studying failure modes in reasoning, generalisation, robustness, and representation, shaping objectives, data mixtures, and optimisation choices that influence model behaviour, building and refining evaluations that measure capability and reliability, analysing training dynamics using logs, metrics, and model outputs, collaborating with ML systems engineers on distributed training and training operations, and writing clear internal notes to translate experimental results into design decisions. The role requires substantial time spent in code, training runs, logs, and evaluation outputs aiming for clarity about what improves the model and why.
Senior Software Engineer
The Senior Software Engineer will build a powerful project innovating customer support by defining what an AI-first SaaS product looks like, addressing unique UI/UX, capabilities, and data model challenges of an AI-first company. They will lead ambitious and ambiguous projects involving strong technical decision-making, effective implementation, and incorporate product and design instincts. The engineer will work across the tech stack, collaborate with a top-caliber team, and mentor or lead less experienced engineers. They will participate in an engineering-led culture where everyone owns working with users and building a great product, taking ownership of challenging problems and defining and implementing solutions.
Product Manager, Models
As the Product Manager for Heidi's models platform, you will own the product strategy and roadmap for the platform including evaluation pipelines, fine-tuning infrastructure, model routing, and safety systems. Your responsibilities include prioritising your team's work across enablement requests, model safety and quality, and new capability bets; fixing platform issues that cause blocks for product teams; building evaluation tooling and fine-tuning workflows usable in clinical settings; deciding improvements based on clinician feedback, model quality signals, and product team needs; allocating engineering capacity among competing requests and clearly communicating deferrals; working with engineers on evaluation design, fine-tuning trade-offs, and model architecture decisions; setting model quality and safety targets based on clinical outcomes; consolidating duplicate infrastructure across product teams; and monitoring foundation model developments to adjust the roadmap accordingly. You will collaborate closely with engineers, researchers, product PMs, and clinical safety teams and report to product leadership. This is a platform role whose outputs impact every user-facing product at Heidi.
AI Software Engineer (Model Training)
You will build and maintain the systems that support large scale model training, including designing and maintaining distributed training pipelines for large language models, building data ingestion and preprocessing systems for large training datasets, developing tooling for experiment management, checkpointing, and reproducibility, monitoring and debugging long running training jobs across clusters, improving reliability and observability across the training stack, optimizing training throughput across compute, memory, and data pipelines, working closely with researchers to translate experimental ideas into training runs, and diagnosing failures across infrastructure, training loops, and data pipelines. The work requires spending time inside code, logs, dashboards, and experiment outputs to make large scale training reliable.
Mechanical Engineer & Python Expert - Freelance AI Trainer
Contributors may design graduate- and industry-level mechanical engineering problems grounded in real practice; evaluate AI-generated solutions for correctness, assumptions, and engineering logic; validate analytical or numerical results using Python (NumPy, SciPy, Pandas); improve AI reasoning to align with first principles and accepted engineering standards; and apply structured scoring criteria to assess multi-step problem solving.
Energy Engineering & Python Expert - Freelance AI Trainer
Contributors may design rigorous energy engineering problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python (NumPy, Pandas, SciPy); improve AI reasoning to align with industry-standard logic; and apply structured scoring criteria to multi-step problems.
Statistics Expert (Python) - Freelance AI Trainer
Contributors design rigorous statistics problems reflecting professional practice; evaluate AI solutions for correctness, assumptions, and constraints; validate calculations or simulations using Python libraries such as NumPy, Pandas, SciPy, Statsmodels, and Scikit-learn; improve AI reasoning to align with industry-standard logic; and apply structured scoring criteria to multi-step problems.
Senior Python Engineer - AI Testing Project (Freelance, Mindrift)
Create functional black box tests for large codebases in various source languages. Create and manage Docker environments to ensure 100% reproducible builds and test execution across different platforms. Monitor code coverage and configure automated scoring criteria to meet industry benchmark-level standards. Leverage large language models (LLMs) such as Roo Code and Claude to accelerate development cycles, automate repetitive tasks, and improve overall code quality.
Physics Researcher (Python) - Freelance AI Trainer
Contributors design rigorous physics problems reflecting professional practice, evaluate AI solutions for correctness, assumptions, and constraints, validate calculations or simulations using Python (NumPy, Pandas, SciPy), improve AI reasoning to align with industry-standard logic, and apply structured scoring criteria to multi-step problems.
Senior Python Engineer - AI Testing Project (Freelance, Mindrift)
Create functional black box tests for large codebases in various source languages, create and manage Docker environments to ensure 100% reproducible builds and test execution across different platforms, monitor code coverage and configure automated scoring criteria to meet industry benchmark-level standards, leverage LLMs like Roo Code and Claude to accelerate development cycles, automate repetitive tasks, and improve overall code quality.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.