Evaluation Scenario Writer - AI Agent Testing Specialist
Design realistic and structured evaluation scenarios for LLM-based agents by creating test cases that simulate human-performed tasks and defining gold-standard behavior to compare agent actions against. Create structured test cases that simulate complex human workflows, define gold-standard behavior and scoring logic to evaluate agent actions, analyze agent logs, failure modes, and decision paths, work with code repositories and test frameworks to validate scenarios, iterate on prompts, instructions, and test cases to improve clarity and difficulty, and ensure that scenarios are production-ready, easy to run, and reusable.
MCP & Tools Python Developer - Agent Evaluation Infrastructure
Developing and maintaining MCP-compatible evaluation servers; implementing logic to check agent actions against scenario definitions; creating or extending tools that writers and QAs use to test agents; working closely with infrastructure engineers to ensure compatibility; occasionally helping with test writing or debug sessions when needed.
Head of Machine Learning (Remote - UK/Europe)
The Head of Machine Learning will manage 9 Machine Learning Engineers, including 3 Team Leaders, with responsibilities spanning People Management and project coordination. They will understand and coordinate the strategic direction of ML team projects, manage dependencies, allocate resources, and ensure alignment with business and product goals. This includes contributing to system architecture and development by empowering the team via 1:1s, code reviews, and discussions to deliver impactful features. The role involves leading and nurturing the ML engineering team through coaching and mentorship, leading team OKR discussions, coordinating projects, facilitating meetings, and collaborating with the CTO, Platform, and Product Managers to align team priorities with company OKRs. They will work with the People team on recruiting and onboarding talent, act as a sounding board for the team, support identifying and resolving bottlenecks and blockers to enable faster iteration, drive ML system development and deployment, optimize tools and infrastructure for efficiency, and promote a culture of collaboration and continuous learning while mentoring team members.
Forward Deployed Engineer - Paris
Lead customer discovery and design sessions to map business processes, identify automation opportunities, and define solution architecture. Design, build, and deploy integrations using low/no-code platforms (Zapier, Make, n8n, Workato) and CRM automation tools (HubSpot Workflows, Salesforce Flow) with API connectors. Collaborate with Engineering to validate technical feasibility, resolve blockers, and share field learnings that inform product improvements. Configure and optimize the AI Agent by defining intents, prompts, actions, guardrails, and performance metrics. Manage complex, cross-functional deployments by defining timelines, aligning stakeholders, ensuring accountability, and delivering on time and within scope. Create scalable models and reusable frameworks such as templates, playbooks, and reference architectures to expedite future projects. Champion continuous learning and enablement through training peers, running internal workshops, and documenting best practices to raise the technical bar across the team. Run global, targeted outbound campaigns within the existing customer base to generate pipeline and accelerate adoption, working closely with the customer marketing team. Collaborate with GTM leadership to embed routines and cadences that drive accountability for new product pipeline, forecast accuracy, and performance tracking. Own regional top-line targets for assigned products by collaborating with AEs and AMs who hold add-on quotas. Act as an internal product owner within the GTM function by defining product-specific MRR strategies, coordinating cross-functional support, and ensuring delivery of the AI-enabled communication platform. Collaborate with Product and PMM to shape the AI Voice Agent roadmap based on customer needs, integration insights, and field learnings. Drive internal and external product education including enablement for System Integrators and channel partners. Maintain deep awareness of AI and CX industry trends to keep Aircall's positioning competitive and feed insights back into product and GTM strategies.
Product Analyst Intern
Help users discover and master the Dataiku platform through user training, office hours, demos, and ongoing consultative support. Analyse and investigate various kinds of data and machine learning applications across industries and use cases. Provide strategic input to the customer and account teams that help customers achieve success. Scope and co-develop production-level data science projects with customers. Mentor and help educate data scientists and other customer team members to aid in career development and growth.
Forward Deployed Engineer
The Forward Deployed Engineer will work closely with customers from onboarding through ongoing usage to integrate and optimize HappyRobot's AI solutions. Responsibilities include building new features, MVPs, and scalable solutions that directly impact customer outcomes, using full-stack development with React, TypeScript, Node.js, and Python. They will design, implement, and iterate on AI/ML applications such as LLM prompting, tuning voices, and transcribers to optimize use cases. The engineer will manage APIs and integrations with third-party systems to ensure seamless customer functionality. Collaboration with Product, Engineering, and Customer Success teams is required to deliver tailored solutions. They must continuously iterate and improve AI solutions based on customer feedback and evolving requirements, while prioritizing and managing multiple projects under tight deadlines with high-quality results.
Future AI Global leaders - Applied Science
Run pre-training, post-training and deploy state of the art models on clusters with thousands of GPUs. Design, train, and deploy state-of-the-art models in specialized fields like time series, edge devices, quantization, cybersecurity, multimodal or robotics. Generate and curate data for pre-training and post-training, work on evaluations and ensure the model's performance exceeds expectations. Develop tools and frameworks to facilitate data generation, model training, evaluation and deployment. Collaborate with cross-functional teams to tackle complex use cases using agents and foundational models. Manage research projects and communications with client research teams.
Future AI Global leaders - Applied AI & Engineering
Work on state-of-the-art Generative AI applications, ranging from consumer products to industrial use cases. Collaborate closely with research, product, and engineering teams to develop complex, high-impact, and scalable AI use cases. Assist in the deployment of AI models, including fine-tuning, Retrieval-Augmented Generation (RAG), and Agentic workflows. Engage in ongoing training and development to stay up-to-date with the latest advancements in AI technology. Receive mentorship from experienced AI professionals and contribute to several projects.
First-Line Supervisors of Food Preparation and Serving Workers - AI Trainer (Contract)
The responsibilities include evaluating what AI models produce related to the field of food preparation and serving work, assessing content related to the field of work, delivering clear and structured feedback to improve the AI model's understanding of workplace tasks and language, developing prompts for AI models that reflect the field, and evaluating AI responses. The work is performed remotely and asynchronously with flexible hours, and involves leveraging professional experience in food preparation and serving supervision to train AI models.
Senior Business Psychologist
Apply psychological theories and methodologies to improve assessment and selection procedures; conduct research and analysis to identify key psychological factors and competencies for various job roles and industries; create custom assessments and competency frameworks tailored to client needs and psychometric standards; ensure the test bank is diverse, inclusive, and aligned with future work competencies; build strong client relationships throughout the assessment lifecycle including needs analysis and results interpretation; collaborate closely with Onboarding, Customer Success, Psychometricians, and Assessment Operations to deliver high-quality, scalable, and impactful assessments; communicate validation studies and statistical insights clearly for non-technical audiences; share insights externally on AI technology integration in assessments; ensure test development and AI-driven hiring systems meet ethical and compliance standards; assist in training internal teams in psychological best practices for talent acquisition; stay current on I/O psychology and AI research to apply insights to product development; contribute expert input for client proposals to ensure scientific rigor and innovation; represent the company by presenting actionable insights at industry conferences.
Access all 4,256 remote & onsite AI jobs.
Frequently Asked Questions
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.