⚠️ Sorry, this job is no longer available.

AI Product & Operation Jobs

Latest roles in AI Product & Operation, reviewed by real humans for quality and clarity.

All Jobs

Showing 6179  of 79 jobs
Genies.jpg

Machine Learning Engineer: ML Infra and Model Optimization

Genies
USD
50
40
-
50
US.svg
United States
Intern
Remote
false
Genies is an avatar technology company powering the next era of interactive digital identity through AI companions. With the Avatar Framework and intuitive creation tools, Genies enables developers, talent, and creators to generate and deploy game-ready AI companions. The company’s technology stack supports full customization, AI-generated fashion and props, and seamless integration of user-generated content (UGC). Backed by investors including Bob Iger, Silver Lake, BOND, and NEA, Genies’ mission is to become the visual and interactive layer for the LLM-powered internet. About the opportunity We are looking for a Backend Software Engineer Intern (LLM) to join our AI Engineering Team based in San Francisco, CA or Los Angeles, CA. The team is responsible for developing the backend infrastructure powering the Genies Avatar AI framework. You will contribute to the next generation of AI 3D avatar entertainment experience, and be involved with designing, coding, and testing software according to the requirements and system plans. You will be expected to collaborate with senior engineers and other team members to develop software solutions, troubleshoot issues, and maintain the quality of our software. You will also be responsible for documenting their work for future reference and improvement. Our internship program has a minimum duration of 12 weeks. Key Responsibilities Develop and deploy LLM agent systems within our AI-powered avatar framework. Design and implement scalable and efficient backend systems to support AI applications Collaborate with AI and NLP experts to integrate LLM, and LLM-based systems and algorithms into our avatar ecosystem. Work with Docker, Kubernetes, and AWS for AI model deployment and scalability Contribute to code reviews, debugging, and testing to ensure high-quality deliverables. Minimum Qualifications Currently pursuing OR a recent graduate from a  Master's degree or Bachelor's in Computer Science, Engineering, Machine Learning, or related field. Course or internship experience related to the following areas : Operating Systems, Data Structures & Algorithms, Machine Learning Strong programming skills in Python, Java, or C++ Excellent written and verbal communication skills Basic understanding of AI/LLM concepts and enthusiasm for learning advanced techniques. Preferred Qualifications Experience in building ML /LLM powered software systems. Previous Computer Science/Software Engineering Internship experience Solid understanding of LLM agents, retrieval-augmented generation (RAG), and prompt engineering. Experience with AWS, Docker and Kubernetes Experience with  CI/CD pipelines Experience with API design, schema design Here's why you'll love working at Genies: Salary $40-$50 per hour. You'll work with a team that you’ll be able to learn from and grow with, including support for your own professional development You'll be at the helm of your own career, shaping it with your own innovative contributions to a nascent team and product with flexible hours and a hybrid(office+home) policy You'll enjoy the culture and perks of a startup, with the stability of being well funded  Flexible paid time off, sick time, and paid company holidays, in addition to paid parental leave, bereavement leave, and jury duty leave for full-time employees Health & wellness support through programs such as monthly wellness reimbursement   Choice of MacBook or windows laptop Genies is an equal opportunity employer committed to promoting an inclusive work environment free of discrimination and harassment. We value diversity, inclusion, and aim to provide a sense of belonging for everyone. 
No items found.
Apply
Hidden link
Firecrawl.jpg

AI Engineer (Partnerships)

Firecrawl
USD
180000
130000
-
180000
US.svg
United States
Full-time
Remote
false
Salary Range: $130,000-$180,000/year (Range shown is for U.S.-based employees in San Francisco, CA. Compensation outside the U.S. is adjusted fairly based on your country's cost of living.)Equity Range: Up to 0.10%Location: San Francisco, CA (Hybrid) OR RemoteJob Type: Full-Time (SF) OR Contract (Remote)Experience: 2+ yearsAbout FirecrawlFirecrawl is the easiest way to extract data from the web. Developers use us to reliably convert URLs into LLM-ready markdown or structured data with a single API call. In just a year, we've hit millions in ARR and 70k+ GitHub stars by building the fastest way for developers to get LLM-ready data.We're a small, fast-moving, technical team building essential infrastructure for the AI era. We value autonomy, clarity, and shipping fast.About the RoleWe're looking for an AI Engineer to own the technical side of our partnerships motion. Your mission: make Firecrawl the default web data API that AI agents and tools reach for. You'll work directly with emerging AI-native companies - writing prompts, building evals, and ensuring Firecrawl integrations just work.What You'll DoCraft and iterate on prompts that help AI agents reliably choose and use Firecrawl for web data tasksBuild evaluation frameworks to test prompts across different models, use cases, and edge cases - then iterate relentlessly based on resultsBe the technical partner contact in Slack channels, helping partners implement Firecrawl into their products and troubleshoot issues in real-timeTest obsessively - new models drop, agent architectures evolve, and you're on top of how Firecrawl performs across all of themCreate integration guides and templates that make it dead simple for partners to ship Firecrawl-powered featuresIdentify new partnership opportunities by understanding how AI tools are using web data and where Firecrawl fitsCollaborate with Product and Engineering to surface partner feedback and shape the roadmapWho You Are2+ years working with LLMs - you've written production prompts, understand model quirks, and know what makes agents tickYou ship code. Python, TypeScript, whatever - you can build evals, write scripts, and prototype integrations quicklyYou're a clear communicator who can help non-technical partners implement technical solutionsYou thrive in ambiguity - partnerships are messy, timelines shift, and you figure it outYou're responsive and reliable - when a partner pings in Slack, you're on itBonus: You've worked at an AI-native company or have experience with agent frameworks (LangChain, CrewAI, OpenAI Agents SDK, etc.)Bonus: You've done developer relations, solutions engineering, or technical partnerships beforeBenefits & PerksAvailable to all employeesSalary that makes sense - $130,000-180,000/year OTE (U.S.-based), based on impact, not tenureOwn a piece - Up to 0.10% equity in what you're helping buildGenerous PTO - 15 days mandatory, anything after 24 days, just ask (holidays excluded); take the time you need to rechargeParental leave - 12 weeks fully paid, for moms and dadsWellness stipend - $100/month for the gym, therapy, massages, or whatever keeps you humanLearning & Development - Expense up to $150/year toward anything that helps you grow professionallyTeam offsites - A change of scenery, minus the trust fallsSabbatical - 3 paid months off after 4 years, do something fun and newAvailable to US-based full-time employeesFull coverage, no red tape - Medical, dental, and vision (100% for employees, 50% for spouse/kids) - no weird loopholes, just care that worksLife & Disability insurance - Employer-paid short-term disability, long-term disability, and life insurance - coverage for life's curveballsSupplemental options - Optional accident, critical illness, hospital indemnity, and voluntary life insurance for extra peace of mindDoctegrity telehealth - Talk to a doctor from your couch401(k) plan - Retirement might be a ways off, but future-you will thank youPre-tax benefits - Access to FSAs and commuter benefits (US-only) to help your wallet out a bitPet insurance - Because fur babies are family tooAvailable to SF-based employeesSF HQ perks - Snacks, drinks, team lunches, intense ping pong, and peak startup energyE-Bike transportation - A loaner electric bike to get you around the city, on usInterview ProcessApplication ReviewIntro Chat (~25 min)Technical Deep Dive (~45 min)Paid Work Trial (1-2 weeks)DecisionIf you're an AI engineer who lives in Slack, obsesses over prompt quality, and wants to make Firecrawl the infrastructure layer for AI agents everywhere - let's talk.
No items found.
Apply
Hidden link
Cohere Health.jpg

Member of Technical Staff, Senior/Staff MLE

Cohere
0
0
-
0
US.svg
United States
Full-time
Remote
false
Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.Join us on our mission and shape the future!Why This Role Is DifferentThis is not a typical “Applied Scientist” or “ML Engineer” role. As a Member of Technical Staff, Applied ML, you will:Work directly with enterprise customers on problems that push LLMs to their limits. You’ll rapidly understand customer domains, design custom LLM solutions, and deliver production-ready models that solve high-value, real-world problems.Train and customize frontier models — not just use APIs. You’ll leverage Cohere’s full stack: CPT, post-training, retrieval + agent integrations, model evaluations, and SOTA modeling techniques.Influence the capabilities of Cohere’s foundation models. Techniques, datasets, evaluations, and insights you develop for customers will directly shape the next generation of Cohere’s frontier models.Operate with an early-startup level of ownership inside a frontier-model company. This role combines the breadth of an early-stage CTO with the infrastructure and scale of a deep-learning lab.Wear multiple hats, set a high technical bar, and define what Applied ML at Cohere becomes. Few roles in the industry combine application, research, customer-facing engineering, and core-model influence as directly as this one.What You’ll DoTechnical Leadership & Solution DesignLead the design and delivery of custom LLM solutions for enterprise customers.Translate ambiguous business problems into well-framed ML problems with clear success criteria and evaluation methodologies.Modeling, Customization & Foundations ContributionBuild custom models using Cohere’s foundation model stack, CPT recipes, post-training pipelines (including RLVR), and data assets.Develop SOTA modeling techniques that directly enhance model performance for customer use-cases.Contribute improvements back to the foundation-model stack — including new capabilities, tuning strategies, and evaluation frameworks.Customer-Facing Technical ImpactWork closely with enterprise customers to identify high-value opportunities where LLMs can unlock transformative impact.Provide technical leadership across discovery, scoping, modeling, deployment, agent workflows, and post-deployment iteration.Establish evaluation frameworks and success metrics for custom modeling engagements.Team Mentorship & Organizational ImpactMentor engineers across distributed teams.Drive clarity in ambiguous situations, build alignment, and raise engineering and modeling quality across the organization.You May Be a Good Fit If You Have:Technical FoundationsStrong ML fundamentals and the ability to frame complex, ambiguous problems as ML solutions.Fluency with Python and core ML/LLM frameworks.Experience working with large-scale datasets and distributed training or inference pipelines.Understanding of LLM architectures, tuning techniques (CPT, post-training), and evaluation methodologies.Demonstrated ability to meaningfully shape LLM performance.Experience & LeadershipExperience engaging directly with customers or stakeholders to design and deliver ML-powered solutions.A track record of technical leadership at a team level.A broad view of the ML research landscape and a desire to push the state of the art.MindsetBias toward action, high ownership, and comfort with ambiguity.Humility and strong collaboration instincts.A deep conviction that AI should meaningfully empower people and organizations.Join UsThis is a pivotal moment in Cohere’s history. As an MTS in Applied ML, you will define not only what we build — but how the world experiences AI. If you're excited about building custom models, solving generational problems for global organizations, and shaping frontier-model capabilities, we’d love to meet you.If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.Full-Time Employees at Cohere enjoy these Perks:🤝 An open and inclusive culture and work environment 🧑‍💻 Work closely with a team on the cutting edge of AI research 🍽 Weekly lunch stipend, in-office lunches & snacks🦷 Full health and dental benefits, including a separate budget to take care of your mental health 🐣 100% Parental Leave top-up for up to 6 months🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement🏙 Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend✈️ 6 weeks of vacation (30 working days!)
No items found.
Apply
Hidden link
Cohere Health.jpg

Member of Technical Staff, MLE

Cohere
0
0
-
0
US.svg
United States
Full-time
Remote
false
Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.Join us on our mission and shape the future!Why This Role Is DifferentThis is not a typical “Applied Scientist” or “ML Engineer” role. As a Member of Technical Staff, Applied ML, you will:Work directly with enterprise customers on problems that push LLMs to their limits. You’ll rapidly understand customer domains, design custom LLM solutions, and deliver production-ready models that solve high-value, real-world problems.Train and customize frontier models — not just use APIs. You’ll leverage Cohere’s full stack: CPT, post-training, retrieval + agent integrations, model evaluations, and SOTA modeling techniques.Influence the capabilities of Cohere’s foundation models. Techniques, datasets, evaluations, and insights you develop for customers will directly shape the next generation of Cohere’s frontier models.Operate with an early-startup level of ownership inside a frontier-model company. This role combines the breadth of an early-stage CTO with the infrastructure and scale of a deep-learning lab.Wear multiple hats, set a high technical bar, and define what Applied ML at Cohere becomes. Few roles in the industry combine application, research, customer-facing engineering, and core-model influence as directly as this one.What You’ll DoTechnical Leadership & Solution DesignContribute to the design and delivery of custom LLM solutions for enterprise customers.Translate ambiguous business problems into well-framed ML problems with clear success criteria and evaluation methodologies.Modeling, Customization & Foundations ContributionBuild custom models using Cohere’s foundation model stack, CPT recipes, post-training pipelines (including RLVR), and data assets.Develop SOTA modeling techniques that directly enhance model performance for customer use-cases.Contribute improvements back to the foundation-model stack — including new capabilities, tuning strategies, and evaluation frameworks.Customer-Facing Technical ImpactWork as part of Cohere’s customer facing MLE team to identify high-value opportunities where LLMs can unlock transformative impact to our enterprise customers.You May Be a Good Fit If You Have:Technical FoundationsStrong ML fundamentals and the ability to frame complex, ambiguous problems as ML solutions.Fluency with Python and core ML/LLM frameworks.Experience working with (or the ability to learn) large-scale datasets and distributed training or inference pipelines.Understanding of LLM architectures, tuning techniques (CPT, post-training), and evaluation methodologies.Demonstrated ability to meaningfully shape LLM performance.Experience & LeadershipA broad view of the ML research landscape and a desire to push the state of the art.MindsetBias toward action, high ownership, and comfort with ambiguity.Humility and strong collaboration instincts.A deep conviction that AI should meaningfully empower people and organizations.Join UsThis is a pivotal moment in Cohere’s history. As an MTS in Applied ML, you will define not only what we build — but how the world experiences AI. If you're excited about building custom models, solving generational problems for global organizations, and shaping frontier-model capabilities, we’d love to meet you.If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.Full-Time Employees at Cohere enjoy these Perks:🤝 An open and inclusive culture and work environment 🧑‍💻 Work closely with a team on the cutting edge of AI research 🍽 Weekly lunch stipend, in-office lunches & snacks🦷 Full health and dental benefits, including a separate budget to take care of your mental health 🐣 100% Parental Leave top-up for up to 6 months🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement🏙 Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend✈️ 6 weeks of vacation (30 working days!)
No items found.
Apply
Hidden link
Cohere Health.jpg

Applied AI Engineer – Agentic Workflows

Cohere
0
0
-
0
US.svg
United States
Full-time
Remote
false
Who are we?Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, RAG, and agents. We believe that our work is instrumental to the widespread adoption of AI.We obsess over what we build. Each one of us is responsible for contributing to increasing the capabilities of our models and the value they drive for our customers. We like to work hard and move fast to do what’s best for our customers.Cohere is a team of researchers, engineers, designers, and more, who are passionate about their craft. Each person is one of the best in the world at what they do. We believe that a diverse range of perspectives is a requirement for building great products.Join us on our mission and shape the future!Why this role?We’re a fast-growing startup building production-grade AI agents for enterprise customers at scale. We’re looking for Applied AI Engineers who can own the design, build, and deployment of agentic workflows powered by Large Language Models (LLMs)—from early prototypes to production-grade AI agents, to deliver concrete business value in enterprise workflows.In this role, you’ll work closely with customers on real-world business problems, often building first-of-their-kind agent workflows that integrate LLMs with tools, APIs, and data sources. While our pace is startup-fast, the bar is enterprise-high: agents must be reliable, observable, safe, and auditable from day one.You’ll collaborate closely with customers, product, and platform teams, and help shape how agentic systems are built, evaluated, and deployed at scale.What You’ll DoWork with enterprise customers and internal teams to turn business workflows into scalable, production-ready agentic AI systems.Design and build LLM-powered agents that reason, plan, and act across tools and data sources with enterprise-grade reliability.Balance rapid iteration with enterprise requirements, evolving prototypes into stable, reusable solutions.Define and apply evaluation and quality standards to measure success, failures, and regressions.Debug real-world agent behavior and systematically improve prompts, workflows, tools, and guardrails.Contribute to shared frameworks and patterns that enable consistent delivery across customers.Required Skills & ExperienceBachelor’s degree in Computer Science or a related technical field.Strong programming skills in Python and/or JavaScript/TypeScript.3+ years of experience building and shipping production software; 2+ years working with LLMs or AI APIs.Hands-on experience with modern LLMs (e.g., GPT, Claude, Gemini), vector databases, and agent/orchestration frameworks (e.g., LangChain, LangGraph, LlamaIndex, or custom solutions).Practical experience with RAG, agent workflows, evaluation, and performance optimization.Strong agent design skills, including prompt engineering, tool use, multi-step agent workflows (e.g. ReAct), and failure handling.Ability to reason about and balance trade-offs between customization and reuse, as well as autonomy, control, cost, latency, and risk.Strong communication skills and experience leading technical discussions with customers or partners.Nice-to-HaveExperience working in a fast-moving startup environment.Prior work delivering AI or automation solutions to enterprise customers.Familiarity with human-in-the-loop workflows, fine-tuning, or LLM evaluation techniques.Experience with cloud deployment and production operations for AI systems.Background in applied ML, NLP, or decision systems.Additional RequirementsStrong written and verbal communication skills.Ability and interest to travel up to 25%, flexible.Why Join UsBuild production-grade AI agents used in real enterprise workflows.Operate at scale while retaining end-to-end ownership.Work on hard problems in agent design, evaluation, and reliability.Shape shared platforms and standards, not just individual features.Move fast with a high bar for quality, safety, and reliability.If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. Should you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.Full-Time Employees at Cohere enjoy these Perks:🤝 An open and inclusive culture and work environment 🧑‍💻 Work closely with a team on the cutting edge of AI research 🍽 Weekly lunch stipend, in-office lunches & snacks🦷 Full health and dental benefits, including a separate budget to take care of your mental health 🐣 100% Parental Leave top-up for up to 6 months🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement🏙 Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend✈️ 6 weeks of vacation (30 working days!)
No items found.
Apply
Hidden link
Warp.jpg

Forward Deployed Engineer

Wrap
USD
275000
220000
-
275000
US.svg
United States
CA.svg
Canada
Full-time
Remote
false
Warp: We're Building the Platform for Agentic Development Warp began with the vision of reimagining one of the fundamental dev tools—the terminal—to make it more usable and powerful for all developers. As AI has advanced, Warp has evolved beyond its terminal roots into the platform for Agentic Development: a workbench for dispatching agents to code, deploy, and debug production software. With over 700k active developers and revenue that has grown over 20x this year so far, Warp is now one of the fastest growing startups in the exploding AI development space. We believe that soon developers will be “tech leads” for groups of agents; rather than opening a code editor to write code or a terminal to write commands, they will open Warp and prompt their computer to build features, fix bugs, and diagnose production issues. With its starting point as a reimagined command line, Warp is well-positioned to support agent-first workflows: It sits at the lowest level in the dev stack, has access to all of a developer’s context, and is set up for multitasking and long-running processes. In addition, Warp has state-of-the-art code editing features and built-in team knowledge sharing. It’s the right interface for the agentic future. Our mission has remained the same even as AI has advanced: to empower developers to ship better software more quickly, freeing them to focus on the creative and rewarding aspects of their work. For more information on our team and culture, we highly recommend reading our How We Work.Why this role? Warp is fundamentally changing how developers interact with technology, moving from coding "by hand" to working "by prompt." We're hiring an Applied AI Engineer to accelerate this transformation.  You’ll be instrumental in developing innovative predictive AI features leveraging our unique user-generated content and team data. This applied AI Engineer role emphasizes product development and implementation, distinguishing it from a purely research-oriented position. You will be directly reporting to John Rector, our Head of Engineering. Your work will supercharge our natural language understanding, enhance predictive accuracy for commands, and build personalized, specialized AI agents. By continuously refining our AI-driven suggestions and agent interactions, you'll empower hundreds of thousands of developers globally to ship better software faster, significantly impacting Warp's core product. As our first Applied AI Engineer, you will… Design, build, and deploy predictive AI features, including natural language detection, autosuggestions, and intelligent prompt recommendations. Leverage Warp’s extensive user-generated content and team data to continuously refine AI prediction and personalization. Drive substantial improvements in code generation quality, including code completions, diff applications, and SWEbench performance. Implement and iterate specialized agents tailored for specific developer workflows and use cases. Optimize AI models through fine-tuning, advanced prompt engineering, and robust, data-driven feedback loops. Improve context retrieval systems, enabling Warp agents to retain and utilize memory effectively. Collaborate closely with product and engineering teams, rapidly shipping iterative improvements into production. Continuously elevate the user experience by refining interactions between developers and Warp AI. You may be a good fit if… You have at least 5 years of experience applying AI/ML research to build and ship user-facing, production-grade products.  You possess a strong software engineering background You have experience in fine-tuning and deploying large language models and predictive systems. You're adept at prompt engineering and able to craft and iterate on prompts to optimize AI outputs and agent performance. You’re comfortable building scalable data-driven feedback loops to measure and improve model accuracy and user satisfaction. You thrive in a fast-paced environment, prioritizing shipping high-quality improvements over pure theoretical research. Bonus points if you’ve previously built or significantly enhanced developer-facing AI products, particularly those involving command-line or coding assistance. At Warp, we are dedicated to building a diverse, inclusive, and authentic workplace. If you’re excited about this role but your past experience doesn’t align perfectly with every qualification, we encourage you to apply anyway! Most of us are learning new skills for the first time (like our engineers learning to program Warp in Rust). You might be just the right candidate for this or other roles. Feeling playful? Try our optional hiring challenge and submit your answers with your application: Warp Hiring Challenge Salary Transparency Total compensation at Warp consists of two parts: 1) a competitive base salary, and 2) meaningful equity. When we find the right person, we try to put our best foot forward with an offer that excites you. The budgeted compensation range for this role is targeted at $220,000 – $275,000. Final total compensation depends on experience and expertise. In addition to salary, all employees receive further compensation in the form of equity in the company. This is a meaningful stock option grant with a four-year vesting period and one-year cliff. Your equity is where most of the significant upside potential is. Comparing startup equity is always a bit tricky, so we’re happy to walk you through different valuation scenarios at the offer stage in order to help paint a clearer picture of the upside. Final total compensation is determined by multiple factors including your experience and expertise and may vary from the amounts listed above. What We Offer Competitive Salary & Meaningful Equity – we will stretch to get the right talent on board Full Medical, Dental, and Vision Benefits for employees (80% coverage for dependents) Flexible remote-first culture, with optional office spaces in NYC and SF for folks who want to work together IRL  Pre-tax FSA Health Savings Plan Pre-tax Commuter Benefit 20-days of Paid Time Off Unlimited Sick Time Off 12 US Holidays 16 weeks of paid Parental Leave for both birthing and non-birthing parents Twice-a-year company retreats Monthly gym and internet stipend Guideline 401(k) Complimentary OneMedical membership Individuals seeking employment at Warp are considered without regards to race, color, religion, national origin, age, sex, marital status, ancestry, physical or mental disability, veteran status, gender identity, or sexual orientation.About Warp We are a company run by product-first builders, building a core product for all developers. We are committed to understanding our users deeply. We will ultimately build the best product and business if that team includes developers and designers from a wide range of backgrounds. The early team comes from Google, Dropbox, Gem, LinkedIn, and Facebook. We are looking for passionate individuals to join us and help bring Warp to the world. We value honesty, humility, and pragmatism, and our core product principle is focusing on the user. If you’re interested in learning more about our company values and the culture of our engineering team, please take a look at our internal 'How We Work' guide. We’re very fortunate to be backed by a great group of venture capital firms. In August 2023, we announced a $50M Series B funding round ($73M total raised), led by Sequoia Capital. Our other investors include Google Ventures, Neo, and Box Group. We are also backed by a network of passionate angels, including Dylan Field (Co-Founder and CEO, Figma), Elad Gil (early investor in Airbnb, Pinterest, Stripe, and Square), Jeff Weiner (Executive Chairman and Ex-CEO, LinkedIn), Marc Benioff (Founder and CEO, Salesforce), and Sam Altman (Co-Founder & CEO, OpenAI). The Product Here's our latest demo showing some of our current features…
No items found.
Apply
Hidden link
Together AI.jpg

Software Engineer Intern (Summer 2026)

Together AI
USD
230000
160000
-
230000
No items found.
Full-time
Remote
false
About the Role As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. You will be working on developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that go beyond state-of-the-arts. As a team, we have been pushing on all these fronts (e.g., Hyena, FlashAttention, FlexGen, and RedPajama). You will also work closely together with the machine learning systems, NLP/CV, and engineering teams for inspiration of research problems and to jointly work on solutions to practical challenges. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning, ensuring that we stay at the cutting edge of open model innovations. Requirements Strong background in Machine Learning Experience in building state-of-the-art models at large scale Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field Responsibilities Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art Take advantage of the computational infrastructure of Together to create the best open models in their class Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base Stay up-to-date with the latest advancements in machine learning About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy    
No items found.
Apply
Hidden link
Together AI.jpg

Security Engineer Intern (Summer 2026)

Together AI
USD
230000
160000
-
230000
No items found.
Full-time
Remote
false
About the Role As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. You will be working on developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that go beyond state-of-the-arts. As a team, we have been pushing on all these fronts (e.g., Hyena, FlashAttention, FlexGen, and RedPajama). You will also work closely together with the machine learning systems, NLP/CV, and engineering teams for inspiration of research problems and to jointly work on solutions to practical challenges. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning, ensuring that we stay at the cutting edge of open model innovations. Requirements Strong background in Machine Learning Experience in building state-of-the-art models at large scale Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field Responsibilities Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art Take advantage of the computational infrastructure of Together to create the best open models in their class Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base Stay up-to-date with the latest advancements in machine learning About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy    
No items found.
Apply
Hidden link
Together AI.jpg

Frontier Agents Intern (Summer 2026)

Together AI
USD
230000
160000
-
230000
No items found.
Intern
Remote
false
About the Role As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. You will be working on developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that go beyond state-of-the-arts. As a team, we have been pushing on all these fronts (e.g., Hyena, FlashAttention, FlexGen, and RedPajama). You will also work closely together with the machine learning systems, NLP/CV, and engineering teams for inspiration of research problems and to jointly work on solutions to practical challenges. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning, ensuring that we stay at the cutting edge of open model innovations. Requirements Strong background in Machine Learning Experience in building state-of-the-art models at large scale Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field Responsibilities Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art Take advantage of the computational infrastructure of Together to create the best open models in their class Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base Stay up-to-date with the latest advancements in machine learning About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy    
No items found.
Apply
Hidden link
Together AI.jpg

Product Marketing Intern (Summer 2026)

Together AI
USD
230000
160000
-
230000
US.svg
United States
Full-time
Remote
false
About the Role As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. You will be working on developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that go beyond state-of-the-arts. As a team, we have been pushing on all these fronts (e.g., Hyena, FlashAttention, FlexGen, and RedPajama). You will also work closely together with the machine learning systems, NLP/CV, and engineering teams for inspiration of research problems and to jointly work on solutions to practical challenges. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning, ensuring that we stay at the cutting edge of open model innovations. Requirements Strong background in Machine Learning Experience in building state-of-the-art models at large scale Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field Responsibilities Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art Take advantage of the computational infrastructure of Together to create the best open models in their class Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base Stay up-to-date with the latest advancements in machine learning About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy    
No items found.
Apply
Hidden link
Together AI.jpg

Research Intern, Model Shaping (Summer 2026)

Together AI
USD
230000
160000
-
230000
US.svg
United States
Full-time
Remote
false
About the Role As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. You will be working on developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that go beyond state-of-the-arts. As a team, we have been pushing on all these fronts (e.g., Hyena, FlashAttention, FlexGen, and RedPajama). You will also work closely together with the machine learning systems, NLP/CV, and engineering teams for inspiration of research problems and to jointly work on solutions to practical challenges. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning, ensuring that we stay at the cutting edge of open model innovations. Requirements Strong background in Machine Learning Experience in building state-of-the-art models at large scale Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field Responsibilities Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art Take advantage of the computational infrastructure of Together to create the best open models in their class Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base Stay up-to-date with the latest advancements in machine learning About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy    
No items found.
Apply
Hidden link
Together AI.jpg

Research Intern RL & Post-Training Systems, Turbo (Summer 2026)

Together AI
USD
230000
160000
-
230000
No items found.
Full-time
Remote
false
About the Role As an AI Researcher, you will be pushing the frontier of foundation model research and make them a reality in products. You will be working on developing novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that go beyond state-of-the-arts. As a team, we have been pushing on all these fronts (e.g., Hyena, FlashAttention, FlexGen, and RedPajama). You will also work closely together with the machine learning systems, NLP/CV, and engineering teams for inspiration of research problems and to jointly work on solutions to practical challenges. You will also interact with customers to help them in their journey of training, using, and improving their AI applications using open models. Your research skills will be vital in staying up-to-date with the latest advancements in machine learning, ensuring that we stay at the cutting edge of open model innovations. Requirements Strong background in Machine Learning Experience in building state-of-the-art models at large scale Experience in developing algorithms in areas such as optimization, model architecture, and data-centric optimizations Passion in contributing to the open model ecosystem and pushing the frontier of open models Excellent problem-solving and analytical skills Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field Responsibilities Develop novel architectures, system optimizations, optimization algorithms, and data-centric optimizations, that significantly improve over state-of-the-art Take advantage of the computational infrastructure of Together to create the best open models in their class Understand and improve the full lifecycle of building open models; release and publish your insights (blogs, academic papers etc.) Collaborate with cross-functional teams to deploy your models and make them available to a wider community and customer base Stay up-to-date with the latest advancements in machine learning About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy at https://www.together.ai/privacy    
No items found.
Apply
Hidden link
Mindrift.jpg

MCP & Tools Python Developer - Agent Evaluation Infrastructure

Mindrift
USD
80
0
-
80
US.svg
United States
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe. Who we're looking forCalling all security researchers, engineers, and penetration testers with a strong foundation in problem-solving, offensive security, and AI-related risk assessment.If you thrive on digging into complex systems, uncovering hidden vulnerabilities, and thinking creatively under constraints, join us! We’re looking for someone who can bring a hands-on approach to technical challenges, whether breaking into systems to expose weaknesses or building secure tools and processes. We value contributors with a passion for continuous learning, experimentation, and adaptability. About the projectWe’re on the hunt for hands-on Python engineers for a new project focused on developing Model Context Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification, integrate with internal and client infrastructures, and help fill tooling gaps across the team. What you’ll be doing:Developing and maintaining MCP-compatible evaluation serversImplementing logic to check agent actions against scenario definitionsCreating or extending tools that writers and QAs use to test agentsWorking closely with infrastructure engineers to ensure compatibilityOccasionally helping with test writing or debug sessions when neededAlthough we’re only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects. How to get started:Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.RequirementsThe ideal contributor will have:4+ years of Python development experience, ideally in backend or toolsSolid experience building APIs, testing frameworks, or protocol-based interfacesUnderstanding of Docker, Linux CLI, and HTTP-based communicationAbility to integrate new tools into existing infrastructuresFamiliarity with how LLM agents are prompted, executed, and evaluatedClear documentation and communication skills - you’ll work with QA and writersWe also value applicants who have:Experience with Model Context Protocol (MCP) or similar structured agent-server interfacesKnowledge of FastAPI or similar async web frameworksExperience working with LLM logs, scoring functions, or sandbox environmentsAbility to support dev environments (devcontainers, CI configs, linters)JS experienceBenefitsGet paid for your expertise, with rates that can go up to $80/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift
USD
40
0
-
40
SG.svg
Singapore
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleWe’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions. Although every project is unique, you might typically:Create structured test cases that simulate complex human workflows.Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths.Work with code repositories and test frameworks to validate your scenarios.Iterate on prompts, instructions, and test cases to improve clarity and difficulty.Ensure that scenarios are production-ready, easy to run, and reusable.How to get startedSimply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsBachelor's and/or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or other related fields. Background in QA, software testing, data analysis, or NLP annotation.Good understanding of test design principles (e.g., reproducibility, coverage, edge cases).Strong written communication skills in English.Comfortable with structured formats like JSON/YAML for scenario description.Can define expected agent behaviors (gold paths) and scoring logic.Basic experience with Python and JS.Curious and open to working with AI-generated content, agent logs, and prompt-based behavior.Nice to HaveExperience in writing manual or automated test cases.Familiarity with LLM capabilities and typical failure modes.Understanding of scoring metrics (precision, recall, coverage, reward functions).BenefitsContribute on your own schedule, from anywhere in the world. This opportunity allows you to:Get paid for your expertise, with rates that can go up to $40/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

MCP & Tools Python Developer - Agent Evaluation Infrastructure

Mindrift
USD
30
0
-
30
No items found.
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe. Who we're looking forCalling all security researchers, engineers, and penetration testers with a strong foundation in problem-solving, offensive security, and AI-related risk assessment.If you thrive on digging into complex systems, uncovering hidden vulnerabilities, and thinking creatively under constraints, join us! We’re looking for someone who can bring a hands-on approach to technical challenges, whether breaking into systems to expose weaknesses or building secure tools and processes. We value contributors with a passion for continuous learning, experimentation, and adaptability. About the projectWe’re on the hunt for hands-on Python engineers for a new project focused on developing Model Context Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification, integrate with internal and client infrastructures, and help fill tooling gaps across the team. What you’ll be doing:Developing and maintaining MCP-compatible evaluation serversImplementing logic to check agent actions against scenario definitionsCreating or extending tools that writers and QAs use to test agentsWorking closely with infrastructure engineers to ensure compatibilityOccasionally helping with test writing or debug sessions when neededAlthough we’re only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects. How to get started:Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.RequirementsThe ideal contributor will have:4+ years of Python development experience, ideally in backend or toolsSolid experience building APIs, testing frameworks, or protocol-based interfacesUnderstanding of Docker, Linux CLI, and HTTP-based communicationAbility to integrate new tools into existing infrastructuresFamiliarity with how LLM agents are prompted, executed, and evaluatedClear documentation and communication skills - you’ll work with QA and writersWe also value applicants who have:Experience with Model Context Protocol (MCP) or similar structured agent-server interfacesKnowledge of FastAPI or similar async web frameworksExperience working with LLM logs, scoring functions, or sandbox environmentsAbility to support dev environments (devcontainers, CI configs, linters)JS experienceBenefitsGet paid for your expertise, with rates that can go up to $30/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

MCP & Tools Python Developer - Agent Evaluation Infrastructure

Mindrift
USD
30
0
-
30
ES.svg
Spain
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe. Who we're looking forCalling all security researchers, engineers, and penetration testers with a strong foundation in problem-solving, offensive security, and AI-related risk assessment.If you thrive on digging into complex systems, uncovering hidden vulnerabilities, and thinking creatively under constraints, join us! We’re looking for someone who can bring a hands-on approach to technical challenges, whether breaking into systems to expose weaknesses or building secure tools and processes. We value contributors with a passion for continuous learning, experimentation, and adaptability. About the projectWe’re on the hunt for hands-on Python engineers for a new project focused on developing Model Context Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification, integrate with internal and client infrastructures, and help fill tooling gaps across the team. What you’ll be doing:Developing and maintaining MCP-compatible evaluation serversImplementing logic to check agent actions against scenario definitionsCreating or extending tools that writers and QAs use to test agentsWorking closely with infrastructure engineers to ensure compatibilityOccasionally helping with test writing or debug sessions when neededAlthough we’re only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects. How to get started:Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.RequirementsThe ideal contributor will have:4+ years of Python development experience, ideally in backend or toolsSolid experience building APIs, testing frameworks, or protocol-based interfacesUnderstanding of Docker, Linux CLI, and HTTP-based communicationAbility to integrate new tools into existing infrastructuresFamiliarity with how LLM agents are prompted, executed, and evaluatedClear documentation and communication skills - you’ll work with QA and writersWe also value applicants who have:Experience with Model Context Protocol (MCP) or similar structured agent-server interfacesKnowledge of FastAPI or similar async web frameworksExperience working with LLM logs, scoring functions, or sandbox environmentsAbility to support dev environments (devcontainers, CI configs, linters)JS experienceBenefitsGet paid for your expertise, with rates that can go up to $30/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift
USD
40
0
-
40
SA.svg
Saudi Arabia
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleWe’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions. Although every project is unique, you might typically:Create structured test cases that simulate complex human workflows.Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths.Work with code repositories and test frameworks to validate your scenarios.Iterate on prompts, instructions, and test cases to improve clarity and difficulty.Ensure that scenarios are production-ready, easy to run, and reusable.How to get startedSimply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsBachelor's and/or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or other related fields. Background in QA, software testing, data analysis, or NLP annotation.Good understanding of test design principles (e.g., reproducibility, coverage, edge cases).Strong written communication skills in English.Comfortable with structured formats like JSON/YAML for scenario description.Can define expected agent behaviors (gold paths) and scoring logic.Basic experience with Python and JS.Curious and open to working with AI-generated content, agent logs, and prompt-based behavior.Nice to HaveExperience in writing manual or automated test cases.Familiarity with LLM capabilities and typical failure modes.Understanding of scoring metrics (precision, recall, coverage, reward functions).BenefitsContribute on your own schedule, from anywhere in the world. This opportunity allows you to:Get paid for your expertise, with rates that can go up to $40/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift
USD
30
0
-
30
PL.svg
Poland
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleWe’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions. Although every project is unique, you might typically:Create structured test cases that simulate complex human workflows.Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths.Work with code repositories and test frameworks to validate your scenarios.Iterate on prompts, instructions, and test cases to improve clarity and difficulty.Ensure that scenarios are production-ready, easy to run, and reusable.How to get startedSimply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsBachelor's and/or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or other related fields. Background in QA, software testing, data analysis, or NLP annotation.Good understanding of test design principles (e.g., reproducibility, coverage, edge cases).Strong written communication skills in English.Comfortable with structured formats like JSON/YAML for scenario description.Can define expected agent behaviors (gold paths) and scoring logic.Basic experience with Python and JS.Curious and open to working with AI-generated content, agent logs, and prompt-based behavior.Nice to HaveExperience in writing manual or automated test cases.Familiarity with LLM capabilities and typical failure modes.Understanding of scoring metrics (precision, recall, coverage, reward functions).BenefitsContribute on your own schedule, from anywhere in the world. This opportunity allows you to:Get paid for your expertise, with rates that can go up to $30/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

Evaluation Scenario Writer - AI Agent Testing Specialist

Mindrift
USD
30
0
-
30
No items found.
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform connects specialists with AI projects from major tech innovators. Our mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe.About the RoleWe’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You’ll create test cases that simulate human-performed tasks and define gold-standard behavior to compare agent actions against. You’ll work to ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. You’ll need a sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions. Although every project is unique, you might typically:Create structured test cases that simulate complex human workflows.Define gold-standard behavior and scoring logic to evaluate agent actions. Analyze agent logs, failure modes, and decision paths.Work with code repositories and test frameworks to validate your scenarios.Iterate on prompts, instructions, and test cases to improve clarity and difficulty.Ensure that scenarios are production-ready, easy to run, and reusable.How to get startedSimply apply to this post, qualify, and get the chance to contribute to projects aligned with your skills, on your own schedule. From creating training prompts to refining model responses, you’ll help shape the future of AI while ensuring technology benefits everyone.RequirementsBachelor's and/or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or other related fields. Background in QA, software testing, data analysis, or NLP annotation.Good understanding of test design principles (e.g., reproducibility, coverage, edge cases).Strong written communication skills in English.Comfortable with structured formats like JSON/YAML for scenario description.Can define expected agent behaviors (gold paths) and scoring logic.Basic experience with Python and JS.Curious and open to working with AI-generated content, agent logs, and prompt-based behavior.Nice to HaveExperience in writing manual or automated test cases.Familiarity with LLM capabilities and typical failure modes.Understanding of scoring metrics (precision, recall, coverage, reward functions).BenefitsContribute on your own schedule, from anywhere in the world. This opportunity allows you to:Get paid for your expertise, with rates that can go up to $30/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
Mindrift.jpg

MCP & Tools Python Developer - Agent Evaluation Infrastructure

Mindrift
USD
30
0
-
30
IT.svg
Italy
Part-time
Remote
false
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. Please submit your resume in English and indicate your level of English proficiency.At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI. What we doThe Mindrift platform, launched and powered by Toloka, connects domain experts with cutting-edge AI projects from innovative tech clients. Our mission is to unlock the potential of GenAI by tapping into real-world expertise from across the globe. Who we're looking forCalling all security researchers, engineers, and penetration testers with a strong foundation in problem-solving, offensive security, and AI-related risk assessment.If you thrive on digging into complex systems, uncovering hidden vulnerabilities, and thinking creatively under constraints, join us! We’re looking for someone who can bring a hands-on approach to technical challenges, whether breaking into systems to expose weaknesses or building secure tools and processes. We value contributors with a passion for continuous learning, experimentation, and adaptability. About the projectWe’re on the hunt for hands-on Python engineers for a new project focused on developing Model Context Protocol (MCP) servers and internal tools for running and evaluating agent behavior. You’ll implement base methods for agent action verification, integrate with internal and client infrastructures, and help fill tooling gaps across the team. What you’ll be doing:Developing and maintaining MCP-compatible evaluation serversImplementing logic to check agent actions against scenario definitionsCreating or extending tools that writers and QAs use to test agentsWorking closely with infrastructure engineers to ensure compatibilityOccasionally helping with test writing or debug sessions when neededAlthough we’re only looking for experts for this current project, contributors with consistent high-quality submissions may receive an invitation for ongoing collaboration across future projects. How to get started:Apply to this post, qualify, and get the chance to contribute to a project aligned with your skills, on your own schedule. Shape the future of AI while building tools that benefit everyone.RequirementsThe ideal contributor will have:4+ years of Python development experience, ideally in backend or toolsSolid experience building APIs, testing frameworks, or protocol-based interfacesUnderstanding of Docker, Linux CLI, and HTTP-based communicationAbility to integrate new tools into existing infrastructuresFamiliarity with how LLM agents are prompted, executed, and evaluatedClear documentation and communication skills - you’ll work with QA and writersWe also value applicants who have:Experience with Model Context Protocol (MCP) or similar structured agent-server interfacesKnowledge of FastAPI or similar async web frameworksExperience working with LLM logs, scoring functions, or sandbox environmentsAbility to support dev environments (devcontainers, CI configs, linters)JS experienceBenefitsGet paid for your expertise, with rates that can go up to $30/hour depending on your skills, experience, and project needs.Take part in a flexible, remote, freelance project that fits around your primary professional or academic commitments.Participate in an advanced AI project and gain valuable experience to enhance your portfolio.Influence how future AI models understand and communicate in your field of expertise.
No items found.
Apply
Hidden link
No job found
Your search did not match any job. Please try again
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.