Location
San Francisco, United States
San Francisco, United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
Date posted
January 14, 2026
Job type
Full-time
Experience level
Senior
Summary this job with AI
Highlight
Highlight

Job Description

Our Mission

Reflection’s mission is to build open superintelligence and make it accessible to all.

We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.

About the Role

  • Drive the entire alignment stack, spanning instruction tuning, RLHF, and RLAIF, to push the model toward high factual accuracy and robust instruction following.

  • Lead research efforts to design next-generation reward models and optimization objectives that significantly improve human preference (HP) performance.

  • Curate high-quality training data and design synthetic data pipelines that solve complex reasoning and behavioral gaps.

  • Optimize large-scale RL pipelines for stability and efficiency, ensuring rapid iteration cycles for model improvements.

  • Collaborate closely with pre-training and evaluation teams to create tight feedback loops that translate alignment research into generalizable model gains.

About You

  • Graduate degree (MS or PhD) in Computer Science, Machine Learning, or related discipline.

  • Deep technical command of alignment methodologies (PPO, DPO, rejection sampling) and experience scaling them to large models.

  • Strong engineering skills, comfortable diving into complex ML codebases and distributed systems.

  • Experience improving model behavior through data, reward modeling, or RL techniques.

  • Evidence of owning ambitious research or engineering agendas that led to measurable model improvements.

  • Thrive in a fast-paced, high-agency startup environment with bias toward action.

  • Passionate about advancing the frontier of intelligence.

What We Offer:

We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

  • Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.

  • Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance.

  • Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning.

  • Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time.

  • Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.

Apply now
Reflection is hiring a Member of Technical Staff - Alignment Lead. Apply through The Homebase and and make the next move in your career!
Apply now
Companies size
1-10
employees
Founded in
2024
Headquaters
Paris, France
Country
France
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

People Data & Operations Manager

Intern
Research Scientist
US.svg
United States

Research-Hardware Codesign Engineer

Full-time
Research Scientist
US.svg
United States

Member of Technical Staff - Alignment Lead

Full-time
Research Scientist
No items found.

HR Operations Partner

Full-time
Research Scientist
No items found.

Senior Product Manager, Integration Agents

Full-time
Research Scientist
No items found.

MEP Manager, Data Centers

Full-time
Research Scientist
Open Modal