About Brighthire
BrightHire is a category-creating, Series B software company with a mission to give everyone the hiring experience they deserve.
We deliver on this mission by transforming the way many of the world’s leading companies build exceptional teams. We created the Interview Intelligence category, and our clients include some of the world’s most innovative companies—from Canva and Zapier to Rippling—as well as members of the Fortune 100.
About The Role
You will partner closely with our Engineers, Product, and Design to productionize early-stage AI features into high quality, performant AI features that delight users at scale. Your focus will be on quality and safety testing: devising rigorous evaluation frameworks, refining prompts and pipelines, and optimizing model choices for cost, latency, accuracy, tone, and safety. You will help build the shared AI platform that powers products such as:
AI Interviewer conversation loops that adapt in real time
AI Fraud Signals that flag suspicious behavior with minimal false positives
AI Candidate skills matrices and assistants that surface instant insights
What You'll Do
Design and own comprehensive evaluations that measure accuracy, completeness, style, hallucination rate, bias, and safety across every release.
Tune and iterate on RAG pipelines, prompt chains, conversation loops, provider selections, and fine-tunes until quality bars are met or exceeded.
Build reusable data and evaluation pipelines, a shared semantic layer, and monitoring dashboards that make it easy for product teams to ship reliable AI quickly.
Optimize for cost and latency, continuously benchmarking models and negotiating trade-offs between performance and spend.
Implement robust data governance and lineage practices that satisfy enterprise compliance requirements and support our AI bias audit process.
Document best practices and share knowledge to raise the bar for AI development across BrightHire.
What You'll Bring
5+ years in Data Science or ML engineering with a strong focus on ML or NLP systems.
1+ year focused on Gen-AI or LLM systems.
Strong Python and SQL skills.
Experience creating automated evaluation suites for LLM outputs (accuracy, safety, bias, tone, style) and using results to guide iterative improvements.
Knowledge of prompt engineering, RAG techniques, vector search, embeddings, fine-tuning, and model selection across multiple providers.
Ability to communicate complex AI trade-offs clearly to engineers, designers, and executives alike
Bias toward action, curiosity, and a passion for building high-quality user experiences
About Our Team
High-impact projects in small, autonomous squads where you can lead platform initiatives or dive deep as a specialist
Thoughtful developer experience with fast CI, 1-click deploys, strong observability, and clean codebases
Sustainable remote culture: regular working hours, no-meeting Wednesdays, and flexible time off
Collaborative, kind teammates who value learning and growth
More About Us
Remote flexibility: Our team is fully remote, spanning across North and South American time zones. We crossover our hours for a core chunk of the day but provide everyone flexibility in how and when they get work done.
Impactful work: Play a critical role in delivering on our mission to give everyone the hiring experience they deserve.
Learning opportunities: Engage with a wide range of technologies and challenges, offering continuous learning and professional growth.
Collaborative environment: We’re always working together to brainstorm ideas about product, strategy, etc.
Use your own product: We use our product daily in our own hiring, which is rewarding and gives us product empathy!
Customer Connection: We try to make sure everyone stays connected to users and clients, joining sales and client meetings, talking to end users, etc.
Autonomy: Everyone is self-motivated, autonomous, and seeks ways we can continuously improve as a company
Fun: We’re generous, self-deprecating, look for reasons to laugh, and enjoy sharing our ideas for band names, posting photos from our walks, and reminiscing about previous travels.
Benefits
This is a full-time contractor role for long-term employment
15 days PTO
12 national holidays
Healthcare stipend
Work-from-home, learning, and vacation stipends
Company provided computer
The Selection Process
Silver.dev Recruiter Screen
Hiring Manager Interview
Deep Dive Work Experience Screen
System Design Screen
Executive Interview w/CTO




