About Sana
Sana is an AI lab building superintelligence for work.
We believe organizations can accomplish their missions faster when humans can effortlessly access knowledge, automate repetitive work, and learn anything with the help of agentic AI.
As part of Workday, we are committed to building AI that augments humans.
If that’s a mission that excites you, you’re in the right place.
About the role
You’ll be the quality champion for Sana’s AI agent platform, ensuring our LLM-powered products are robust, reliable, and a delight to use. You’ll design and implement test strategies that keep pace with rapid iteration, automate critical workflows, and drive a culture of quality across engineering. This is a hands-on role for someone who thrives on constructing scalable ways of breaking things, uncovering edge cases unique to agentic and LLM-based systems, and building the safeguards that prevent issues from reaching production. You’ll help us deliver agent workflows that are safe, trustworthy, and enterprise-ready, for the AI landscape of today and tomorrow.
In this role, you will
Design and implement test plans for agent infrastructure, LLM-based APIs, and end-to-end user journeys
Build and maintain automated test suites for backend, frontend, and integration layers, including prompt and response validation for generative models
Develop tools and frameworks to accelerate testing and catch regressions early, especially in agent reasoning, tool use, and context handling
Collaborate closely with engineers to embed quality into every stage of development, with a focus on the unique challenges of AI/LLM systems (e.g., non-determinism, hallucinations, safety)
Lead root cause analysis and drive resolution for critical issues and incidents, including those arising from model updates or agent behaviors
Advocate for best practices in code quality, observability, and CI/CD pipelines—ensuring quality signals are actionable and visible
What success looks like
Critical bugs, regressions, and model failures are caught before they reach users—even as we scale and ship rapidly
Automated test coverage is high, reliable, and easy to maintain, including for LLM outputs and agent workflows
Release cycles are fast and safe. Confidence in shipping is high, even with evolving models and agent capabilities
Quality metrics (including model quality, agent reliability, and user experience) and dashboards provide clear, actionable signals to the team
You are a go-to partner for engineers, raising the bar for quality and reliability in AI-driven systems
Our tech stack
We build on a simple modern stack optimized for both humans and AI:
Backend: TypeScript, Node.js
Frontend: TypeScript, React, Tailwind
Databases: Postgres, Redis
Cloud infra: GCP/Kubernetes/Terraform
What We Offer
Help shape AI's future alongside brilliant minds from Notion, Dropbox, Slack, Databricks, Google, McKinsey, and BCG.
Competitive salary complemented with a transparent and highly competitive options program.
Swift professional growth in an evolving environment, supported by a culture of continuous feedback and mentorship from senior leaders.
Work with talented teammates across 5+ countries, and collaborate with customers globally
Regular team gatherings and events (recently in Italy and South Africa)




