
đ€ Full Interview: Nir Gazit, Co-founder & CEO @ Traceloop
"We wonât rest until AI developers can have the same confidence and discipline that has existed in software engineering for years."
Founder Story & Vision
Who they are & what theyâre building
Nir Gazit is the co-founder and CEO of Traceloop, a YC-backed startup building the observability and evaluation stack for LLM applications. Before Traceloop, he led ML engineering at Google and served as Chief Architect at Fiverr. He teamed up with Gal Kleinman (also ex-Fiverr) after realizing how difficult it was to debug and improve LLM agents in production. Their early GPT-3 experiments worked⊠until they didnât â and they had no idea why. That led to Traceloop.
Why now & whatâs the big bet
2025 is the year of GenAI production. Enterprises like Miro, Cisco, and IBM are finally putting LLM agents to work â in support, code gen, and internal data tasks. But thereâs a critical gap: you canât improve what you canât measure. Nirâs big bet is that AI apps need the same observability tooling we use for traditional software â tracing, metrics, alerts, evals â or else risk hallucinations, UX fails, and trust erosion. With Traceloop, he wants to close the feedback loop between real-world usage and model improvement.
đ§© Real-World Use Cases
How it works in the wild
Miro uses Traceloop to monitor GPT-4.1 performance inside its collaborative whiteboard platform. They needed safe rollout of an LLM-powered assistant that helps users plan, draw, and generate frameworks â without âgoing rogue.â Traceloopâs platform gave them real-time insights into hallucination rates, latency, and tool usage breakdowns, enabling confident deployment at scale.

IBM and Cisco use Traceloop to trace complex agent workflows â where LLMs use tools, call APIs, and make decisions. Traceloop helps flag tool call failures, drift in behavior, and latent costs â before users notice.
Startups tap into OpenLLMetry â Traceloopâs open-source SDK â to collect token-level traces, eval scores, and prompt versions. Itâs everything they need, with zero platform lock-in.
What youâll learn:
How Nir navigated the shift from LLM playgrounds to production systems
Framework for closing the loop between LLM usage, tracing, and automated evals
Real tactics behind open-source-led growth (500K+ downloads/month)
Lessons on founding during platform shifts (and pitching a technical tool to non-technical buyers)
How Traceloop is solving the âblack boxâ problem of LLM agents in production
Some Takeaways:
Treat prompts like code: version, trace, test, and improve.
Observability isnât optional â it's your first defense against hallucinations and drift.
Open-source + product-led growth can build traction fast if devs find real value.
Traceloopâs secret weapon: agent-first tracing, not just logging.
The next frontier: auto-improving agents using real-world data as feedback.
In this episode, we cover:
00:00 â Why Traceloop exists: from GPT chaos to clarity
03:00 â From Google & Fiverr to co-founding a YC startup
06:00 â LLMs fail silently: real-world horror stories
09:00 â OpenLLMetry, 2M+ downloads & go-to-market via open source
12:30 â Enterprise GenAI adoption: why 2025 is the turning point
16:00 â Prompt engineering is fake, AI coding is real (and weird)
20:00 â Junior vs. senior engineers in the age of AI
24:00 â AGI, AI limitations & whatâs realistically possible
28:00 â Traceloopâs vision: auto-improving agents & full feedback loop
33:00 â Startup lessons, YC advice, hiring culture & future of work in AI
For inquiries about sponsoring the podcast, email david@thehomabase.ai
Referenced in the Episode:
OpenLLMetry â Traceloopâs open-source SDK for LLM tracing
Traceloop Platform â Full observability stack for LLMs and agents
â traceloop.com
Cursor
â cursor.sh
Claude by Anthropic
â claude.ai
OpenTelemetry â The open standard Traceloop is built on
â opentelemetry.io
Y Combinator (W23) â Startup accelerator backing Traceloop
â yc.com
âSoftware 2.0â by Andrej Karpathy
â Link to essay
Find Case Studies of all other episodes here.