Location
New York, United States
New York, United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
$225,000 – $325,000
Date posted
January 22, 2026
Job type
Full-time
Experience level
Mid Level 3+
Summary this job with AI
Highlight
Highlight

Job Description

We’re looking for a Platform Engineer, Applied Evaluations to define and operationalize quality for the agentic systems that power Antimetal’s investigation and automation engine.

This role is core to our product. You’ll own online and offline evaluation pipelines that operate over petabytes of infrastructure data, and shape agent platform abstractions where necessary to ensure our agents are measurable, debuggable, and reliable. You’ll partner closely with platform, product, and research, leveraging quality signals to accelerate iteration across the company.

About Antimetal

Antimetal is building the future of infrastructure management. We're starting by creating a platform that investigates, resolves, and prevents issues—giving engineers their time back to focus on what they do best: building great products.

What you'll do:

  • Own the evaluation stack: Build online and offline eval pipelines that measure agent quality across ephemeral, voluminous MELT data, code, and unstructured docs. Set the metrics that define the experience.

  • Define quality at scale: Production incidents span hundreds of services–ephemeral, high-volume, and where ground truth is approximative. Design evals that capture trajectory quality, not just final outputs, and validate that your metrics predict real outcomes.

  • Build platform abstractions for agents: Design core agent architectures and extend internal frameworks (e.g. sub-agents, MCPs, middleware) – that lets product, platform, and research iterate with confidence and ship faster.

  • Productionize: Own latency, observability, and uptime.

What you do:

  • At least 3 years of experience in ML platform engineering, data engineering, or a related role, preferably at a high-growth company.

  • Prior experience designing evaluation systems where ground truth is noisy, high-volume, and hard to label (e.g. computer vision, deep research pipelines)

  • Strong system design skills: you think about how data flows through distributed systems and how decisions compound at scale.

  • Proven ability to write clean, scalable code and strong data modeling skills.

  • Demonstrated ability to bring ambiguous goals from prototype to production, using data and experimentation to drive product and architectural decisions.

  • Proficient in Python and Typescript, with experience using common ML libraries and data engineering tools.

Bonus:

  • Experience with SRE-best practices and modern observability (OTEL, distributed tracing)

  • Strong on ML fundamentals: classification/regression, clustering, dimensionality reduction, evaluation + error analysis, probabilistic ML

  • Experience with agent architectures: multi-step reasoning, tool use, context management

Who you are:

  • Identify as a builder

  • Are excited to work in-person from our new and spacious office in New York

  • Love working in a startup environment (experience in a startup or obsession with going zero-to-one)

  • Enjoy working with people who are ambitious, caring, and think in systems

  • Thrive in a fast-paced iterative environment where experimentation is essential

What we bring:

  • Pay & ownership — Competitive salary with generous equity grants.

  • Full coverage + retirement — Fully covered health, dental, and vision, plus retirement benefits.

  • Unlimited PTO — Take the time you need to recharge.

  • Dinner on late nights — Working late? Dinner is on us.

  • Fitness stipend — Monthly support for your health and wellness.

  • Tools of the trade — Any equipment you need to do your best work.

  • Commute perks — Citi Bike + train benefits.

Interview process

  1. Application Review – Send us your stuff, and a quick note on why you're excited.

  2. Intro Chat: Share what you're looking for next and learn more about what we're building.

  3. Founder Interview: Talk with one of our founders in more detail about the role

  4. Technical Interview: We’ll have you complete a short exercise specific to the role.

  5. Onsite: Come onsite and meet the team through a series of 1:1 interviews.

  6. Decision – We’ll move fast.

Apply now
Antimetal is hiring a Evaluations - Platform Engineer. Apply through The Homebase and and make the next move in your career!
Apply now
Companies size
11-50
employees
Founded in
Headquaters
New York City, NY, United States
Country
United States
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Software Engineer - Sensing, Consumer Products

Full-time
Software Engineer
US.svg
United States

Senior Software Engineer, ML Core

Full-time
Software Engineer
No items found.

Software Engineer - Embedded NixOS

Full-time
Software Engineer
CA.svg
Canada

Software Engineer, Data & Retrieval

Full-time
Software Engineer
US.svg
United States

Engineering Manager - Engine and Platform

Full-time
Software Engineer
US.svg
United States

Engineering Manager - Tool Development and Developer Experience

Full-time
Software Engineer
Open Modal