Location
New York, United States
New York, United States
Salary
(Yearly)
(Yearly)
(Yearly)
(Yearly)
(Hourly)
Undisclosed
$215,000 – $235,000
Date posted
January 13, 2026
Job type
Full-time
Experience level
Senior 5+
Summary this job with AI
Highlight
Highlight

Job Description

At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. We are guided by principles that shape how we think, build, and execute, including deep customer obsession, purposeful speed, ownership, and a relentless focus on outcomes. We work in small, highly skilled teams, prioritize clarity over complexity, and continuously evolve through curiosity, data, and craftsmanship.

We’re building a globally diverse team of technologists and problem solvers who thrive in fast-paced environments, value collaboration, and approach every challenge with a Day 1 mindset. With hubs in New York City, Mountain View, Latin America, and India. If you’re driven by continuous learning, rapid iteration, and the challenge of building in a high-growth startup, this is more than a role—it’s a journey.

We are seeking a Speech Software Engineer to spearhead the architectural evolution of our voice infrastructure. This isn't just a maintenance role; you will be a primary architect in rebuilding our core speech stack from the ground up to support the next generation of real-time customer interactions. You will have the autonomy to make high-level technical decisions and the support of a team that thrives on deep thinking and startup-paced execution. 

You will join the GenerativeAgent team, bridging the gap between cutting-edge ASR (Automatic Speech Recognition) research and high-performance production systems. If you are passionate about low-latency streaming, distributed systems, and the intricacies of audio processing, this is your opportunity to make a massive impact for millions of users.

What you'll do

  • Architect & Modernize: Lead the design and implementation of a scalable, high-availability voice infrastructure that replaces legacy systems.
  • Optimize Performance: Build and refine multi-threaded server frameworks capable of handling thousands of concurrent, real-time audio streams with minimal jitter and latency.
  • Build for Scale: Deploy robust ASR > LLM > TTS pipelines that process thousands of calls concurrently.
  • Stream Engineering: Develop robust logic for handling media streams, ensuring seamless audio data flow between clients and our ML models.
  • System Observability: Build advanced monitoring and load-testing tools specifically designed to simulate high-concurrency voice traffic.
  • Collaborate: Partner with Speech Scientists and Research Engineers to integrate state-of-the-art models into a production-ready environment.

What you'll need

  • Experience: 5+ years of software engineering experience, with a proven track record of building and maintaining production-grade infrastructure.
  • Industry Knowledge: A background in building ASR/TTS products at scale that interact with foundational LLMs.
  • Language Mastery: Expert-level proficiency in Golang, Python, or willingness to learn.
  • Voice Fundamentals: Deep understanding of audio processing, including sample rates, codecs (Opus, G.711), network protocols, and buffering strategies.
  • System Design: Strong background in object-oriented design and the ability to architect systems that are both modular and performant.
  • Growth Mindset: The ability to navigate and refactor large existing codebases while transitioning to new, more efficient architectures.

What we'd like to see

  • Cloud Native: Hands-on experience with Kubernetes, Docker, and cloud providers (AWS/GCP/Azure) for deploying distributed speech services.
  • Event-Driven Architecture: Familiarity with event loops (Boost.Asio, uvloop) and asynchronous programming patterns
  • Big Data: Experience with Hadoop, Spark, or Hive for analyzing massive datasets of speech logs to improve model accuracy.

215,000 - 235,000 a year
The compensation includes salary plus performance bonus. The actual salary may be different depending upon non-discriminatory factors such as qualifications, experience, and other factors permitted by law.
ASAPP is committed to creating a diverse environment and is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, disability, age, or veteran status. If you have a disability and need assistance with our employment application process, please email us at careers@asapp.com to obtain assistance. #LI-AG1 #LI-Hybrid
Apply now
ASAPP is hiring a Speech Software Engineer. Apply through The Homebase and and make the next move in your career!
Apply now
Companies size
201-500
employees
Founded in
2014
Headquaters
New York City, NY, United States
Country
United States
Industry
Computer Software
Social media
Visit website

Similar AI jobs

Here are other jobs you might want to apply for.

US.svg
United States

Speech Software Engineer

Full-time
Software Engineer
US.svg
United States

Senior Staff Systems Engineer

Full-time
Software Engineer
US.svg
United States

Software Engineer, Backend

Full-time
Software Engineer
US.svg
United States

Software Engineer, Codex Runtime

Full-time
Software Engineer
SG.svg
Singapore

Senior Forward Deployed Software Engineer

Full-time
Software Engineer
SG.svg
Singapore

Senior CFD Engineer

Full-time
Software Engineer
Open Modal