ML Engineer, TTS Systems
Location: San Francisco, CA or Remote (US)
About Bland
At Bland.com, we empower enterprises to build and scale AI phone agents. As a fast-growing team in San Francisco, our mission is to advance customer interactions with businesses through natural, reliable, and highly human-like voice technologies. Backed by $65M in funding from leading Silicon Valley investors, including Emergence Capital, Scale Venture Partners, Y Combinator, and founders of Twilio, Affirm, and ElevenLabs.
The Role: ML Engineer, TTS Systems
As an ML Engineer focused on Text To Speech (TTS), you will own the deployment, optimization, and maintenance of our production TTS systems. Your work will transform advanced research models into highly performant, scalable, and robust real-world solutions serving millions of real-time voice interactions daily. You will collaborate with research and engineering teams to implement inference-optimized TTS models, streamline deployment processes, and monitor live systems to ensure best-in-class performance for enterprise clients.
What You Will Do
Deploy and optimize large-scale TTS models into production environments for reliable, low-latency inference.
Implement and refine post training techniques (Like DPO, GRPO, and RLHF) and other modern inference techniques to maximize throughput and audio quality.
Collaborate with cross-functional teams to ensure seamless rollout, A/B testing, and iterative improvement of production models.
Maintain high availability and scalable infrastructure for multi-speaker, expressive, and controllable TTS use cases.
Design and document best practices for efficient TTS inference and system reliability.
What Makes You a Great Fit
Hands-on experience deploying large-scale neural TTS models in cloud or on-prem production settings.
Deep expertise in TTS inference optimization (e.g., quantization, kernel optimization, batching strategies, GRPO).
Strong understanding of real-time, low-latency audio processing pipelines and their challenges.
Working knowledge of distributed systems, GPU acceleration, and scalable production infrastructure.
Ability to diagnose and resolve quality, performance, and reliability issues in deployed voice systems.
Comfortable working in fast-paced, startup environments and taking full ownership from deployment through system maintenance.
Bonus Points
Contributions to open-source TTS systems or production audio frameworks.
Prior work in telephony, streaming, or live enterprise communication environments.
Benefits and Compensation
Healthcare, dental, vision
Meaningful equity in a fast-growing company
Every tool you need to succeed
Beautiful office in Jackson Square, SF with rooftop views
Competitive salary: $160,000 to $250,000
If you’re passionate about scaling production TTS systems, driving inference excellence, and delivering seamless, human-like voice at scale, we want to hear from you.




