Machine Learning Engineer - Speech Model Training
DeepRec.ai - San Francisco, CA
Apply NowJob Description
Machine Learning Engineer - Speech Model Training $250,000 - $300,000 San Francisco, CA Hybrid, 3x per week in office Full time / PermanentIn this role you won't be wrapping APIs or fine-tuning existing models. You'll be building models across raw acoustic signal processing all the way through to production inference on edge devices. At a company that actually ships to 1.5M+ live users.A profitable, fast-growing AI company ($250M ARR in under three years, no VC dependency) is standing up a SpeechLLM lab from scratch. This is a founding seat on that team.They build a hardware-software AI companion used daily by over 1.5 million professionals worldwide. The next chapter is a world-class speech intelligence core and they need the engineers to architect it.What you'd ownDesign and train large-scale speech models end-to-end. Unified SpeechLLMs, ASR, expressive TTS, generative audioOwn the full stack from acoustic feature engineering to GPU cluster optimisationRun and optimise distributed training at scale via PyTorch or JAX, FSDP, DeepSpeed, etcDrive real-time inference performance with vLLM, TensorRT-LLM, or SGLangApply RL alignment techniques to improve conversational qualityDebug the hard problems in distributed infrastructure and ship solutionsYou likely haveProven experience training large-scale audio or speech models from the ground upDeep PyTorch or JAX expertise with real distributed training experienceGenuine comfort traversing the entire ML stack from signal processing to productionA bias toward shipping: you take ownership, you iterate fastStrong bonusneural audio codecs, diffusion/flow-matching architectures, or LLM pretraining experience.Why joinProfitable company at ~$250M run rate - you'll see the impact of your work immediately in a product used daily by professionals worldwideDirect ownership of the live speech quality stack, not a supporting role in a large orgHybrid San Francisco team with real access to large, diverse, multilingual audio datasetsShort feedback loops - improvements ship fast and metrics are visibleClear path toward senior technical leadership as the audio team grows
Created: 2026-05-13