Site Reliability Engineer - Inference
Jobright.ai - San Francisco, CA
Apply NowJob Description
Join to apply for the Site Reliability Engineer - Inference role at Jobright.ai2 days ago Be among the first 25 applicantsJoin to apply for the Site Reliability Engineer - Inference role at Jobright.aiGet AI-powered advice on this job and more exclusive features.Jobright is an AI-powered career platform that helps job seekers discover the top opportunities in the US. We are NOT a staffing agency. Jobright does not hire directly for these positions. We connect you with verified openings from employers you can trust.Job Summary:Lambda is the #1 GPU Cloud for ML/AI teams, providing tools for building, testing, and deploying AI products at scale. The Site Reliability Engineer - Inference will work on developing a large-scale platform for running AI models and building a high-throughput, low-latency API for distributed systems.Responsibilities:• Work on our Inference service, helping us to develop our large-scale platform for running new, cutting-edge models across tens of thousands of GPUs• Help build a high-throughput, low-latency API and routing system running at geographically-distributed scale• Shape a highly reliable distributed system with a focus on reducing operational overhead and deep observability and capacity management.• Work with the team and our internal ML researchers to adopt and improve new inference engines, models and architectures across a variety of different mediums (such as text, image, video and audio)• Tackle global networking challenges to deliver the lowest possible latency to our users across all of Lambda’s available capacity• Help push Lambda forward into the state of the art, and be part of a team that is operating right at the edge of new developments in the industry.Qualifications:Required:• 8 or more years of experience as a software reliability engineer or software engineer working on large-scale, internet-facing production services• Highly skilled at writing Go and Python• Experience with bare-metal system installation and administration• Experience deploying applications and operators on Kubernetes• Product-focused, balancing operational needs and keeping overheads down with the need to ship features at a rapid pace• Proven track record of working in an environment with rapid deployment and the ability to stay on top of shifting priorities as the industry rapidly develops• Willingness to take ownership of projects and help drive them forwards through design, implementation, launch, and maintenance.Preferred:• Experience working with machine learning models• Experience operating large-scale, geographically distributed systems• Experience developing Kubernetes operators and componentsCompany:Lambda provides infrastructure, cloud services, and software for the training and inferencing of AI models. Founded in 2012, headquartered in San Jose, California, USA, team size 201-500 employees, currently Late Stage. Lambda has a track record of offering H1B sponsorships.Seniority levelSeniority levelMid-Senior levelEmployment typeEmployment typeFull-timeJob functionIndustriesSoftware DevelopmentReferrals increase your chances of interviewing at Jobright.ai by 2xInferred from the description for this jobMedical insuranceVision insurance401(k)Get notified when a new job is posted.Sign in to set job alerts for “Site Reliability Engineer” roles.San Francisco, CA $160,000.00-$180,000.00 4 days agoSoftware Engineer, Infrastructure, Early CareerSan Francisco, CA $126,000.00-$170,000.00 11 hours agoSan Francisco, CA $180,000.00-$280,000.00 3 days agoSan Francisco, CA $130,000.00-$238,000.00 1 day agoSan Francisco, CA $150,000.00-$250,000.00 1 day agoSan Francisco, CA $150,000.00-$230,000.00 4 months agoSan Francisco, CA $99,500.00-$200,000.00 2 weeks agoFull-Stack Software Engineer (Jr/Mid level)San Francisco, CA $120,000.00-$180,000.00 1 day agoSan Francisco, CA $56.25-$137,000.00 5 days agoSoftware Development Engineer I - Frontend & MobileSan Francisco, CA $99,500.00-$200,000.00 3 weeks agoSan Francisco, CA $160,000.00-$200,000.00 2 months agoSan Francisco, CA $150,000.00-$176,000.00 3 months agoSan Francisco, CA $120,000.00-$190,000.00 9 months agoSan Francisco, CA $130,000.00-$140,000.00 2 weeks agoSoftware Engineer, AI Intern (Summer 2026)San Francisco, CA $125,000.00-$175,000.00 2 months agoSoftware Engineer, AI Intern (Winter 2026)San Francisco, CA $130,000.00-$240,000.00 2 weeks agoSan Francisco, CA $163,200.00-$223,200.00 3 days agoSoftware Engineer, Frontend (All Levels)San Francisco, CA $150,000.00-$220,000.00 2 weeks agoSan Francisco, CA $150,000.00-$283,000.00 4 days agoSan Francisco, CA $155,000.00-$339,500.00 2 weeks agoSan Francisco, CA $140,000.00-$280,000.00 8 months agoSan Francisco, CA $165,000.00-$165,000.00 2 years agoSan Francisco, CA $120,000.00-$200,000.00 2 years agoWe’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI. #J-18808-Ljbffr
Created: 2025-09-17