Software Engineer, ML Infrastructure, Level 4
Snap Inc. - Los Angeles, CA
Apply NowJob Description
Snap Inc ( is a technology company. We believe the camera presents the greatest opportunity to improve the way people live and communicate. Snap contributes to human progress by empowering people to express themselves, live in the moment, learn about the world, and have fun together. The Companyu2019s three core products are Snapchat (, a visual messaging app that enhances your relationships with friends, family, and the world; Lens Studio (, an augmented reality platform that powers AR across Snapchat and other services; and its AR glasses, Spectacles (. Snap Engineering ( teams build fun and technically sophisticated products that reach hundreds of millions of Snapchatters around the world, every day. Weu2019re deeply committed to the well-being of everyone in our global community, which is why our values ( are at the root of everything we do. We move fast, with precision, and always execute with privacy at the forefront. Youu2019ll play a critical role in scaling our ML Infrastructure, building and scaling feature/training data platform, advancing ML quality & insights, and driving innovations that make Snapchatu2019s ranking and recommendation systems more efficient and impactful. Weu2019re looking for a Software Engineer, ML Infrastructure to join Snap Inc., as part of the ML Data team that builds the foundational data platforms powering Snapu2019s machine learning. The team develops and maintains large-scale systems for feature and training data, ensures data quality and lineage across the ML lifecycle, and drives continuous efficiency improvements to support Snapu2019s ML growth. What youu2019ll do: + Design and optimize infrastructure systems for machine learning workloads at scale and drive reliability and efficiency improvements across Snapchatu2019s ML Infrastructure + Build and enhance feature/training data generation pipelines that power online inferencing and offline training/experimentation. + Build platform/infrastructure to support embedding, user sequence and other feature types to support business growth. + Build and expand the end to end ML model/data quality platform to enhance model debuggability and team accountability. + Build holistic ML insights by cataloging metadata and lineage through ML lifecycle. + Work closely with ML engineers to deploy cutting-edge models into production Knowledge, Skills & Abilities: + Strong programming skills in Python, Java, Scala or C++ Strong problem-solving skills with a focus on system performance, scalability, and efficiency + Deep understanding of distributed systems and the infrastructure components of large-scale ML + Experience with big data processing frameworks such as Spark, Flink, or Ray + Ability to collaborate and work well with others + Proven track record of operating highly-available systems at significant scale + Ability to proactively learn new concepts and apply them at work Minimum Qualifications: + Bacheloru2019s degree in a technical field such as computer science or equivalent experience + 2+ years of post-Bacheloru2019s software development experience; or Masteru2019s degree in a technical field + 1+ year of post-grad software development experience; or PhD in a relevant technical field + Experience building large scale production machine learning systems, distributed systems or big data processing Preferred Qualifications: + Masters/PhD in a technical field such as computer science or equivalent industry experience + Experience working on large scale feature platform + Experience working on large scale training data preprocessing infrastructure + Deep expertise in understanding end to end ML lifecycle and ML quality If you have a disability or special need that requires accommodation, please donu2019t be shy and provide us some information (.
Created: 2025-11-01