ASE Observability SRE
Apple - Seattle, WA
Apply NowJob Description
ASE Observability SRESeattle, Washington, United StatesSoftware and ServicesSummaryPeople at Apple don’t just build products — they craft the kind of experience that have revolutionized entire industries. The diverse collection of our people and their ideas inspire innovation in everything we do. Imagine what you could do here! Join Apple, and help us leave the world better than we found it.The Apple Service Engineering(ASE) team builds and provides systems and infrastructure that fuel Apple’s services (such as iCloud, iTunes, Siri, and Maps). We are the foundation on which Apple’s software developers build the products that our customers love. We are looking for passionate and talented Site Reliability Engineers to continue our focus in providing our customers the highest quality Apple Services experience.Role DescriptionThe Cloud Monitoring SRE organization is specifically tasked with enabling other teams to better understand their infrastructure and services, providing world-class observability capabilities. Keeping Apple services up and running 100% of the time is a challenging job. Accurately monitoring the health of every application and infrastructure that comprises the Apple ecosystem 100% of the time is an order of magnitude more challenging.As a Site Reliability Engineer on the Cloud Monitoring Team at Apple you will be working to improve the reliability and performance of the software systems that provide visibility into the services & infrastructure that runs Apple. Our monitoring, alerting, and visualization platform analyzes billions of metrics per minute and comprises the central nervous system of Apple's architecture.ResponsibilitiesApple Services Engineering infrastructure is BIG. Operating at our scale, across multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As an SRE at Apple, you'll need to solve these problems using data, teamwork, and your own expertise.Minimum QualificationsB.S. in Computer Science or a related field.Minimum 4+ years of industry experience.Proven experience developing production-grade software in Python, Go, or Java.Strong sense of ownership and integrity demonstrated through clear communication and collaborationExperience and confidence around incident response and incident managementExperience/knowledge in managing and scaling distributed systems in a public, private, or hybrid cloud environmentExperience/knowledge with the Prometheus ecosystemAcute drive to automate manual operations and to improve them through repeated iterationComfortable with Open Source configuration management and orchestration tools (such as Helm, Puppet, and Spinnaker)Familiarity with micro-services architecture and container orchestration with KubernetesPreferred QualificationsMaster’s degree in Computer Science or a related field is preferred.Demonstrated ability to investigate complex systemic and latent reliability issues and collaborate cross-functionally with software and systems teams to implement sustainable solutions.Experience automating workflows and reducing operational toil through scalable solutions.Use of configuration management and deployment toolsMonitoring of systems and services, optimization of performance, and resource utilizationPrior on-call experience supporting large-scale, distributed systems.Collaborating with a global and asynchronously communicating team.Pay & BenefitsAt Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role.Apple employees also have the opportunity to become an Apple shareholder through participation in Apple’s discretionary employee stock programs. Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. #J-18808-Ljbffr
Created: 2025-10-05