Principal Site Reliability Engineer
Harrison Clarke - San Francisco, CA
Apply NowJob Description
Harrison Clarke are working with several high profile companies that are seeking a Principal Site Reliability Engineer (SRE) , to lead the design, implementation, and scaling of the infrastructure and systems that support their products. The ideal candidate should have extensive experience in designing highly scalable infrastructure, building systems, and performing testing, monitoring, and maintenance. The backend stack includes Python , PostgreSQL , DynamoDB , Redis , and Kubernetes . What You'll Do Design and implement highly available, high-performance , and scalable systems. Maintain and optimize key-value and relational databases. Scale and load balance web server backends to meet rapidly changing needs. Participate in on-call rotation. Monitor systems and applications, proactively identifying and resolving reliability, scalability, or performance issues. Develop monitoring tools, alerts, and dashboards to provide visibility into system health and performance. Collaborate with product engineering and security engineering teams to develop scalable automations. What You'll Bring 7+ years experience with cloud infrastructure built on AWS . Proficient in database management and caching strategies. Excellent problem-solving and troubleshooting skills, with the ability to analyze, debug, and resolve complex technical issues. Experience working with containers ( Docker, Kubernetes ) and orchestration tools. Strong experience with Python and Terraform . If you are looking to explore your next SRE role, we would love to hear from you. Apply below! #J-18808-Ljbffr
Created: 2025-10-01