Senior SRE/DevOps
Ipro Networks Pte. Ltd. - Sunnyvale, CA
Apply NowJob Description
Overview Job Title: Senior Site Reliability Engineer Position Type: Full-time Location: Sunnyvale, CA Salary Range: $180,000 - $240,000 (USD) Responsibilities Develop stability standards and metrics Covering robust architecture, R&D quality, release management, production environment operations, and more. Embedding stability into technical R&D system. Driving major stability governance campaigns Initiatives such as full-stack disaster recovery, phased change rollout, the 1-5-10 emergency response mechanism (1-minute alerting, 5-minute triage, 10-minute recovery), and financial-loss prevention. Rapidly and continuously mitigating stability risks. Building a stability-focused technical platform Platform capabilities for unattended change management, red/blue team drills, emergency collaboration, risk and vulnerability inspection, and monitoring/alerting. Simplifying stability engineering through automation and tooling. Executing production incident management Emergency response, cross-team coordination, root cause analysis, rapid recovery, and post-incident reviews to drive systemic improvements. Ensuring stability for large-scale customer events Technical and operational support for critical activities such as Olympics and customer business peak periods. On-call responsibilities Responding to customer issues within Service Level Agreement (SLA) timeframes, resolving problems proactively, and enhancing customer experience. Daily operations and maintenance of applications, databases, and middleware, as well as troubleshooting and answering customer inquiries. Collaborating with R&D to develop critical support plans based on customer business requirements during peak periods, including preparation during standby, on-duty support during critical periods, and post-standby review. Qualifications Bachelor's degree in Computer Science or related field with solid fundamentals. Expert-level Linux system administration skills. Proficient in open-source big data architectures. 5+ years of experience in development/operations of large-scale distributed systems. Strong troubleshooting and performance optimization capabilities. Cloud-native technical competency with hands-on Kubernetes experience (architecture understanding, issue diagnosis, change releases). Strong scripting skills (Python/Shell) for automated troubleshooting, monitoring solutions, and operational automation. Excellent communication skills. Chinese language proficiency is a significant advantage. About Us Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU. IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at Compensation The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility. #J-18808-Ljbffr
Created: 2025-09-17