SRE
United IT Solutions - Mountain View, CA
Apply NowJob Description
JD: • Design, implement, and maintain complex data systems supporting millions of customers with Cloud Native principles and best practices to ensure highly available, secure, performant and scalable database systems • Build and maintain CI/CD pipelines in Jenkins • Build and deploy services in Kubernetes cluster using helm, kustomize, etc • Contribute to infrastructure changes to AWS with deep understanding of AWS services • Engage in on-call for pre-production and production systems supporting multi-million users • Write/Review RCA docs to prevent recurrence of Incidents in future and share the learnings • Contribute to major system upgrades, deployment automation, monitoring enhancements and Production changes • Create operational playbooks, contribute to how-to articles, and gain domain knowledge to drive changes in the team • Participate and contribute in FMEA/Chaos testing, Security remediations, etc • Share best practices and patterns for operational excellence and cost optimization • Reduce or eliminate manual steps by automating as much as possible • Continuously look for opportunities to increase developer velocity and productivity Qualifications: • Bachelor's or master's degree in computer science or a related technical field. Equivalent experience will be considered • 4+ years of hands-on development & operational experience with building and maintaining infrastructure in AWS • Extensive performance monitoring, troubleshooting & tuning experience • Experience with AWS services and hands-on knowledge of hosting on Cloud • Experience with scripting languages for DevOps automation • Experience with any one of the programming languages: Java/Python/Ruby • Knowledge of Docker & Kubernetes, ArgoCD, • Experience with monitoring and observability using Splunk, Wavefront, AppDynamics, Prometheus, Tracing, etc
Created: 2026-03-10