Site Reliability Engineer II
Kforce - Alpharetta, GA
Apply NowJob Description
Kforce has a client that is seeking a Site Reliability Engineer II in Alpharetta, GA. Summary: We are seeking a skilled Site Reliability Engineer to join our team and help build, maintain, and scale our cloud-native infrastructure. You will work closely with development and operations teams to ensure our systems are reliable, scalable, and efficient. The ideal candidate is passionate about automation, observability, and infrastructure-as-code, and thrives in a collaborative, fast-paced environment. Key Responsibilities: * Site Reliability Engineer II will design, implement, and manage cloud infrastructure on Azure using Terraform and Terragrunt * Maintain and optimize Kubernetes clusters on Azure Kubernetes Service (AKS) * Build and manage CI/CD pipelines using GitHub Actions/Workflows and ArgoCD for GitOps deployments * Enhance system reliability by implementing monitoring, alerting, and observability solutions with Grafana * Automate operational tasks to reduce toil and improve team efficiency * Participate in on-call rotations, incident response, and post-mortem analysis * As a Site Reliability Engineer II, you will collaborate with development teams to improve application performance, scalability, and resilience * Implement and advocate for SRE best practices, including SLIs, SLOs, and error budgets * Continuously improve system performance, cost efficiency, and security* Bachelor's degree required; Masters preferred * 3+ years of experience in an SRE, DevOps, or cloud infrastructure role * Strong experience with Azure cloud services and infrastructure * Hands-on experience with java and Terraform and Terragrunt for infrastructure-as-code * Experience with CI/CD tools, especially GitHub Workflows/Actions and ArgoCD * Solid understanding of observability tools like Grafana (Prometheus, Loki, Tempo experience is a plus) * Proficiency with Kubernetes (preferably AKS), Databricks and container orchestration
Created: 2026-04-02