Sr SRE Compute Infrastructure
NxT Level - Boston, MA
Apply NowJob Description
OverviewSenior Site Reliability Engineer – Compute InfrastructureLocation: Boston, MA (Hybrid – Tues–Fri Onsite | Mondays Remote)Compensation: $134,250 – $214,800 + Bonus + Equity + Full BenefitsWe are representing a cutting-edge technology company that is seeking a Senior Site Reliability Engineer (SRE) to join their global infrastructure team. In this role, you/'ll play a critical part in scaling and optimizing the organization/'s cloud-native Kubernetes platform—the backbone for internal engineering teams delivering high-impact applications and services.This role is ideal for an SRE who thrives in complex distributed environments, is passionate about developer enablement, and enjoys building robust systems that balance performance, reliability, and scalability.Why You Should Apply:You/'ll work on global, mission-critical systems running on modern cloud infrastructureHigh autonomy in a fast-paced, high-impact engineering environmentOpportunity to shape SRE best practices across the orgHybrid work culture that values face-to-face collaboration and innovationWhat You’ll Do:Architect and scale cloud-native Kubernetes infrastructure to support internal engineering workflowsDevelop tools and platforms that empower product and infrastructure teams to deploy and manage services rapidly and securelyWrite clean, efficient, and maintainable code in languages such as Python, Go, C#, or JavaUse Infrastructure as Code (IaC) tools like Terraform or Pulumi to provision and manage cloud resourcesEnhance observability and alerting systems using APM, metrics, and log aggregation toolsPartner with developers to optimize CI/CD pipelines and ensure smooth software delivery lifecyclesProvide strong documentation to promote self-service and onboarding across engineeringContinually assess and improve platform reliability, operability, and cost-efficiencyContribute to system design reviews and mentor junior engineers on cloud-native best practicesWhat You Bring:7+ years of experience in Platform Engineering or Site Reliability EngineeringProven experience managing Kubernetes platforms at scale (e.g., AKS, EKS, or GKE)Strong programming experience in Python, Go, C#, Java, or similar languagesDeep understanding of cloud platforms like AWS or AzureExperience with ArgoCD, GitHub Actions, or similar CI/CD toolsProficiency with observability tooling (Datadog, Prometheus, Grafana, etc.)Expertise in networking, security protocols, and container orchestrationFamiliarity with communication protocols such as SPI, UART, RS485, and modern interfaces like TLS, X.509, etc.Experience building testable, scalable IaC modules and managing multi-environment deploymentsStrong collaboration and documentation habits in cross-functional teamsEmpathy for internal users and a customer-focused mindsetBenefits:Competitive base salary: $134,250 – $214,800 (based on experience & location)Bonus + equity opportunitiesDiscretionary time off (DTO) policyPaid parental leave for all caregiversMedical, dental, and vision coverageFitness and wellness reimbursementsMental health & professional development supportHybrid workplace with in-office perks (snacks, events, and team-building activities)Note: Compensation and benefits may vary depending on experience level and geographic market. #J-18808-Ljbffr
Created: 2025-09-17