StaffAttract
  • Login
  • Create Account
  • Products
    • Private Ad Placement
    • Reports Management
    • Publisher Monetization
    • Search Jobs
  • About Us
  • Contact Us
  • Unsubscribe

Login

Forgot Password?

Create Account

Job title, industry, keywords, etc.
City, State or Postcode

Lead Site Reliability Engineer (SRE)

Macpower Digital Assets Edge - Rockville, MD

Apply Now

Job Description

Note: This is a fully hands-on role. Architect-level applicants will not be considered. Key Focus Areas: Manage and optimize control towers, organizational policies, and multi-account environments. Oversee AWS backups, SSM patching, AMI deployments, and configuration pushes across multiple accounts. Manage and maintain core AWS services including EC2, ECS, EKS, RDS, S3, SageMaker, CloudFront, and Lambda. Implement S3, SFTP, and site externalization methods. Develop Infrastructure as Code (IaC) using Terraform, CloudFormation, and Python. Manage IAM policies, access controls, and permissions. Core Responsibilities: Manage and maintain cloud infrastructure to ensure high availability, reliability, and performance. Serve as the primary escalation point for all cloud infrastructure issues. Monitor cloud resource performance and cost efficiency. Lead major incident management and communicate timely updates to stakeholders. Perform due diligence and impact analysis before implementing changes to cloud platforms. Lead and mentor a team of cloud engineers to ensure performance and collaboration. Manage daily operations and ensure alignment with organizational objectives. Develop and implement incident management processes and conduct root cause analysis. Identify and automate repetitive infrastructure tasks using IaC principles. Continuously improve operational processes and standard operating procedures. Implement and enforce security controls, ensuring compliance with standards such as GDPR and HIPAA. Monitor cloud usage and conduct capacity planning to balance efficiency and scalability. Develop and test disaster recovery and business continuity plans. Collaborate with IT, business units, and vendors to deliver scalable cloud solutions. Document cloud configurations, processes, and reports, ensuring accessibility and version control. Technical Skills: Proficiency in AWS (EC2, ECS, EKS, RDS, S3, Lambda, SageMaker, CloudFront). Experience with Azure and OCI cloud environments. Infrastructure as Code (Terraform, CloudFormation, Ansible, Puppet, Chef). Scripting in Python and PowerShell. Strong understanding of cloud architecture, monitoring, and automation tools. System administration experience (Windows, Linux, VMware, Active Directory, Azure AD SSO). Strong networking knowledge (DNS, DHCP, PKI, LAN/WAN). Leadership and Behavioral Skills: Demonstrated experience in leading teams and managing cloud operations. Strong communication and stakeholder management across technical and business functions. Proactive problem-solver with excellent analytical and root cause analysis skills. Self-motivated with a continuous improvement mindset. Experienced in vendor management and contract negotiations. Basic Qualifications: Bachelor's degree in Computer Science, Information Technology, Electrical Engineering, or equivalent. Experience in cloud operations and team leadership in technical environments. Preferred Certifications and Experience: WS Certified Solutions Architect - Associate or Professional. Microsoft Certified: Azure Architect. Familiarity with DevOps tools (CI/CD, Jenkins, Git). Experience with ITIL or ITSM frameworks.

Created: 2026-03-04

➤
Footer Logo
Privacy Policy | Terms & Conditions | Contact Us | About Us
Designed, Developed and Maintained by: NextGen TechEdge Solutions Pvt. Ltd.