MCS Site Reliability Engineer
MSCCN - Aurora, CO
Apply NowJob Description
Responsibilities- Support the availability, reliability, and performance of IaaS services supporting mission systems- Monitor infrastructure health using metrics, logs, and alerts; respond to and resolve incidents- Perform root-cause analysis for infrastructure and service outages; implement corrective and preventative actions- Improve system reliability through automation, standardization, and proactive engineering- Support capacity planning, performance analysis, and scaling of infrastructure services- Maintain and enhance monitoring, logging, and alerting solutions- Participate in incident response, on-call rotations (as required), and post-incident reviews- Collaborate with network, systems, platform, and application teams to resolve cross-stack issues- Support infrastructure lifecycle activities including upgrades, patches, and configuration changes- Apply security best practices and support compliance requirements in a regulated environment- Develop and maintain runbooks, procedures, and operational documentation- Contribute to CI/CD and Infrastructure-as-Code workflows supporting IaaS services- Participate in Agile ceremonies and operational planning activities- Perform other duties as assigned Requirements- 5+ years of professional experience in systems engineering, SRE, DevOps, or infrastructure operations- Strong experience administering Linux systems- Experience supporting on-prem, cloud, or hybrid infrastructure environments- Hands-on experience with monitoring, logging, and alerting systems- Strong troubleshooting skills across compute, storage, networking, and OS layers- Experience scripting or automating tasks using Bash, Python, or similar languages- Familiarity with Infrastructure as Code concepts and tooling- Strong verbal and written communication skills- Detail-oriented, self-motivated, and able to own issues through resolution- Ability to obtain and maintain a DoD security clearance- Ability to work on-site at the customer locationCandidates who have any of the following skills will be preferred:- Experience working on an IaaS or platform operations team- Experience with virtualization platforms (e.g., VMware vSphere)- Experience supporting container platforms (Kubernetes, OpenShift)- Experience with cloud environments (AWS, Azure, or GovCloud)- Familiarity with SRE concepts such as SLIs, SLOs, error budgets, and toil reduction- Experience with configuration management or automation tools (Ansible, Terraform)- Experience with CI/CD pipelines (GitLab CI, Jenkins, or similar)- Experience operating systems in government or secure environments- Experience with incident management and operational readiness reviews BenefitsSciTec offers a highly competitive salary and benefits package, including:- 4% Safe Harbor 401(k) match- 100% company paid HSA Medical insurance, with a choice of 2 buy-up options- 80% company paid Dental insurance- 100% company paid Vision insurance- 100% company paid Life insurance- 100% company paid Long-term Disability insurance- Short-term Disability insurance- Annual Profit-Sharing Plan- Discretionary Performance Bonus- Paid Parental Leave- Generous Paid Time Off, including Holiday, Vacation, and Sick Pay- Flexible work hoursThe pay range for this position is $146,000 - $175,000 / year.u00a0SciTec considers several factors when extending an offer of employment, including but not limited to the role and associated responsibilities, a candidate's work experience, education/training, and key skills. This is not a guarantee of compensation.SciTec is proud to be an Equal Opportunity employer. VET/Disabled.
Created: 2026-01-27