Linux System / Platform Engineer
Berkeley Lab - Berkeley, CA
Apply NowJob Description
OverviewThe National Energy Research Scientific Computing Center (NERSC) is seeking a versatile Linux System / Platform Engineer to join our team building and managing Linux-based infrastructure. More than ever, scientific discovery transforms our world. NERSC is at the forefront, operating some of the world’s largest supercomputers for thousands of researchers who use computational power to solve society’s most challenging problems.This role focuses on building and managing container and virtual machine platforms and deploying systems that keep our supercomputing center running smoothly. Responsibilities include API endpoints, scientific research tools, authentication, identity and access management, databases, and more. You will collaborate with a group of systems and software engineers and work with other groups across NERSC on various projects, as well as with counterparts at peer scientific facilities to streamline cutting-edge research using automation, cloud-native and AI tools and techniques.If you are interested in science, have Linux experience, and would enjoy working in a fast-paced, creative environment with a diverse team and a beautiful Berkeley Hills view, we want to hear from you!What You Will Do, at Level 3Work with a team to build and manage Linux systems and storage infrastructure.Troubleshoot and solve complex technical problems with other team stall, upgrade, and secure equipment and services.Develop and refactor scripts and other code.Participate in 24x7 on-call rotation.Coordinate small project teams or other initiatives (such as the rollout of a new service or system, or a major equipment or software upgrade).Work with vendors to prioritize efforts and enhance their technologies to meet user needs.Work with researchers to deploy services using Spin, our container cloud platform based on Kubernetes.Collaborate within NERSC and across the DOE community to develop services, integrate them into the new NERSC supercomputer Doudna, the NERSC data center environment, and across multiple DOE facilities.Present developments to NERSC staff and the broader HPC community at science conferences and industry meetings.Additional Responsibilities, at Level 4Analyze and solve complex technical problems requiring in-depth evaluation of variable factors.Work at a higher level of independence while carrying out work assignments.Research, select, and lead the implementation of new technologies.Develop team strategy and project plans.Provide leadership and technical guidance to group members and other colleagues at NERSC.Recommend and lead system improvement efforts that enhance system performance, reliability, and security.Identify and evaluate emerging HPC technologies and features that could introduce novel capabilities or enhance existing system performance and utility.Represent NERSC in technical or user advocacy groups to influence the HPC and DOE community to meet user needs.What is Required, at Level 3Typically, 8+ years of related experience with a Bachelor’s degree; alternatively, 6+ years with a Master’s degree; or equivalent career experience.4+ years of experience managing large-scale Linux-based system deployments in a high-performance computing, cloud computing, or hyper-scale environment.Experience with some or all of our key technologies:containers (such as Docker or Kubernetes)virtualization (such as Proxmox or VMware)cloud-based deployment (such as AWS, Azure or GCP)Using and developing AI (or machine learning) tools and servicesidentity and access managementdatabase administration, tuning, and troubleshootingnetworked storage systemsbackup technologiesFamiliarity with automated provisioning systems (such as Chef, Foreman, or Terraform).Familiarity with configuration management systems (such as Ansible or Puppet).Working knowledge of Linux system engineering and security practices.Ability to resolve complex issues in creative and effective ways and derive technical solutions in a collaborative environment to meet end user requirements or needs.Demonstrated ability to work independently as well as collaboratively in large projects, and contribute to an active and respectful intellectual environment.Creative, positive, and collaborative work style.Excellent oral and written communication skills.Additional Requirements, at Level 4Typically, 12+ years of related experience with a Bachelor’s degree; alternatively, 8+ years with a Master’s degree; or equivalent career experience.Experience in software engineering or complex scripting.Experience managing network equipment.Ability to lead and coordinate projects.Ability to analyze and resolve significant and unique issues requiring evaluation of multiple intangible factors.Ability to exercise independent judgment in methods, techniques and evaluation criteria for obtaining results.NotesThis is a full-time, career appointment, exempt (monthly paid) from overtime pay.This position will be hired at a level commensurate with the business needs and the skills, knowledge, and abilities of the successful -person interviews will consist of standard question and answer sessions and a presentation on a technical topic.The Level 3 salary range is between $136,440 to $230,244 per year and is expected to pay between a targeted range of $153,492 to $187,596 per year depending upon full skills, knowledge, and abilities.The Level 4 salary range is between $155,388 to $262,224 per year and is expected to pay between a targeted range of $174,804 to $213,660 per year depending upon full skills, knowledge, and abilities.This position is subject to a background check. Any convictions will be evaluated to determine if they directly relate to the responsibilities and requirements of the position. Having a conviction history will not automatically disqualify an applicant from being considered for employment.This position requires substantial on-site presence, but is eligible for a flexible work mode, and hybrid schedules may be considered. Hybrid work is a combination of performing work on-site at Lawrence Berkeley National Lab, 1 Cyclotron Road, Berkeley, CA and some telework. Individuals working a hybrid schedule must reside within 150 miles of Berkeley Lab. Work schedules are dependent on business needs. In rare cases, full-time telework or remote work modes may be considered. A REAL ID or other acceptable form of identification is required to access Berkeley Lab sites.Want to learn morecareers.lbl.govEqual Employment Opportunity EmployerBerkeley Lab is an Equal Opportunity Employer. We strive to build a diverse and inclusive workforce. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, protected veteran status, or other protected categories under State and Federal law. #J-18808-Ljbffr
Created: 2025-09-25