StaffAttract
  • Login
  • Create Account
  • Products
    • Private Ad Placement
    • Reports Management
    • Publisher Monetization
    • Search Jobs
  • About Us
  • Contact Us
  • Unsubscribe

Login

Forgot Password?

Create Account

Job title, industry, keywords, etc.
City, State or Postcode

Senior HPC Engineer

ASRC Federal Holding Company - Mountain View, CA

Apply Now

Job Description

ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal InuTeq provides High Performance Computing services across the full HPC lifecycle including computational requirements, architecture, acquisition, and operations for federal government customers, while promoting innovation, continuous standards-driven improvement, and industry best practices; this senior role supports the NASA NACS High Performance Computing contract by delivering continuous architectural enhancements and operational excellence, with the successful candidate serving as a proactive senior member of the team reporting to the Manager of the HPC Computer Systems and Storage (CSS) group and bringing extensive experience in designing, installing, maintaining, and upgrading large-scale HPC environments, including expertise with common batch schedulers such as PBS, Slurm, or Moab/Torque and InfiniBand troubleshooting and optimization, while actively participating in day-to-day HPC operations such as system patching, OS upgrades, new system deployments, scripting, troubleshooting, testing, benchmarking, and user tool development, as well as directly supporting scientific users by diagnosing and reproducing application performance issues, analyzing trouble tickets for recurring patterns, and contributing to both system improvements and user education. Key Responsibilities: Design, deploy and maintain HPC clusters with over 2000+ nodes with InfiniBand, 100+ petabytes of data storage in production. Shepherd and/or contribute to scalable feature designs through the entire software development process, from requirements and use cases to release Designs and develops scripts for system administration, monitoring and usage reporting. Modify existing software to correct errors and/or improve performance Designs and develops scripts for system regression test and performance (file systems (Luster), scheduler (PBS), interconnect (HDR/NDR, Slingshot, ), high availability, etc.). Troubleshoots, isolates and resolves application, system and other technical problems (hardware, software, and network). Understands research use cases, researches and deploys new technologies, defining cost, performance and other trade-offs. Manages and maintains tools for provisioning, configuration management (HPCM, Ansible & GIT), resource management, scheduling and all necessary aspects of HPC in accordance with best practices. Researches, deploys and manages networking and security infrastructure, including development of policies and procedures. Assists in developing and writing proposals and publications. Creates and provides clear documentation. Mentoring junior staff and cross training peers After hours/weekend support as required Moderate Supercomputing System Administration that contributes to: Day-to-day operations of the Linux HPC clusters and storage systems Proactive monitoring, analyze, and correct system issues Development of scripts to automate repetitive tasks or tools to enhance support of the HPC systems System performance analysis and tuning Building, installing, and supporting user-requested software Supporting evaluation and assessment of new HPC technology Resolving user report issues and manage support tickets requests in Remedy Requirements: Bachelor's degree in computer science or related field Strong computer science background with in-depth systems-level knowledge in operating systems and networking A minimum of 10 years of experience in the administration of HPC systems and scheduling software (PBS, Slurm, or Moab/Torque) A minimum of 10 years of experience of systems programming in heterogeneous, multi-platform HPC environments Strong ability to analyze, debug and maintain the integrity of an existing code base Demonstrated equivalence of 5 years of Linux/UNIX user support experience and hands-on experience with administration of Linux systems Experience working with HPC applications and proficiency in at least C, C++, or Fortran Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash Strong ability to interact with customers to understand needs, elicit requirements, and get feedback on prototype solutions Excellent communication and people skills; excellent time management and organizational skills Experience with system configuration management tools e.g. , puppet, chef, ansible Experience with revision control software e.g. CVS, SVN, Git Track record of delivering commercial quality software on schedule with excellent quality through multiple release cycles Proficiency at documentation and technical writing Preferred Skills: Proficiency with analysis and problem-solving skills for debugging and optimization of applications Familiarity/proficiency with OpenMP and Message Passing Interface (MPI) programming Experience with Lustre, and InfiniBand Experience with cloud technologies (AWS, Azure, GCP), OpenStack or Kubernetes is a plus We invest in the lives of our employees, both in and out of the workplace, by providing competitive pay and benefits packages. This position is offering a pay range of $140,000.00 - $160,000.00 depending on experience, seniority, geographic locations, and other factors permitted by law. Benefits offered may include healthcare, dental, vision, life insurance; 401(k); education assistance ; paid time off including PTO, holidays, and any other paid leave required by law. Job Details Job Family Information Technology Job Function Systems Administration Pay Type Salary Education Level Bachelor's Degree

Created: 2026-03-10

➤
Footer Logo
Privacy Policy | Terms & Conditions | Contact Us | About Us
Designed, Developed and Maintained by: NextGen TechEdge Solutions Pvt. Ltd.