We are seeking a highly skilled and motivated HPC Systems Administrator to manage and support our high-performance computing (HPC) environment. The role involves maintaining and optimizing four unique HPC clusters, Globus data transfer nodes, GPU nodes, monitoring systems, IBM ESS storage appliances, GPFS (General Parallel File System), PBS Pro scheduler, and ensuring compliance with security and identity management standards such as LDAP integration, Multi-Factor Authentication (MFA), and HSPD-12 compliance. The ideal candidate will ensure the reliability, performance, and scalability of our HPC infrastructure to support advanced computational workloads.
Key Responsibilities
HPC Cluster Management:
Administer and maintain four unique HPC clusters, ensuring optimal performance and uptime.
Perform system upgrades, patching, and configuration management.
Troubleshoot and resolve hardware and software issues.
Data Transfer Nodes & Globus:
GPU Nodes Administration:
Configure and maintain GPU nodes for computational workloads.
Optimize GPU utilization for machine learning, AI, and other GPU-intensive applications.
Monitoring & Visualization:
Storage Management:
Job Scheduling:
Identity & Access Management:
Implement and manage LDAP integration for centralized authentication and directory services.
Administer Linux account management, including user provisioning, permissions, and access controls.
Configure and support Multi-Factor Authentication (MFA) solutions to enhance system security.
Ensure compliance with HSPD-12 standards for identity verification and access control.
Documentation & Reporting:
Maintain detailed documentation of system configurations, processes, and procedures.
Generate regular reports on system performance, utilization, and incidents.
Collaboration & Support:
Work closely with researchers, developers, and other stakeholders to understand their computational needs.
Provide technical support and training to users of the HPC systems.
Security & Compliance:
Implement security best practices to protect sensitive data and computational resources.
Ensure compliance with organizational policies, industry standards, and government regulations such as HSPD-12.
May be required to perform other duties as assigned.
Position Requirements
Minimum Education and Experience Requirements: Bachelors and 6+ years’ experience, Masters and 4+ years’ experience, or equivalent
Bachelor's degree in Computer Science, Information Technology, or a related field.
5+ years of experience in HPC systems administration or a similar role.
Proficiency in Linux/Unix system administration.
Experience with Globus, GPU nodes, and HPC cluster management.
Strong knowledge of IBM ESS storage appliances and GPFS.
Familiarity with PBS Pro scheduler and job queuing systems.
Expertise in LDAP integration, Linux account management, and Multi-Factor Authentication (MFA).
Hands-on experience with monitoring tools like Grafana.
Knowledge of HSPD-12 compliance requirements and implementation.
Excellent problem-solving and analytical skills.
Ability to work independently and manage multiple priorities.
Attention to detail and commitment to quality.
Ability to model Argonne’s core values of impact, safety, respect, integrity, and teamwork.
Interpersonal skills, oral and written communication skills, and ability to interact with people at all levels both within and outside the laboratory.
Preferred Knowledge, Skills, and Experience
Master's degree in a relevant field.
Certifications in HPC, Linux, or storage technologies.
Experience with scripting languages (e.g., Python, Bash) for automation.
Knowledge of networking protocols and security practices.
Work Environment
Job Family
Professional Technical (PT)
Job Profile
Systems Integration Admin/Support 4
Worker Type
Regular
Time Type
Full timeThe expected hiring range for this position is $106,455.00 - $166,069.80.
Please note that the pay range information is a general guideline only. The pay offered to a selected candidate will be determined based on factors such as, but not limited to, the scope and responsibilities of the position, the qualifications of the selected candidate, business considerations, internal equity, and external market pay for comparable jobs. Additionally, comprehensive benefits are part of the total rewards package.
Click here to view Argonne employee benefits!
As an equal employment opportunity employer, and in accordance with our core values of impact, safety, respect, integrity and teamwork, Argonne National Laboratory is committed to a safe and welcoming workplace that fosters collaborative scientific discovery and innovation. Argonne encourages everyone to apply for employment. Argonne is committed to nondiscrimination and considers all qualified applicants for employment without regard to any characteristic protected by law.
Argonne employees, and certain guest researchers and contractors, are subject to particular restrictions related to participation in Foreign Government Sponsored or Affiliated Activities, as defined and detailed in United States Department of Energy Order 486.1A. You will be asked to disclose any such participation in the application phase for review by Argonne's Legal Department.
All Argonne offers of employment are contingent upon a background check that includes an assessment of criminal conviction history conducted on an individualized and case-by-case basis. Please be advised that Argonne positions require upon hire (or may require in the future) for the individual be to obtain a government access authorization that involves additional background check requirements. Failure to obtain or maintain such government access authorization could result in the withdrawal of a job offer or future termination of employment.