The Argonne Leadership Computing Facility’s (ALCF) mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity by designing and providing world-leading computing facilities in partnership with the computational science community. We help researchers solve some of the world’s largest and most complex problems with our unique combination of supercomputing resources and computational science expertise.
The ALCF has an opening for a Software Engineer working in the space of enabling AI for science, specifically targeting scalable inference leveraging HPC systems and AI accelerators. The successful candidate will join the Data Services and Workflows group, which focuses on scientific workflows that combine large-scale data, simulations, analysis, and AI. In this position, the candidate can expect to explore and engineer solutions for AI inference integrated within scientific workflows, via programmatic access using standard programming interfaces (e.g. OpenAI API), and through submission of large batches of prompts for parallel processing; both scenarios require efficient execution on underlying resources, including ALCF’s HPC systems and AI testbed machines. This position demands a good understanding of currently available AI models (LLMs and otherwise), their compute and memory requirements, and how to utilize the underlying hardware for high responsiveness. As the space of AI models evolves, we will adapt and deploy new models and functionality.
The Data Services and Workflows group--and this position--involves work in a highly collaborative environment involving science application teams, academia and industry, as well as other national labs and agencies, to solve some of the world’s largest and most complex problems in science and engineering. The candidate will engage with science application teams and contribute to broader scientific initiatives.
Position Requirements
Required skills and qualifications:
Experience with at least one AI framework is required, such as PyTorch or TensorFlow.
Comprehensive experience programming in one or more programming languages such as Python, C/C++.
Ability to create, maintain, and support high-quality software is essential.
Work with and contribute to domain-specific software and models.
Experience with version control software such as git.
Ability to work collaboratively in a fast-paced environment.
Effective written and oral communications skills.
Ability to model Argonne’s core values of impact, safety, respect, integrity and teamwork.
Preferred skills and qualifications:
Experience designing or operating distributed inference or data services, including request routing, asynchronous execution, queueing, fault tolerance, and performance monitoring.
Experience integrating services with HPC schedulers (e.g., Slurm, PBS), including resource provisioning, job lifecycle management, and balancing latency-sensitive and throughput-oriented workloads.
Experience optimizing AI inference performance (e.g., batching, memory management, model parallelism, quantization, accelerator utilization) on GPU- or accelerator-based systems.
Familiarity with secure, multi-user services, including authentication/authorization, API security, and operating within institutional or regulated environments.
Experience with running simulations or AI workflows on supercomputers.
This position can be hired at one of two levels (RD2 or RD3) and the requirements for each are as follows:
RD2: Bachelor’s degree and 5+ years of experience, Master’s degree and 3+ years of experience, or PhD, or equivalent. The expected pay range for this level is between $94,486 - $147,399.
RD3: Bachelor’s degree and 8+vyears of experience, Master’s degree and 5+ years of experience, or PhD and 4+ years of experience, or equivalent. The expected pay range for this level is $116,250 - $181,350.
Job Family
Research Development (RD)
Job Profile
Software Engineering 2
Worker Type
Regular
Time Type
Full timeThe expected hiring range for this position is $94,486.00 - $147,398.94.
Please note that the pay range information is a general guideline only. The pay offered to a selected candidate will be determined based on factors such as, but not limited to, the scope and responsibilities of the position, the qualifications of the selected candidate, business considerations, internal equity, and external market pay for comparable jobs. Additionally, comprehensive benefits are part of the total rewards package.
Click here to view Argonne employee benefits!
As an equal employment opportunity employer, and in accordance with our core values of impact, safety, respect, integrity and teamwork, Argonne National Laboratory is committed to a safe and welcoming workplace that fosters collaborative scientific discovery and innovation. Argonne encourages everyone to apply for employment. Argonne is committed to nondiscrimination and considers all qualified applicants for employment without regard to any characteristic protected by law.
Argonne employees, and certain guest researchers and contractors, are subject to particular restrictions related to participation in Foreign Government Sponsored or Affiliated Activities, as defined and detailed in United States Department of Energy Order 486.1A. You will be asked to disclose any such participation in the application phase for review by Argonne's Legal Department.
All Argonne offers of employment are contingent upon a background check that includes an assessment of criminal conviction history conducted on an individualized and case-by-case basis. Please be advised that Argonne positions require upon hire (or may require in the future) for the individual be to obtain a government access authorization that involves additional background check requirements. Failure to obtain or maintain such government access authorization could result in the withdrawal of a job offer or future termination of employment.