HPC Systems Administrator
HPC Systems Administrator
20 hours per week, 6 month contract, remote
Support and maintain high-performance computing (HPC) clusters used for large-scale research and engineering workloads. You’ll manage day-to-day operations, troubleshoot hardware and software issues, and ensure optimal system performance across compute, storage, and network environments.
Experience / Skills
-
Bright Cluster Manager (CMSH, image maintenance) and Slurm administration
-
Linux systems administration and user support
-
Hardware troubleshooting — GPUs, servers, InfiniBand, and power issues
-
BMC tools (Dell iDRAC, HPE iLOM, Supermicro) for remote diagnostics
-
Strong communication and coordination skills — able to work with vendors and internal teams
Nice to Have
-
Experience with InfiniBand, Panasas storage, and GPU management
-
Familiarity with Active Directory integration for Linux (vasd, vastool)
-
Vendor escalation and support coordination experience
Ideal Candidate: A hands-on Linux administrator with HPC or cluster experience who enjoys solving complex technical problems and coordinating hardware repairs while keeping compute environments stable and efficient.
Mainz Brady Group is a technology staffing firm with offices in California, Oregon, Washington and Texas. We specialize in Information Technology and Engineering placements on a Contract, Contract-to-hire and Direct Hire basis. Mainz Brady Group is the recipient of multiple annual Excellence Awards from the Techserve Alliance, the leading association for IT and engineering staffing firms in the U.S. Mainz Brady Group is an Equal Opportunity Employer. We are committed to Diversity & Inclusion and incorporate non-discrimination best practices in all our staffing processes. Mainz Brady Group does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, gender expression, age, disability or any other protected class.
