Location: Chicago, IL
Job Description:
Installing, configuring, and maintaining large computer clusters/servers and software.
Day-to-day operations of the systems including systems administration, monitoring and storage performance up to and including network components.
Management of the system’s network switch, parallel file system and HPC software stack and tools.
Configuration of the scheduling and queuing system.
Diagnosing and resolving system operational problems quickly and effectively. Coordinating with vendors to resolve hardware and software problems. Assist users with access and other help desk ticket requests or issues.
Building and deploying open source software and software from vendors/partners.
Providing reliable and efficient backups/restores for all managed systems. Maintaining and monitoring the security of the HPC systems and servers.
Documenting system administration procedures for routine and complex tasks.
Plans and installs necessary patches and upgrades for servers and their associated storage, network, communications, and peripheral sub-systems. Installs and maintains an appropriate level of intrusion detection, monitoring, and auditing software as required.
Tracks compliance and maintains documentation for hardware, software, and service inventories for management reports.
Performs other related work as needed.
Preferred Qualifications
Education:
Bachelor’s degree in Computer Science or closely related field.
Experience:
A minimum of three years of Linux system administration experience in a large distributed computing environment.
At least two years experience in HPC system administration or managing large HPC clusters.
Technical Skills or Knowledge:
Knowledge of Linux.
Experience scripting with one or more languages such as Python, Shell, Perl.
Experience with Linux build automation tools such as Puppet, Ansible, GIT, Docker, highly preferred.
Experience implementing automation and monitoring using shell scripting and other related tools strongly preferred.
Experience with installing, configuring, and maintaining job management tools (such as SLURM, Moab, TORQUE, PBS, etc.) strongly preferred.
Experience with operating system deployment tools (e.g. XCAT, ROCKS) strongly preferred.
Experience configuring, administering, and supporting network storage subsystems (e.g. IBM, NetApp DataDirect Network, LSI, etc.) strongly preferred.
Experience with one or more distributed file systems (GPFS, Lustre, Gluster, etc.) strongly preferred.
Experience configuring, installing, tuning and maintaining scientific application software strongly preferred.
Experience configuring, installing, maintaining and/or using performance monitoring and optimization tools strongly preferred.
Experience documenting implementations and system related tasks.
Preferred Competencies
Understand and translate researchers' scientific goals into computational requirements.
Work well with faculty and researchers.
Identify and gain expertise in appropriate new technologies and/or software tools.
Function as part of an interactive team while demonstrating self-initiative to achieve project's goals and Research Computing Center's mission.
Strong analytical skills and problem-solving ability.
Application Documents
Resume or C/V (required)
Cover letter strongly preferred
The University of Chicago is an Affirmative Action/Equal Opportunity/Disabled/Veterans Employer and does not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national or ethnic origin, age, status as an individual with a disability, protected veteran status, genetic information, or other protected classes under the law. For additional information please see the University's Notice of Nondiscrimination.
Staff Job seekers in need of a reasonable accommodation to complete the application process should call 773-702-5800 or submit a request via the Applicant Inquiry Form.
The University of Chicago's Annual Security & Fire Safety Report (Report) provides information about University offices and programs that provide safety support, crime and fire statistics, emergency response and communications plans, and other policies and information. The Report can be accessed online at: securityreport.uchicago.edu. Paper copies of the Report are available, upon request, from the University of Chicago Police Department, 850 E. 61st Street, Chicago, IL 60637.
#J-18808-Ljbffr