Site Reliability Engineer

Company:  University Of Michigan
Location: Ann Arbor
Closing Date: 20/10/2024
Salary: £100 - £125 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

A cover letter is required for consideration for this position and should be attached as the first page of your resume. The cover letter should address your specific interest in the position and outline skills and experience that directly relate to this position.

Job Summary

As a Site Reliability Engineer in the Department of Biostatistics at the University of Michigan, you will ensure that all services (both our internally critical and our externally-visible systems) have reliability and uptime appropriate to users' needs and a fast rate of improvement. You will keep an ever-watchful eye on our systems' capacity and performance. This position will report to Albert Smith.

Responsibilities*

  1. Develop, scale, and automate our cloud infrastructure with a focus on efficiency, security, and reliability. (15%)
  2. Work closely with development teams to ensure that design, testing, and deployment of new products and features are optimized for reliability. (10%)
  3. Monitor system performance, configure alerts, and respond to incidents. (15%)
  4. Perform root cause analysis of production errors and resolve technical issues. (10%)
  5. Implement automation tools for efficient server management and operation. (10%)
  6. Participate in on-call rotations to handle and resolve high priority incidents. (10%)
  7. Collaborate with team members to improve our engineering tools, systems, procedures, and data security. (10%)
  8. Conduct systems tests for security, performance, and availability. (10%)
  9. Develop and maintain documentation for key systems and processes. (10%)

Required Qualifications*

  1. 1-3+ years (intermediate) or 3-5+ (senior) years of experience with cloud services (AWS, Google Cloud Platform, Azure) and experience with container orchestration technologies (e.g., Kubernetes, Docker).
  2. Expertise in software development in one or more programming languages (Python, Go, Java, etc.).
  3. Proficient with Unix/Linux systems, with scripting experience in Shell, Perl or Python.
  4. 1-3+ years (intermediate) or 3-5+ (senior) years of experience with infrastructure as code (IaC) using tools like Terraform, Ansible, or Chef.
  5. Solid understanding of core internet technologies (e.g., TCP/IP, DNS, SMTP, HTTP, distributed networks), and ability to troubleshoot related issues.
  6. Ability to navigate, communicate, and negotiate priorities and technical risk across teams.

Application Deadline

Job openings are posted for a minimum of seven calendar days. The review and selection process may begin as early as the eighth day after posting. This opening may be removed from posting boards and filled anytime after the minimum posting period has ended.

U-M EEO/AA Statement

The University of Michigan is an equal opportunity/affirmative action employer.

#J-18808-Ljbffr
Apply Now
An error has occurred. This application may no longer respond until reloaded. Reload 🗙