Site Reliability Engineer

Company:  Dice
Location: San Jose
Closing Date: 26/10/2024
Salary: £150 - £200 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

Role:
Site Reliability Engineer

Employment Type:

Work location:
Sanjose, CA

Work mode:
Onsite - 2 days a week / 3 days Remote

About The Role:
We seek a highly skilled and dynamic Site Reliability Engineer. In this role, you will:

  1. Maintain and improve the reliability, performance, and availability of software systems.
  2. Act as a bridge between traditional IT operations and software development, bringing a software engineering approach to system administration.

Job Responsibilities:

  1. Creating and supporting automation scripts (shell/ansible/python) for infrastructure deployments, validations, and monitoring to improve operational tasks.
  2. Scheduling monitoring scripts using cron and airflow.
  3. Monitoring using tools including Dynatrace, Apica, Grafana, etc.
  4. Database handling.
  5. Build CICD pipelines.
  6. Incident handling and problem management.

Mandatory Skills:

  1. Experience in Ansible/Python.
  2. Monitoring Tools: Dynatrace/Apica/Grafana.

Required Education:
Bachelor's degree in computer science or a related field.

Required Experience:

  1. 14 plus years of IT Infrastructure experience.
  2. Extensive experience working with Linux flavors like RHEL/CentOS OS, shells, filesystems, and utilities.
  3. Experience in programming languages like Python, Ansible.
  4. Knowledge of distributed computing and experience working with container orchestration frameworks including on-prem and Rancher Kubernetes, with good knowledge of Kubernetes objects.
  5. Experience working with Storage; ONTAP is preferable: volume, aggregates, backups, DR planning.
  6. Experience scheduling monitoring scripts using cron and airflow.
  7. Experience with monitoring tools including Dynatrace, Apica, Grafana, etc.
  8. Database knowledge including SQL and NoSQL databases.
  9. Experience building CICD pipelines (preferred).
  10. Cloud platform knowledge (specifically AWS) is required.

Travel Requirement (%):
NA

#J-18808-Ljbffr
Apply Now
An error has occurred. This application may no longer respond until reloaded. Reload 🗙