SR.SRE (Linux & Windows)

Company:  Omni Inclusive
Location: Los Angeles
Closing Date: 25/10/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description
The Systems Reliability Engineering (SRE) team helps elevate SRE practices at TWDC, promoting and on-boarding new technologies, solving complex problems and integrating with next generation digital platforms.
Systems Reliability Engineers use a software engineering approach to architect, design, automate, monitor, and build applications at scale. This includes operating and engineering software with close business segment alignment to deliver platforms through efficient, effective and resilient architectures. SREs are talented engineers that are focused on improving quality through a data driven approach: instrumentation, automation, and functional/unit testing.
The Senior SRE will help create, build and deliver amazing experiences for our guests, fans and businesses. Primary responsibilities include helping existing, new and emerging business teams onboard technologies or platforms to accelerate their businesses. This will include consultation, designing, building, and supporting development pipelines, automating infrastructure and operations, creating telemetry for monitoring, engineering high reliability and reinforcing best practices to secure our company and guest data.
The Senior SRE is expected to have systems administration skills in Linux and Windows platforms, and must have experience with software development (e.g. Python, Go, Java, Node), CI Pipeline tools (e.g. Jenkins), Git source management, cloud hosting (AWS, GCP & Azure), container computing (e.g. Docker, OCI), web technologies and the DevOps team culture. This position will also bring a working knowledge on systems, network, operational excellence and application stability, security, performance, and capacity management, as well as documentation.
The Senior SRE must be prepared to work with engineering, creative and production teams in an extremely collaborative and high-energy environment to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support. The ideal Senior SRE is passionate about constantly learning, applying technology to solve complex problems, and is a highly motivated, optimistic, proactive, creative thought leader and project manager.
The Senior SRE will:
Translate ideas into tangible products that shape experiences by focusing on a systematic approach to automation, resiliency, efficiency, stability, security, performance, and capacity management, as well as documentation and serve as a subject matter expert through internal and external tech talks and conferences.
Support initial discovery, architecture, design, automation, implementation and operationalization, including:
Business Engagement and Requirements Gathering
Architectural Review, Proof of Concept Work, and Onboarding
Project: Build and Operationalize New Systems/Sites/Services/Products
Systematic Load Testing, Troubleshooting, Optimization and Tuning
Create System and Application Monitors, Trending Metrics and Reports
Development: Tools and Automation Frameworks
Hosting Platforms and Infrastructure Design and Support
Documentation: Creation of Application Infrastructure Design documents, Operational Runbooks, and Knowledge Base Articles
Fluent in multiple scripting languages and advanced skills in programming languages (e.g. Go, Python, Ruby, Dart, Node, Java, others alike) with ability to build test coverage for all software being developed.
Systems administration skills on Linux and Windows platforms
Networking skills and protocols (e.g. HTTP, TLS, SSH, DNS)
Software Development Continuous Integration (CI) Pipeline knowledge (e.g. Jenkins, Gitlab CI)
Experience with Distributed Systems and Container Platforms (e.g. Kubernetes/GKE, ECS, Mesos, Fargate, Nomad)
Experience with Source Control Management systems (e.g. Git)
Expertise in public and private cloud hosting services (AWS, Google Cloud, Azure)
Recognized as an expert on at least one OS and proficient in multiple operating systems, including OS performance monitoring, setup, configuration, tuning, and troubleshooting.
Proficient in web server technologies (e.g. Apache, Node.js, NginX, Tomcat, IIS, Caddy Server) including setup, configuration, performance monitoring, tuning, clustering, and debugging (e.g. JConsole).
Proficient with data technologies (e.g. NoSQL, MySQL, MongoDB, Redis, Elastic) including being able to perform basic setup, configuration, and troubleshooting.
Able to implement existing base standards for new systems and/or applications for all of the following:
o Site/Systems monitoring and instrumentation
o Application monitoring and instrumentation
o System monitoring and instrumentation
o Resilience, performance & Telemetry data
Able to diagnose simple to complex system and process problems.
Demonstrate exceptional troubleshooting methodology, including the ability to author and instruct new methodologies to the SRE team.
Independently resolve moderately to highly complex system and application incidents.
Able to identify and propose system and application fixes for performance bottlenecks.
Able to evaluate new application requirements for capacity and run-time best practices.
Able to evaluate new systems and/or infrastructure solutions for technical feasibility against known requirements and standards.
Effective at dealing with change: Able to transition in role or handle a significant modification or technology with minimal ramp-up time and with very little guidance.
Communication and Leadership Requirements
Excellent verbal and written communication to all levels in the organization.
Demonstrates curiosity and continuous learning and self-improvement.
Ability to write operational specs, architectural diagrams, test plans and requirements management.
Communication of ideas and solutions in a clear and organized manner.
Clear and effective presentations to groups of people, including internal and external conference presentations..
Construction of concise and complete technical documentation.
Mentoring of other engineers on technical material.
Able to quickly and adeptly understand the needs of the business and be able to translate those needs into actionable items.
BS in Computer Science or related field with 5+ years
Strong communication skills
Most important Technical skills to have:
- CI/CD Pipelines
- Githab,Gitab
-Cloud environment experience
-Scripting-multiple languages
-Terraform
Apply Now
An error has occurred. This application may no longer respond until reloaded. Reload 🗙