Site Reliability Engineer

Company:  Kobiton
Location: Atlanta
Closing Date: 02/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

What we do:

Kobiton empowers enterprises to accelerate mobile app delivery through manual, automated, and no-code testing on real devices. Kobiton’s AI-augmented mobile testing platform uniquely delivers one-hour continuous testing and integration. Founded in 2016, Kobiton is venture-backed and headquartered in Atlanta. 

At Kobiton, we care a lot about experience - the experience we provide our enterprise customers, the experience our platform enables them to provide their users, and especially the experience we provide our internal customers - our employees. As we empower enterprises to deliver a better mobile experience, we strive to empower our employees by delivering a better work experience. 

We do this by committing to transparency (no, really) and a culture of collaboration, curiosity, and action in which we strive to work well together on things that actually matter. We also offer benefits that actually matter, including company-paid employee health benefits, self-managed (otherwise known as unlimited PTO), and an annual stipend for employee development through our Growbition program. 

Kobiton ranked the 18th Fastest-Growing Company in North America on the 2022 Deloitte Technology Fast 500™ and one of Georgia’s Top 40 Technology Companies in 2022 and 2023. As one of Atlanta’s “Best & Brightest” we’re searching for the best and brightest to join our team and help us continue to deliver the best experiences - internally and externally.

What you’ll do:

As a Site Reliability Engineer at Kobiton, you will be responsible for ensuring the reliability, performance, and scalability of our systems and services. You will work closely with development and operations teams to build and maintain robust infrastructure, automate processes, and troubleshoot complex issues. Your role is crucial in providing a seamless and reliable experience for our customers.

Key Responsibilities:

  • System Reliability and Performance: You’ll help us monitor and maintain the reliability, availability, and performance of our data center and AWS cloud systems. You’ll implement and manage systems to detect and resolve performance issues proactively.
  • Automation and Tools Development: You’ll help automate repetitive tasks and processes to improve efficiency and reduce manual intervention and develop and maintain tools and scripts for system monitoring, deployment, and maintenance.
  • Infrastructure Management: We strive to create remotely managed systems and infrastructure. You will have the opportunity to define and create approaches to automation, fail over, and automated recovery systems.
  • Collaboration: We work closely with our development teams to integrate reliability and performance best practices into the software development lifecycle. You’ll provide guidance and support to ensure that applications are designed for reliability and scalability.
  • Capacity Planning and Scaling: We pay attention to the performance (and cost) of our systems.  You’ll analyze system capacity and forecast future needs and implement scaling strategies to handle growth and ensure system performance under varying loads.
  • Building custom ISO’s: Kobiton offers proprietary software on a range of different hardware configurations giving you the opportunity to work on a variety of Linux distributions, creating bootable ISO infrastructure as well as creating different hardware platforms for customers.

Requirements:

  • Technical Expertise: You’ll need to have strong experience in systems administration, infrastructure management, and cloud platforms along with proficiency in scripting languages such as Ansible and Bash. It would be ideal for you to have  experience with github actions, terraform, troubleshooting and incident recovery, hardware management, and configuration as code.
    • Experience with container orchestration systems (e.g., Kubernetes, Docker).
    • Understanding of microservices architecture and distributed systems.
    • Experience with Nexus, Github to automate our build and deployment pipelines
    • Experience with managing DNS, DHCP, virtualization platforms (VMware / Proxmox).
  • Monitoring and Incident Management: Experience with monitoring tools such as AWS Grafana Suite and incident management best practices. 
  • Collaboration Skills: You’ll have strong communication and collaboration skills, with the ability to work effectively with cross-functional teams and you’ll be expected to provide technical support and guidance across the organization
  • Problem-Solving Abilities: You’ll bring excellent analytical and problem solving skills with a proactive approach to identifying, resolving, and preventing recurring issues. 

Benefits

  • 100% company-paid Medical, Dental, & Vision insurance for you and 80% company-paid coverage for your family. 
  • Self-Managed Paid Time Off (aka Unlimited PTO).
  • 401(k) Retirement Plan.
  • $1,000 annual stipend for professional development through our Growbiton program. 
  • Paid Parental Leave Program, available from day one. 
  • Access coffee at Bellwood Coffee Shop and fitness center in the 1776 office for FREE.
  • Quarterly Culture program that provides a variety of team-building, social, educational, and wellness events for all team members on the third Wednesday of each month.

Kobiton is proud to be an equal opportunity employer. We care about our people and celebrate our differences. We want to work with talented, collaborative, and innovative people. We do not discriminate in hiring or any employment decision based on race, color, religion, national origin, age, sex (including pregnancy, childbirth, or related medical conditions), marital status, ancestry, physical or mental disability, genetic information, veteran status, gender identity or expression, sexual orientation, or other characteristics protected by law.

Apply Now
An error has occurred. This application may no longer respond until reloaded. Reload 🗙