Senior Site Reliability Engineer at Horizontal Talent

Posted in Other about 3 hours ago.

Location: Atlanta, Georgia





Job Description:

Our client is seeking a dynamic Senior Site Reliability Engineer to join their team. This role is ideal for a candidate well-versed in modern reliability disciplines, capable of driving cross-team reliability initiatives. These initiatives include enhancing our client's reliability engineering practices through increased application resiliency, improved uptime/availability, and optimized application performance.

Key Responsibilities:

- Setting SLOs/SLIs/error budgets and managing reliability for infrastructure and applications

- Scripting in languages such as JavaScript, Nodejs, Python, Maven, Ansible, Bash, etc.

- Handling diverse systems with configuration management systems like Puppet, Chef, Ansible

- Leveraging automation for toil elimination

- Using tools like PagerDuty for managing incidents

- Monitoring and alerting systems like Prometheus, Grafana, Dynatrace

- Working with standard networking protocols and components

- Experience in Serverless Application Framework

- Managing containerized workloads and platforms such as Docker or Kubernetes

- Familiarity with distributed systems including Microservices

- Infrastructure automation tools such as CloudFormation, Terraform

- Understanding of CI/CD processes and deployment automation tools

- Debugging, troubleshooting, and problem-solving

- Effective communication, collaboration & negotiation skills

- Liaising with developers, operations staff, and third-party resources

- API integration projects

- Coaching/mentoring team members on reliability engineering aspects

Required Experience:

- Minimum 5+ years of experience in DevOps practices

- Hands-on experience with AWS Cloud and DevOps principles

- 2+ years of experience working on DevOps tools (GitLab CI, AWS-CodePipeline)

- 2+ years of experience in Scripting tools (Bash, Python etc.)

- 1+ years of experience in developing NodeJS or TypeScript applications

- 2+ years of experience in building and supporting applications in AWS

- 1+ year of experience in AWS CDK

Preferred Experience:

- Experience in Containerization technologies like Kubernetes, OpenShift, Docker

- Experience in Application Resiliency evaluation using AWS FIS

- Experience using Litmus for Chaos Engineering methods

- Exposure to RedHat OpenShift on AWS (ROSA)

As a lead engineer with our client's team, you will be at the forefront of Cloud and Big Data technology. This role will support highly available, business-critical applications and serve as the escalation point for complex issues in both on-premise and AWS environments. We are seeking talented engineers, well versed in DevOps technologies, automation, infrastructure orchestration, configuration management, and continuous integration.


More jobs in Atlanta, Georgia


Hilton Global

Kroger

Best Buy
More jobs in Other


Pike Electric, Inc.

Chickasaw Nation Industries

Chickasaw Nation Industries