Sr./Lead Site Reliability Engineer at Federal Reserve Bank (FRB)

Posted in Information Technology about 3 hours ago.

Type: Full-Time
Location: San Francisco, California

Job Description:

Company

Federal Reserve Bank of San Francisco

We are the Federal Reserve Bank of San Francisco—public servants with a mission to advance the nation’s monetary, financial, and payment systems to build a stronger economy for all Americans. We are a community-engaged bank, and are committed to understanding and serving the vibrant, expansive communities of the Twelfth District. That means we seek and appreciate new perspectives. We respect people for what they do and for who they are. We build opportunities to learn and grow. When you join the SF Fed, you become part of a diverse team united in its purpose to promote an economy that works for everyone.

As a Sr. /Lead Site Reliability Engineer, you will work with Cash Application Delivery Services (ADS) development, QA , DevOps and National IT teams for managing the systems that support the Cash ADS applications suite both on-prem and in the Cloud. Your main focus will be to ensure that all of our applications are operating optimally, and every aspect of the application is being monitored so as to facilitate quick troubleshooting and resolution of issues as they arise.

We empower our people to balance their life and work responsibilities. That’s why we offer a flexible hybrid work model that allows you to collaborate with office colleagues on some days, and work from home on others.

Responsibilities:

Establish and run playbooks to support the resolution of incidents that occur in production environments.

Help design Dashboards for effective monitoring of infrastructure resources in the cloud environments

Work with development teams to establish Service-Level Objectives and key Service-Level Indicators

Conduct Production Readiness Reviews to ensure services meets accepted standards of operational readiness before going live

Ensure infrastructure aligns with Security standards, assist in audits, and implement recommended practices to protect data and systems.

Facilitate the design and implementation of the Disaster Recovery plans, including back-ups, failover and recovery mechanism with the development and DBA subject matter experts

As one of the SREs, drive improvement opportunities in infrastructure, tooling, and workflows using a continuous feedback loop between development and CloudOps

Ensure uptime and reliability of Cloud based infrastructure and systems, monitoring system performance, and maintaining high availability of cloud-based assets.

Participate in incident Response and Troubleshooting by conducting root cause analysis and implementing solutions to prevent recurrence.

Establish thresholds for cloud based services and capabilities, set up and maintain monitoring systems to detect issues, before they impact users,

Configure alerts for system analogies, develop monitoring dashboards, monitor resource usage, latency, and error rates

Analyze system performance, establish metrics and thresholds, optimize service uptime, reduce latency, and improve customer experience, leveraging infrastructure modifications and configuration tuning etc.

Knowledge of technical troubleshooting approaches, tools and techniques, and the ability to anticipate, recognize, and resolve technical (hardware, software, application or operational) problems

Working experience in programming and scripting languages

Working tooling experience in Ansible, GitLab, Terraform, CloudWatch, Dynatrace, Grafana or equivalent is a must

Qualifications:

Bachelor’s degree in Computer Science, Information Systems, Computer Engineering, Systems Analysis or a related field or equivalent work experience

As a Lead Site Reliability Engineer typically requires 7+ years of industry experience in building and supporting enterprise level systems as a platform engineer or equivalent in a production environment. As a Senior Site Reliability Engineer typically requires 5+ years of hands-on experience implementing, supporting, and using the tools and services required for software orchestration, environment monitoring and management (DevOps) best practices. Experience in Ansible, GitLab, Terraform, CloudWatch, Dynatrace, Grafana is required

2+ years hands-on experience with AWS services - building, deploying, and monitoring using AWS tools and services such as AWS Lambda, AWS CloudWatch, and AWS X-Ray

Must be a U.S. Citizen or a Green Card holder with the intent to become a U.S. Citizen

Base Salary Range Sr. Site Reliability Engineer: Min: $113600 - Mid: $147600 - Max: $181600 (Location: San Francisco)

Base Salary Range: Lead Site Reliability Engineer: Min: $138900 - Mid: $180400 - Max: $221900 (Location: San Francisco)

Final salary and offer will be determined by the applicant’s background, experience, skills, internal equity, and alignment with market data.

We offer a wonderful benefits package including: Medical, Dental, Vision, Pre-tax Flexible Spending Account, Backup Child Care Program, Pre-Tax Day Care Flexible Spending Account, Paid Family Care Leave, Vacation Days, Sick Days, Paid Holidays, Pet Insurance, Matching 401(k), and Retirement/Pension.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive other benefits and privileges of employment. The SF Fed is an Equal Opportunity Employer.

#LI-Hybrid

Full Time / Part Time

Full time

Regular / Temporary

Regular

Job Exempt (Yes / No)

Yes

Job Category

Information Technology

Work Shift

First (United States of America)

The Federal Reserve Banks believe that diversity and inclusion among our employees is critical to our success as an organization, and we seek to recruit, develop and retain the most talented people from a diverse candidate pool. The Federal Reserve Banks are committed to equal employment opportunity for employees and job applicants in compliance with applicable law and to an environment where employees are valued for their differences.

Always verify and apply to jobs on Federal Reserve System Careers (https://rb.wd5.myworkdayjobs.com/FRS) or through verified Federal Reserve Bank social media channels.

Privacy Notice

More jobs in San Francisco, California

Other 8 minutes ago Senior Equity Plan Administrator Zendesk San Francisco, California
Other 8 minutes ago Sr. Director, Scaled Marketing, Analytics, and Operations Zendesk San Francisco, California
Other 8 minutes ago Armed Security Officer- FPS/DHS Government Contract MaxSent San Francisco, California

More jobs in Information Technology

Information Technology 41 minutes ago Project Manager - Public Works Ardurra Group, Inc. Spokane, Washington
Information Technology about 1 hour ago Project Manager - Public Works Ardurra Group, Inc. Cheyenne, Wyoming
Information Technology about 1 hour ago Project Accountant II Ardurra Group, Inc. Phoenix, Arizona