Lead Site Reliability Engineer at Federal Reserve Bank (FRB)

Posted in Information Technology 3 days ago.

Type: Full-Time
Location: San Francisco, California





Job Description:

Company

Federal Reserve Bank of San Francisco

We are the Federal Reserve Bank of San Francisco—public servants with a mission to advance the nation’s monetary, financial, and payment systems to build a stronger economy for all Americans. We are a community-engaged bank, and are committed to understanding and serving the vibrant, expansive communities of the Twelfth District. That means we seek and appreciate new perspectives. We respect people for what they do and for who they are. We build opportunities to learn and grow. When you join the SF Fed, you become part of a diverse team united in its purpose to promote an economy that works for everyone.

The Federal Reserve Bank of San Francisco is looking for a Lead Site Reliability Engineer (DevOps Expert) to join the Enterprise Architecture and Integrated DevOps Team. We empower the Federal Reserve business technology landscape by guiding development and management of solutions with cloud-centric strategic platforms. This is an exciting opportunity to help design, development, deployment, and support of an automation framework that enables IaaS capabilities across AWS Cloud. This requires collaborating closely with cross-functional teams to translate infrastructure architectures into automated, scalable cloud-native Integrated DevSecOps. You will standardize deployment and data models to support rapid scaling with multi-tenancy and self-service functionality across our cloud services.

We empower our people to balance their life and work responsibilities. That’s why we offer a flexible hybrid work model that allows you to collaborate with office colleagues on some days, and work from home on others.


Responsibilities:


  • Design, implement, and maintain scalable, highly available, and secure infrastructure on AWS. Handle and optimize AWS services, including EC2, EKS, and other related services.

  • Work with internal collaborators, customers for planning, delivery, and service management.

  • Co-own ongoing ITIL processes, and implementation and driving of continuous improvement initiatives.

  • Build and maintain reliable and scalable systems, CI/CD tooling, and automating cloud-based highly available, high performing applications.

  • Design and deploy robust cloud infrastructure and container solutions, focusing on reliability, scalability, and performance

  • Create/use automation framework to streamline IaaS provisioning and configuration across cloud environment, enabling efficient scaling and operational consistency

  • Implement/leverage observability, monitoring, and SRE principles (e.g., error budgets, proactive incident management) to enhance system reliability and performance

  • Apply FinOps practices to monitor and optimize cloud resource usage, ensuring cost-effective operation across all environments

  • Guide engineering teams, fostering standard processes in cloud engineering, SRE, and automation

  • Adopt security standard processes within cloud infrastructure with secure design patterns and ensuring alignment with industry standards and regulatory requirements.

  • Proactively identify potential vulnerabilities and lead initiatives to ensure systems are prepared for rapid recovery, minimizing impact from disruptions

  • Participate in strategy toward continuous monitoring and performance tuning of cloud systems to enhance efficiency and reliability. Use data-driven insights to identify optimization opportunities, address performance bottlenecks, and ensure cloud resources meet evolving business demands

  • Design and manage microservices architecture for high performance and scalability.

  • Monitor and maintain system performance using cloud monitoring tools.

  • Collaborate with development teams to integrate applications into the CI/CD pipeline.

  • Automate configuration management and deployment processes


Qualifications:


  • Bachelor’s degree in Computer Science, Information Technology, or other related technical degree.

  • Typically requires 7+ years of solid background in AWS services, container orchestration, infrastructure as code, and continuous integration/continuous deployment (CI/CD) processes.

  • Experience with microservices architecture, and integration of programming languages into CI/CD pipelines

  • Experience developing, customizing, and scaling cloud monitoring tools

  • Technical/functional expertise in tooling for ITIL, Agile, Project Management and SDLC

  • Experience supporting infrastructure for large multi-services applications

  • Familiarity with Fault Injection tooling (i.e. AWS Fault Injection Simulator, Gremlin, ChaosToolkit, Chaos Monkey)

  • Standard methodologies in chaos engineering process and implementation (Chaos gamedays, business critical KPIs, etc.)

  • Excellent problem-solving skills and ability to work in a fast-paced environment.

  • Good interpersonal skills and ability to collaborate effectively with multi-functional teams.

  • AWS Certified DevOps Engineer or similar certification.

  • Must be a U.S Citizen or a Green Card holder with intent to become a U.S Citizen

Base Salary Range Lead Site Reliability Engineer: Min: $138,900 - Mid: $180,400 - Max: $221,900 (Location: San Francisco)

Final salary and offer will be determined by the applicant’s background, experience, skills, internal equity, and alignment with geographic and other market data.

We offer a wonderful benefits package including: Medical, Dental, Vision, Pre-tax Flexible Spending Account, Backup Childcare Program, Pre-Tax Day Care Flexible Spending Account, Paid Family Care Leave, Vacation Days, Sick Days, Paid Holidays, Pet Insurance, Matching 401(k), and Retirement/Pension.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive other benefits and privileges of employment.

The SF Fed is an Equal Opportunity Employer.

#LI-Hybrid

Full Time / Part Time

Full time

Regular / Temporary

Regular

Job Exempt (Yes / No)

Yes

Job Category

Information Technology

Work Shift

First (United States of America)

The Federal Reserve Banks believe that diversity and inclusion among our employees is critical to our success as an organization, and we seek to recruit, develop and retain the most talented people from a diverse candidate pool. The Federal Reserve Banks are committed to equal employment opportunity for employees and job applicants in compliance with applicable law and to an environment where employees are valued for their differences.

Always verify and apply to jobs on Federal Reserve System Careers (https://rb.wd5.myworkdayjobs.com/FRS) or through verified Federal Reserve Bank social media channels.

Privacy Notice





More jobs in San Francisco, California


T. Rowe Price

Digital Realty

Eurofins MET Laboratories, Inc.
More jobs in Information Technology


Regional Transportation District

T. Rowe Price

AT&T