Senior Site Reliability Engineer

New Today

Overview

Senior Site Reliability Engineer — Knutsford (hybrid, 2 days per week in office). A leading Financial Services firm is recruiting for a Senior Site Reliability Engineer to become part of a newly formed Core SRE Team that will establish a Centre of Excellence to enhance and promote SRE best practices.

About the Role

As a key hire, you will raise awareness and drive adoption of SRE methodologies within various teams. This is a hands-on engineering role where you will design, build, and optimise automation frameworks, observability tools, and incident response mechanisms. You will act as a trusted advisor, providing strategic guidance and consultative support to help teams improve reliability, scalability, and efficiency.

Responsibilities

  • Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
  • Resolution, analysis and response to system outages and disruptions, and implementation of measures to prevent similar incidents from recurring.
  • Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
  • Monitoring and optimisation of system performance and resource usage, identifying bottlenecks, and implementing best practices for performance tuning.
  • Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.

Required Skills

  • Proficiency in Programming and Scripting — languages such as Python, Powershell, or Go for automating routine tasks and system deployments.
  • Incident Management and Troubleshooting — ability to manage incidents effectively, troubleshoot issues swiftly, and perform root cause analysis to prevent future incidents.
  • Systems Engineering and Automation — understanding of operating systems, networking, and cloud infrastructure; proficiency in automation tools for maintaining system reliability at scale.
  • Influential Communication Skills — ability to communicate effectively with team members and stakeholders to drive alignment and foster a collaborative environment for SRE practices.
  • Knowledge of Cloud Computing — familiarity with cloud platforms and services as infrastructure moves to the cloud.

Seniority level

  • Mid-Senior level

Employment type

  • Full-time

Job function

  • Information Technology

Industries

  • Technology, Information and Media
  • Financial Services
#J-18808-Ljbffr
Location:
Knutsford, England, United Kingdom
Salary:
£80,000 - £100,000
Job Type:
FullTime
Category:
Engineering

We found some similar jobs based on your search