Site Reliability Engineer - 12 Month FTC (we have office locations in Cambridge, Leeds and London)

New Today

Are you driven by a deep curiosity about how complex distributed systems work and, more importantly, how they fail? Do you believe reliability is the most critical feature of any service? We're looking for a Site Reliability Engineer to ensure our platform is not just running, but is sustainably reliable, scalable, and resilient.As a SRE advocate, you will actively collaborate with engineering squads to cultivate a culture of reliability. You will play a pivotal role in driving our technical evolution, influencing and shaping platform practices across the organisation.Your responsibilities will include automating and optimising infrastructure to improve workload throughput. You will focus on implementing proactive measures to anticipate and address potential issues before they impact our users.What You'll Be Doing Day-to-Day:Champion Reliability: Work with engineering teams to define and measure what matters to our users, establishing and monitoring SLIs, SLOs, and error budgets that drive data-informed decisions.Learn from Failure: Be involved in blameless post-incident reviews that focus on identifying contributing factors, ensuring we turn every failure into a valuable opportunity for systemic improvement.Eliminate Toil: Systematically identify and automate repetitive, manual, and tactical operational processes.Build Resilient Systems: Design, build, and maintain robust infrastructure across AWS and on-prem environments using Infrastructure as Code and automation.Enable Developer Velocity: Develop CI/CD pipelines, release automation, and platform tooling that help our engineering squads deploy changes safely and efficiently, without sacrificing reliability.Share Your Knowledge: Create clear, usable documentation and act as a consultant and advocate for SRE and DevOps best practices.What You'll Bring:Mindset & Approach: Deep-Seated Curiosity, Systems Thinker, Relentlessly Collaborative, Incident Responder, Views Failure as an Opportunity, Customer-Focused.Technical Experience: Experience applying Site Reliability Engineering principles, strong hands-on experience with AWS services, deep understanding of distributed systems, experience with capacity planning, performance engineering, and designing systems that scale.Nice to Haves: Exposure to new tech evaluation, lean experimentation, or platform tooling decisions, experience mentoring or sharing knowledge across teams, understanding of genomics, HPC, data-heavy workloads, or regulated environments.Genomics England partners with the NHS to provide whole genome sequencing diagnostics. Our mission is to continue refining, scaling, and evolving our ability to enable others to deliver genomic healthcare and conduct genomic research.We offer a competitive salary from £71,300 and a benefits package including generous leave, family-friendly arrangements, pension and financial benefits, learning and development opportunities, recognition and rewards, and health and wellbeing support.Genomics England is actively committed to providing and supporting an inclusive environment that promotes equity, diversity and inclusion best practice. We are proud of our diverse community where everyone can come to work and feel welcomed and treated with respect. #J-18808-Ljbffr
Location:
Leeds, England, United Kingdom
Job Type:
FullTime

We found some similar jobs based on your search