Site Reliability Engineer
New Today
As a Site Reliability Engineer, you will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices.
You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency.
Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management.
Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands and enhance overall service performance.
This role is eligible for inclusion in the Company’s hybrid working from home policy.
Qualifications
- Excellent knowledge of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction.
- Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty.
- Excellent knowledge of programming languages including Python, Golang and JavaScript.
- Knowledge and experience of modern software development techniques and lifecycles.
- Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible and Terraform.
- Prior experience working in a large scale, 24/7 enterprise where system uptime and stability is of paramount importance to the Business.
- Keen interest of industry trends, particularly Platform Engineering.
- Proficiency in shell scripting for automation and system management tasks.
Additional Information
- Writing and contributing to code that enhances the reliability and observability of services, including telemetry, operational APIs and tooling.
- Developing and maintaining tools that facilitate effective management of our systems, ensuring they are operationally efficient and resilient.
- Working with automation and orchestration platforms to automate manual activity and reduce toil.
- Building sophisticated dashboards using a range of telemetry data and dash boarding technologies like Grafana, Splunk and New Relic.
- Maintaining and administering existing monitoring and analytic toolsets.
- Mentoring colleagues in use of new technologies or practices.
- Actively participating in live incident resolution and post-mortem analysis, providing effective remediation strategies to improve overall system health and prevent future issues.
- Driving initiatives to enhance system reliability and observability, contributing to a culture of continuous improvement.
- Collaborating with the central Site Reliability Engineering and Observability teams to establish and uphold standards for reliability and observability, assisting teams in adhering to these practices.
- Working with IT Operations, providing and supporting the use of critical tooling to enable increasing levels of value to the Business.
By applying to us you are agreeing to share your Personal Data in accordance with our Recruitment Privacy Notice - http://www.bet365careers.com/privacypolicy.pdf
At bet365, we're committed to creating an environment where everyone feels welcome, respected and valued. Where all individuals can grow and develop, regardless of their background. We're Never Ordinary, and we're always striving to be better. If you need any adjustments or accommodations to the recruitment process, at either application or interview, please don’t hesitate to reach out.
- Location:
- Manchester
- Job Type:
- FullTime
We found some similar jobs based on your search
-
New Today
Site Reliability Engineer
-
Manchester
Job Description As a Site Reliability Engineer, you will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering ski...
More Details -
-
New Today
Senior Site Reliability Engineer
-
Salford
-
£80,000
- Engineering
Join a team at the heart of the global economy! The Department for Business and Trade ('DBT') and Inspire People are partnering together to bring you an exciting opportunity for Senior Site Reliability Engineers to join a team that ensures DBT's dig...
More Details -
-
New Today
Senior Site Reliability Engineer
-
Salford
-
£80,000
- IT;IT
Join a team at the heart of the global economy! The Department for Business and Trade ('DBT') and Inspire People are partnering together to bring you an exciting opportunity for Senior Site Reliability Engineers to join a team that ensures DBT's digi...
More Details -
-
2 Days Old
Site Reliability Engineer
-
Manchester, England, United Kingdom
Site Reliability Engineer at Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. We deploy and run OpenStack, Kubernetes, storage solutions, and open source applications, applying DevOps practices. The company is a pioneer of globally distributed collaboration with...
More Details -
-
3 Days Old
Mid & Senior Site Reliability Engineers - GDS - G7
-
Manchester, England, United Kingdom
Mid & Senior Site Reliability Engineers - GDS - G7The Government Digital Service (GDS) is the digital centre of government — we are responsible for setting, leading and delivering the vision for a modern digital government.Our priorities are to drive...
More Details -
-
3 Days Old
Site Reliability Engineer
-
Manchester, England, United Kingdom
This range is provided by Caspian One. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Base pay rangeFor more information, contact ben.dowdle@caspianone.co.uk, 0203 6919 176Job DescriptionWe’re bui...
More Details -