Site Reliability Engineer
New Today
Capital on Tap was founded with the mission to help small business owners and make their lives easier. Today, we provide an all-in-one business credit card & spend management platform that helps business owners save time and money. Capital on Tap proudly serves over 200,000 businesses across the world and our goal is to help 1 million small businesses by 2030.
We empower you to be innovative and solve complex problems. Take ownership, make an impact, and thrive in our scaling and agile environment.
This is a Hybrid role, the SRE team work from our London (Shoreditch) Offices 1-2 days per week.
SRE at Capital on Tap
SRE at Capital On Tap, we run a hybrid embedded SRE model. We aim to work closely with the teams within Capital On Tap to provide them the best support. Our main objective currently is to gain as much visibility to our platform's health while offering scalable solutions.
What You'll Be Doing
As a Site Reliability Engineer (SRE), you will ensure the reliability, performance, and availability of our platforms. Your role includes designing, building, and monitoring systems to maximise uptime and efficiency while collaborating with the Platform teams to build reliable, scalable applications. You will also proactively address potential outages and performance issues by implementing structured monitoring and alerting. Finally, you will determine the launch of new features by using service-level agreements (SLAs) to define the required reliability of the platforms through service-level indicators (SLI) and service-level objectives (SLO).
- Manage and automate Azure, Datadog, NGINX & Cloudflare
- Develop and monitor Kubernetes and Serverless resources
- Maintain infrastructure code with Terraform & CRDs / Crossplane
- Improve systems, processes, and technologies; consult stakeholders to enhance platform performance
- Getting involved in new application architecture & design processes
- Design solutions to reduce toil
- Create SLIs and SLOs; increase application visibility
- Align with the Product team on SLAs and core service objectives
- Collaborate with Platform Engineers for automated solutions and pipelines
- Enhance user experience with infrastructure and pipeline optimization
- Support CI/CD tools such as Azure Devops, Octopus Deploy and Flux to streamline software delivery
- Lead incident troubleshooting to safeguard customer experience
We're Looking For
- Experience in managing a public cloud (Azure advantageous)
- Experience in Azure DevOps, Octopus, Flux or other CI/CD tools
- Experience in Go (preferred), Powershell (preferred), Python, C# or other scripting languages
- Experience with Linux and Microsoft Systems
- Excellent communication skills and ability to collaborate with multiple teams in an agile environment
- Proficient in contributing to IaC technologies involving expertise in writing, managing, and optimising infrastructure with tools such as Terraform and Pulumi
- Experience working with a cloud monitoring solution (advantageous to have DataDog)
- Experience with Kubernetes and Docker
- Experience with Chaos Engineering practices
- Experience with IDPs
- Experience with software cataloguing
- Experience with observability and tracing best practices
Diversity & Inclusion
We welcome, consider and encourage applications from anyone who shares our commitment to inclusivity. Join us in creating a space where authenticity thrives, and everyone can do their best work.
Great Work Deserves Great Perks
We try not to take ourselves too seriously (all the time) so we make sure our office is decked out with a pool table, arcade machine, beer tap, and a couple of office dogs thrown in for good measure. Check out our benefits:
- Private Healthcare including dental and opticians services through Vitality
- Worldwide travel insurance through Vitality
- Anniversary Rewards (£250, £500, £750, 4-week fully paid sabbatical)
- Salary Sacrifice Pension Scheme up to 7% match
- 28 days holiday (plus bank holidays)
- Annual Learning and Wellbeing Budget
- Enhanced Parental Leave
- Cycle to Work Scheme
- Season Ticket Loan
- 6 free therapy sessions per year
- Dog Friendly Offices
- Free drinks and snacks in our offices
Interview Process
First stage: 30 minute intro and values call with Talent Partner (Video call)
Second stage: 45 minute CV overview with Head of department & Engineering Team Leads and/or PM (Video call)
Final stage: 60 minute questions and scenario-based interview with SRE Team Lead (Video call)
- Location:
- City Of London, England, United Kingdom
- Salary:
- £100,000 - £125,000
- Job Type:
- FullTime
- Category:
- Engineering
We found some similar jobs based on your search
-
New Today
Senior Site Reliability Engineer
-
United Kingdom
- IT;IT
Join a team at the heart of the global economy! The Department for Business and Trade ('DBT') and Inspire People are partnering together to bring you an exciting opportunity for Senior Site Reliability Engineers to join a team that ensures DBT's digi...
More Details -
-
New Today
Site Reliability Engineer
-
Poplar
- Information Technology
Join us as a Site Reliability Engineer to oversee the reliability, scalability, and performance of software systems and infrastructure. Interested in this role You can find all the relevant information in the description below. In this role you wil...
More Details -
-
New Today
Senior Site Reliability Engineer
-
London
-
£80,000
- IT;IT
Join a team at the heart of the global economy! The Department for Business and Trade ('DBT') and Inspire People are partnering together to bring you an exciting opportunity for Senior Site Reliability Engineers to join a team that ensures DBT's digi...
More Details -
-
New Today
Site Reliability Engineer
-
Isleworth, England, United Kingdom
-
£100,000 - £125,000
- Engineering
Join us as a Site Reliability Engineer In this key role, you’ll improve, drive, and embed non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and ca...
More Details -
-
New Today
Engineer - Deployment Site Reliability Engineer
-
Birmingham, England, United Kingdom
-
£100,000 - £125,000
- Engineering
Working at Deutsche Bank - we understand that employee expectations and preferences are changing. We have implemented a model that enables eligible employees to work remotely for a part of their working time and reach a working pattern that works for...
More Details -
-
New Today
Site Reliability Engineer - Quant Hedge Fund - £300k+
-
England, United Kingdom
-
£100,000 - £125,000
- Engineering
Site Reliability Engineer - Quant Hedge Fund - £300k+ Paragon Alpha are partnered with a top performing Systematic Hedge Fund who manage over $30b in assets. The firm has a real engineers first culture and attributes its success to its bespoke syste...
More Details -