Site Reliability Engineer

New Today

OverviewDuffel is building tools to simplify travel distribution, search and booking, delivering one common and seamless API for hundreds of airlines. We aim to redesign the infrastructure underpinning the travel industry and improve the reliability and developer experience as we scale globally.Site Reliability Engineering at DuffelAs an SRE, you’ll be part of a small engineering team responsible for the reliability, performance, and resilience of our infrastructure and applications. You will work closely with engineering teams to understand their needs and help meet the demands of our product as we scale globally.What we’re looking forAn infrastructure and systems engineering generalist who is comfortable diving deep into the weeds on different issues.A configuration issue between Google’s Load Balancer and the HTTP server in our main Elixir application causing HTTP 5XX responses to customers.Debugging an issue in our OpenTelemetry pipelines causing us to silently drop spans.An enthusiasm for both software development and systems engineering.A high bar for code and configuration quality and readability.A good understanding of current observability and reliability practices.Experienced and comfortable in running incident response.Big picture thinking - ability to make trade-offs on technical work streams against business impact.Excellent communication skills; the ability to articulate what you’re working on and why to the team in a clear and structured way.Thrives in a collaborative environment, open to feedback and new ideas.Experience with Google Cloud Platform (GCP) products such as GKE, Cloud SQL for PostgreSQL, BigQuery, Memorystore (Redis), etc.Experience managing infrastructure and security for a PCI Cardholder Data Environment using Google Cloud Platform services and tooling.Experience with Infrastructure as Code (Terraform).Experience with a GitOps approach to Kubernetes using ArgoCD and Helm.Experience with high-availability metrics collection systems (Grafana, Thanos, Prometheus) and transitioning to OpenTelemetry and Honeycomb for traces and metrics.Experience with data pipelines (Pub/Sub, Airbyte, dbt).Note: If your experience doesn’t exactly align with this stack, transferable skills will be considered. This description provides a sense of what you’ll be working with if you join the team.What you can expect from usWe’re dedicated to your personal growth. Our environment is supportive, and we value ideas, concerns and questions. Everyone who joins Duffel owns a share of the company and takes pride in their work.Equality and recruitmentWe are an equal opportunities employer. We believe the key to our success is a diverse team; recruitment decisions are based on experience and skills. We welcome applications from everyone regardless of age, sex, disability, sexual orientation, race, religion or belief.Note to recruitment agenciesDuffel does not accept speculative CVs from external parties. Unsolicited CVs will be treated as the property of Duffel; attached terms and conditions are null and void.Roles and locationsLocation: London, England, United Kingdom. This role is typically full-time and falls under the Seniority level: Mid-Senior level. #J-18808-Ljbffr
Location:
England, United Kingdom
Job Type:
FullTime