Senior Site Reliability Engineer, Observability

New Today

Overview

Senior Site Reliability Engineer, Observability. Join to apply for the Senior Site Reliability Engineer, Observability role at Chainlink Labs.

About us: Chainlink Labs is the primary contributing developer of Chainlink, the decentralized computing platform powering the verifiable web. Chainlink is the industry-standard platform for providing access to real-world data, offchain computation, and secure cross-chain interoperability across any blockchain. Chainlink Labs helps power verifiable applications for banking, DeFi, global trade, and gaming by collaborating with some of the world’s largest financial institutions, notably Swift, DTCC, and ANZ. Chainlink Labs also works with top Web3 teams, including Aave, Compound, GMX, Maker, and Synthetix. Chainlink Labs was ranked as one of the Global Top 100 Most Loved Workplaces by Newsweek 2025.

The Observability Team enables Chainlink development and empowers engineers to continue building and supporting crucial products and services that have a profound impact in the blockchain industry. Reliability is vital to the success of our company. As a Senior SRE, you will help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load.

This role is ideal for someone with a strong DevOps mindset who is passionate about building and maintaining a mature GitOps environment and has experience focusing on observability. The engineering team is expanding, offering opportunities to build, learn, and grow. We are committed to supporting diverse backgrounds and equal opportunity for all candidates.

Responsibilities

  • Build and orchestrate a modern OTEL-based observability platform
  • Support multiple telemetry types, including metrics, logs and traces
  • Define and support governance in observability and scale issues
  • Ensure reliability, security, and performance exceed SLAs
  • Collaborate with engineers across the company to troubleshoot issues, deploy new products and services, and increase velocity while decreasing cognitive load
  • Lead the design and deployment of monitoring and observability services to detect and alert the team of needed action
  • Ingest, aggregate, transform, and utilize data from multiple sources in real-time data pipelines
  • Oversee availability, performance, and supportability of observability infrastructure
  • Create processes around alert response operations and support the team to ensure reliable delivery of data
  • Recommend sufficient metrics for new feature releases to enable effective alerting
  • Champion reliability and security by doing work right the first time

Requirements

  • 7+ years of relevant professional experience in DevOps, infrastructure, SRE, or a related area
  • Ability to develop software beyond typical infrastructure requirements and configurations
  • Experience programming in C, C++, Java, Python, Go, Perl, or Ruby
  • Expert knowledge in designing, developing, and managing large real-time systems
  • Experience with monitoring and logging: Prometheus, Grafana, and centralized logging solutions (ELK Stack, Splunk, or Grafana Stack)
  • Experience with distributed systems and container orchestration; Kubernetes clusters and deploying services on them
  • Strong communication skills and ability to participate in planning meetings and code reviews

Desired Qualifications

  • Interest in blockchain and Web3 technologies
  • Experience running infrastructure in the blockchain/web3 space
  • Ability to scale systems sustainably through automation and process improvements that reduce toil
  • Experience working remotely in a distributed team
  • Desire to continually improve and automate services

Tools and Services

  • AWS, Terraform/Terragrunt, Kubernetes, Calico, ArgoCD, Prometheus, Grafana, GitHub Actions, Packer

All roles with Chainlink Labs are global and remote-based. We ask that you try to overlap some working hours with Eastern Standard Time (EST).

We carefully review all applications and aim to respond to every candidate within two weeks after the job posting closes. The closing date is listed on the job advert. We encourage you to prepare your application thoughtfully. You will hear from us regarding the status of your application after the closing date.

Commitment to Equal Opportunity
Chainlink Labs is an equal opportunity employer. All qualified applicants will receive equal consideration for employment in compliance with applicable laws. If you need assistance or accommodations due to a disability or special need during the application or recruitment process, please contact us via this form.

Global Data Privacy Notice for Job Candidates and Applicants
Information collected as part of your Chainlink Labs Careers profile and job applications is subject to our Privacy Policy. By submitting your application, you agree to our use and processing of your data as required.

Seniority level: Mid-Senior level

Employment type: Full-time

Job function: Engineering and Information Technology

Industries: Technology, Information and Internet

#J-18808-Ljbffr
Location:
London
Job Type:
FullTime
Category:
Engineering

We found some similar jobs based on your search