Senior Site Reliability Engineer

New Yesterday

Overview

Job Description

About the Role
A global leader in financial services is seeking a Site Reliability Engineer (SRE) to join its Cybersecurity function, specifically within the Data Loss Prevention (DLP) team. This role focuses on transitioning from traditional monitoring to a modern observability model - using telemetry, distributed tracing, actionable metrics, and SLO/SLI-driven alerting. It\'s ideal for candidates looking to either deepen their cybersecurity career or transition into this space from a related discipline.

Location & Culture

Based in Glasgow, the office employs over 1,500 staff across multiple functions. It offers a centrally located, modern workspace with on-site amenities like a gym and restaurant. The organization fosters a culture of meritocracy, and giving back to the local community through charitable partnerships.

Key Responsibilities

  • Write, review, and optimize PromQL queries for Prometheus.
  • Operate and troubleshoot Prometheus in agent mode, ensuring efficient metrics collection.
  • Design and refine Grafana dashboards based on observability best practices like RED or Four Golden Signals.
  • Develop and optimize Splunk dashboards for effective log analytics and incident response.
  • Streamline and revise alerting rules (including PagerDuty orchestration) to reduce false positives and improve signal-to-noise ratio.
  • Collaborate with DLP squads to align alerting standards with SRE best practices.
  • Leverage telemetry data to generate actionable insights that improve DLP platform performance.
  • Participate in a 24/7 on-call rotation for DLP product support.

What You\'ll Bring

  • Strong critical thinking skills and a proactive mindset.
  • Proven experience setting up SLOs and managing error budgets.
  • Deep knowledge of SRE principles and operational best practices.
  • 3+ years of experience with Prometheus, PromQL, and exporters.
  • 3+ years of hands-on experience with Grafana and Splunk.
  • Proficiency in observability tools and dashboard creation.
  • Fluency in at least one programming or scripting language.
  • Experience working in CI/CD environments (e.g., Jenkins, Bitbucket).
  • Exposure to cloud platforms (e.g., AWS) and Unix/Linux systems.

Nice to Have

  • Familiarity with incident, problem, or change management tools.
  • Automation and scripting for operational efficiency.
  • Experience with DLP products or broader cybersecurity environments.
  • Agile team experience and understanding of OpenTelemetry standards.

Why Join Us

  • Work at the forefront of cybersecurity within a global organization.
  • Join a collaborative and forward-thinking technology team.
  • Enjoy a flexible working model and strong support for career development.
  • Be part of a company culture that values integrity, excellence, and work-life balance.

Interested? Apply now or contact our recruitment team to learn more about how this role can take your career to the next level in observability and cybersecurity.

We are committed to creating an inclusive recruitment experience. If you have a long-term health condition and require adjustments to the recruitment process, our Adjustment Concierge Service is here to support you. Please reach out to us at adjustments@robertwalters.com to discuss further.

This position is being recruited on behalf of our client through our Outsourcing service line. Resource Solutions Limited, trading as Robert Walters, acts as an employment business and agency, partnering with top organizations to help them find the best talent. We welcome applications from all candidates and are committed to providing equal opportunities.

#J-18808-Ljbffr
Location:
Glasgow
Job Type:
FullTime
Category:
Engineering

We found some similar jobs based on your search