Production Reliability Engineer

New Yesterday

This is an opportunity for a technical problem solving SRE to join a leading global fintech who are currently growing their presence in Europe.Role Responsibilities;Investigate, troubleshoot and diagnose incidentsProvide first-third line investigation and diagnosis of incidents and Service Requests.Be the Incident coordinator for operational incidents on the core engineering production platform. This includes all technical internal communications, ensuring processes are followed and all post-incident follow up and analysis.Escalate incidents or services requests that require system, config or code changes to appropriate on call EngineerManage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely mannerWork with engineers to establish or update runbooks and procedures needed for handling incidents and Service Requests.Develop and maintain knowledge base and respond to customer’s technical questions.Actively monitor integration endpoints and external programatic dependencies (i.e venue APIs).Maintain scripts, dashboards and other programatic tools acquired or builtQualifications & Required Skillset;Ability to diagnose and troubleshoot technical issues both offline and in real-timeAbility to handle multiple priorities and deal with ambiguityExperience with incident and problem management processesExperience working as an Application Support / DevOps or SRE Role (preferably with in Trading & Risk Management systems )Experience communicating to customers as well as to sr. software engineersExperience with Python, PostgresSQL and UnixExperience with writing intermediate to advanced SQL queries for data extraction and troubleshooting purposes.Experience with using and troubleshooting programming interfaces especially REST APIs and Web Sockets.Experience with monitoring tools (Grafana, DataDog)Experience working with Crypto and blockchain (DLT)Familiarity with common engineering development workflows and tools (e.g. JIRA, Confluences, github, scrum, etc…)Familiarly with scaling, monitoring, and general production challenges of real time (banking) systems.Familiarity with financial services infrastructure & processes (e.g ITIL) and related systems in an SRE or Dev/Ops capacityFamiliarity with AWS Cloud Infrastructure & ProcessesFamiliarity with Release management processes and SDLC using agile methodologies and best practices.Motivated by working with people and solving their problemsUnderstanding of basic programming constructs (loops, conditionals, data types, regular expressions) with the ability to write and read non-trivial production and operational scripts.
Location:
London Area, United Kingdom
Job Type:
FullTime
Category:
Financial Services

We found some similar jobs based on your search