Site Reliability Engineer

New Today

Join to apply for the Site Reliability Engineer role at J BANDY CONSULTING LTDWe are hiring for a next generation telecoms software company who are seeking a Network Autonomy Engineer to join their expanding team.Primary Function Of The PositionReporting to the Site Reliability Engineer Team Lead, the Site Reliability Engineer will be responsible for ensuring the reliability, scalability and performance of our systems.The Responsibilities IncludeDevelop the Site Reliability Engineering culture across the team by applying best practices, approaches and code.Apply automation and propose/implement software to any tasks or parts of the system that would deliver benefit.Monitor application performance – identifying, and implementing, improvements to application performance and stability.Collaborate with the design and implementation of the desired pipelines and process for deployment to production environment.The SRE will work closely with Platform and Software domains to ensure continuous improvement of performance and stability whilst adhering to standards.Undertake ad-hoc projects and other activities as required.Key Accountabilities And ActivitiesDrive evolution of the DevOps / GitOps toolchain, promoting improvements to streamline the software delivery process and showing improvements through metrics.Accountable for halting or stopping a project/product if the solution is not technically acceptable.Responsible for producing and maintaining documentation relating to application design, integration processes, testing procedures, and deployment approach as well as collaborating with teams to create operational run and playbooks.Integration With Domains IncludingCollaborating with Domains to plan, design, test and maintain the application.Design patterns for any component or structure under SRE responsibility.Implementation of components such as Monitoring and Logging.Manage the runbook preparations of Domains.Liaise And Support Other Teams On Work Items IncludingDeveloping, refining, and tuning integrations between application elements.Collaborate with stakeholders in the Enterprise, Solution and Development teams to produce and maintain standards and guidelines.Knowledge sharing and education of team members across the organisation.Act as first point of contact for the Problem management and Process Outcomes team.Build And Guide Successful SRE Efforts IncludingAnalysing and resolving technical and application issues.Researching and evaluating software products.Evaluate risks and defects, analysing specifications, and customising applications for specific customer needs.Identify complex and manual processes and work to simplify and automate them.Continuously review capabilities and roles critical to evolving DevOps and quality assurance practices and be responsible for the acquisition, development, and maturity of these.Minimising outages by continuous improvement.Experience And SkillsExperience and demonstratable knowledge of SRE best practicesExpert in Git and GitopsExpert in logging and monitoring solutions (Prometheus, Grafana etc.)Demonstratable knowledge of CloudExpert knowledge of KubernetesProficient ability to communicate in English (Written and Verbal)Understanding of non-functional testingSignificant DevOps experienceDesirableProven ability to work independently and collaboratively in a fast-paced technical environment.Demonstratable knowledge of the telecommunications industry and technologies.Proven experience and ability to provide support to direct reports.Golang skills and experience #J-18808-Ljbffr
Location:
City Of London, England, United Kingdom
Job Type:
FullTime

We found some similar jobs based on your search