Machine Learning Infrastructure Engineer

New Today

Job Description

ML & Cloud Infrastructure Engineer

London

Up to £150000


About the Role

A cutting-edge AI start-up is pioneering the development of frontier 3D foundation models, pushing the boundaries of computer vision and spatial computing. Their mission is to redefine how industries such as robotics, AR/VR, gaming, and film generate and interact with 3D content.

They are now seeking an ML & Cloud Infrastructure Engineer to join their growing team.


This is a unique opportunity to work at the forefront of AI innovation, building the infrastructure that underpins complex ML workloads and production systems. You’ll play a central role in scaling the company’s platforms and ensuring their pioneering technology reaches its full potential.


Key Responsibilities,

  • Develop and maintain scalable, high-performance cloud-based infrastructure for ML workloads and API deployment.
  • Manage and optimize cloud platforms (AWS, Azure, GCP) and set up ML nodes for local and distributed training.
  • Install, configure, and monitor servers, ensuring system reliability.
  • Design and optimize storage solutions for large-scale ML datasets.
  • Manage containerized applications with Docker, Kubernetes, Terraform, and related tools.
  • Collaborate with ML engineers and researchers to ensure seamless orchestration of training and production environments.
  • Troubleshoot and respond to cloud/production incidents, implementing long-term solutions.


What We’re Looking For

  • At least 3 years of professional experience in a cloud-related engineering role (ML-related experience highly desirable).
  • Strong scripting skills (Bash, PowerShell, Python, etc.) for automation.
  • Proven expertise in at least one major cloud platform (AWS, GCP, or Azure).
  • Experience with containerization and orchestration (Docker, Kubernetes).
  • Ability to manage and optimize large-scale cloud infrastructure.
  • Familiarity with Python (Jupyter) and ML frameworks (e.g., PyTorch).
  • Experience with cloud monitoring tools (Prometheus, Grafana).
  • Exposure to cloud-based databases (RDS, Aurora, Spanner, etc.) and data-visualisation tools.
  • Knowledge of CI/CD tools (e.g., CircleCI).
Location:
City Of London
Job Type:
FullTime
Category:
Technology

We found some similar jobs based on your search