Staff AI Backend Engineer AWS / Node.js / TypeScript
New Today
Overview
Role: Staff AI Backend Engineer AWS / Node.js / TypeScript
Company: Serve First CX
Location: hybrid from Milton Keynes (min 3 days/month in office)
Team: Engineers across UK, US, India, and Philippines
Reports to: CTO (US)
Works closely with: Head of Engineering (India)
Why Serve First
We’re a scrappy, well-funded (£4.5 million seed closed) AI startup turning raw customer feedback into real-time insight for businesses that care about CX. Our 2025 roadmap is ambitious: break apart our Node.js monolith into microservices, double our AI-driven workflows, and harden infrastructure for 100× traffic. Everyone ships. Everyone is on-call. Bureaucracy is nil. Velocity is high.
What You'll Do
- Break up the monolith: Define service boundaries and lead the transition to a microservices architecture. Implement REST + SQS communication between services, containerized via ECS Fargate. You'll design services that scale, not snowball.
- Own AI integration: Build features using OpenAI APIs today, and pave the path for tomorrow: private model deployment, vector DBs, prompt orchestration frameworks, and usage monitoring. Lead the evolution toward multi-model support, caching layers, and Bedrock/RAG-native infrastructure.
- Build and scale AI infra: Design training/inference workflows. Spin up model-serving infra on AWS (Bedrock, SageMaker, or container-based). Help make our AI systems observable, secure, and cost-efficient. You’ll apply DevOps instincts to support LLM-powered production systems at scale.
- Architect the future of our AI platform: Build composable infrastructure for experimentation, scale, and optional self-hosting. Define boundaries between orchestration and inference, expose tracing and prompt history, and make systems that the team can iterate on without chaos.
- Champion testing and correctness: Define and enforce robust testing strategies: unit, integration, and load. Design systems that are testable by default, with clear mocks, interface contracts, and fast CI.
- Estimate and scope work: Own delivery for complex features, break them down into milestones, identify hidden risks, and clearly communicate tradeoffs. We don't over-spec; we trust senior engineers to lead the build and help shape the spec.
- Make it observable: Design systems with telemetry: structured logs, metrics, traces, and alerting. Help us make LLM behavior debuggable and traceable, from token usage to prompt mutation.
- Think like a secure infra engineer: We handle sensitive customer data. Make security a first-class concern in system design, including PII handling, IAM design, secrets management, rate limiting, and GDPR-readiness.
- Ship production-ready backend code: You'll work primarily in Node.js/TypeScript, using MongoDB (Mongoose), Redis, and job schedulers (cron/EventBridge).
- Design cloud-flexible infra: Keep our infrastructure cloud-agnostic. We’re AWS-first today, but use modular Terraform so we can pivot to GCP for customer workloads if needed.
- Mentor, review, and raise the bar: Lead code reviews, pair with engineers, and mentor the team. Help reinforce best practices and know when to lean on AI tools (and when not to).
Must-Haves
- 8+ years of backend engineering, with deep experience building distributed systems using Node.js/TypeScript on AWS.
- System design fluency, especially in event-driven, autoscaling architectures.
- Production LLM experience: Shipped features using OpenAI, Claude, or similar APIs. You understand token limits, prompt shaping, cost tradeoffs, and context handling.
- Infra-aware AI developer: You\'ve helped stand up inference infra via Bedrock, SageMaker, or containerized flows and you care about performance, cost, and traceability.
- Testing mindset: You design with testability in mind, and are confident in setting up or extending CI pipelines for coverage across microservices.
- MongoDB & Redis: You can tune indexes, optimize queries, and debug performance issues at the DB level.
- Terraform & Docker: You can bootstrap infra from scratch and debug cloud deployment issues without waiting on a DevOps team.
- Clear communicator: You write well, document clearly, and thrive in async-first workflows.
- Security + compliance basics: Comfortable designing within GDPR/SOC2 requirements, with a good grasp of secure architecture patterns.
- Estimation and delivery: You\'ve scoped, built, and delivered complex backend features with minimal PM handholding.
Nice-to-Haves
- Background in CX, survey, or analytics SaaS.
- Bedrock, LangChain, or RAG experience.
- Experience with LLMOps: prompt versioning, feedback loops, prompt/response telemetry.
- GCP infra exposure or portability experience.
- React familiarity or empathy with frontend engineers.
- Incident response and blameless postmortem experience.
What We Offer
- Competitive salary (band shared at offer stage)
- Standard UK pension
- 20 days holiday + public holidays
- Generous hardware/kit budget
- High autonomy, massive scope
- Personal and Professional Development budget
- Additional Perks
Seniority level
- Mid-Senior level
Employment type
- Full-time
Job function
- Engineering and Information Technology
Industries
- Research Services
- Location:
- Milton Keynes, England, United Kingdom
- Salary:
- £80,000 - £100,000
- Job Type:
- FullTime
- Category:
- IT & Technology