Platform - SRE Engineer - #2082720

HelloKindred


Date: 10 hours ago
City: Sheffield
Contract type: Contractor
Work schedule: Full day
HelloKindred

Who is HelloKindred?

HelloKindred are specialists in staffing marketing, creative and technology roles, offering a range of talent solutions that can be delivered on-site, remotely or hybrid.

Our vision is to make work accessible and people’s lives better. We do this by disrupting traditional employment barriers – connecting ambitious talent to flexible opportunities with trusted brands.



Job Description

Anticipated Contract End Date/Length: November 30, 2026.

Work Set Up: Hybrid (3 days per week in office)

Clearance required: BPSS

Our client in the Information Technology and Services industry is looking for a Platform / SRE Engineer to own deployment, observability, reliability, cost control, and production operations for an AI helpdesk platform. This role will support the design, deployment, and operational management of AI services and production environments while ensuring scalability, uptime, performance optimization, and operational resilience across cloud-based infrastructure.

The ideal candidate will bring strong expertise in DevOps and Site Reliability Engineering practices, along with experience managing cloud-native platforms, CI/CD pipelines, observability tooling, and AI/ML production workloads within complex enterprise environments.

What you will do:

  • Build and manage CI/CD pipelines, infrastructure, and runtime environments for AI services.
  • Deploy and operate model-serving, orchestration, and application workloads.
  • Implement monitoring, tracing, alerting, logging, and operational dashboards.
  • Manage scaling activities, release processes, rollback mechanisms, and production support operations.
  • Optimize inference cost, latency, uptime, and overall system reliability.
  • Create runbooks, operational standards, and incident response processes.
  • Support infrastructure automation and platform engineering initiatives.
  • Maintain observability and monitoring solutions across production environments.
  • Support release automation, secrets management, and production operational processes.
  • Collaborate with engineering teams to support AI platform reliability and operational readiness.
  • Troubleshoot production issues and support system diagnostics and remediation activities.
  • Ensure platform stability, scalability, and performance across cloud-native environments.

Qualifications

  • Strong experience in DevOps and Site Reliability Engineering environments.
  • Experience with Docker, Kubernetes, cloud platforms, and Infrastructure as Code practices.
  • Strong experience with monitoring, observability, and operational tooling.
  • Familiarity with CI/CD pipelines, release automation, secrets management, and production support processes.
  • Understanding of LLM deployment patterns and API-based model integrations.
  • Experience working with cloud platforms, particularly AWS.
  • Experience using Jira, Confluence, and ServiceNow.
  • Experience supporting AI/ML workloads in production environments is preferred.
  • Experience with GPU workloads, autoscaling, and cost optimization is preferred.
  • Strong troubleshooting, operational support, and incident response capabilities.
  • Strong communication and collaboration skills within cross-functional engineering teams.

Additional Information

All your information will be kept confidential according to EEO guidelines.

Candidates must be legally authorized to live and work in the country where the position is based, without requiring employer sponsorship.

HelloKindred is committed to fair, transparent, and inclusive hiring practices. We assess candidates based on skills, experience, and role-related requirements.

We appreciate your interest in this opportunity. While we review every application carefully, only candidates selected for an interview will be contacted.

HelloKindred is an equal opportunity employer. We welcome applicants of all backgrounds and do not discriminate on the basis of race, colour, religion, sex, gender identity or expression, sexual orientation, age, national origin, disability, veteran status, or any other protected characteristic under applicable law.

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Senior Java Architect - Cloud Migration

Smartedge Solutions,
10 hours ago
Smartedge’s Client is looking for an individual to help with their Senior Java Architect - Cloud Migration @ Sheffield , UK (Hybrid Working) Job Description : We are seeking an experienced Senior Java Architect to lead the design and implementation...
Smartedge Solutions

Experienced Combat/Game Designer

Steel City Interactive,
10 hours ago
About Steel City Interactive SCI was born out of passion for video games and boxing, with the ambition to create an authentic and exciting boxing game that does justice to the sport we love. What started as a fun prototype...
Steel City Interactive

Field Services Engineer

KYOCERA Document Solutions UK,
10 hours ago
We’re hiring: Field Print Engineer – Sheffield & Doncaster (S & DN Postcodes) We’re looking for a hands-on Field Print Engineer to join our Service & Support team, covering the Sheffield and Doncaster areas. This is a field-based role ,...
KYOCERA Document Solutions UK