Machine Learning DevOps Engineer

Newcastle Upon Tyne

£45,000 - £50,000

Apply before

Job Description

About the Role: The ML DevOps Engineer is responsible for designing, deploying, automating, testing, and maintaining scalable machine learning and software delivery environments. This hybrid role combines Machine Learning Engineering, DevOps, QA Automation, and Infrastructure Engineering to ensure reliable end-to-end product delivery. The role supports agile teams by building robust CI/CD pipelines, managing cloud/on-prem infrastructure, integrating automated testing, and enabling production-grade ML systems. Key Responsibilities: ML Engineering & MLOps Design, develop, deploy, and maintain machine learning pipelines for training, validation, inference, and monitoring. Automate model lifecycle processes including data ingestion, feature engineering, model retraining, versioning, and rollback. Build scalable environments for model experimentation and production deployment. Integrate ML models into APIs, applications, and enterprise systems. Monitor model drift, performance degradation, and prediction reliability. Ensure reproducibility of experiments using version control and artifact management. DevOps & Infrastructure Automation Design and implement infrastructure automation using Infrastructure as Code (IaC) tools such as Terraform, Ansible, or equivalent. Build, maintain, and optimize CI/CD pipelines for software and ML workloads. Manage containerized environments using Docker and orchestration platforms such as Kubernetes or Swarm. Support hybrid environments including cloud platforms and on-premise infrastructure. Administer Windows and Linux systems to ensure operational stability. Control VMware / vSphere / storage / SAN infrastructure where required. Quality Assurance & Test Automation Develop and maintain automated testing frameworks across UI, API, integration, regression, and acceptance layers. Deliver scheduled automated test suites across browsers, devices, and environments. Perform manual functional, exploratory, UAT, and systems integration testing where required. Define test strategies, test plans, scripts, and release readiness criteria. Integrate automated tests into CI/CD pipelines for continuous validation. Record, triage, monitor, and resolve defects efficiently. Monitoring, Reliability & Observability Implement monitoring, alerting, logging, and diagnostics using tools such as Grafana, ELK, Prometheus, or similar. Ensure availability, scalability, security, and resilience of production systems. Perform root cause analysis for incidents and deliver preventive improvements. Build dashboards for application performance, and ML services. Agile Delivery & Stakeholder Collaboration Work within Agile / Scrum teams to deliver iterative business solutions. Collaborate with developers, data scientists, testers, product owners, and clients. Translate technical challenges into clear business-facing updates. Provide effort estimates, timelines, technical recommendations, and delivery plans. Support continuous improvement of engineering automation, delivery practices, and technical culture. Essential Skills & Experience Strong hands-on experience in DevOps, automation engineering, QA automation, or MLOps. Proven experience designing enterprise-grade infrastructure and deployment pipelines. Experience supporting production systems with high availability requirements. Ability to troubleshoot across code, infrastructure, networking, and data layers. Strong communication skills in client-facing environments. Ownership mindset with ability to work independently. Comfortable working in technically complex, fast-paced delivery environments. Exposure to cloud platforms such as Microsoft Azure or Google Cloud. Experience deploying ML systems into production environments. Technical Skills Required - Programming / Scripting Python PowerShell Java / JavaScript / C# Bash / Shell scripting - DevOps Tooling Azure DevOps TeamCity Octopus Deploy Artifactory Git / GitHub / GitLab - Infrastructure & Containers Docker Kubernetes / Swarm VMware / vSphere Windows Server / Linux - Testing Tools Selenium Cypress Playwright Karate WebdriverIO JMeter - Monitoring / Quality Grafana ELK Stack SonarQube Control-M - Databases Microsoft SQL Server Relational database administration basics Preferred Background: Experience in regulated industries, or enterprise consulting. Master/ PhD degree in Computer Science, Engineering, Mathematics, or STEM discipline. Familiar with geospatial data. Job Types: Full-time, Permanent Pay: £45,000.00-£50,000.00 per year Benefits: Casual dress Company events Company pension Cycle to work scheme Enhanced maternity leave Financial planning services Private medical insurance Sick pay Work from home Ability to commute/relocate: Newcastle upon Tyne NE1: reliably commute or plan to relocate before starting work (required) Work Location: In person