Machine Learning DevOps Engineer
Newcastle Upon Tyne
£45,000-£50,000
Apply before
Job Description
About the Role:
The ML DevOps Engineer is responsible for designing, deploying, automating, testing, and maintaining scalable machine learning and software delivery environments.
This hybrid role combines Machine Learning Engineering, DevOps, QA Automation, and Infrastructure Engineering to ensure reliable end-to-end product delivery.
The role supports agile teams by building robust CI/CD pipelines, managing cloud/on-prem infrastructure, integrating automated testing, and enabling production-grade ML systems.
Key Responsibilities:
ML Engineering & MLOps
Design, develop, deploy, and maintain machine learning pipelines for training, validation, inference, and monitoring.
Automate model lifecycle processes including data ingestion, feature engineering, model retraining, versioning, and rollback.
Build scalable environments for model experimentation and production deployment.
Integrate ML models into APIs, applications, and enterprise systems.
Monitor model drift, performance degradation, and prediction reliability.
Ensure reproducibility of experiments using version control and artifact management.
DevOps & Infrastructure Automation
Design and implement infrastructure automation using Infrastructure as Code (IaC) tools such as Terraform, Ansible, or equivalent.
Build, maintain, and optimize CI/CD pipelines for software and ML workloads.
Manage containerized environments using Docker and orchestration platforms such as Kubernetes or Swarm.
Support hybrid environments including cloud platforms and on-premise infrastructure.
Administer Windows and Linux systems to ensure operational stability.
Control VMware / vSphere / storage / SAN infrastructure where required.
Quality Assurance & Test Automation
Develop and maintain automated testing frameworks across UI, API, integration, regression, and acceptance layers.
Deliver scheduled automated test suites across browsers, devices, and environments.
Perform manual functional, exploratory, UAT, and systems integration testing where required.
Define test strategies, test plans, scripts, and release readiness criteria.
Integrate automated tests into CI/CD pipelines for continuous validation.
Record, triage, monitor, and resolve defects efficiently.
Monitoring, Reliability & Observability
Implement monitoring, alerting, logging, and diagnostics using tools such as Grafana, ELK, Prometheus, or similar.
Ensure availability, scalability, security, and resilience of production systems.
Perform root cause analysis for incidents and deliver preventive improvements.
Build dashboards for application performance, and ML services.
Agile Delivery & Stakeholder Collaboration
Work within Agile / Scrum teams to deliver iterative business solutions.
Collaborate with developers, data scientists, testers, product owners, and clients.
Translate technical challenges into clear business-facing updates.
Provide effort estimates, timelines, technical recommendations, and delivery plans.
Support continuous improvement of engineering automation, delivery practices, and technical culture.
Essential Skills & Experience
Strong hands-on experience in DevOps, automation engineering, QA automation, or MLOps.
Proven experience designing enterprise-grade infrastructure and deployment pipelines.
Experience supporting production systems with high availability requirements.
Ability to troubleshoot across code, infrastructure, networking, and data layers.
Strong communication skills in client-facing environments.
Ownership mindset with ability to work independently.
Comfortable working in technically complex, fast-paced delivery environments.
Exposure to cloud platforms such as Microsoft Azure or Google Cloud.
Experience deploying ML systems into production environments.
Technical Skills Required
Programming / Scripting
Python
PowerShell
Java / JavaScript / C#
Bash / Shell scriptingDevOps Tooling
Azure DevOps
TeamCity
Octopus Deploy
Artifactory
Git / GitHub / GitLabInfrastructure & Containers
Docker
Kubernetes / Swarm
VMware / vSphere
Windows Server / LinuxTesting Tools
Selenium
Cypress
Playwright
Karate
WebdriverIO
JMeterMonitoring / Quality
Grafana
ELK Stack
SonarQube
Control-MDatabases
Microsoft SQL Server
Relational database administration basics
Preferred Background:
Experience in regulated industries, or enterprise consulting.
Master/ PhD degree in Computer Science, Engineering, Mathematics, or STEM discipline.
Familiar with geospatial data.
Job Types: Full-time, Permanent
Pay: £45,000.00-£50,000.00 per year
Benefits:
Casual dress
Company events
Company pension
Cycle to work scheme
Enhanced maternity leave
Financial planning services
Private medical insurance
Sick pay
Work from home
Ability to commute/relocate
Newcastle upon Tyne NE1: reliably commute or plan to relocate before starting work (required)
Work Location: In person.