The Machine Learning Operations (MLOps) Specialist ensures the stable, reliable, and secure operation of deployed
Machine Learning (ML) models and the Model Management Platform. The role manages the end-to-end operational
lifecycle of ML models, including deployment, monitoring, maintenance, incident resolution, platform support, and
continuous improvement. The MLOps Specialist works closely with Data Scientists, ML Engineers, Platform Engineers,
and cross-functional teams to sustain model performance and availability in production environments.
2 to 4 years experience in MLOps, DevOps, Data Engineering, ML Engineering, or related operational roles
Minimum Qualifications
- Knowledge: Has working knowledge and expertise in the following:
- Machine Learning fundamentals (model behavior, evaluation metrics, drift concepts)
- MLOps tools and platforms (MLflow, Kubeflow, SageMaker, Vertex AI, or equivalent)
- Monitoring and observability (Prometheus, Grafana, EvidentlyAI, ELK Stack)
- CI/CD pipelines (Git, Jenkins, GitLab CI, Azure DevOps, or equivalent)
- Containerization and orchestration (Docker, Kubernetes)
- Scripting and automation (Python, Bash)
- Data management, data pipelines, and operational data quality
- Cloud infrastructure (AWS, GCP, Azure)
- Model deployment strategies (batch, real-time, streaming)
- API integration and microservices concepts
- Database design and querying
- Testing approaches for ML systems (functional tests, performance tests, canary tests)