Search by job, company or skills

N

Machine Learning Operations Specialist

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 5 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Role Description

We are seeking a dedicated and skilled Machine Learning Operations Specialist to join our team. In this full-time, hybrid role (partly based in Taguig with work-from-home flexibility), you will design, implement, and optimize machine learning models for business applications.

Qualifications

Model Monitoring and Maintenance

  • Monitor performance, health, and operational status of ML models deployed to production.
  • Detect anomalies, data drift, concept drift, or degradation in prediction quality.
  • Implement corrective actions or coordinate with Data Science teams for retraining or model refinement.

Incident Management

  • Triage and diagnose issues affecting model availability, accuracy, latency, or data quality.
  • Execute immediate stabilization tasks or escalate to appropriate teams when needed.
  • Document all incidents, resolutions, and preventive actions.

Deployment and Release Management

  • Plan and execute rollout of new model versions, platform updates, or configuration changes.
  • Validate deployments through regression checks, functional testing, and performance validation.
  • Manage model versioning, rollback procedures, and release documentation.

Model Lifecycle Operations

  • Manage routine operational tasks such as data refreshes, threshold tuning, and configuration updates.
  • Automate workflows for recurring operational tasks where applicable.
  • Support periodic model retraining cycles initiated by Data Science or ML Engineering teams.

Model Management Platform Support

  • Support the enhancement and maintenance of the Model Management Platform, including model registry, monitoring tools, lineage dashboards, and pipeline components.
  • Collaborate in implementing CI/CD pipelines for ML models (training, validation, deployment).
  • Integrate new tools or operational improvements to enhance model reliability and observability.

Operational Stability and Risk Mitigation

  • Address high-impact issues due to upstream data inconsistencies, environment drift, or infrastructure instability.
  • Identify operational risks and propose preventive measures to ensure model uptime, accuracy, and compliance.
  • Monitor and optimize resource usage related to ML tooling, serving infrastructure, and pipelines.

Cross-Functional Collaboration

  • Work closely with Data Scientists, ML Engineers, DevOps, SysOps, and Platform teams to maintain seamless ML
  • operations.
  • Participate in post-incident reviews and recommend improvements for pipeline reliability, monitoring, and tooling.

Minimum Qualifications

  • Statistical modeling, ML theory, and practical ML applications.
  • Model interpretability and bias mitigation techniques.
  • Data preprocessing methods (handling missing values, normalization, outlier detection).
  • Anomaly detection algorithms (Isolation Forest, One-Class SVM, Autoencoders).
  • Forecasting techniques (ARIMA, LSTM, Prophet).
  • Model drift detection and mitigation strategies.
  • Time-series analysis for predictive modeling.
  • ML frameworks (TensorFlow, Keras, PyTorch, Scikit-learn, XGBoost).

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 147255761