Overview
We are looking for a Machine Learning Engineer who will be responsible for improving and maintaining our machine learning models/pipelines, developing scalable ETL pipelines, and optimizing the end-to-end data processing workflow. This role requires strong experience in computer vision, OCR, and deep learning, as well as the ability to deploy models in a production environment.
Responsibilities
- Design, develop, and maintain data pipelines to manage TIFF images, extracted fields, CSV, PDF, XML, and structured outputs.
- Build and optimize ETL workflows for preprocessing (resizing, rotation, denoising) and ML pipeline integration.
- Develop, train, evaluate, deploy, and monitor deep learning models in a production environment.
- Develop API endpoints and integrate structured data into the existing data keying application.
- Implement logging, monitoring, and error-handling mechanisms for model performance and data consistency.
- Work with Dockerized deployments to streamline ML and data workflows.
- Collaborate with ML engineers, software developers, and quality test engineers to ensure seamless integration.
Qualifications
- Bachelor's Degree holder.
- 3+ years of experience in machine learning, ML engineering, or a hybrid role.
- Proficiency in Python for data manipulation, API development, and ML integration.
- Ability to write efficient ETL scripts and manage large datasets.
- Strong knowledge of database management (SQL, NoSQL, PostgreSQL, or similar).
- Familiarity with YOLO-based object detection and OCR processing.
- Knowledge of image processing and computer vision tools such as Pillow, OpenCV, pdf2image, etc.
- Experience with training and testing machine learning models using Tensorflow, Pyspark, and Scikit-learn.
- Hands-on experience with Docker, Kubernetes, and containerized ML workflows.
- Experience working with Azure services (Data Factory, Blob Storage, ML Studio, and other compute resources).
- Experience with Flask, FastAPI, or other API frameworks.
- Knowledge of MLOps best practices for deploying and monitoring ML models in production.
- Hands-on experience using Git and Github/Gitlab, and Github Actions.
- Exposure to DevOps practices for CI/CD pipelines in ML projects.
- Ability to quickly learn and apply enterprise AI tools and technologies to support technical workflows and business objectives.
Preferred Skills (Nice To Have)
- Experience working in insurance, document processing, or OCR-related applications.
- Knowledge of distributed data processing (Spark, Dask, or similar).
- Familiarity with NLP and LLM-based models.
We know your well-being and happiness are key to a long and successful career. We are delighted to offer country specific benefits. Click here to access benefits specific to your location.