Role Overview:
To design, develop, and maintain scalable data engineering solutions that enable the migration from the legacy big data platform to a modern cloud-based data environment, while ensuring seamless data operations and supporting ongoing and new business initiatives.
Key Responsibilities:
- Design, build, and optimize automated data pipelines, ETL/ELT processes and data models within the new cloud platform to ingest, process, and store large volumes of data.
- Support the big data migration effort, ensuring data accuracy, performance efficiency, and minimal business disruption.
- Creates ETL/ELT workflows to ingest, transform, and load data from various sources, ensuring performance and scalability.
- Collaborate with business and analytics teams to deliver data requirements for new strategic business imperatives.
- Designs, develops and generates data marts and customized data extraction in line with business needs.
- Ensure adherence to data governance, security, and compliance standards.
- Monitor pipeline health and performance, troubleshoot data incidents and implement preventive measures.
- Document data workflows, schemas, technical specifications, and operational runbooks to support team knowledge transfer.
- Work closely with product owners, data architects and data scientists to ensure our data infrastructure is reliable and efficient.
- Contribute to continuous improvement of data engineering practices, tools, and automation.
- Provide technical guidance and mentorship to junior engineers through code reviews, best-practice sharing, and troubleshooting support
- Manage and deliver multiple projects or initiatives concurrently, ensuring timely and high-quality outcomes.
Technical Skills:
- Programming: Python, PySpark, Spark, Scala, SQL and Shell scripting
- Data Platforms: Databricks, Snowflake, Hadoop, Oracle
- Cloud Platforms: Azure (preferred), AWS (nice-to-have)
- Version Control & Automation: GIT, CI/CD pipelines
- Architecture & Design: Lakehouse / Medallion patterns, ELT/ETL workflows, data integration and orchestration
- Optimization: SQL performance tuning and cloud cost management
- Documentation: ER modeling, data flow documentation, development standards and best practices
Work Set-Up:
- Hybrid, 3x a week onsite
- BGC, Taguig City
- Day shift