Data Engineer

Exist Software Labs, Inc.

Taguig, Philippines

Fresher

Save

Posted 22 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Role Overview:

To design, develop, and maintain scalable data engineering solutions that enable the migration from the legacy big data platform to a modern cloud-based data environment, while ensuring seamless data operations and supporting ongoing and new business initiatives.

Key Responsibilities:

Design, build, and optimize automated data pipelines, ETL/ELT processes and data models within the new cloud platform to ingest, process, and store large volumes of data.
Support the big data migration effort, ensuring data accuracy, performance efficiency, and minimal business disruption.
Creates ETL/ELT workflows to ingest, transform, and load data from various sources, ensuring performance and scalability.
Collaborate with business and analytics teams to deliver data requirements for new strategic business imperatives.
Designs, develops and generates data marts and customized data extraction in line with business needs.
Ensure adherence to data governance, security, and compliance standards.
Monitor pipeline health and performance, troubleshoot data incidents and implement preventive measures.
Document data workflows, schemas, technical specifications, and operational runbooks to support team knowledge transfer.
Work closely with product owners, data architects and data scientists to ensure our data infrastructure is reliable and efficient.
Contribute to continuous improvement of data engineering practices, tools, and automation.
Provide technical guidance and mentorship to junior engineers through code reviews, best-practice sharing, and troubleshooting support
Manage and deliver multiple projects or initiatives concurrently, ensuring timely and high-quality outcomes.

Technical Skills:

Programming: Python, PySpark, Spark, Scala, SQL and Shell scripting
Data Platforms: Databricks, Snowflake, Hadoop, Oracle
Cloud Platforms: Azure (preferred), AWS (nice-to-have)
Version Control & Automation: GIT, CI/CD pipelines
Architecture & Design: Lakehouse / Medallion patterns, ELT/ETL workflows, data integration and orchestration
Optimization: SQL performance tuning and cloud cost management
Documentation: ER modeling, data flow documentation, development standards and best practices

Work Set-Up: