Key Responsibilities:
- Create proof-of-concepts and implement best-fit data/platform solutions aligned to the business/product owner's initiatives
- Promote and define enterprise-grade data engineering standards, including coding practices, testing frameworks, version control, and CI/CD processes
- Design, build, support complete data lifecycle from ingestion, transformation, storage, access ensuring data integrity, security, and quality
- Collaborate closely with data scientists, data analysts, data product owners, third-party teams, and business stakeholders to translate needs into data-driven/technical requirements and solutions
- Ensure alignment with regulatory requirements, data governance frameworks, and internal data policies, fostering trust and accountability in data assets
- Keep updated of emerging data technologies and tools, evaluating and integrating them where they can deliver strategic advantage
- Identify opportunities for innovation in data infrastructure and automation
- Implement observability practices to monitor data platforms performance and reliability, practicing continuous improvement where necessary
- Coach and mentor junior to mid-level data engineers, creating high-performing engineering culture
Qualifications:
- 2+ years of experience in data engineering
- Proven and validated experience in designing and building large-scale data platforms or pipelines (batch, real-time) with varying security/governance compliance requirements from different business entities/data sources
- Understanding of data architectures (data lake, data warehouse, lakehouse, delta architecture)
- Hands-on experience with various Cloud platform services to support data engineering and data lifecycle management
- Proficiency in data manipulation and pipeline development with scripting background
- Working knowledge of data cataloging, lineage tracking and access controls with familiarity on regulatory requirements and data privacy/security best practices
- Comfortable in designing, implementing, and managing CI/CD pipelines (for data pipelines), version control, Infrastructure as Code with testing and monitoring frameworks
- Experience implementing data quality, observability, and pipeline monitoring
- Prior experience in a lead and architect capacity, including technical leadership and decision making
Skills and Competencies:
Essential
- Design, implementation, maintenance, and support of organization-wide data solutions and end-to-end data pipelines
- Expertise in data modeling, normalization, and schema design
- Enablement and operationalization of cloud components in relation to the data solutions
- Ingestion and integration of structured, semi-structured, and unstructured data from diverse data sources
- Design and implementation of CI/CD solutions for data pipelines
- Understanding of data access controls, lineage, and privacy regulations
- Establish, enable, and implement governance, compliance requirements, security of data and data pipelines via the Data Platform
- Implement pipeline health checks, monitoring, and alerting
- Cost and resource utilization optimization of pipelines, data solutions
- Ability to align data/data platform initiatives with data productization and long-term organization goals
- Strong written and verbal communication to business proponents around design and implementation of data solutions and end-to-end data pipelines
- Encourage innovation and continuous improvement in data platform/data delivery
- Proficiency on various Data Engineering Tools:
- Platform Databricks
- Programming - Python, SQL, Shell Scripting
- Orchestration Apache AirFlow
- Processing Apache Spark, Kafka
- Storage S3, NoSQL DB
- Cloud AWS
- Integration AWS Integration Services, APIs, CDCs
- CI/CD GitHub, GitHub Actions, Databricks Asset Bundles, IaC (Terraform)
Desirable
- Knowledge on Azure and GCP data engineering-related services
- Skilled in managing multiple SIs around data delivery
- Has other relevant Data, DevOps, and Cloud certifications
- Data Engineering Tools:
- Platform - Microsoft Fabric
- Cloud - GCP, Azure
- Integration - AWS Integration Services, various iPaaS
Education / Certifications:
- Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field
- Databricks Fundamentals Certification
- AWS Cloud Practitioner Certification
- Other related certifications are a plus