Data Engineer (Databricks & AWS)

The Citco Group Limited

Philippines, Central Luzon

2-4 Years

Save

Posted 11 hours ago
Be among the first 10 applicants

Early Applicant

Job Description

Job Description

Position: Data Engineer (Databricks & AWS)

Company Overview Citco is a global leader in financial services, delivering innovative solutions to some of the world's largest institutional clients. We harness the power of data to drive operational efficiency and informed decision-making. We are looking for a Data Engineer with strong Databricks expertise and AWS experience to contribute to mission-critical data initiatives.

Role Summary as a Data Engineer, you will be responsible for developing and maintaining end-to-end data solutions on Databricks (Spark, Delta Lake, MLflow, etc.) while working with core AWS services (S3, Glue, Lambda, etc.). You will work within a technical team, implementing best practices in performance, security, and scalability. This role requires solid understanding of Databricks and experience with cloud-based data platforms.

Key Responsibilities

Databricks Platform & Development
Implement Databricks Lakehouse solutions using Delta Lake for ACID transactions and data versioning
Utilize Databricks SQL Analytics for querying and report generation Support cluster management and Spark job optimization
Develop structured streaming pipelines for data ingestion and processing
Use Databricks Repos, notebooks, and job scheduling for development workflows
AWS Cloud Integration
Work with Databricks and AWS S3 integration for data lake storage
Build ETL/ELT pipelines using AWS Glue catalog, AWS Lambda, and AWS Step Functions
Configure networking settings for secure data access
Support infrastructure deployment using AWS CloudFormation or Terraform
Data Pipeline & Workflow Development
Create scalable ETL frameworks using Spark (Python/Scala)
Participate in workflow orchestration and CI/CD implementation
Develop Delta Live Tables for data ingestion and transformations
Support MLflow integration for data lineage and reproducibility
Performance & Optimization
Implement Spark job optimizations (caching, partitioning, joins)
Support cluster configuration for optimal performance
Optimize data processing for large-scale datasets
Security & Governance
Apply Unity Catalog features for governance and access control
Follow compliance requirements and security policies
Implement IAM best practices
Team Collaboration
Participate in code reviews and knowledge-sharing sessions
Work within Agile/Scrum development framework
Collaborate with team members and stakeholders
Monitoring & Maintenance
Help implement monitoring solutions for pipeline performance
Support alert system setup and maintenance
Ensure data quality and reliability standards

Qualifications

Educational Background
Bachelor's degree in Computer Science, Data Science, Engineering, or equivalent experience
Technical Experience
Databricks Experience: 2+ years of hands-on Databricks (Spark) experience
AWS Knowledge: Experience with AWS S3, Glue, Lambda, and basic security practices
Programming Skills: Strong proficiency in Python (PySpark) and SQL
Data Warehousing: Understanding of RDBMS and data modeling concepts
Infrastructure: Familiarity with infrastructure as code concepts