Are you a Data Engineer with a strong background in distributed systems and real-time data streaming looking to work on large-scale, high-impact data platforms
We are looking for a modern data engineer who operates in a cloud-native, data-as-code environment built on AWS, with a heavy focus on highly scalable and real-time data processing.
The ideal candidate has hands-on, production-level experience architecture-streaming technologies such as Apache Kafka, Spark, Flink or RabbitMQ as a core skill, alongside Apache Airflow for orchestration, and Python or Java for building high-throughput data pipelines.
This role suits engineers who think like software developers—comfortable with version control, CI/CD, testing, and distributed computing frameworks—rather than traditional ETL or legacy data warehouse practitioners.
Work Setup
- Hybrid: 2x onsite per week
- Office Location: Mandaluyong (Rockwell Business Center, Sheridan)
- Schedule: Monday to Friday, 10:00 AM to 7:00 PM
What You'll Do
- Design, build, and maintain high-throughput, scalable data pipelines integrating internal and external data sources within a Databricks ecosystem.
- Develop, optimize, and maintain real-time streaming and complex batch data processing workflows.
- Architect distributed data transformation layers using code-first frameworks (e.g., PySpark, Spark SQL, Flink).
- Ensure data quality, reliability, and observability across live data streams through validation and monitoring frameworks.
- Build high-performance data APIs and services for internal and external consumption.
- Troubleshoot and resolve production infrastructure and streaming pipeline issues.
- Work closely with Product, BI, Engineering, and Infrastructure teams.
- Participate in code reviews and Agile ceremonies.
Must-Have Qualifications
- MUST have advanced, hands-on experience with streaming and real-time data technologies such as Apache Kafka, Spark (Core/Streaming), or RabbitMQ.
- MUST have hands-on experience with Databricks and its ecosystem
- Proven experience building and maintaining distributed data pipelines using Spark and Apache Airflow.
- Strong programming skills in Python or Java with solid software engineering fundamentals.
- Advanced SQL skills and hands-on experience with relational, columnar, and NoSQL databases (MySQL, PostgreSQL, MongoDB, Elasticsearch).
- Experience architecting solutions within the AWS cloud ecosystem (e.g., EMR, MSK, Glue, S3).
- Strong understanding of modern data architectures, specifically Data Lakes and Lakehouse patterns (e.g., Apache Iceberg, DeltaLake).
- Experience building and exposing APIs / REST services for data consumption.
- Strong data modeling, data quality, and pipeline observability practices.
- Excellent communication and collaboration skills.
Nice to Have
- Exposure to AI-assisted development tools (Cursor, Claude).
- Experience with marketing data (campaigns, rewards, segmentation).
- Work experience within the gaming or gambling industry.