The Data Engineer will be responsible for designing, developing, and managing data pipelines and architectures. S/he will ensure that data flows seamlessly from multiple sources into databases, data lakes, or warehouses and is processed in a way that supports business analysis and decision-making. The ideal candidate will have strong software engineering skills, experience with data processing frameworks, and the ability to optimize and scale large data systems.
Responsibilities
- Design, build, and maintain scalable data pipelines that support data collection, transformation, and integration from various sources.
- Maintain databases, data warehouses, data lake ensuring data integrity, scalability, and reliability.
- Implement efficient ETL/ELT (Extract, Transform, Load) processes to automate data collection, cleaning, and transformation tasks.
- Integrate data from various internal and external sources, including APIs, cloud storage, third-party systems, and databases.
- Optimize data architectures for performance, scalability, and cost efficiency. Ensure fast and reliable data retrieval.
- Work closely with the data team to understand data needs and deliver reliable, clean, and accessible data solutions.
- Implement data validation and monitoring processes to ensure data accuracy and integrity. Identify and troubleshoot data issues and bottlenecks.
- Manage and optimize cloud-based infrastructure (e.g., AWS, Azure) to support data storage, processing, and retrieval.
- Automate repetitive tasks and processes using scripting languages (e.g., Python and SQL) and workflows.
- Document data architectures, pipelines, and workflows that will be understood by the Information Technology (IT) department. Provide technical support and troubleshooting to the team.
- Demonstrating a strong track record of interacting with the business and technology partners to drive the requirement, technical process into acceptance criteria.
- Keen on performing user live verification and testing to avoid errors before going to the production environment.
- Provide support for data extraction and reports as required.
Qualifications
- Degree in Computer Engineering, Data Science, Statistics, Physics, or any other related field in IT.
- Preferably 2-3 years of experience as a Data Analyst
- Knowledge in Agile methodology (e.g. Scrum & Kanban)
- Experienced in using cloud computing platforms (Azure)
- Coding Languages
- Python (Apache Spark)
- SQL
- DAX
- File types
- parquet; sql; csv; xlsx; json; txt
- Systems / Softwares / Platforms
- Github; Databricks; AWS; PowerBI; Linux
- Other Qualifications
- ETL/ELT; Integration; API; Statistics; Code/Performance Optimization; Data Wrangling;
Data Pipelines; Data Modeling; Data Quality; Data Governance
- Nice to have but not required
- LS Central; Dynamics 365 (Business Central); Microsoft SQL Server; Azure; Tableau;
Javascript; HTML; CSS; AL