Key Responsibilities
- Design and implement modern data platforms using the Databricks Lakehouse Platform.
- Build ingestion and transformation pipelines using Databricks Workflows, Apache Spark (PySpark / Spark SQL) and Delta Lake, supporting batch, streaming and near real-time processing.
- Architect focus: Define and apply platform standards for data ingestion, modelling, data quality and governance; ensure alignment to cloud, security, identity and cost management requirements.
- Develop and optimise Databricks (Apache Spark) pipelines and Delta Lake models to enable governed data consumption (for example, Power BI/Tableau) and support analytics, data science, and machine learning (ML) use cases.
- Implement monitoring, performance tuning and cluster optimisation; support CI/CD for notebooks, jobs and data pipelines and contribute to operational runbooks.
- Collaborate with analysts, data scientists and application teams; advise stakeholders on Databricks capabilities and design trade-offs, and mentor junior team members (where applicable).
Skills, Experience and Competencies
Technical Skills
- Strong expertise in Databricks Lakehouse Platform.
- Deep experience with Apache Spark, PySpark, and Spark SQL.
- Knowledge of Delta Lake, data modelling, and performance tuning.
- Experience with streaming data (Structured Streaming, Kafka).
- Understanding of data security, governance, and cost optimisation.
Experience (Role Level Determination)
MDP Databricks Architect
- 10–15+ years of data and analytics experience.
- 7–10+ years designing enterprise scale data platforms.
- Proven experience defining lakehouse architectures and data strategies.
MDP Databricks Consultant
- 6–9 years of data and analytics experience.
- 4–6 years delivering modern data platforms using Databricks.
- Strong delivery, design, and stakeholder engagement skills.
MDP Databricks Developer
- 4–6 years of overall data / analytics experience.
- 2–4 years of hands‑on Databricks and Spark experience.
- Strong focus on development and implementation.
Competencies
- Early alignment and shaping clarifying use cases, data domains, success measures and non-functional requirements, then translating them into a practical Databricks delivery plan.
- Governance and controlled change applying standards and change control (with impact assessment) across notebooks, jobs, pipelines, data models and environments.
- Trusted data, quality by design building Delta Lake pipelines and models with validation, testing, lineage and reconciliation to support analytics and artificial intelligence/machine learning (AI/ML).
- Standardise use patterns, templates and accelerators (including CI/CD and infrastructure as code where relevant) to improve speed and consistency.
- Transparent delivery and operational discipline communicating progress and risks clearly, implement monitoring and observability, and optimising cost and performance for stable operations.
Qualifications and Certifications
- Databricks Certified Data Engineer (Associate / Professional)
- Databricks Certified Data Architect (role dependent).
- Cloud platform certifications (Microsoft Azure / Amazon Web Services desirable).