As an
Analytics Engineer at Salmon, you will play a pivotal role in Data Modeling & Transformation (Databricks Silver & Gold Layers). You will work closely with Data Scientists, Engineers, and Business System Analysts to ensure that datasets align with business needs.
Key responsibilities
Data Modeling & Transformation
- Design, build, and maintain scalable data models in Databricks silver (curated data) and gold (business-ready data) layers.
- Define clear data contracts between silver and gold to ensure consistency and reliability.
- Apply best practices for dimensional modeling (star/snowflake schemas) to support analytics and reporting.
Collaboration & Best Practices
- Partner with data scientists, platform engineers, and business analysts to ensure gold datasets meet business needs.
- Follow software engineering practices version control (Git), CI/CD for data pipelines, code reviews, and testing.
- Contribute to the development of a shared analytics engineering framework (naming standards, reusable templates, testing frameworks).
ETL/ELT Development
- Develop and optimize transformation pipelines (PySpark/SQL/Delta Live Tables/Databricks Workflows) to process data from bronze silver gold.
- Implement incremental data processing strategies to minimize compute cost and improve pipeline performance.
- Ensure data quality checks (validations, anomaly detection, deduplication, SCD handling, etc.) are built into transformations.
Data Quality & Governance
- Establish and maintain data quality metrics (completeness, accuracy, timeliness) for silver and gold tables.
- Apply data governance standards consistent naming conventions, documentation, and tagging across datasets.
- Collaborate with data platform engineers to enforce lineage and observability.
Business Enablement
- Work closely with analysts and business stakeholders to understand requirements and translate them into gold-layer datasets.
- Build reusable, business-friendly datasets that power dashboards, self-service BI tools, and advanced analytics.
- Maintain documentation (data dictionaries, transformation logic, lineage diagrams).
Performance & Optimization
- Optimize Databricks SQL queries and Delta Lake performance (Z-ordering, clustering, partitioning).
- Monitor and tune workloads to control compute spend on silver and gold pipelines.
- Implement best practices for caching, indexing, and incremental updates.
Requirements and expectations
- Strong SQL expertise
- Ability to write complex, performant queries (CTEs, window functions, joins)
- Experience optimizing queries on large datasets
- Strong understanding of analytical SQL patterns
- Hands-on experience with dbt
- Building and maintaining dbt models (staging, intermediate, marts)
- Writing reusable macros and Jinja templates
- Implementing tests, documentation, and exposures
- Working with dbt version control and CI workflows
- Data Modeling expertise
- Strong understanding of dimensional modeling (facts, dimensions, star schemas)
- Ability to translate business requirements into scalable data models
- Designing metrics and semantic layers for analytics and BI
- Experience maintaining a single source of truth for business metrics
- Analytics Engineering mindset
- Strong focus on data quality, reliability, and consistency
- Experience working closely with analysts and business stakeholders
- Ability to balance technical best practices with business needs
- Production-ready analytics
- Experience with data testing, monitoring, and debugging
- Familiarity with ELT pipelines and modern data stack concepts
- Comfortable working in Git-based workflows