Job Description
Responsibilities
- Design, develop, and optimize data workflows and notebooks using Databricks to ingest, transform, and load data from various sources into the data lake.
- Build and maintain scalable and efficient data processing workflows using Spark (PySpark or Spark SQL) by following coding standards and best practices.
- Collaborate with technical and business stakeholders to understand data requirements and translate them into technical solutions.
- Develop data models and schemas to support reporting and analytics needs.
- Ensure data quality, integrity, and security by implementing appropriate checks and controls.
- Monitor and optimize data processing performance, identifying, and resolving bottlenecks.
- Stay up to date with the latest advancements in data engineering and Databricks technologies.
- Bachelor’s or master’s degree in any field
- 7- 14 years of experience in designing, implementing, and maintaining data solutions on Databricks
- Experience with at least one of the popular cloud platforms – Azure, AWS or GCP
- Experience with ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) processes
- Knowledge of data warehousing and data modelling concepts
- Experience with Python or SQL
- Experience with Delta Lake
- Understanding of DevOps principles and practices
- Excellent problem-solving and troubleshooting skills
- Strong communication and teamwork skills