We are seeking a motivated and enthusiastic intern to join our Business Insights & Analytics Data Engineering team for a 6-month internship. This position offers hands-on experience working within a modern cloud-native data platform and an opportunity to contribute to meaningful projects while gaining valuable skills in data engineering, pipeline development, and analytics infrastructure.
Key Accountabilities
Build and maintain data pipelines using Databricks (PySpark, Spark SQL) to ingest, transform, and deliver structured data across Bronze, Silver, and Gold layers following a Medallion Architecture.
Create and maintain multi-dimensional data models including Star Schema and Snowflake Schema; work with Fact and Dimension tables registered as Delta Lake tables in Unity Catalog.
Write optimized SQL and Python code for ETL/ELT workflows; leverage Databricks notebooks and Databricks Workflows (Jobs) for scheduling, automation, and job monitoring.
Identify and implement process improvements by automating manual data tasks, optimizing data delivery pipelines, and improving infrastructure scalability using Databricks and cloud-native tools.
Support data ingestion from a wide variety of sources including cloud storage (Azure Data Lake Storage / AWS S3), relational databases, and APIs, applying appropriate transformation logic.
Assist in building analytics-ready datasets and data products that provide actionable insights into commercial performance, customer engagement, and operational metrics.
Participate in code reviews, documentation, and best-practice adoption for Databricks development, including Delta table management, partitioning, and Z-ordering.
Basic Qualifications
Recently completed a Bachelor's Degree in Computer Science, Information Systems, Data Engineering, Statistics, or a related quantitative field.
Additional Skills / Preferences
Strong learning agility and intellectual curiosity - comfortable picking up new tools and technologies quickly.
Strong problem-solving skills and ability to work through ambiguous, real-world data challenges.
Excellent teamwork, communication, and collaboration skills; able to work effectively across technical and business teams.
Flexibility and adaptability in a dynamic, fast-paced analytics environment.
Working knowledge of SQL and Python; experience applying these in data manipulation or analysis tasks.
Experience with or exposure to Databricks - including notebooks, PySpark, Delta Lake, or Databricks Workflows - is preferred.
Understanding of data modeling concepts (dimensional modeling, normalization, schema design) is an advantage.