Team Rannoch's Data Project
present

We know what you sold last Summer.

Create a data platform that extracts data from an operational database (and potentially other sources), archives it in a data lake, and makes it available in a remodelled OLAP data warehouse, using an ETL format.

The Team

  • Team member imagePreview: Team member image

    Luna Birtles

  • Team member imagePreview: Team member image

    Ali Anvari

  • Team member imagePreview: Team member image

    Stephen Molano-James

  • Team member imagePreview: Team member image

    James Ault

  • Team member imagePreview: Team member image

    Yaroslav Davydchuk

Technologies

Technologies section imagePreview: Technologies section image

We used: AWS (Eventbridge, Lambda, S3, SNS, Cloudwatch), Terraform, GitHub Actions, Python (pandas, pytest, boto3, moto), PostgreSQL

They were the most appropriate technologies available for the project.

Challenges Faced

We faced challenges figuring out approprate properties for our lambda functions (e.g. layers, memory and timeout), ensuring NaN representation from pandas was handled in the PostgreSQL database, ensuring that updates to the lambda code were deployed properly with Terraform, and sequencing our functions correctly to overcome issues with foreigns keys when loading data in to the OLAP database. All of which we overcame.