TotesOps' Data Project
present

TotesOps demo video

TotesOps demo video

Streamline, Automate, Innovate: Revolutionising TotesOps

Our ETL Data Engineering Project at TotesOps showcases an adept utilisation of AWS services, exemplifying our Python proficiency. The data processing pipeline relies on three AWS Lambda functions, seamlessly handling extraction, transformation, and loading tasks.

Infrastructure as Code (IaC) with Terraform ensures automated deployment, scalability, and consistent resource management. Continuous Integration/Continuous Deployment (CI/CD) using Github Actions enhances workflow efficiency.

In terms of data storage, AWS S3 buckets provide scalability and durability for ingested and processed data. The inclusion of CloudWatch for monitoring and alerting adds a layer of proactive oversight, ensuring optimal performance.

Throughout the project, we have diligently embraced Agile methodologies, employing an iterative and adaptive approach to development. This commitment underscores our dedication to innovation and excellence in the field of data engineering. The project has challenged us in unexpected ways, and we are proud of the outcome of the final product.

The Team

  • Team member imagePreview: Team member image

    Tom Roberts

  • Team member imagePreview: Team member image

    Minnie Taylor Manson

  • Team member imagePreview: Team member image

    Kirsten Brindle

  • Team member imagePreview: Team member image

    Leah Morden-Tew

  • Team member imagePreview: Team member image

    Cinthya Sánchez

  • Team member imagePreview: Team member image

    Elliott Mullins

Technologies

Technologies section imagePreview: Technologies section image

We used AWS S3, Lambda, Cloudwatch, IAM, Secrets Manager, Python, SQL (Protgres), Terraform, Github actions, Pandas, pytest, moto, unittest and Trello.

The combination of AWS services, GitHub Actions, Terraform and boto3 allowed for seamless deployment and automation of our project.

Pandas and Postgresql assisted in the data-wrangling aspect of our project.

For high-quality testing, we utilized Pytest, Moto and Unittest (MagicMock, patch, Mock) ensuring code integrity.

Challenges Faced

In the project's concluding phases, code refactoring was necessary because of unanticipated structural issues in the data warehouse schema. To address this challenge, we had to reassess our code's organisation and make necessary adjustments to align with the final schema.