Team Sidley Project
present

In Steve We Trust

We have built applications capable of handling Extract, Transform, and Load (ETL) processes, moving data from a prepared source, into a data lake, and finally a warehouse hosted in AWS.

The Team

  • Liam DransfieldPreview: Liam Dransfield

    Liam Dransfield

  • Team member imagePreview: Team member image

    Russell Gooday

  • Simon HossainPreview: Simon Hossain

    Simon Hossain

  • Isma'il DjelloulPreview: Isma'il Djelloul

    Isma'il Djelloul

  • Karl SmithPreview: Karl Smith

    Karl Smith

  • Zameer MohamedPreview: Zameer Mohamed

    Zameer Mohamed

Technologies

AWS: Lambda, S3, CloudWatch, EventBridge, Terraform, Python (pandas, pytest, PostgresPreview: AWS: Lambda, S3, CloudWatch, EventBridge, Terraform, Python (pandas, pytest, Postgres

We used: AWS: Lambda, S3, CloudWatch, IAM, Secrets Manager, EventBridge, Simple Notification Service (SNS). Terraform, Poetry, Python (pandas, boto3, moto, pytest, pg8000, pyarrow, fastparquet) Postgres, GitHub (source control), GitHub Projects (kanban board), GitHub Actions (CI/CD)

We chose to work with Python, with libraries like pandas for data manipulation and boto3 for AWS interactions, enables powerful and flexible scripting. GitHub supports robust source control, while GitHub Projects and Actions streamline project management and CI/CD, automating workflows for faster, error-free deployments.

Terraform was chosen because it allows for consistent, repeatable infrastructure as code. We deployed the following AWS services for their efficiency, scalability, and automation benefits. AWS Lambda enables serverless computing, reducing operational overhead, while S3 offers cost-effective, scalable storage. CloudWatch provides real-time monitoring and logging, ensuring system reliability, and IAM enhances security through granular access control. Secrets Manager helps safeguard sensitive information, while EventBridge and SNS facilitate an event-driven architecture, improving communication and system decoupling.

Overall, these technologies provide a strong foundation for a scalable, secure, and automated project architecture.

Challenges Faced

During the project, we faced several challenges. Using Poetry as our virtual environment instead of venv required extra research to adapt to new workflows. We had to dive into new topics like visualization tools and database mocking with pytest-postgresql, which added complexity to our testing strategy. Testing and coverage, particularly with mocking, proved to be tricky, requiring more effort to ensure proper implementation. Additionally, we spent time revising and researching concepts learned during the bootcamp, such as Terraform and pandas, to ensure we were applying best practices. Despite these challenges, overcoming them deepened our understanding and strengthened our technical skills.

It was amazing working through the course, we had great tutors who helped to lay the foundation for many of the skills we applied in this project. Massive thanks to Northcoders for the great opportunity and all the support!