Team Sidley Projectpresent
In Steve We Trust
We have built applications capable of handling Extract, Transform, and Load (ETL) processes, moving data from a prepared source, into a data lake, and finally a warehouse hosted in AWS.
The Team
Liam Dransfield
Russell Gooday
Simon Hossain
Isma'il Djelloul
Karl Smith
Zameer Mohamed
Technologies
We used: AWS: Lambda, S3, CloudWatch, IAM, Secrets Manager, EventBridge, Simple Notification Service (SNS). Terraform, Poetry, Python (pandas, boto3, moto, pytest, pg8000, pyarrow, fastparquet) Postgres, GitHub (source control), GitHub Projects (kanban board), GitHub Actions (CI/CD)
We chose to work with Python, with libraries like pandas for data manipulation and boto3 for AWS interactions, enables powerful and flexible scripting. GitHub supports robust source control, while GitHub Projects and Actions streamline project management and CI/CD, automating workflows for faster, error-free deployments.
Terraform was chosen because it allows for consistent, repeatable infrastructure as code. We deployed the following AWS services for their efficiency, scalability, and automation benefits. AWS Lambda enables serverless computing, reducing operational overhead, while S3 offers cost-effective, scalable storage. CloudWatch provides real-time monitoring and logging, ensuring system reliability, and IAM enhances security through granular access control. Secrets Manager helps safeguard sensitive information, while EventBridge and SNS facilitate an event-driven architecture, improving communication and system decoupling.
Overall, these technologies provide a strong foundation for a scalable, secure, and automated project architecture.
Challenges Faced
During the project, we faced several challenges. Using Poetry as our virtual environment instead of venv required extra research to adapt to new workflows. We had to dive into new topics like visualization tools and database mocking with pytest-postgresql, which added complexity to our testing strategy. Testing and coverage, particularly with mocking, proved to be tricky, requiring more effort to ensure proper implementation. Additionally, we spent time revising and researching concepts learned during the bootcamp, such as Terraform and pandas, to ensure we were applying best practices. Despite these challenges, overcoming them deepened our understanding and strengthened our technical skills.
It was amazing working through the course, we had great tutors who helped to lay the foundation for many of the skills we applied in this project. Massive thanks to Northcoders for the great opportunity and all the support!