This case study is a comprehensive analysis of data crimes from 2001 to present using various data engineering tools and practices.
Table of Contents
About
The purpose of this project is to analyze the trends and patterns in crime data in Chicago.
Usage
To use this project, follow these steps:
- Load the dataset into your preferred tool (e.g., Jupyter Notebook)
- Run a query to analyze crime trends and patterns
Deployment
For deployment, we will be using [AWS/GCP] to scale the data processing tasks and handle large datasets.
Built Using
This project was built using various data engineering tools and practices, including:
* Airflow
* AWS/GCP for deployment
* DBT for database tasks
* PySpark and PyFlink for processing large datasets
Authors
- Vivek Murali - Idea & Initial work.
Acknowledgements
- Hat tip to anyone whose code was used
- Inspiration from [inspiration source]
- References
Comparison of Practices
We have compared the performance, scalability, and ease of use of each tool in handling large datasets. This section provides a comprehensive overview of our findings.
Project Organization
├── LICENSE
├── Makefile <- Makefile with commands like `make data` or `make train`
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── docs <- A default Sphinx project; see sphinx-doc.org for details
│...
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
├── figures <- Generated graphics and figures to be used in reporting
...
└── src <- Source code for use in this project.
Development Updates
=====================
Task 1
- Local Setup of Infra Using Helm Charts and Kubernetes.
- Analyzed crime data from 2001 to 2010 using Jupyter Notebook
- Identified trends in violent crime rates over time
I hope this helps! Let me know if you have any questions or need further assistance.