Theft behaviors analysis in Chicago

Overview

The project is conducted to explore the character traits of theft from a significant amount of raw data - one of the complicated issues in data analysis. To address this challenge, I implement an automated data processing that cleans and normalizes the datasets, then reports to the DBA. Particularly, the DBA uploads the dataset on the server and receives a notification of the status of these pieces of data without performing any actions. Furthermore, in this project, I present a series of approaches to explore data insight. First, I pursue OLAP queries to understand the overview of the theft’s behaviors. Second, I implement a linear regression model to determine what factors affect the increase in theft behaviors in Chicago city. Finally, I build machine learning to predict behavior that is classified as theft or not. You can read the full report here.

Architecture

The project is structured by Cross-industry standard process for data mining (CRISP-DM):

  1. Business understanding
  2. Data understanding
  3. Data preparation
  4. Modeling
  5. Evaluation
this is a placeholder image
The data warehouse architecture

Dashboard

Some dashboards are built using Power BI.

this is a placeholder image
The dashboard examines the number of theft crimes in Chicago
this is a placeholder image
The dashboard shows underlying trend in criminal behavior
this is a placeholder image
The dashboard shows crime ware over areas

Data mining

Using linear regression

this is a placeholder image
Coefficients of variables using linear and logistic regression
this is a placeholder image
Positive and negative coefficients of variables
this is a placeholder image
Relationship between the predictor variables and response variable

Data modeling

this is a placeholder image
Build model with different predictor variables

Comments