By: Charan Batth
Geovis Project Assignment @RyersonGeo, SA8905, Fall 2021
Introduction
The crime rate for the city of Chicago is significantly higher than the US average. In 2016, Chicago was responsible for nearly half of the homicides increase in the US.
A time series interactive dashboard will be used to visualize and analyze the distribution of homicides across Chicago for the last two decades. We will create this dashboard using Tableau Desktop, which is an interactive data visualization and analytical tool. In addition to the dashboard, we will create two visualizations: treemap and line chart. Treemap will be used to visualize aggregated homicides across police districts. The line chart will visualize the number of homicides per month.
Data
The data used to produce the Interactive Dashboard was obtained from the Chicago Data Portal. The dataset consists of 7,424,694 crimes between 2001 to 2021. However, since our crime is focused on homicides, the data was filtered by setting the field Prime Type to be equal to HOMICIDE and then the data was downloaded as a CSV file.
I will go through the step-by-step process of creating the time-series interactive dashboard, the dashboard can be viewed here for reference.
Creating the Interactive Dashboard
To get started on creating the interactive dashboard and the visualizations. We will first import the data, since our dataset is in a CSV format we will select the Text File under the To a File Option. After opening the data, you will see a screen showing all the fields in the CSV file and on the bottom left beside Data Source, you will see a tab called Sheet 1 (highlighted in orange), we will click on it to begin the process of creating the dashboard. The Worksheet tabs will be used to create the map and visualizations and the Dashboard tab will be used to create the dashboard.
In order to create a dot density map showing homicides across Chicago, we need to plot the latitude and longitude coordinates for each homicide. We do this by dragging the Longitude field into the Columns tab and the Latitude field into the Rows tab. We then set both fields to Dimension, by right-clicking on the fields. A map will automatically be created however, there are two minor issues with the map, shown below.
Our map shows 1 null point (displayed on the bottom right of the map) and there is a random point in Missouri.
In order to fix these issues, we will first remove the null point by clicking on 1 null and selecting the Filter Data. To remove the random point located in Missouri, we will right-click on the point and select Exclude. This will remove the point from the map, and our map extent will automatically zoom to the Chicago area.
Creating Time-series Map
To create the time-series map, we will drag the Year field into the Pages card. This will create a time slider that will allow you to view the dot density map for any chosen year. The time slider also allows the user to animate the map, by clicking on the loop button and the animation can be paused at any time.
For our dot density map, we will show specific attributes for each homicide location on the map. This can be accomplished by dragging the fields into the Marks card. For our map, we will show the following fields: Block, Description, District, Location Description, and Date.
To make our map look aesthetic, we will change the theme of our map to Dark. This can be done by going to the header Map, hovering over to Background Map, and selecting Dark. To better visualize the locations of the data points, we will add zip code boundaries to the map. To do this, we will go to the header Map and from there we will choose Map Layers and then select the Zip Code Boundaries under the Map Layers pane (this will appear on the left side of the sheet). Lastly, we are going to change the colour and size of the data points. This can be done by going to the Marks card and selecting the Color and Size option.
Visualizations
Treemap
We will now create the visualizations to better understand the distribution of homicides in Chicago. To begin the process, we will create a new Worksheet and we will name it Treemap. To create a treemap, we will first drag the Year field into the Page card, as we are creating a time-series interactive map. Since we want to see how homicides vary across police districts, we will drag the District field into the Marks card. To show the homicides, we will drag the Primary Type field onto both the Color and Size options in the Marks card. We will then set the Primary Type field to Measure and choose Count, as we want to show aggregated homicides. The final step is to make our worksheet transparent, so we could add it to our interactive map. This is done by going to the header Format and selecting Shading. In the Formatting pane, we will set the Worksheet and Pane background colour to None.
Line Chart
We will create a new Worksheet and name it chart. Our data does not contain the month the incident occurred, but we have the Date when the incident occurred. So, in order to extract just the month, we will need to create a new field. This can be done by going to the Analysis header and choosing Create Calculated Field. We will give the field an appropriate name, change the name Calculation1 to MonthOfIncident. To extract the month we first need to truncate the Date field, as it contains both the date and time. We will use the LEFT function which allows us to truncate a string type specified by the length. The date consists of 10 characters (dd/mm/yyyy), so our query would be LEFT([Date], 10). Next, we need to extract the month from the truncated string, so we will use the built-in function, called MONTH, which returns a number representing the month. However, the MONTH function requires its parameter data type to be a date. So we need to convert our truncated string date to date, we can do this by applying the DATE function on the LEFT function and finally applying the MONTH function on the entire expression. Thus our expression for finding the month is:
Now, we can finally begin the process of creating the line chart. As we are making a time-series interactive map, we will also need to make a time-series line chart. So, we will drag the Year field into the Pages card, as this will be part of our time-series interactive map. Next, we will drag the MonthOfIncident field into the Columns tab and Primary Type into the Rows tab. Since we want to show the total number of homicides, we will set the Primary Type field to Measure and select Count. We will make this worksheet transparent as well, so we will go to the header Format and select Shading. In the Formatting pane, we will set the Worksheet and Pane background colour to None.
Creating the Dashboard
To create our dashboard, we will click on the Dashboard tab, right beside the Worksheet tab. In the dashboard, we can add all the worksheets we have created. We will first add the interactive map followed by the visualizations. To display the visualizations on top of the map, we need to make them float. So, we will select one of the visualizations and hover over to More Options (shown as a downward arrow) and click on Floating, repeat this process for the other visualization. You can also change the size of the dashboard by going to the Size pane, the default size is Desktop Browser (1000 x 800), we will change it to Generic Desktop (1366 x 788). Last but not least, we will publish this dashboard, by going to the Server -> Tableau Public -> Save to Tableau Public As. Tableau Public allows anyone to view the dashboard and allow anyone to download it and specific permissions for the dashboard can be applied.
Limitations and Future Goals
One of the main limitations that occurred during the process of creating the dashboard was gathering the data. First, I had downloaded the entire CSV file containing all different types of crimes. However, when I filtered the Primary Type to HOMICIDE in the Filters card, a huge amount of data for homicides was missing. So, I then decided to directly connect the dataset to Tableau using ODATA Server. It took me a couple hours to connect to the server, just to run into the same issue. I then tried exporting the data through SODA API from the portal, I was able to find raw data for homicides however, it contained partial data. After a while, I figured out I had to directly filter the table in the Chicago Data Portal in order to download the entire data for Chicago homicides.
Another limitation I faced with the data was creating the visualizations. Originally I intended on creating a highlight table to show how homicides varied across police districts and community areas. However, due to the data having null values for community areas, the visualization couldn’t be created. Furthermore, I was only able to create basic visualizations, as the data did not have any interesting variables to help analyze the homicide distribution. For instance, if each homicide incident included a Zip Code, it could have been used to explain the spatial pattern much better rather than using police districts to show how homicides vary across it.
If I was to expand on this project, I would try to incorporate all different crime incidents from 2001-2021 to see Chicago’s overall crime history. In addition to this, I would find demographic data for Chicago such as population, education, and average family income to help understand the spatial pattern for the distribution of crimes.