Tracking the COVID-19 Pandemic in Toronto with R and Leaflet

By: Tavis Buckland

Geovisualization Project Assignment, SA8905, Fall 2020

Github Repository:


Over the course of the pandemic, the City of Toronto has implemented a COVID-19 webpage focused on providing summary statistics on the current extent of COVID-19 cases in the city. Since the beginning of the pandemic, this webpage has greatly improved, yet it still lacks the functionality to analyze spatio-temporal trends in case counts. Despite not providing this functionality directly, the City has released the raw data for each reported case of COVID-19 since the beginning of the pandemic . Using RStudio with the leaflet and shiny libraries, a tool was designed to allow for the automated collection, cleaning and mapping of this raw case data.

Sample of COVID-19 case data obtained from the Toronto Data Portal


The raw case data was downloaded from the Toronto Open Data Portal in R, and added to a data frame using read.csv. As shown in the image below, this data contained the neighbourhood name and episode date for each individual reported case. As of Nov. 30th, 2020, this contained over 38,000 reported cases. Geometries and 2016 population counts for the City of Toronto neighbourhoods were also gathered from the Toronto Open Data Portal.


After gathering the necessary inputs, an extensive amount of cleaning was required to allow the case data to be aggregated to Toronto’s 140 neighbourhoods and this process had to be repeatable for each new instance of the COVID-19 case data that was downloaded. Hyphens, spaces and other minor inconsistencies between the case and neighbourhood data were solved. Approximately 2.5% of all covid cases in this dataset were also missing a neighbourhood name to join on. Instead of discarding these cases, a ‘Missing cases’ neighbourhood was developed to hold them. The number of cases for each neighbourhood by day was then counted and transposed into a new data table. From there, using ‘rowSum’, the cumulative number of cases in each neighbourhood was obtained.

Example of some of the code used to clean the dataset and calculate cumulative cases

Unfortunately, in its current state, the R code will only gather the most recent case data and calculate cumulative cases by neighbourhood. Based on how the data was restructured, calculating cumulative cases for each day since the beginning of the pandemic was not achieved.


Using leaflet all this data was brought together into an interactive map. Raw case counts were rated per 100,000 and classified into quintiles. The two screenshots below show the output and popup functionality added to the leaflet map.

In its current state, the map is only produced on a local instance and requires RStudio to run. A number of challenges were faced when attempting to deploy this map application, and unfortunately, the map was not able to be hosted through the shiny apps cloud-server. As an alternative, the map code has been made available through a GitHub repository at the top of this blog post. This repository also includes a stand-alone HTML file with an interactive map.

Screenshot of HTML map produced by R Shiny App and Leaflet. Popups display neighbourhood names, population, raw count, and rate per 100,000 for the most recent case data.


There are a couple notable limitations to mention considering the data and methods used in this project. For one, the case data only supports aggregation to Toronto neighbourhoods or forward sortation areas (FSA). At this spatial scale, trends in case counts are summarized over very large areas and are not likely to accurately represent This includes the modifiable areal unit problem (MAUP), which describes the statistical biases that can emerge from aggregating real-world phenomena into arbitrary boundaries. The reported cases derived from Toronto Public Health (TPH) are likely subject to sampling bias and do not provide a complete record of the pandemic’s spread through Toronto. Among these limitations, I must also mention my limited experience building maps in R and deploying them onto the format.


With the power of R and its many libraries, there are a great many improvements to be made to this tool but I will note a few of the significant updates I would like to implement over the coming months. Foremost, is to use the ‘leaftime’ R package to add a timeline function, allowing map-users to analyze changes over time in reported neighbourhood cases. Adding the function to quickly extract the map’s data into a CSV file, directly from the map’s interface, is another immediate goal for this tool. This CSV could contain a snapshot of the data based on a particular time frame identified by a user. The last functionality planned for this map is the ability to modify the classification method used. Currently, the neighbourhoods are classified into quintiles based on cumulative case counts per 100,000. Using an extended library of leaflet, called ‘leafletproxy’, would allow map users greater control over map elements. It should be possible to allow users to define the number of classes and which method (i.e. natural breaks, standard deviation, etc.) directly from the map application.

Interactive Map and Border Travels

Given the chance to look at making geovisualisation, a pursuit began to bring in data on a scope which would need adjustments and interaction for understanding geography further and further, while still being able to begin the journey with an overview and general understanding of the topic at hand.

Introduction to the geovisualisation

This blog post doesn’t unveil a hidden gem theme of border crossing, but demonstrates how an interactive map can share the insights which the user might seek, not being limited to the publisher’s extents or by printed information. Border crossing is selected as topic of interest to observe the navigation that may get chosen with borders, applying this user to a point of view that is similar to those crossing at these points themselves, by allowing them to look at the crossing options, and consider preferences.

To give the user this perspective, this meant beginning to locate and provide the crossing points. The border crossing selected was the US border between Canada and between Mexico, being a scope which could be engaged with the viewer and provide detail, instead of having to limit this data of surface transportation to a single specified scale and extent determined by the creator rather than the user.

Border crossings are a matter largely determined by geography, and are best understood in map rather than any other data representation, unlike attributes like sales data which may still be suitable in an aspatial sense, such as projected sales levels by line graph.

To get specific, the data came from the U.S. Bureau of Transportation Statistics, and was cleaned to be results from the beginning of January 2010 til the end of September 2020. The data was geocoded with multiple providers and selected upon consistency, however some locations were provided but their location could not be identified.

Seal of the U.S. Bureau of Transportation Statistics

To start allowing any insights for you, the viewer, the first data set to be appended to the map is of the border locations. These are points, and started to identify the distribution of crossing opportunities between the north American countries. If a point could not be appended to the location of the particular office that processed the border entries, then the record was assigned to the city which the office was located in. An appropriate base layer was imported from Mapbox to best display the background map information.

The changes in the range of border crossings were represented by shifts in colour gradient and symbol size. With all the points and their proportions plotted, patterns could begin to be provided as per the attached border attributes. These can illustrate the increases and decreases in entries, such as the crossings in California points being larger compared to entries in Montana.

Mapped Data

But is there a measure as to how visited the state itself is, rather than at each entry point? Yes! Indeed there is. In addition to the crossing points themselves, the states which they belong to have also been given measurement. Each state with a crossing is represented on the map displaying a gradient for the value of average crossing which the state had experienced. We knew that California had entry points with more crossings than the points shown in Montana, but now we compare these states themselves, and see that California altogether still experienced more crossings at the border than Montana had, despite having fewer border entry points.

Could there be a way to milk just a bit more of this basic information? Yes. This is where the map begins to benefit from being interactive.

Each point and each state can be hovered over to show the calculated values they had, clarifying how much more or less one case had when compared to another. A state may have a similar gradient, an entry point may appear the same size, but to hover over them you can see which place the locations belong to, as well as the specific crossing value it has. Montana is a state with one of the most numerous crossing points, and experiencing similar crossing frequencies across these entries. To hover over the points we can discover that Sweetgrass, Montana is the most popular point along the Montana border.

Similar values along the Montana border

In fact, this is how we discover another dimension which belongs to the data. Hovering over these cases we can see a list of transport modes that make up the total crossings, and that the sum was made up of transport by trucks, trains, automotives, busses, and pedestrians.

To discover more data available should simply mean more available to learn, and to only state the transport numbers without their visuals would not be the way to share an engaging spatial understanding. With these 5 extra aspects of the border crossings available, the map can be made to display the distributions of each particular mode.

Despite the points in Alaska typically being one of the least entered among the total border crossings, selecting the entries by train draws attention to Skagway, Alaska, being one of the most used border points for crossing into the US, even though it is not connected to the mainland. Of course, this mapped display paints a strong understanding from the visuals, as though this large entry experienced at Skagway, Alaska is related to the border crossings at Blaine, Washington, likely being the train connection between Alaska and Continental USA.

Mapping truck crossing levels (above), crossings are made going east and past the small city of Calexico. The Calexico East is seen having a road connection between the two boundaries facing a single direction, suggesting little interaction intended along the way

When mapping pedestrian crossings (above), these are much more popular in Calexico, the area which is likely big dense to support the operation of the airport shown in its region, and is displaying an interweaving connection of roads associated with an everyday usage

Overall, this is where the interactive mapping applies. The borders and their entry points have relations largely influenced by geography. The total pedestrian or personal vehicle crossings do well to describe how attractive the region may be on one side rather than another. Searching to discover where these locations become attractive, and even the underlying causes for the crossing to be selected, can be discovered in the map that is interactive for the user, looking at the grounds which the user chooses.

While this theme data layered on top highlights the topic, the base map can help explain the reasons behind it, and both are better understood when interactive. It isn’t necessary to answer one particular thought here as a static map may do, but instead to help address a number of speculative thoughts, enabling your exploration.

Past 5 Years of Toronto Robberies

By: Niraginy Theivendram

Geovisualization Project, SA 8905, Fall 2020


Over the past years, Toronto has experienced an increase in robbery and assault. Robbery is the act of stealing from a person using violence or threats of violence. According to the Toronto Police Services, in the past 5 years there has been over 20, 000 police reported robberies in the city. Toronto Police provides various datasets online to the public with various types of crime information across the City of Toronto. This dataset can be used to visualize and analyze the distribution of Toronto crime. There have been many types of crime that Toronto has experienced over the years, however, this interactive dashboard will look at the different types of robberies in Toronto. With Tableau’s interactive time series map, you will be able to visualize the distribution of Toronto robberies over a span of 5 years.

The following dashboard was produced using Tableau Public, an interactive data visualization and analytics tool. I created a time series map visualizing the different types of robberies that Toronto has experienced over a 5-year span. In addition to the map, there are 3 visuals produced. The pie chart visualizes the percent of total offence type. The other 2 charts allow you to visualize the distribution of each type of offence over a 1-year period based on the count as well as the number of offences per neighbourhood.


The data used for this dashboard was acquired from Toronto Police Services and was downloaded as a shapefile. Toronto Police provides a variety of data types for all types of crime. However, this specific dashboard uses the Robbery 2014 to 2019 dataset. This data was a point file consisting of information about the type of offence, the date it occurred, the date it was reported, and the neighbourhood it happened in.

The following information will go through the steps in producing this dashboard in Tableau. The overall dashboard can be viewed on Tableau Public here.


Importing Data

Before getting starting on the visuals, we will first need to import the data we are working with. Tableau works with a wide range of data types. Since I will be using a shapefile, we can import this data as a ‘Spatial file’ in the Connect section. This file will then open up in the Data Source tab where you will be able to sort and edit your data. The Sheet tab can then be used to individually create maps and charts which will then be rearranged into a dashboard using the Dashboard tab.

Creating the Time Series Map

First, we will be creating a time series map from 2014 to 2019, showing the number of robberies in Toronto as a dot density map. Go into the Sheet tab to create the first map. This dataset provides longitude and latitude coordinates for each robbery represented by a point. To create a dot density map, we will drag the ‘Longitude’ field into the Columns tab and the ‘Latitude’ field into the Rows tab. Right-click on the Longitude and Latitude fields and make sure they are set as a Dimension in order to produce this dot density map.

To make this a time series map, we will drag the field ‘Reported Year’ into the Pages card. This will produce a time slider which enables you to view the dot density map at any chosen reported year.

The time slider will allow you to view the map in a loop by adjusting the speed of the animation. This could be controlled by any user just by using the features on the legend.

Finally, in the upper-right Show Me tab, select the symbol map icon to produce your base map.

The Marks card provides you with control over how the data is displayed in the view. The options on the card allows you to change the level of detail as well as the appearance. For this map, we would like to display the Offence type, Neighbourhood, Reported Date, and Division of each Robbery point on the map. Make sure these fields are dragged into the Marks card as a Detail, so that it doesn’t affect the headers built by the fields on the Columns and Rows. The attributes that appear when you hover over one or more marks in the view can be controlled in the Tooltips tab. You can modify which fields to include in the tooltip and how to display it.

Creating Graphics and Visuals

Next, we will create a graph displaying the number of robberies by offence type for each month over the entire time series.

To produce this graph, drag and drop the ‘Reported Month’ field into the Columns tab and the ‘Offence’ field into the Rows tab. Make sure both fields are set as a Dimension.

Since this will also be a part of the time series, drag and drop the ‘Reported Year’ field into the Pages card.

Next, we add the ‘Offence’ field into the Marks card to quantify how many robberies are attributed to each type of offence. Since we want the number of offences, right-click on the field and under Measure, click Count. This will display the number of offences and will also enable you to make the symbol proportional to the number of offences by adding the field as a Size. As mentioned before, the attributes shown when a user hovers over a feature can be edited under the tooltip.

Next, we will create the pie chart. This will display the percent of each offence type based on the total count.

Since this is also part of the time series, we will add the ‘Reported Year’ field in the Pages card. Next to represent the count of offences as a pie chart we will add the ‘Offence’ field as a count into the Marks card. Change the type to Angle or click ‘Pie Chart’ in the Show Me tab to create a pie chart.

We also want the percentage of the number of offences, right-click on the field and go to ‘Quick Table Calculation’ where you will be able to make the Percent of Total calculation. This will then display the percent of each offence when you hover over the pie chart. Add another ‘Offence’ field to the Marks card to control the colour scheme of the pie chart.

Next, we will create the chart displaying the number of offences per neighbourhood. This will allow the users to get an understanding of which neighbourhoods experience a high number of robberies.

Similar to the previous visuals, drag and drop the ‘Reported Year’ field into the Pages card to be included into the time series. In the Show Me tab, select horizontal bars and then drag the ‘Neighbourhood’ field into the Marks card as a Detail. Since we want to look at the count of offences per neighbourhood, add the ‘Offence’ field into the Marks card as a Size. This will allow the squares representing the neighbourhoods in the chart to be proportional to the number of robberies that were reported in that location. To control the colours of the square, add another ‘Offence’ field (count) as Colour.

 Creating the Dashboard

Now in the Dashboard tab, all the sheets that were created of the map and charts can now be added onto the dashboard using the toolbar on the left by simply dragging each individual sheet into the dashboard pane. This toolbar can also be used to change the size of the dashboard. You can then save the created dashboard which will be published to an online public portal.

Limitations and Future Works

This dashboard is produced using police reported data which provides only one particular view of the nature and extent of the robbery. One of the major factors that can influence the police reported crime rate is the willingness of the public to report a crime to the police. It is not certain that every robbery that has happened in Toronto has been reported. Over the years, it has been proven by criminologists that many crimes never come to attention to the police because the incident was not considered important enough.

Being a time series dashboard, examining the distribution of robberies over a larger time period dating back to the late 1900s or early 2000s would further our understanding of the distribution of robberies. The 5-year time series doesn’t show much of a difference in the patterns that were determined. However, Toronto Police only provides data for the last 5 years, dating from 2014-2019, making it impossible to look at a larger time period.

To expand further on this project in the future, it would be interesting to look at potential factors relevant to robbery and assault. Given that this is a quantitative analysis, it cannot take into account of all potential factors of relevance to crime due to the limitation of data availability and challenges in their quantification. This model is limited in that it cannot consider the importance of many socio-demographic changes in Canadian society that is not available in a statistical time series. For the future of this project, exploring the statistical relationship between crime patterns and the demographic and economic changes would allow us to conclude with better assumptions about Toronto crime patterns today.

Renewable Energy Installations in The Greater Toronto Area

By: Athithja Arunagiri

Geo-Visualization Project @RyersonGeo, SA8905, Fall 2020

Project Link: Click here


Renewable energy is the energy that is derived from natural processes that are replenished at a rate that is equal to or faster than the rate at which they are consumed. There are various forms of renewable energy, getting directly or indirectly from the sun, or from the heat that is generated deep within the earth. They include energy generated from wind, solar, hydropower and ocean resources, geothermal, solid biomass, biogas and liquid biofuels. Over time, there has been a wide range of energy-producing technologies and equipment that took advantage of these natural resources. Consequently, the utilizable energy can be produced in many forms. These forms include industrial heat, electricity, thermal energy for space and water conditioning, and transportation fuels.

Canada has an abundance of renewable resources that can be used to produce energy due to its large landmass and diversified geography. Canada is a world leader in the production and use of energy from renewable resources. For this project, we would be focusing on renewable energy installations in the Greater Toronto Area (GTA), Canada. There are 58 renewable energy installations in GTA. Renewable energy resources currently provide 0.6% of GTA’s total renewable energy supply. Deep Lake Water Cooling, Geothermal, Solar Air Heating, Solar Hot Water, Solar Photovoltaic (PV) and Wind Turbine are the forms of renewable energy used in GTA. Solar photovoltaic is the most important used form of renewable energy produced in the GTA. Solar hot water also contributes to the GTA’s renewable energy mix. Recently, more wind and solar photovoltaic is being used within the GTA.  

Project Description

My geo-visualization project includes one interactive map and two graphs:

Fig. 1: Screenshot of my geo-visualization project.

My map illustrates renewable energy locations in the GTA. This map is a proportional symbol map where the size of the circles depends on the size of the installation. Depending on the year and type chosen, the size varies.  Users can view the results for different years between 1986 and 2014. Users can select years to see how many installed within that year and select the type of installation to see how many of that specific installation is within the GTA. The bar graph compares the type of installation and its size. The bars are stacked by the years each renewable energy was installed. The pie chart looks at the % of total counts of system owners. 79.31% of the system owners are generated.    


Tableau is a data visualization software used to see and understand data. For my data visualization project, I used Tableau Public to create my dashboard. I chose to use Tableau because it has built-in visualizing and interacting tools and provides limitless data exploration. It allows you to import many different types of files such as shapefiles, text files and excel files to the maps.

Data & Methods

The data used for this project was downloaded from the Toronto Open Data Portal. I used the Renewable Energy Installations shapefile (Click here) for my map. This data consists of point data of the renewable energy locations in the GTA. It displays data from 1863 to 2014. The attribute for this data includes Building Name, Location, Type, Year Installed, Size (ekW), etc. This data was imported into Tableau’s data source as a ‘Spatial file”.

Fig. 2: Adding a connection in Tableau

For my map, I added the “Geometry” field into the “Marks” card in Sheet. This added the generated Longitude field to the “Columns” tab and the generated Latitude field to the “Rows” tab. The background map was set to a dark theme and in the upper-right “Show Me” tab, the map icon can be selected to generate the base map.

Fig. 3 & 4: Geometry added into “Marks” to produce the points for the map.

From the Tables column, I added multiple features to the sheet. Systems Owner, Geometry, Building Name, Type, Year Installed, and Size field were added to the “Marks” card. The Type field was set to Colour and Sum Size field was set to Size. Then under the “Marks” card, I set it to Circle to allow the Size field to symbolized using a proportional symbol. A year filter was added to the map. Users can use the slider to look at the installations by year.

Fig. 5, 6 & 7: The settings behind the interactive map.

Next, I opened a second sheet. This was used for the bar graph. For the bar graph, under the “Marks” card, I set it to Bar. I used Type for the “Rows” tab and Size for the “Columns” tab. Under the “Marks” card, the Year Installed field was set to Colour, this produced a stacked bar graph.  

Fig. 8: The x-y axis for the bar graph.

Next, I opened the third sheet. This was used for the pie chart. For the pie chart, under the “Marks” card, I set it to Pie. I added the System Owners into the “Marks” card. For the System Owners, I right clicked on that field, and the “Measure” was set to “Count”. Next, I right clicked on that field again, and set the “Quick Table Calculation” to “Percent of Total”.  This computed the percentage for each System Owners count. 

Fig. 9 & 10: The settings behind getting % total count for the System Owners.

Finally, a new “dashboard sheet” was added and the 3 sheets were dragged into it. The legend for the map had “floating” items. This was done by right-clicking on the legend item in the dashboard and from the layout column on the left side, floating was clicked. The bar graph and pie chart were also floating items. They were placed at the bottom of the interactive map with their respective legends.

Limitations and Future Works 

One of my main limitation for this project was getting data. Initially, I planned to create a Canada-wide level dataset map. There are over 600 renewable energy locations across Canada. However, for the Renewable Energy installations data, I was not able to find Canada wide level dataset. This restriction made me change my focus to the GTA. As a result, my map only focuses on 58 installations. The downloaded file was a shapefile data. Moreover, the downloaded data was incomplete. It included City divisions but not all Agencies or Corporations. When I imported the data into Tableau, it had a lot of null columns; it was missing ward names, etc. On the Open Data Portal, the same data was in a .xlsx format. However, this format had more fields within it (such as ward names, etc.). When I tried using that in Tableau, it was missing the geometry field, as a result, it did not display any data on the map. Additionally, this data was in months, so I was not able to connect both this table and the shapefile table on Tableau. 

Another limitation is with the proportional symbols for the size of the installations. The EditSize feature for the proportional symbols was very limited with its edit options. It does not allow you to select the number of divisions you want for your data. If Tableau enables this feature, it will help users customize what they want to their symbols.  

To expand on this project, it would be more beneficial to add additional information/context on Renewable Energy Installations. For the future of this project, if I had information on the units and energy used for each installation, I would have been able to look at how efficient each installation is. This would state the measuring cost and the benefits of energy transitions. Moreover, using location data such as urban vs. rural would add more information to the base map. It would allow users to understand and see where these installations are located visually. As a result, with additional information and a complete dataset, this geo-visualization project can be expanded and improved. 

COVID-19 in Toronto: A Tale of Two Age Groups

By Meira Greenbaum

Geovis Project Assignment @RyersonGeo, SA8905, Fall 2020

Story Map Link


The COVID-19 pandemic has affected every age group in Toronto, but not equally (breakdown here). As of November 2020, the 20-29 age group accounts for nearly 20% of cases, which is the highest proportion compared to the other groups. The 70+ age group accounts for 15.4% of all cases. During the first wave, seniors were affected the most, as there were outbreaks in long-term care homes across the city. By the end of summer and early fall, the probability of a second wave was certain, and it was clear that an increasing number of cases were attributed to younger people, specifically those 20-29 years old. Data from after October 6th was not available at the time this project began, but since then Toronto has seen another outbreak in long-term care homes and an increasing number of cases each week. This story map will investigate the spatial distribution and patterns of COVID-19 cases in the city’s neighbourhoods using ArcGIS Pro and Tableau. Based on the findings, specific neighbourhoods with high rates can be analyzed further.

Why these age groups?

Although other age groups have seen spikes during the pandemic, the trends of those cases have been more even. Both the 20-29 and 70+ groups have seen significant increases and decreases between February and November. Seniors are more likely to develop extreme symptoms from COVID-19, which is why it is important to focus on identifying neighbourhoods with higher rates of seniors. 20-29 is an important age group to track because increases within that group are more unique to the second wave and there is a clear cluster of neighbourhoods with high rates.

Data and Methods

The COVID-19 data for Toronto was provided by the Geo-Health Research Group. Each sheet within the Excel file contained a different age group and the number of cases each neighbourhood had per week from January to early October. The format of the data had to be arranged differently for Tableau and ArcGIS Pro. I was able to table join the original excel sheet with the columns I needed (rates during the week of April 14th and October 6th for the specific age groups) to a Toronto neighbourhood shapefile in Pro and map the rates. The maps were then exported as individual web layers to ArcGIS Online, where the pop-ups were formatted. After this was done, the maps were added to the Story Map. This was a simple process because I was still working within the ArcGIS suite so the maps could be transported from Pro to Online seamlessly.

For animations with a time and date component, Tableau requires the data to be vertical (i.e. had to be transposed). This is an example of what the transformation looks like (not the actual values):

A time placeholder was added beside the date (T00:00:00Z) and the excel file was imported into Tableau. The TotalRated variable was numeric, and put in the “Columns” section. Neighbourhoods was a string column and dragged to the “Colour” and “Label” boxes so the names of each neighbourhood would show while playing the animation. The row column was more complicated because it required the calculated field as follows:

TotalRatedRanking is the new calculation name. This produced a new numeric variable which was placed in the “Rows” box. 

If TotalRatedRanking is right clicked, various options will pop-up. To ensure the animation was formatted correctly, the “Discrete” option had to be chosen as well as “Compute Using —> Neighbourhoods.” The data looked like the screenshot below, with an option to play the animation in the bottom right corner. This process was repeated for the other two animations.

Unfortunately, this workbook could not be imported directly into Tableau Public (where there would be a link to embed in the Story Map) because I was using the full version of Tableau. To work around this issue, I had to re-create the visualization in Tableau Public (does not support animation), and then I could add the animation separately when the workbook was uploaded to my Tableau Public account. These animations had to be embedded into the Story Map, which does have an “Embed” option for external links. To do this, the “Share” button on Tableau Public had to be clicked and a link appeared. But when embedded in the Story Map, the animation is not shown because the link is not formatted correctly. To fix this, the link had to be altered manually (a quick Google search helped me solve it):

Limitations and Future Work

Creating an animation showing the rate of cases over time in each neighbourhood (for whichever age group or other category in the excel spreadsheet) may have been beneficial. An animation in ArcGIS Pro would have been cool (just not enough time to learn about how ArcGIS animation works), and this is an avenue that could be explored further. The compromise was to focus on certain age groups, although patterns between the start (April) and end (October) points are less obvious. It would also be interesting to explore other variables in the spreadsheet, such as community spread and hospitalizations per neighbourhood. I tried using, which is a powerful data visualization tool developed by Uber, to create an animation from January to October for all cases, and this worked for the most part (video at the end of the Story Map). The neighbourhoods were represented as dots (not polygons), which is not very intuitive for the viewer because the shape of the neighbourhood cannot be seen. Polygons can be imported into but only as a geojson and I am unfamiliar with that file format.

Where to Grow?

Assessing urban agriculture potentiality in the City of Edmonton

By Yichun Du
Geovisualization Project, @RyersonGeo, SA8905, Fall 2020


North America is one of the most urbanized areas in the world, according to the United Nations, there are about 82% of the population living in the cities today. It brings various issues, one of them is the sustainability of food supply. Currently, the foods we consume are usually shipped domestically and internationally. Only a few of them are locally supplied. It is neither sustainable nor environmentally-friendly, as a large amount of energy is burned through the logistics of the food supply chain. In order to address this issue, many cities are introducing and encouraging urban agriculture to citizens. 

Urban agriculture is usually summarized as food production activities occurred in urban areas (Colasanti et al. 2012). Zezza & Tasciotti (2010) identified that urban farming and related sectors have employment of around 200 million workforces, and it provides food products to more than 800 million residents. Also, urban agriculture can bring benefits to the city in the aspects of economics, social, citizen health, and ecology. Besides, implementing urban agriculture can help the city to reinvent itself in an attracting,  sustainable, and liveable way. 

The City of Edmonton is North America’s northernmost city (at about 52.5° N) with a population of 0.932 million (2016 Census). However, apart from some very general municipal strategies (a general vision for food and urban agriculture in The Way We Grow MDP, and a more detailed document fresh: Edmonton’s Food & Urban Agriculture Strategy introduced in 2012), there is very little study on the urban agriculture suitability for the City of Edmonton. which gives the incentive to develop this study. The Geo-Visualization of the outcome can inform Edmontonians where to sow the seeds, and also tell them that the place they live has great potential in growing food.


Edmonton is located on the Canadian Prairie, which leads to very minor topographic changes in the city, but very cold and snowy winters. It limits food production in snow-free seasons, but provides flat ground for agriculture. To conduct an assessment of urban agriculture potentiality for the city, I focused on two themes: ground and rooftop

The ground part is for assessing the potentiality of food production directly taking place on the ground, including the backyard of a house. The general concept is to utilize the existing land-use that supports urban agriculture activities, however, it should be far away from pollution, and to avoid negative externalities. That is to say, current agriculture zoning, parkland, and vacant lots are favoured. Top-up on that, soil nutrient level will be taken into consideration. Then, the constraints will be the close distance to the source of pollution. Meanwhile, urban agriculture activities can bring potential contaminations, such as water pollutants to the water bodies. So the activity should be at a distance to water bodies. 

The rooftop part is for assessing places such as the rooftop of a large building, balconies of a suite in a condo building, or any other places in an artificial structure. The goal of implementing urban agriculture at rooftops is to encourage people to participate, and to focus on proximity to markets, which is people’s dining tables. However, pollution from the surrounding environment should be avoided. 

The project will present the scores at the neighbourhood level in both themes of Ground and Rooftop that shows the potentiality of urban agriculture in the City of Edmonton. 

Data Source

Based on the general concepts mentioned above, the following data are chosen for conducting the analysis. Majority of the data are retrieved from the City of Edmonton’s Open Data Portal. The waterbody shapefile is obtained from the Boundary files at Stats Canada’s 2016 Census. 


The preliminary part of this project is done at ArcMap. Then, the visualization part is proceeded using Tableau

The general methodology can be summarized in the following workflows. The first workflow below is for the Ground potentiality. Majority of the work was done on ArcMap. After that, the final results were brought to Tableau for Geo-Visualization with the Rooftop part. The blue boxes are for presenting the original shapefiles. The Yellow boxes are for the Data Criteria. The pink boxes are displaying the secondary shapefiles that are constraints, the green boxes are showing the potential areas or the final results. Both of the pink and green boxes are generated through the white boxes (geoprocessing steps). The final results are processed with data normalization, and an average score was given. So the total score in one neighbourhood was normalized by the total area.

Workflow for the Ground theme.

The second part is the Rooftop potentiality. It has a similar process of the Ground part in getting the results.

Workflow for the Rooftop theme.

Also, a table for the weighting scheme of all the selected criteria is shown below. Constraints are assigned with a negative value, while potentials are assigned positive values. Also, the weights will be heavier for more constraints or potentials.

Weight Assignment Scheme.


Larger the number, higher the potential for conducting urban agriculture. The Ground has a maximum score of 3.8, while Rooftop has a maximum score of 4 in this analysis. 

The results of the scatter plot below suggest that the majority of the neighbourhoods in Edmonton have the potential for urban agriculture. For the Ground theme, only a few of the industrial zones have a score of 0. All types of neighbourhoods are widespread in the score classes. However, the River valley System tends to be associated with medium to high scores. For the Rooftop theme, more than half of the neighbourhoods are in medium to high scores (>2) for the potentiality. Nearly all the mature neighbourhoods are associated with scores higher than 3. Only a few transportation and developing land-uses are having scores of 0.

Scatterplot for scores of Ground and Rooftop potentiality at neighbourhood level.

The next screenshot is the final output from the Tableau Dashboard. Audiences can click on any of the elements that represent a neighbourhood for an excluded view of that specific neighbourhood. For example, you can click one neighbourhood on the Ground map, then the same neighbourhood will be highlighted in the Rooftop map, as well as the point representing that neighbourhood in the scatterplot will be zoomed in with the corresponding score in the two themes. On the other hand, the Audience can select the point in the scatterplot, and the neighbourhood will be zoomed in in the two maps. Also, the audience can view the typology of the neighbourhood and figure out the associated scores for each typology of the neighbourhood by selecting the typology in the legend. Then, all the neighbourhoods belong to that typology will be displayed in the three views. 

Tableau provides an interactive visualization of the urban agriculture potentiality in Edmonton at the neighbourhood level. Please click here for viewing the project.

Dashboard view from Tableau for the final output.

For example, I clicked Oliver neighborhood for Ground score (1.270), then the associated Rooftop score (2.010) and the detailed location of Oliver Neighbourhood is shown in the Rooftop view. Also, the scatterplot for both scores is provided below, with the neighbourhood typology of Central Core.

Example of selecting Oliver neighbourhood.


There are some limitations regarding this project’s data source and methodology. If I have access to updated soil nutrition data, solar radiation data, precipitation data that related to the Ground theme, then I would have a better assessment model for a more ideal result regarding the potentiality of the Ground. Also, an inventory of the physical surface details can help to determine where the impermeable surfaces are. Similarly, if I have a comprehensive dataset of rooftop types, including the slope of the roofs and the individual use of the building, could help to eliminate the unsuitable roofs. Moreover, detailed zoning shapefile with potential land-use modification of community gardens, or backyard gardens would be beneficial to the future application of this project. As for the methodology improvement, the major concern is the weight assignments. Opinions from local experts or the authority can help to improve the model to fit the local context. Also, public consultation or survey can bring the general public to the project, which can form a bottom-up approach in transforming Edmonton into an urban agriculture-friendly place. As an expectation for the future development of this Geo-Visualization project, I would like to see more inputs in data source, as well as participation from the general public and local authorities. 

To sum up, this assessment of urban agriculture potentiality in the City of Edmonton assigns all the neighbourhoods scores for Ground and Rooftop potentiality. With those scores, a perception is provided to Edmontonians on where to sow the seeds on the ground, and which neighbourhoods are in the best locations for urban agriculture. 


Colasanti KJA, Hamm MW, Litjens CM. 2012. The City as an “Agricultural Powerhouse”? Perspectives on Expanding Urban Agriculture from Detroit, Michigan. Urban Geography. 33(3):348–369. doi:10.2747/0272-3638.33.3.348

Zezza A, Tasciotti L. 2010. Urban agriculture, poverty, and food security: Empirical evidence from a sample of developing countries. Food Policy. 35(4):265–273. doi:10.1016/j.foodpol.2010.04.007

United States Presidential Election Results: 1976-2016 in Tableau

By: Vincent Cuevas

Geovisualization Project Assignment, SA8905, Fall 2020

Project link can be found here.


The United States presidential elections occur every four years and much attention is placed on the polarization of US politics based on voting for either of the major political parties, the Democratic Party and the Republican Party. This project aims to use visualization to show the results across many different elections over time to view how the American public is voting for these two parties.

Methodology and Data

Tableau was used for the data visualization due to its ability to integrate multiple data sheets and recognize spatial data to instantly create maps. It is also able to quickly generate different types of visualizations in cartographic maps, bar charts, line graphs, etc.

Data was collected from the University of California – Santa Barbara website the Presidency Project. The repository contains data from elections all the way back up to 1789. This visualization will go back to 1976 and view results up until 2016. Other data sources were considered for this visualization, namely MIT’s Election Lab dataset from 1976-2016. However, this dataset contained results for up to 66 different parties that votes were casted for from 1976 to 2016. Incorporating this level of detail would have shown inconsistent data fields across the different election years. Other political parties are omitted from this project due to the inconsistency of party entrants by year and the fact that Democrats and Republicans take up the vast majority of the national vote. The Presidency Project data was used as it provided simpler views of Democrat-Republican results.

Data Retrieval

The downside to using UCSB’s Presidency Project data is that it is not available as a clean data file!

The data was collected from each individual data page into an Excel sheet. One small piece of data that was collected elsewhere was the national voter turnout data, which was taken from the United States Election Project website.

Voting Margin Choropleth Map

Once the data was formatted, only two sheets needed to be imported into Tableau. The first was the state level results, and the second being the national level results. The relationship between the two is held to together by a join on the state fields.

Tableau has a nice feature in that it instantly converts recognizable data fields into spatial data. In this case, the state field generates latitude and longitude points for each state. Drag the auto-generated Latitude and Longitude fields into Columns and Rows, and then drag state under Marks to get this.

For one of the main sheets, one of the maps will show a choropleth themed map that will show voting margin differences between the Democratic Party and the Republican Party. Polygon shapes are needed, which can be done by going to the drop-down menu in Marks and selecting Map. Next, the sheet will need to identify the difference between states that were Democrat vs. Republican. A variable ‘PartyWin’ was created for this and dragged under marks, and colours were changed to represent each party.

The final step requires creating ranges based on the data. Ranges cannot be created manually and require either some programming logic and/or the use of bins. Bins were created by right-clicking a variable ‘VictoryMargin (%)’. The size of each bin is essentially a pre-determined interval (20 was chosen). VictoryMargin(%) was dragged under Marks in order to get a red/blue separation from the colours from Party Win. The Colors were edited under VictoryMargin to get appropriate light/darker hues for each colour. The specific bins were also appropriately labelled based on 20 point intervals.

The screenshot shows that you can hover over the states and retrieve information on Party Win, the percentage of Democrat and Republican votes that year, as well as the Victory Margin. The top-left corner also has Year in the Pages area, which also for a time-series view for each page that contains Year.

Vote Size Dot Symbol Map

While margin of victory in each state illustrates the degree on if the state voted Democrat or Republican, we know that the total number of Democrat and Republican not equal when comparing voting populations across different states. Florida, for example has 9,420,039 total votes casted and had a 1.2% victory margin for the Republicans in 2016. Contrast that with District of Columbia in the same year, which had 311,268 total votes, but with a 86.8% victory margin for Democrats. For the next map, dot symbols are used to determine the vote size (based on the variable Total State Votes) for each state.

The same longitude and latitude generated map will be used from the choropleth map, only this time the dots and the surrounding Open Street basemap are kept intact. A similar approach is taken from the choropleth map using Party Win to differentiate between Republican and Democrat states. The Total State Votes variable is dragged into the size area under Marks to create different dots sizes based on the numbers here. Bins were created once again – this time with an interval break of 2.5 million votes per state. Ideally, there would be customized breaks as many states fall into the lower end of total votes such as District of Columbia. Once the labelled bins are edited, additional information for State, Total Democrat Votes and Total Democrat Votes were entered to view in the Tooltip.

Screenshot of Dot Symbol map based on Number of State Votes in Tableau Worksheet view

Electoral College Seats Bar

American politics has the phrase of “270 To Win“, based on needing 270 electoral seats as of 2020 to win enough seats for the presidency. As recently as 2016, the Democratic candidate Hillary Clinton won the popular vote over the Republican candidate Donald Trump. However, Trump won the majority of electoral seats and presidency based on winning votes in states with a greater total number of seats.

A bar showing the number of electoral seats won can highlight the difference between popular vote, and that greater margin of victory in a state matters less than having a greater number of state seats won. To create this bar the same setup is used having Party Win and State underneath the marks. This time, a SUM value of the number of seats is dragged to the Columns. The drop down list is then changed into a bar.

Dashboard and Nationwide Data Points

Since this data will go into a dashboard, there is a need to think how these visualizations compliment each other. The maps themselves provide data while looking at a view of individual states. The dynamic bar shows the results of each state, though is better at informing the viewer the number of seats of won by each party, and the degree to how many more seats were won. The dynamic bar needs some context though, specifically the number of total seats won nationwide. This logically took the visualization for placing the maps at the middle/bottom, while moving the electoral college bar to the top, while also providing some key indicators for the overall election results.

The key data points included were the party names, party candidates, percentage of popular vote, total number of party votes, total number of electoral seats, as well as an indicator of if either the Democratic or Republican Party won. Secondary stats for the Other Party Vote (%), Total Number of Votes Casted, as well as Voter Turnout(%). Individual worksheets were created of each singular stat and were imported into the dashboard. Space was also used to include Alaska and Hawaii. While the main maps are dynamic in Tableau and allow for panning, having the initial view of these states limits the need to for the user to find those states. All of the imported data had ‘Year’ dragged into the pages area of the worksheet, allowing for a time-series view of all of the data points.

You can see what the time series from 1976 to 2016 looks like in a gif animation via this Google Drive link.


When looking at the results starting from 1976, an interesting point is that many Southern states were Democratic (with a big part due to the Democratic candidate Jimmy Carter being governor of Georgia) that are now Republican in 2016. 1980 to 1984 was the Ronald Reagan era, where the Californian governor was immensely popular throughout the country. Bill Clinton’s reign from in 1992 and 1996 followed in Carter’s footsteps with the Arkansas governor able to win seats in typically Republican states. Starting with the George W. Bush presidency win in 2000, current voting trends manage to stay very similar with Republican states being in the Midwest and Southern regions, while Democrats take up the votes in the Northeast and Pacific Coast. Many states around the Great Lakes such as Wisconsin, Michigan and Pennsylvania have traditionally been known as “swing states” in many elections with Donald Trump winning many of those states in 2016. When it comes to number of votes by state, two states with larger populations (California, New York) have typically been Democratic in recent years leading to a large amount of total votes for Democrats. However, the importance of total votes is minimized compared to the number of electoral seats gained.

Future Considerations and Limitations

With the Democrats taking back many of those swing states in the most recent election, inputting the 2020 election data would highlight where Democrats were successful in 2020 vs. in 2016. Another consideration would be to add the results since 1854, when the Republican Party was first formed as the major opposition to the Democratic Party.

Two data limitations within Tableau are the use of percentages, and the lack of projections. Tableau can show data in percentages, but only as a default if it is part of a Row % or Column % total. The data file was structured in a way where this was not possible, meaning that whole numbers were used with (%) labelled wherever necessary. Tableau also is not able to project in a geographic coordinate system without necessary conversions. For the purposes of this map, the default Web Mercator layout was used. One previous iteration of this map was also done as a cartogram hex map. However, a hex map may be better in a static map as the sizing and zooming is much more forgiving when using the default basemap.

Health Care Access in the City of Toronto

By: Shabnam Sepehri

Geo-Visualization Project: SA8905, Fall 2020

Project Link: Final Map

Final Product: An Interactive Map


There are many factors that contribute to an individual’s access to health care. Statistics Canada has defined the ‘Social determinants of health and health inequalities’ as the 12 major factors that affect access. First on the list is income and social status and near the bottom at number 11 is race and ethnicity. For this project, I was curios to see how these two variables are distributed across the census tracts in the City of Toronto; and if there are any overlaps with the locations of healthcare institutions. The software of choice will be CARTO, which is a Service cloud computing platform that enables the visualization and analysis of geographic data.

Data Acquisition

  • CHASS Data Center: used to collect census data by census tract (2016): Total population, total visible minority population, total aboriginal identity population, and median total income;
  • Statistics Canada: used to obtain census tract boundary files;
  • City of Toronto Open Data: Address Point files;
  • Geospatial Map & Data Centre: used to collect physicians data – Enhanced Points of Interest 3.1 (City of Toronto)
  • ArcMap (digitize): used to digitize hospital locations in the City of Toronto


After the data was acquired, the following steps were taken in ArcMap to organize the data before importing it into Carto. First, the census variables were joined to the census boundary shapefile. Then, a new column was created to calculate the sum of visible minority and aboriginal identity population density per 1,000 people.

Next, the hospital locations were digitized using the ‘Editor Toolbar’. following that, the physicians locations were geocoded using the using the address repository acquired from Toronto Open Source data. Lastly, , the non-spatial data (e.g. total median income) were joined to the spatial data (census tract boundaries) to enable the layer visualization. After all the necessary formatting was done, the data was uploaded onto Carto.

Once on Carto, I realized that the software allows the user to carry out different spatial functions such as geocoding. It also allows you to edit your dataset using SQL queries. This function is really useful in facilitating the data editing process and helps to reduce the back and forth between different mapping software’s.

CARTO: dataset dashboard

Carto allows you to import a total of four layers in your map. The hospital locations, physician offices, and the census tracts were added as the four layers, with census tract uploaded twice to show the two different census variables. The census variables were visualized as a choropleth maps, and the health institutions were visualized as points on top of the choropleth layers.


The interactive aspect of this map is mainly the users ability to switch between layers and toggle each map component separately. Moreover, the ‘pop-up’ option was utilized for the hospital points to show the name and address of each location. Similarly, pop-ups were created for the choropleth maps to show the median income and population density of each individual census tract. Lastly, the widget feature was used to create histograms to showcase the distribution of the two census variables among the census tracts. This feature allows the user to select for the tracts in different categories and zoom into those specific tracts on the map. For instance, someone may want to look at tracts with the highest median income and tracts with an average aboriginal and visible minority population density. Lastly, the choropleth layers are turned off and may be switched on as per the user’s interest.


The map shows that census tracts where the median income is relatively high, tend to have a low distribution of aboriginal and visible minority population density. The distribution of hospitals appear to be uniform throughout the city with a few more concentrated in the downtown core. Conversely, the physician offices appear to be more concentrated in tracts with higher income or close to the downtown core. That being said, this does not mean that higher income groups have better access than lower income groups. However, the map does identify areas where there is a low number of physician offices, and most often, these areas tend to be classified as having a low to medium income. There are of course other variables that must be considered when identifying access, however, due to the limiting number of layers this option was not feasible for this project. Overall, this map can be used to identify ideal locations for future health facilities and to identify groups that have limited access to these resources.

Limitations & Future Work

Initially, I wanted to include more variables in the map; the goal was to map median income, visible minority & Aboriginal population density, education attainment, and employment conditions. However, Carto only allows for the addition of four variables. This limited the diversity of the visualized variables. Ideally, exploring other geo-visualization software such as Tableau, ArcGIS Online, or the Esri Dashboard would aid in creating a more nuanced map.

Ideally, I would also want to map the change of these variables over time. For instance, to show whether the distribution of median income, and visible minority & Aboriginal population density per census tract has always been the same or if there are slight changes in pattern. It would be interesting to capture which census tracts had experienced better access due to changes in health determinants over time.

Finding Your Ideal Toronto Neighbourhood

By Marian Mendoza
Geovisualization Project, @RyersonGeo, SA8905, Fall 2020

Project link: here. Best viewed in full screen.


Toronto is a diverse, exciting city with plenty to offer. When moving to a new city or finding a new neighbourhood to live in, it can be challenging to decide on what neighbourhood is best for you. This tool helps users filter their desired criteria based on a number of selected features.

Full screen view of the web app


Data for this project is collected from the following sources:

  • City of Toronto Open Data Portal for neighbourhood and green space shapefiles
  • for Walk, transit, and bike scores from
  • Metrolinx for subway stations (filtered from shapefile of Regional Transit Network).
  • Google Places API for museum/gallery, and mall (major shopping centres and district malls)


1. Collecting Google Places API data.
Querying the Google Places API returns a maximum of 60 results (as per Google’s Terms and Conditions). This query returns a list of results with extra features in a .json format. The desired results, specifically the name, point location, and address, are then reformatted into a usable .csv to be used in ArcGIS.

2. Data cleaning
All files were cleaned to retain only relevant fields for this geovisualization. Some neighbourhood names had changed since the data was published, so this was cross referenced with the City of Toronto’s current neighbourhood names. Further, Google Places API returned some misclassified or duplicated results that had to be removed from the list.

3. Map preparation on ArcGIS Pro.
– Walk/bike/transit scores were joined to the neighbourhood shapefile using the common area name.
– All other features were counted in each neighbourhood using “Summarize Within.” – For green_space, this returned the total area of green space in a neighbourhood. This was then computed as a percentage of the total area, producing the proportion of green space for neighbourhood. Next, the proportions were classified into 6 quantiles of “green levels” in the city.

4. Creating Widgets on ArcGIS Online Web App Builder + Using the Web App

All layers were uploaded to ArcGIS online and used as the web map for the web app builder. Several widgets were created to enable a user-interactive experience. Users are welcomed by a splash screen that explains how to use the app’s key functions.

FILTER. Since all features were joined to the neighbourhood layer, I created a filter widget that allows users to input values for any of the features. Most of these queries are set to “at least” since people who want higher values would be more selective. People who are not as particular about a feature can keep it at a low setting.

Setting the filter criteria

QUERIES. Several pre-set queries allow the user to see neighbourhoods with the top features in each category. The user can also engage further with the map layers by toggling on/off additional relevant features. For example, by querying “top transit scores” the user can then turn on the SUBWAY STATION layer and see where stations are in the city. They can also turn on “transit score” and see a choropleth map of transit scores across the city. This enables a richer understanding of the results of the queries.

Query for neighbourhoods with top transit scores, in descending order

CHARTS. The charts widget enables users to see a graphical representation of some of the features (scores and green space) and compare neighbourhoods.

Users have the option to use a spatial filter to limit the chart display to only some neighbourhoods. In this case, I zoomed into southwestern Toronto. The result is a bar graph where you can quickly compare the values for neighbourhoods in the set extent.

Chart comparing green space % in southwestern Toronto

Additional features in the web app:

SEARCH. Users can search for a place or an address and engage with any of the layers to see the features of that point’s neighbourhood.  

TRANSPARENCY. All layers have modifiable transparency. Users can layer several features and choropleth maps and identify neighbourhoods that may be most ideal based on the polygon’s saturation. For example, layering both walk score and transit score would show the darkest areas to be the most walkable and transit friendly.

ATTRIBUTE TABLE. All layers’ attribute tables are viewable for a user who would be interested in seeing the full details of each layer and use functions like “sort descending.” This is accessible in the “more details” menu for each layer, as well as the black tab in the bottom centre of the screen.


Originally, real estate data (such as rental prices) were to be included, but open data from the Canada Mortgage and Housing Corporation were not clean and complete for each Toronto neighbourhood. Additionally, restaurant, café, schools, and libraries were to be included, as these are some attractive neighbourhood features. However, due to Google Places API restriction of 60 requests, I decided to use smaller data. Alternatively, I could have used Nominatim API to pull more search requests from OpenStreetMap, but with time constraints I kept the scope of the project small.

There are limitations in using the widgets of ArcGIS Web AppBuilder. I would have liked the queries to display the “top results” with the colours assigned from the choropleth source layer. Instead, all results of a query are displayed with the same symbology. There is no option to group layers on ArcGIS Online; ideally, there would be groups for “feature layers” (subway station, mall, green space) and “classified layers” (walk score, green space %) that would help the user navigate the layers more simply.

Further development

Halfway through completing this project, I learned that Toronto Life magazine had created a similar tool. However, their tool scores each neighbourhood on an index and is interactive by using sliders to input the users ranking. The Toronto Life tool  does not allow the user to see the details of the index score for each neighbourhood, such as, identifying the types and locations of shopping experiences in the neighbourhood.

The Toronto Life tool gave me ideas on how to improve my tool. With more time and experience, I would create the app using Javascript to avoid the limitations of ArcGIS Online’s widget functionalities. I would expand on the filter function and allow the user to weight each feature, then return the results ranked by the best neighbourhood based on the criteria. Further, additional neighbourhood qualities not included in this project, such as housing affordability, building types, restaurants, and job opportunities, are complex datasets that would improve the comprehensiveness of this tool. I would use open data from OpenStreetMap to include features with more than 60 records. I would also improve the complexity of each feature’s relationship with other features (such as weighing a feature’s attractiveness to a neighbourhood by assessing its walking/transit/driving distance).

With more resources and access to clean and complete datasets, this tool can be expanded for broader use. Casual users can benefit from this tool, but it can also be used more precisely to complement real estate or housing research.

Visualizing Atlantic Tropical Storm Activity

by Christopher Rudolph

Hurricane Florence | NASA
Fig 1. Hurricane Florence as recorded by NASA

Tropical storms are a category of weather events that create wind and rainfall conditions of varying intensity. These conditions can have high destructive potential depending on intensity, with these storms being classified from tropical depression at the weakest, to hurricane at the most intense. They occur between 5- and 20-degrees latitude when low atmospheric pressure systems cross warm ocean surface temperatures. Depending on conditions, winds can develop from as low as 23 mph to over 157 mph. When these storms meet land, they will often cause property damage and threaten lives due to flooding and wind force before dissipating.

The most dangerous of these storms are classified as hurricanes, which are characterized by exceedingly high wind speeds. Hurricanes are famed across the south-eastern United States for the devastating effects they can have when they reach land such as 2005’s  Hurricane Katrina with over $125 billion in damage and over 1800 deaths or 2012’s Hurricane Sandy with $70 Billion in damage and 233 deaths. Due to this, the study and prediction of tropical storm development has remained continually relevant.

Why track tropical storms?

Many of the processes surrounding hurricane development are poorly understood, such as ocean and atmospheric circulation. To better understand these events, efforts have been made to form detailed histories of past tropical storm conditions. The National Oceanic and Atmospheric Administration (NOAA) has created detailed records of tropical storms as far back as the mid 1800’s.

The atmosphere and ocean are 2 of the largest carbon and thermal sinks on Earth. With anthropogenic climate change changing the conditions of these two bodies, there is concern that tropical storm development will change with it, potentially with intensification of these destructive events. A search for periods analogous to forecasted future conditions has emerged in an attempt to predict how tropical storm conditions may change. Paleotempestology is a scientific field that has sought to extend tropical storm records past modern monitoring technology using geological proxies and historical documentary records.

This visualization will represent the frequency of tropical storm activity in the Atlantic as a heat map. Kernel Density values are assigned based on proximity to tropical storm path activity. The higher the value, the more tropical storm activity seen in proximity to the location. Kernel density will be visualized on a 10-year basis, helping to visualize how storm activity over time and the frequency at which these storms may impact coastal communities.


Fig. 2 Visualization of tropical storm activity density in the west Atlantic.

Data and Platform

For this project, tropical storm data is visualized using the International Best Track Archive for Climate Stewardship (IBTrACS), a tropical cyclone best track data collection published by NOAA.

ARCGis was selected as the platform that would be used for the visualization. The software was familiar and effective for doing the project’s geoprocessing, and looked promising for the visualization product. ARCGis features robust geoprocessing tools for creating the visualization, and has an animation feature that can produce the video format and implement overlay features such as a timeline and text. As the project developed, the animation tool would be abandoned however in favor of Windows Video Editor for the video as discussed later.


With data available in shapefile form, importing NOAA’s data into ARCGis was simple. The data on display upon importation is overwhelming with over 120 thousand records displayed as travel paths. Performance is low and there is little to no context to what is being viewed.

Fig. 3 – A map of all tropical storm tracks recorded

Using density geoprocessing and the filtering of data range through time, this will be transformed into something interpretable.


Time was the first filter implemented. In the properties of the layer, time was enabled. Each row has corresponding time fields. In this case, year was used. Implementing this introduced an adjustable filter to the map area in the top right. This slider could be adjusted to narrow down the range.

Fig. 4 – Layer Time properties and the resulting time range filter

While handy on the fly, more precise results for filtering time is found within the Map Time tab, with precision controls available there.

Creating the density view

For creating heatmaps typically the heatmap symbology option is used to create effective density views with time enabled filtering. For this visualization, this approach was not available as the approach was incompatible with the line datatype used. To create a density map, geoprocessing would need to be done using the density toolset. The kernel density toolset was selected. This tool uses a bivariate kernel function for form a weight range surrounding each point. These ranges are then summed to form cell density values for each raster grid point, resulting in a heatmap.

This approach carried some issues for implementation however. In the process of geoprocessing, the tool doesn’t take into account or assign any time data to the output. This meant that the processed layer couldn’t be effectively filtered for the visualization. To work around this, the data was broken into layers by desired year range, then processed, creating a layer for each time window. These layers could still be used to make keyframes and scenes for the animation, though this solution would have some added housekeeping in displaying certain details such as time and legend within the video


As mentioned earlier, the ARCGis animation tools were planned for use as the delivery format. Working with the results generated so far would prove problematic however. The animation tool is focused on applications involving changes of view and time. Given the needs and constraints of the solutions taken for this project, neither of these would be active components of this visualization, and would complicate the creation of the animation. Issues with preview playback, overlays and exports further complicated this. Given the relatively simple needs, a different approach using other software was selected.

In researching this topic, much forum discussion was found surrounding similar projects. Consensus seemed to be that for a visualization using static views such as this, exporting to an external main-stream video-processing platform would be most effective. To do this, each time view would need to be honed and exported as images through a layout. These layouts would then be arranged into a video with windows video editor.

Elements such as legend, title and attribution that had been causing issues under the animation tool were added to a layout. They automatically updated relevant information as layers were swapped within the layout view. Each layer in turn were exported as layouts representing each year range. Once these images were created, they were imported into windows video editor where they were composed into a timeline. Each layout was given period of 3 seconds before it would transition to the next layout. The video was then exported in 1080p and published to Youtube. Once hosted on Youtube, it can be easily embedded into a site like above or shared via link.

Fig. 5 – Video editing in Windows Video Editor

Future Work

There are different factors and semi-regular phenomenon that have impacts on tropical storm development. Events such as El Nino and the Pacific Decadal Oscillation are recurring events that could enhance. Relating the timeline of these events as well as ocean surface temperature could help interpret trends within this visualization. Creating a methodology behind time ranges displayed also could have enhanced this visualization. For example, breaking this visualization into phases of El Nino-Southern Oscillation rather than even time windows may have presented a lot of value to this sort of visualization.