Building digitization using Artificial Intelligence – an Open Source approach

By Nikita Markevich

Geovisualization Project Assignment, SA8905, Fall 2021

INTRO

With the development of automation and machine learning, a new approach in raw data acquisition has been opened for people to try. QGIS is a popular open-source GIS software that allows the creation of custom plugins for all sorts of geoprocessing. One such plugin is called Mapflow, developed by Russian-based company GEOAlert. Mapflow is an easy-to-use plugin to retrieve ground data from satellite imagery such as buildings, roads, construction zones, and forest canopies. This blog will introduce how to use Mapflow through a browser environment. To learn how to use the plugin, please refer to the Esri Story Maps tutorial through this link: https://storymaps.arcgis.com/stories/dfd88d7170c74f33a4dd5f7583cdc414

The difference between the use of Mapflow in the browser and through the plugin is that browser only allows detection from web-based satellite services such as Mapbox, or custom imagery through URL, while in the plugin, custom satellite imagery can be processed straight from the user’s device. The major advantage of the browser approach is that the process is using remote servers which is faster than the plugin process.

Mapflow website project page

USER INTERFACE

Mapflow online service uses free to try system by giving 500 free credits when opening an account. Each process requires credits based on the size of the data area that the user wishes to process. If the user runs out of credits, it is possible to top up the balance in the top right corner for the price of 100 CAD per 1000 points.

Let’s explore the project page. The project is organized in steps where the user can choose the data source, the type of AI Model that the user wishes to run, and post-processing operation for additional data gathering. AI Models that are available in the browser copy the models which are available in the QGIS plugin. AI can provide digitization for buildings, high-density housing, forests, roads, construction, and agricultural fields.

User Interface of the Data source tab in Mapflow. Mapbox API is used to display geographic data.

In the data source tab, a user can either use the embedded draw tool to choose the area for processing or upload polygon data in GEOJSON format. The draw rectangle tool is very intuitive in its use and as soon as it’s drawn, the website provides the area’s size in squared kilometers. This number is used by the website to determine how many credits are required to process the area. The larger the area, the more credits it costs to process.

DATA

The area of interest for this example would be focused on the same area as was used in the plugin tutorial in Esri Story Maps: the city of Ciego De Avilo in Cuba. The drawn rectangle over the city and closest suburbs estimated the area to be 45.31 squared kilometers. Originally the area was raised to my attention when I was doing some research project for the company I work for to explore the possibility of constructing fiber service in the Caribbean region. While searching for building and road data through open sources such as OpenStreetMap, I realized that some Caribbean countries and especially Cuba is missing geographic data that is required to create a fiber map model. After exploring several options, the plugin Mapflow proved to be most useful to generate geodata from available free commercial satellite imageries.

Selected Area of the City of Ciego De Avila chosen through draw rectangle tool in the data source page of Mapflow

PROCESS

The Chosen area is now inputted in our project. The next steps would be to choose the model and post-processing data. We will choose a buildings model to test the speed of the browser process and compare it with the plugin process. The big perk of the browser tool is the post-processing options. One such option is automatic polygon simplification, which would simplify the results of the model. In the plugin version, the results of the model outputted some building polygons in broken shapes or fuzzy polygons. That would create additional work post-processing polygons manually. The browser tool offers that option for free.

Project window of Mapflow right before the beginning of the process.

The area of interest costs 227 credits to be processed, which means that every 100 squared kilometers processed costs 500 points.

As soon as the Run processing button is pressed, the final step is to wait for the process to finish and download the processed data. The process finished in 32 minutes. That is 15 minutes faster than in the plugin process, which was 47 minutes.

After the process is finished, the user can view the results in the browser and download the file in the GEOJSON format.

Data results in the browser window

The process assigns id numbers to each shape as well as shape types, such as rectangle, grid snap, or l-shape. This information can help with further post-processing and solve any automation mistakes.

LIMITATIONS AND FUTURE WORK

The most important limitation of this tool is its cost, however, if the user decides to process an area larger than 100 square kilometers, one can create multiple accounts and use free credits each time. Secondly, the processed results sometimes output shapes that are very questionable in their nature. Some polygons merged multiple buildings into ones, others detected buildings partially, in other cases the orientations of polygons are off. This can be fixed in the manual post-processing by GIS professionals.

In the future, this tool can be potentially be used to populate the OpenStreetMap dataset with the building polygons and roads data. Open Source data is very important for many gis users, and AI automation is the perfect companion that makes the work of GIS enthusiasts much easier by streamlining the most tedious processes in geographic analysis.