As a geospatial professional in 2020, it’s nearly impossible to avoid being exposed to the ideas of AI, machine learning, and deep learning. Projects utilizing these technologies have been at the forefront of articles, panels, and presentations — often displayed proudly on stage during conferences and other events (virtual or otherwise). What is interesting to me is that, despite their recent popularity, there are still many misconceptions about what these words really mean, and how they can be applied within a geospatial context.
“You mean they aren’t just all synonyms for the same thing?”
— G. O. Spatial, a purely hypothetical professional
Let’s help G. O. out a bit, and go over the differences between the terms here, so that they can be prepared to dip their own toes into the GeoAI waters. Whether they want to actively do data science themselves, or simply have a better understanding of what others are doing, a fundamental knowledge of these terms is a good place to start.
AI, or artificial intelligence, is a science with the purpose of creating hardware and software which can determine its own solution to problems. Historically, only humans (and certain other animals!) have been able to leverage this type of creative problem-solving. Computers have always done only exactly what they are instructed to do — which is comforting in some ways, but also places a limit on future potential.
Machine learning is a specialized subset of AI which focuses on the machine having the ability to learn rather than having explicit instructions programmed. Dozens of different algorithms have been created to analyze training data in order to create data models which specialize in solving whatever problem you are attempting to solve. These algorithms can be great for analyzing structured data, but what happens if you need to analyze data that is unstructured or unlabeled?
Deep learning is a subset of machine learning, just like machine learning is a subset of AI. Deep learning is powered by neural networks — systems designed to operate just like the human brain and nervous system does. These neural networks are able to analyze data that does not conform to a specific format or schema. As a result, they are useful for everything from detecting fraud and money laundering to complex geospatial tasks such as detecting damaged buildings and estimating the cost of repairs in an area after a large storm.
One of the primary reasons that machine learning has been growing so rapidly is the accessibility of ready-to-use libraries which abstract away the inherent complexity of scaffolding / implementing data models. The large majority of these machine learning libraries utilize the Python programming language — you might already be familiar with Python due to the recent inclusion of Jupyter Notebooks in ArcGIS Online, or from creating or using custom geoprocessing scripts and services. These libraries handle a lot of the Matrix operations and pure mathematics (such as linear algebra, multivariate calculus, statistics, and probability) that would otherwise have to be handled manually.
“Cool! So I don’t need to understand all the complicated math?”
— G. O. Spatial, a purely hypothetical professional
Good question, G. O.! These libraries are intended to save you time and effort, and as a result you can actually accomplish a lot without truly understanding the mathematics behind it all. So much so, in fact, that many amateur data scientists are asking if there is an actual need to understand the math at all. Some of you might remember a time in school where you asked a similar question to the teacher (or maybe it was just me) — why do I need to learn algebra/calculus/etc when I can just rely on using a calculator to get my answer? In both of these scenarios, the tool you are using can get you the answer, but it takes knowledge of mathematics in order to ask the right question. In conclusion, if you want to perform real data science, you absolutely do need to understand the math that these libraries are helping you perform. There, I said it. What a weight off my chest!
Some of the most popular libraries for general purpose machine learning are Google TensorFlow and PyTorch. These libraries are highly flexible, and allow for your own creativity as to how to interact with them. Additionally, both libraries are open source — ensuring that they can continue to evolve and improve in order to stay relevant as technology advances. For geospatial analysis, there are some specialized libraries such as PySAL and the GeoPandas extension for Pandas (one of the most popular data analysis libraries for Python).
Each of these libraries has its own strengths and weaknesses, and I encourage you to treat them all as different tools in your toolbox rather than a one-size-fits-all solution. We’ll go into more detail about these libraries, including specific examples, in a future article — so be on the lookout for that!
Applying any of the above concepts in a geospatial context is commonly referred to as GeoAI. The truly beautiful part about GeoAI is that its applications are as wide and diverse as our imaginations! Just a few of the areas where GeoAI is currently yielding positive results are:
- Using computer vision for remote sensing, image classification, and object detection
- Using super-resolution networks to increase visual clarity and allow higher zoom levels of existing imagery
- Using natural language processing to extract geospatial information from unstructured text in documents and images
- Applying deep learning to large 3d geospatial datasets, such as point clouds and 3D meshes
“All that theory is well and good, but I’m itching to see some examples of real-world applications for GeoAI!”
— G. O. Spatial, a purely hypothetical professional
Absolutely! After all, seeing what others are doing can lead to ideas of your own, and a better understanding of the concepts we discussed. With that in mind, let’s take a look at some examples of GeoAI in practice today.
Retail location planning is an industry which has drawn intelligence from demographic and other geospatial data for a long time. Research has shown that the human mind is only capable of conceptualizing about 11 potential causal factors simultaneously. Machine learning has no such limit, and that is one reason why it offers notable advantages for analysis like this. With machine learning, you can evaluate as many causal factors as you want at the same time — eliminating the need to make assumptions and instead analyzing everything equally to see patterns you might not have originally thought to check for. Joel McCune has a great story map on Customer-Centric Analysis illustrating how machine learning can be used to create a much more clear picture of retail location impact assessment.
You’ll find that many geospatial machine learning examples are focused on deep learning — this is largely due to the fact that the geospatial data used for the model does not conform to a specific structure. During the Virtual User’s Conference earlier this year, Esri showcased a great example of deep learning used geospatially — detecting shipwrecks off the coast of Jamaica Bay, NY.
This example is particularly interesting because, in addition to using bathymetric data to perform underwater analysis (something I’ve not seen much of previously), it was also accomplished without writing any code by using the new deep learning tools in ArcGIS Pro.
In future articles in this series, we’ll dig into more detail about machine learning algorithms as well as the future of GeoAI. If you have a project that you think could benefit from GeoAI, we’d be happy to talk it over with you!