Geocoding is the bridge that links location strings to points on a map. This talk will discuss the state of opensource and commercial geocoders, then provide a solution based on the openaddresses.io data repo running on a free micro instance on the Amazon Cloud.
Geocoding, as an information retrieval process, is divided into Forward Geocoding (a location described in words into a latitude,longitude point) and Reverse Geocoding (a point into a location description).
Ever since the advent of the online map, this problem has attracted considerable attention. The main players today are commercial vendors such as Google Maps. Open source alternatives have usually fallen short (for eg, Openstreetmap's Nominatim). In fact all geocoders (commercial and free) fall short in various ways as I will demonstrate in this talk.
I'll also demonstrate how you can build your own geocoder if the data is available. I've built my own over the last 11 years (written in ModPerl) and you can too (in your language of choice) using openaddresses.io, the free and open global address collection, which currently provides over 200 Million addresses.
My solution runs on minimal resources (just 1G or Ram and 1 vCpu on a free micro instance) and may be a bit slow, if you need performance get a faster server. I'll also show how to extract and standardize addresses from bodies of text very quickly, regardless of the amount of text.
Speakers: Ervin Ruci