Some Tableau Geocoding Tricks


Some Tableau Geocoding Tricks
A few weeks ago, I wrote a blog post and created a visualization showing 2016 self-reported UFO sightings. This visualization was, by far, the most complex I’ve ever created. In the post, I detailed some of the challenges that I encountered, all of which related to the Tableau work. But there were some key challenges I encountered before even opening Tableau, all of which had to do with geocoding the data.


Latitude and Longitude
As discussed in the blog post, I collected 2016 sightings data from the National UFO Reporting Center (NUFORC). One of the maps on the visualization showed the location of every one of these sightings, as shown below:


The NUFORC data set had city and state so I attempted to map it in Tableau. No dice—over 2,000 of the 5,000+ rows of data were not recognized. Tableau does a really good job of mapping city and state data, but the data was so messy that it simply couldn’t make sense of it. For example, one city reads “Emigrant Pass (near; foothills of)”. Tableau isn’t going to match that without manual intervention. When I’ve had this problem in the past, I’ve typically just dealt with the mismatches manually and moved on, but I couldn’t do that with over 2,000 mismatches, so I needed a different approach. I decided to leverage the Google Geocoding API.

Google Geocoding API
Any developer can tell you that Google has a great set of APIs available for public use. One of those is the geocoding API. If you were to search “Emigrant Pass, OR latitude and longitude” in Google, this API would be called and the search results would provide you with the latitude and longitude. So, I signed up for a Google API Key (there is no charge for up to 2,500 requests per day, but you do have to provide a credit card in case you exceed the free tier allowance).

Calling an API is relatively simple, but I never like to write code from scratch if I can avoid it. As my data was in Excel, I figured someone before me had written a VBA function to call the API. Sure enough, a simple search led me to a discussion on Stack Overflow, which included the code I needed. I saved my data as a Macro-Enabled Spreadsheet (xlsm), opened the VBA development environment and pasted the code. I then went back to the spreadsheet and used my new function to lookup the latitude and longitude of each sighting. All in all, the function worked great—it was able to find the latitude and longitude of almost every location, even when the data was messy. There were a few problems, however. First, the function is a bit slow—it takes a lot of time to look up over 5,000 latitudes and longitudes. Second, I found that it did not respond well when trying to execute thousands of calls at one time. So I had to apply the formula to chunks of a few hundred records at a time. Third, some data was so messy that not even the Google API could identify it. In those cases, I did my best to figure out what location was being referenced and did some manual lookups. Fortunately, these were few and far between.

Now that I had latitudes and longitudes for every sighting, I was able to pull the spreadsheet into Tableau and map each sighting based on latitude and longitude, instead of city and state. This, of course, worked perfectly.

County Lookup
But that was not the only geographic challenge I encountered. The visualization included three maps that showed some type of county-level data. The most prominently featured of these was a map showing per-capita sightings.


To create these maps, I needed to sum up my set of 5,000+ sightings by county. Unfortunately, the NUFORC database does not have any information about the county whatsoever. But, after overcoming the first challenge, I did have latitude and longitude and I theorized that there must be some way to lookup the county based on those. Sure enough, a Google search revealed an API from the FCC which does exactly this. Unlike the Google geocoding API, I could not find any pre-written VBA code, so I created it myself. I’ve included the code at the end of this post if you would like to use it. I then used the function to look up the county for each location. After that, I was able to easily map the county-level data in Tableau.

Wrap-Up
Tableau does an amazing job with maps. I’ve worked with a variety of analytics and visualization toolsets in the past and I’ve never encountered any that allows for such quick and easy map creation (and it's even better with the new spatial connector in 10.2!!). But, it does have its limitations, as I’ve detailed here. Fortunately, through use of a few relatively easy-to-use and mostly-free APIs, you can quickly and accurately geocode your data yourself, taking the pressure off of Tableau. And, because the geocoding is done outside of the visualization tool, it will work for any tool capable of mapping latitude and longitude.

The Code
In truth, I needed to get both the county name and the FIPS code for my visualization. This is because the method I used to create the Albers projection map linked each polygon to the county FIPS codes (see the original post for more details on this). And, of course, I wanted to include the county name on the tooltips. So, I created two functions—with the same basic guts—one which returns the county name and the other which returns the county FIPS code. The VBA code is below. You can, of course, easily convert this to the programming language of your choice.

'Get County Name Based on Latitude and Longitude
Function GetCounty(Latitude As String, Longitude As String) As String
    Dim strQuery As String, strTemp As String
   
    'Assemble the query string
    strQuery = "http://data.fcc.gov/api/block/find?"
    strQuery = strQuery & "latitude=" & Latitude
    strQuery = strQuery & "&longitude=" & Longitude
    strQuery = strQuery & "&showall=true"
   
    'define XML and HTTP components
    Dim Result As New MSXML2.DOMDocument
    Dim Service As New MSXML2.XMLHTTP
   
    'create HTTP request to query URL - make sure to have
    'that last "False" there for synchronous operation
    Service.Open "GET", strQuery, False
    Service.send
    Result.LoadXML (Service.responseText)
   
    'parse the response text to get the county name.
    Dim intPos As Integer, intPos2 As Integer, strError As String
    intPos = InStr(1, Service.responseText, "<County FIPS=")
    If intPos > 0 Then
        strTemp = Service.responseText
        strTemp = Mid(strTemp, intPos + 13, Len(strTemp) - (intPos + 13))
        intPos = InStr(1, strTemp, "/>")
        strTemp = Left(strTemp, intPos - 1)
        intPos = InStr(1, strTemp, " name=")
        strTemp = Mid(strTemp, intPos + 7, Len(strTemp) - (intPos + 7))
        GetCounty = strTemp
    Else
        GetCounty = "UNKNOWN ERROR OCCURRED"
    End If
   
    Set Service = Nothing
    Set Result = Nothing
End Function


'Get County FIPS Code Based on Latitude and Longitude
Function GetCountyFIPS(Latitude As String, Longitude As String) As String
    Dim strQuery As String, strTemp As String
   
    'Assemble the query string
    strQuery = "http://data.fcc.gov/api/block/find?"
    strQuery = strQuery & "latitude=" & Latitude
    strQuery = strQuery & "&longitude=" & Longitude
    strQuery = strQuery & "&showall=true"
   
    'define XML and HTTP components
    Dim Result As New MSXML2.DOMDocument
    Dim Service As New MSXML2.XMLHTTP
   
    'create HTTP request to query URL - make sure to have
    'that last "False" there for synchronous operation
    Service.Open "GET", strQuery, False
    Service.send
    Result.LoadXML (Service.responseText)
   
    'parse the response text to get the county name.
    Dim intPos As Integer, intPos2 As Integer, strError As String
    intPos = InStr(1, Service.responseText, "<County FIPS=")
    If intPos > 0 Then
        strTemp = Service.responseText
        strTemp = Mid(strTemp, intPos + 14, Len(strTemp) - (intPos + 14))
        intPos = InStr(1, strTemp, """")
        strTemp = Left(strTemp, intPos - 1)
        GetCountyFIPS = strTemp
    Else
        GetCountyFIPS = "UNKNOWN ERROR OCCURRED"
    End If
   
    Set Service = Nothing
    Set Result = Nothing
End Function


If you have any thoughts or feedback, please let me know what you think in the comments section.

Ken Flerlage, March 9, 2017
Twitter | LinkedIn | Tableau Public

3 comments:

  1. That is very interesting; you are a very skilled blogger. I have shared your website in my social networks! A very nice guide. I will definitely follow these tips. Thank you for sharing such detailed article.

    Tableau Online Training

    ReplyDelete
  2. Nice post,i found lot of information.I updated my knowledge with this blog.it can help me to crack GIS jobs in Hyderabad.

    ReplyDelete
  3. I am working with map and i have a dataset hving 640000 lat and log when i am ploting then the map
    is rendering a lot like its takes 20 mins to give the viz .and some times its hangs
    from 8 gb ram its takes 7.5 gb RAM .And also i use dual axis so it is good to take create dual axis in Map viz .????

    how to optimize the map in tableau i am using offline .
    please suggest some thing

    ReplyDelete

Powered by Blogger.