Updated blog!
This Blog now reflects that Batch Geocoding is now supported by the Geocode Dataflow API. documented at https://docs.microsoft.com/bingmaps/spatial-data-services/geocode-dataflow-api.
Introduction
Geocoding and reverse geocoding are services that Bing Maps provides in SDKs, such as the Web Control, Windows, iOS and Android as well as in REST API services. This process takes text descriptions or addresses and outputs accurate geographic coordinates that correspond to a given physical location. What happens if you are about to start a project and you have to geocode thousands of addresses? Or what if you have a requirement to batch-process data updates as a recurring task?
Of course you could just call the geocoder again and again but that doesn’t seem to be a very efficient approach. With our June release we also launched a new batch-geocoder and batch reverse-geocoder as part of the Bing Spatial Data API in order to address just these scenarios. Chris Pendleton briefly touched on it in his blog post here.
Today, we would like to go in a bit more detail and build a little application that leverages the Bing Spatial Data Services. During this walkthrough we will follow the process below.
As a prerequisite you will need a Bing Maps Key which you can create yourself at the Bing Maps Portal.
Using Batch Geocoding
Batch geocoding is a useful feature to have for a business operating in almost any industry. For logistics and delivery businesses, the ability to quickly geocode addresses into location coordinates and vice versa can be incredibly useful, particularly for compression and porting over location data between different devices.
Important geocoding updates
Updated 2022:
Before you create a job to geocode data, it’s worth pointing out that there have been significant updates to the batch-geocoders and batch reverse geocoders, which are now referred to as the Geocode Dataflow API. As a developer you’ll have the option to choose between staying with the older data schema (version 1.0) or porting over to the updated one (version 2.0).
The new data schema provides developers with much more useful information and hence can save large amounts of time otherwise spent on tasks that can now be automated. For example, version 2.0 includes different points for routing and display, as well as an easier method of creating location bounds.
As an overview, using the Geocode Dataflow API involves the following steps:
- Formatting your location data depending on the data schema you’ve chosen. As mentioned above, this can be an XML format or as a set of values.
- Create a geocode job. This essentially just involves uploading location and point data for batch geocoding and batch reverse geocoding respectively.
- Monitor your created job’s status. – This step is simple and only involves two parameters, with the third being optional. Make sure you’re using the same Bing Maps key that you used for creating the geocoding job.
- Done! Download the geocode job results. You’ll know the results are ready for download once the value ‘Completed’ shows up in the job status field.
These four steps are all developers need to geocode and reverse geocode data. Now let’s have a look at how the batch geocoding process works step-by-step with a few examples.
Format Data
Your data can be either in XML- or text-files. In text files you can separate values with comma, tab or pipe (|). The data can be:
- latitudes and longitudes which would be reverse geocoded
- query-strings such as place-names, postcodes or unformatted addresses
- formatted addresses with separate attributes for each address-part
You will find a full description of the data schema here and some sample data here. An interesting aspect of the service is that we can mix different types of information.
In the sample data set below you see for examples for batch geocoding, including well known places We’ve also included UK postcodes, latitudes and longitudes for reverse geocoding. There is also an empty entry which we intentionally put in there to demonstrate what happens if a record cannot be resolved.
<GeocodeFeed>
<GeocodeEntity Id="1" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="de-DE">
<Address AddressLine="Konrad-Zuse-Str. 1"
Locality="Unterschleißheim"
PostalCode="85716" />
</GeocodeRequest>
</GeocodeEntity>
<GeocodeEntity Id="4" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-GB"
Query="Tower of London">
</GeocodeRequest>
</GeocodeEntity>
<GeocodeEntity Id="5" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-GB"
Query="Angel of the North">
</GeocodeRequest>
</GeocodeEntity>
<GeocodeEntity Id="6" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-GB"
Query="RG6 1WG">
</GeocodeRequest>
</GeocodeEntity>
<GeocodeEntity Id="7" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<ReverseGeocodeRequest Culture="fr-FR">
<Location Longitude="2.265087118043766" Latitude="48.83431718199653"/>
</ReverseGeocodeRequest>
</GeocodeEntity>
<GeocodeEntity Id="8" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-US" Query="">
<Address AddressLine="" AdminDistrict="" />
</GeocodeRequest>
</GeocodeEntity>
</GeocodeFeed>
There is a size-limitation to consider though. The file to upload must not exceed 100 MB. You can have up to 10 jobs at a time but if you really need to go to the limits you should consider using a more efficient file format such as a pipe(|)-delimited text file. The sample data above would look in this format as shown below and would only be a quarter of the size of the XML-file
1|de-DE||Konrad-Zuse-Str. 1|||||Unterschleißheim|85716||||||||||||||||||||||
4|en-GB|Tower of London||||||||||||||||||||||||||||
5|en-GB|Angel of the North||||||||||||||||||||||||||||
6|en-GB|RG6 1WG||||||||||||||||||||||||||||
7|fr-FR||||||||||||||||||||||||||||48.83431718199653|2.265087118043766
8|en-US|||||||||||||||||||||||||||||
Create a Job
In the SDK you will find sample code for a console application in C#. In this walk-through we will build a WinForm-application in VB.NET. The final application will look like shown below:
Once we have selected our source-data-file we first set the content-type .
' The 'Content-Type' header must be "text/plain" or "application/xml"
' depending on the input data format.
Dim contentType As String = "text/plain"
If Microsoft.VisualBasic.Right(txtSelectedFile.Text, 3).ToLower = "xml" Then
contentType = "application/xml"
End If
Next we build our HTTP-POST-request adding parameters for the source-data-format and the Bing Maps key. We also add our source-data-file as bytes from a file-stream. If the job was successfully submitted, we receive a job-ID as part of the response-header. Together with a desired output-format (JSON or XML) and the Bing Maps key we can use this job-ID to monitor the batch geocoding or batch reverse geocoding job status. We will start a timer to do just that every 30 seconds (or whatever time interval you think is appropriate).
Dim queryStringBuilder As New StringBuilder()
' The 'input' and 'key' parameters are required.
queryStringBuilder.Append("input=").Append(Uri.EscapeUriString(cbInputFormat.Text))
queryStringBuilder.Append("&")
queryStringBuilder.Append("key=").Append(Uri.EscapeUriString(txtBMKey.Text))
' The 'description' parameter is optional.
If Not String.IsNullOrEmpty(txtDescription.Text) Then
queryStringBuilder.Append("&")
queryStringBuilder.Append("description=").Append(Uri.EscapeUriString(txtDescription.Text))
End If
Dim uriBuilder As New UriBuilder("http://spatial.virtualearth.net")
uriBuilder.Path = "/REST/v1/dataflows/geocode"
uriBuilder.Query = queryStringBuilder.ToString()
Using dataStream As FileStream = File.OpenRead(txtSelectedFile.Text)
Dim request As HttpWebRequest = DirectCast(WebRequest.Create(uriBuilder.Uri), HttpWebRequest)
' The method must be 'POST'.
request.Method = "POST"
request.ContentType = contentType
Using requestStream As Stream = request.GetRequestStream()
Dim buffer As Byte() = New Byte(16383) {}
Dim bytesRead As Integer = dataStream.Read(buffer, 0, buffer.Length)
While bytesRead > 0
requestStream.Write(buffer, 0, bytesRead)
bytesRead = dataStream.Read(buffer, 0, buffer.Length)
End While
End Using
Try
Using response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
' If the job was created successfully, the status code should be
' 201 (Created) and the 'Location' header should contain the
' location of the new dataflow job.
If response.StatusCode <> HttpStatusCode.Created Then
lblStatus.Text = "Unexpected status code."
End If
Dim dataflowJobLocation As String = response.GetResponseHeader("Location")
If String.IsNullOrEmpty(dataflowJobLocation) Then
lblStatus.Text = "Expected the 'Location' header."
End If
myStatusUrl = dataflowJobLocation & "?output=" + cbOutputFormat.Text + "&key=" + txtBMKey.Text
lblStatusUrl.Visible = True
' Start a timer to monitor the status.
' in this sample the timer ticks every 30 seconds
myTimer.Start()
End Using
Catch ex As Exception
lblStatus.Text = ex.Message
End Try
End Using
Monitor Status
In the previous section we have created our batch-job, retrieved the job-ID and started a timer which checks the batch geocoding job-status periodically. The job status can be returned either in XML or JSON format and would look like shown below. As you can see we can retrieve the status of the job as well as URLs from where we can download our geocoded data as well as those that failed to geocode.
<?xml version="1.0" encoding="utf-8"?>
<Response xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1 <">
Copyright>Copyright © 2010 Microsoft and its suppliers. All rights reserved. ...</Copyright>
<BrandLogoUri>http://spatial.virtualearth.net/Branding/logo_powered_by.png</BrandLogoUri>
<StatusCode>200</StatusCode>
<StatusDescription>OK</StatusDescription>
<AuthenticationResultCode>ValidCredentials</AuthenticationResultCode>
<TraceId>0508be9c784f4a9c898003942643b7f2|LTSM001003|02.00.136.1000|</TraceId>
<ResourceSets>
<ResourceSet>
<EstimatedTotal>1</EstimatedTotal>
<Resources>
<DataflowJob>
<Id>ce3548b360ca42d3adac0f7c4a26f392</Id>
<Link role="self">https://spatial.virtualearth.net/REST/...</Link>
<Link role="output" name="succeeded">https://...</Link>
<Link role="output" name="failed">https://...</Link>
<Description>My Batch Job 31/08/2010 00:00:00</Description>
<Status>Completed</Status>
<CreatedDate>2010-08-31T02:46:47.1744785-07:00</CreatedDate>
<CompletedDate>2010-08-31T02:47:36.7504986-07:00</CompletedDate>
<TotalEntityCount>12</TotalEntityCount>
<ProcessedEntityCount>12</ProcessedEntityCount>
<FailedEntityCount>1</FailedEntityCount>
</DataflowJob>
</Resources>
</ResourceSet>
</ResourceSets>
</Response>
In the procedure that is being executed when the timer ticks we evaluate the job status. If the job has been completed we update our user interface with statistical information and download links.
Dim myXmlDocument As New XmlDocument
Dim numTotal As Integer = 0
Dim numProcessed As Integer = 0
Dim numFailed As Integer = 0
myXmlDocument.Load(myStatusUrl)
Dim myJobStatus As String = myXmlDocument.Item("Response").Item("ResourceSets")._
Item("ResourceSet").Item("Resources").Item("DataflowJob").Item("Status").InnerText
If myJobStatus = "Completed" Then
lblStatus.Text = "Job Complete"
Dim myXmlNode As XmlNode = myXmlDocument.Item("Response").Item("ResourceSets")._
Item("ResourceSet").Item("Resources").Item("DataflowJob")
For i = 0 To myXmlNode.ChildNodes.Count - 1
Select Case myXmlNode.ChildNodes(i).Name
Case "Link"
If myXmlNode.ChildNodes(i).Attributes.Count > 1 Then
If (myXmlNode.ChildNodes(i).Attributes("role").Value = "output" And _
myXmlNode.ChildNodes(i).Attributes("name").Value = "succeeded") Then
mySucessUrl = myXmlNode.ChildNodes(i).InnerText + "?key=" + txtBMKey.Text
ElseIf (myXmlNode.ChildNodes(i).Attributes("role").Value = "output" And _
myXmlNode.ChildNodes(i).Attributes("name").Value = "failed") Then
myFailedUrl = myXmlNode.ChildNodes(i).InnerText + "?key=" + txtBMKey.Text
End If
End If
Case "TotalEntityCount"
numTotal = CInt(myXmlNode.ChildNodes(i).InnerText)
Case "ProcessedEntityCount"
numProcessed = CInt(myXmlNode.ChildNodes(i).InnerText)
Case "FailedEntityCount"
numFailed = CInt(myXmlNode.ChildNodes(i).InnerText)
End Select
Next
lblSummary.Text = "Summary" + vbCrLf _
+ "Total Entities: " + numTotal.ToString + vbCrLf _
+ "Processed Entities: " + numProcessed.ToString + vbCrLf _
+ "Failed Entities: " + numFailed.ToString
Download Results
Batch geocoding and batch reverse geocoding results will remain available for download for up to 14 days. Again, a detailed description of the data schema is available here in the SDK but let’s have a quick look at our sample data in XML-format:
<?xml version="1.0"?>
<GeocodeFeed >
<GeocodeEntity Id="1" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="de-DE">
<Address AddressLine="Konrad-Zuse-Str. 1"
Locality="Unterschleißheim"
PostalCode="85716" />
</GeocodeRequest>
<GeocodeResponse DisplayName="Konrad-Zuse-straße 1, 85716 Unterschleißheim"
EntityType="Address"
Confidence="Medium"
StatusCode="Success">
<Address AddressLine="Konrad-Zuse-straße 1"
AdminDistrict="BY"
CountryRegion="Germany"
FormattedAddress="Konrad-Zuse-straße 1, 85716 Unterschleißheim"
Locality="Unterschleißheim"
PostalCode="85716" />
<RooftopLocation Latitude="48.290643" Longitude="11.581654" />
<InterpolatedLocation Latitude="48.290542" Longitude="11.581076" />
</GeocodeResponse>
</GeocodeEntity>
<GeocodeEntity Id="4" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-GB"
Query="Tower of London" />
<GeocodeResponse DisplayName="Tower of London, United Kingdom"
EntityType="HistoricalSite"
Confidence="High"
StatusCode="Success">
<Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="Tower of London, United Kingdom"
Locality="London" />
<RooftopLocation Latitude="51.5081448107958" Longitude="-0.0762598961591721" />
</GeocodeResponse>
</GeocodeEntity>
<GeocodeEntity Id="5" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-GB"
Query="Angel of the North" />
<GeocodeResponse DisplayName="Angel of the North, United Kingdom"
EntityType="LandmarkBuilding"
Confidence="High"
StatusCode="Success">
<Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="Angel of the North, United Kingdom"
Locality="Gateshead" />
<RooftopLocation Latitude="54.9144704639912" Longitude="-1.58999472856522" />
</GeocodeResponse>
</GeocodeEntity>
<GeocodeEntity Id="6" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<GeocodeRequest Culture="en-GB"
Query="RG6 1WG" />
<GeocodeResponse DisplayName="RG6 1WG, Wokingham, United Kingdom"
EntityType="Postcode1"
Confidence="High"
StatusCode="Success">
<Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="RG6 1WG, Wokingham, United Kingdom"
PostalCode="RG6 1WG" />
<RooftopLocation Latitude="51.461179330945" Longitude="-0.925943478941917" />
</GeocodeResponse>
</GeocodeEntity>
<GeocodeEntity Id="7" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
<ReverseGeocodeRequest Culture="fr-FR">
<Location Latitude="48.8343171819965" Longitude="2.26508711804377" />
</ReverseGeocodeRequest>
<GeocodeResponse DisplayName="Quai du Président Roosevelt, 92130 Issy-les-Moulineaux"
EntityType="Address"
Confidence="Medium"
StatusCode="Success">
<Address AddressLine="Quai du Président Roosevelt"
AdminDistrict="IdF"
CountryRegion="France"
FormattedAddress="Quai du Président Roosevelt, 92130 Issy-les-Moulineaux"
Locality="Issy-les-Moulineaux"
PostalCode="92130" />
<InterpolatedLocation Latitude="48.8343036174774" Longitude="2.26509869098663" />
</GeocodeResponse>
</GeocodeEntity>
</GeocodeFeed>
Our customers continue to use geocoding and location intelligence to create powerful all-in-one geospatial mapping experiences. Workers benefit from Bing Maps API’s geocoding capabilities in their everyday activities, quickly receiving geographic coordinates by entering text-based addresses for locations of interest.
As mentioned previously in this article, the latest updates for using Geocode Dataflow API can be found here. The above code samples are now out of date and are for informational purposes only.