Contacting the Mapzen Geocoder and returning sensible results is a multi-step process…which we won’t complete just yet. So while nothing in woz.py changes, we’ll be adding code to utils/geocoding.py
, including a function that will just “pretend” to get data from Mapzen API.
Once we’re sure we can deal with Mapzen’s data format and extract the good parts from it, then we’ll finally contact Mapzen and have it geocode live locations.
The finished code for this exercise can be found at:
A more complicated version of this exercise can be found at:
This exercise assumes you've completed the previous exercise:
compciv-2016 └── projects └── show-me-where └── woz.py └── utils └── geocoding.py └── __init__.py
Nothing will change in woz.py
. But we will be making these changes to geocoding.py
fetch_mapzen_response()
functionparse_mapzen_response()
functiongeocode()
function to use the data that the fetch
and parse
functions return.There's many ways to approach this problem, but I'm going to aim for a somewhat convoluted path, in which the geocoding functionality is split up into 3 separate functions in utils/geocoding.py
.
Here are the following changes, and the order we'll implement them, in the utils/geocoding.py
file.
fetch_mapzen_response()
function that, for now, just returns the raw text data from a canned JSON response.parse_mapzen_response()
function that takes the raw text response from Mapzen (canned or not) and extracts the necessary data and returns a dictionary.Basically, fetch_mapzen_response()
is going to do its thing, which is fetch some canned/stale – but legitimate – data from a given URL. And parse_mapzen_response()
just doesn't care where the data came from. Its job is just to extract the geocoded data.
But how does fetch_mapzen_response()
send its data to parse_mapzen_response()
? That's geocode()
's job. Remember geocode()
? It's just been returning a fake dictionary. But now it can do the work of coordinating the fetching and the parsing of geodata.
It's worth keeping in mind that we still aren't talking to Mapzen's API. That's OK. We're moving one step forward – which is working with data that, at one point, did come from Mapzen. In the next lesson, we'll do the relatively trivial step of contacting Mapzen and getting fresh data with every call.
It'll be a little confusing at first. But hopefully you'll see the reasoning and orderliness of writing separate functions that do their own thing.
Again, this will seem a little backwards…but let's not worry about contacting the actual Mapzen service yet, which would involve reading their documentation and also writing the function that reads the credentials file. Which is not that hard, but still…
Go into utils/geocoder.py
and add this function – you don't have to write a verbose Docstring, but you should write something:
def fetch_mapzen_response(location):
"""
`location` is a string that will be passed onto Mapzen API for geocoding
returns a text string containing JSON-formatted data from Mapzen
"""
No matter what, this function will require the requests library – whether it is fetching from Mapzen directly, or from some other online stashed file. You should add import requests
to the top of utils/geocoder.py
.
As for what stashed data file it should pull from…why not the one we used in the previous homework assignment?
http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json
Here's what the body of the function will look like – nothing different from past assignments in which you had to use requests.get()
, though this time, you don't have to write it to disk:
def fetch_mapzen_response(location):
"""
`location` is a string that will be passed onto Mapzen API for geocoding
returns a text string containing JSON-formatted data from Mapzen
"""
# ignore the location string for now
SAMPLE_DATA_URL = 'http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json'
resp = requests.get(SAMPLE_DATA_URL)
return resp.text
Try it in ipython. Instead of running python woz.py
, just run ipython. Then import the function from utils.geocoding
:
>>> from utils.geocoding import fetch_mapzen_response
>>> x = fetch_mapzen_response("whatever it doesn't matter right now")
>>> type(x)
str
>>> len(x) # count number of text characters
6826
You should get a big block of JSON-formatted text – remember that fetch_mapzen_response
returns a string.
OK, fetch_mapzen_response()
works so far, as far as the current thing we care about – parse_mapzen_response()
– is concerned.
Add this skeleton of a function to utils/geocoding.py
:
def parse_mapzen_response(txt):
"""
`txt` is a string containing JSON-formatted text from Mapzen's API
returns a dictionary containing the useful key/values from the most
relevant result.
"""
This parse_mapzen_response()
function basically does the hard work of the geocode()
(remember geocode()
? We'll get back to it) function, which is deserializing the txt
string into a dictionary, then picking out the first result, then extracting that result's longitude, latitude, confidence, et. al.
At this point, it's worth revisiting the past homework assignment, in which you had to parse that samed canned Mapzen JSON response. The key points were:
features
key, which consists of the actual results.properties
points to a dictionary, which contains confidence
, e.g. 0.949
and label
, e.g. "Stanford, Santa Clara County, CA"
geometry
has a list named coordinates
, in which the desired longitude and latitude values are listed, respectively:
{
"type": "Point",
"coordinates": [
-122.16608,
37.42411
]
}
It's actually not much different from the homework:
gdict
, which we'll fill with result data.txt
) with json.loads()
import json
at the top of the filefeatures
key, which points to a list of results.gdict["status"]
to "OK"
, because everything is just OK.gdict
with the copied attributes, e.g. latitude
, longitude
, etc.gdict['status']
to NoneWhether or not results are fetched from Mapzen, we return gdict
, which will at least have a status
key
Here's one way of doing it; again, this function goes into utils/geocoding.py
def parse_mapzen_response(txt):
"""
`txt` is a string containing JSON-formatted text from Mapzen's API
returns a dictionary containing the useful key/values from the most
relevant result.
"""
gdict = {} # just initialize a dict for now, with status of None
data = json.loads(txt)
if data['features']: # it has at least one feature...
gdict['status'] = 'OK'
feature = data['features'][0] # pick out the first one
props = feature['properties'] # just for easier reference
gdict['confidence'] = props['confidence']
gdict['label'] = props['label']
# now get the coordinates
coords = feature['geometry']['coordinates']
gdict['longitude'] = coords[0]
gdict['latitude'] = coords[1]
else:
gdict['status'] = None
return gdict
That's a big function. Does it even work?
Only one way to find out.
Go into ipython. Then import both the fetch_mapzen_response
and parse_mapzen_response
functions from the utils.geocoding
package.
Remember that fetch_mapzen_response()
takes in a location string…but just returns the canned response (i.e. always "Stanford"). But that's OK, pass that response text directly to parse_mapzen_response
and see what happens:
>>> from utils.geocoding import fetch_mapzen_response, parse_mapzen_response
>>> rawtext = fetch_mapzen_response("whatever")
>>> georesult = parse_mapzen_response(rawtext)
>>> type(georesult)
dict
>>> print(georesult)
The printed dictionary should be what we expect:
{'status': 'OK', 'latitude': 37.42411, 'label': 'Stanford, Santa Clara County, CA', 'longitude': -122.16608, 'confidence': 0.949}
Once you're satisfied that it works, quit out of iPython and point your editor to utils/geocoding.py
.
geocode()
Remember the geocode
function inside of utils/geocoding.py
?
Remember how it just returned a dictionary full of fake data?
def geocode(location):
"""
docstring
"""
mydict = {}
mydict['query_text'] = location
mydict['latitude'] = 99
mydict['longitude'] = -42
mydict['confidence'] = 0.01
mydict['label'] = "HA HA JK"
return mydict
Now it doesn't have to do that anymore. Replace the fake arbitrary values with our…well, less fake response from fetch_mapzen_response
and parse_mapzen_response
. We still want to return a dictionary, mydict
, though, and it should still include the original location
as the query_text
attribute.
The geocode
function, again, takes a location
argument…which is mostly ignored by fetch_mapzen_response
. But that's OK. As long as rawtext
is some kind of Mapzen data response, we'll be able to return the correct kind of object:
def geocode(location):
"""
that giant docstring
"""
rawtext = fetch_mapzen_response(location)
mydict = parse_mapzen_response(rawtext)
# add the location string to mydict
mydict['query_string'] = location
# return the diccionary
return mydict
You can test this in two ways. Via ipython:
>>> from utils.geocoding import geocode
>>> x = geocode("doesn't matter")
>>> type(x)
dict
>>> print(x)
{'longitude': -122.16608, 'latitude': 37.42411, 'confidence': 0.949, 'query_string': "doesn't matter", 'label': 'Stanford, Santa Clara County, CA', 'status': 'OK'}
And you can run woz.py
:
$ python woz.py
What do you want to do? geocode
What is your location? Anywhere
OK...geocoding: Anywhere
{'latitude': 37.42411, 'status': 'OK', 'longitude': -122.16608, 'query_string': 'Anywhere', 'label': 'Stanford, Santa Clara County, CA', 'confidence': 0.949}