The finished code for this exercise can be found at:

https://github.com/compciv/2016.compciv.org/tree/master/source/files/code/practicum/projects/showme-examples/nearest-earthquakes-step-by-step/004

A more complicated version of this exercise can be found at:

https://github.com/compciv/show-me-earthquakes

Getting started

This exercise assumes you've completed the previous exercise:

    compciv-2016
    └── projects
        └── show-me-where
            └── woz.py
            └── utils
                └── geocoding.py
                └── __init__.py

Nothing will change in woz.py. But we will be making these changes to geocoding.py

Adding a fetch_mapzen_response() function
Adding a parse_mapzen_response() function
Changing the geocode() function to use the data that the fetch and parse functions return.

How to turn one problem into 3 separate problems

There's many ways to approach this problem, but I'm going to aim for a somewhat convoluted path, in which the geocoding functionality is split up into 3 separate functions in utils/geocoding.py.

Here are the following changes, and the order we'll implement them, in the utils/geocoding.py file.

Make a fetch_mapzen_response() function that, for now, just returns the raw text data from a canned JSON response.
Make a parse_mapzen_response() function that takes the raw text response from Mapzen (canned or not) and extracts the necessary data and returns a dictionary.

Basically, fetch_mapzen_response() is going to do its thing, which is fetch some canned/stale – but legitimate – data from a given URL. And parse_mapzen_response() just doesn't care where the data came from. Its job is just to extract the geocoded data.

But how does fetch_mapzen_response() send its data to parse_mapzen_response()? That's geocode()'s job. Remember geocode()? It's just been returning a fake dictionary. But now it can do the work of coordinating the fetching and the parsing of geodata.

It's worth keeping in mind that we still aren't talking to Mapzen's API. That's OK. We're moving one step forward – which is working with data that, at one point, did come from Mapzen. In the next lesson, we'll do the relatively trivial step of contacting Mapzen and getting fresh data with every call.

It'll be a little confusing at first. But hopefully you'll see the reasoning and orderliness of writing separate functions that do their own thing.

Fetching and returning a canned response with fetch_mapzen_response()

Again, this will seem a little backwards…but let's not worry about contacting the actual Mapzen service yet, which would involve reading their documentation and also writing the function that reads the credentials file. Which is not that hard, but still…

Go into utils/geocoder.py and add this function – you don't have to write a verbose Docstring, but you should write something:

def fetch_mapzen_response(location):
    """
    `location` is a string that will be passed onto Mapzen API for geocoding

    returns a text string containing JSON-formatted data from Mapzen
    """

No matter what, this function will require the requests library – whether it is fetching from Mapzen directly, or from some other online stashed file. You should add import requests to the top of utils/geocoder.py.

As for what stashed data file it should pull from…why not the one we used in the previous homework assignment?

http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json

Here's what the body of the function will look like – nothing different from past assignments in which you had to use requests.get(), though this time, you don't have to write it to disk:

def fetch_mapzen_response(location):
    """
    `location` is a string that will be passed onto Mapzen API for geocoding

    returns a text string containing JSON-formatted data from Mapzen
    """
    # ignore the location string for now
    SAMPLE_DATA_URL = 'http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json'
    resp = requests.get(SAMPLE_DATA_URL)
    return resp.text

Try it in ipython. Instead of running python woz.py, just run ipython. Then import the function from utils.geocoding:

>>> from utils.geocoding import fetch_mapzen_response
>>> x = fetch_mapzen_response("whatever it doesn't matter right now")
>>> type(x)
str
>>> len(x) # count number of text characters
6826

You should get a big block of JSON-formatted text – remember that fetch_mapzen_response returns a string.

Let's parse a canned Mapzen JSON response

OK, fetch_mapzen_response() works so far, as far as the current thing we care about – parse_mapzen_response() – is concerned.

Add this skeleton of a function to utils/geocoding.py:

def parse_mapzen_response(txt):
    """
    `txt` is a string containing JSON-formatted text from Mapzen's API

    returns a dictionary containing the useful key/values from the most
       relevant result.
    """

This parse_mapzen_response() function basically does the hard work of the geocode() (remember geocode()? We'll get back to it) function, which is deserializing the txt string into a dictionary, then picking out the first result, then extracting that result's longitude, latitude, confidence, et. al.

At this point, it's worth revisiting the past homework assignment, in which you had to parse that samed canned Mapzen JSON response. The key points were:

Mapzen's API response, when deserialized from JSON, becomes a dictionary.
That dictionary contains a features key, which consists of the actual results.
The results are listed in order of relevance. We just need to return the first result.
Each result is itself a dictionary with a few useful keys:
- properties points to a dictionary, which contains confidence, e.g. 0.949 and label, e.g. "Stanford, Santa Clara County, CA"
- geometry has a list named coordinates, in which the desired longitude and latitude values are listed, respectively:
```
    {
      "type": "Point",
      "coordinates": [
        -122.16608,
        37.42411
      ]
    }
```

Implementing parse_mapzen_response()

It's actually not much different from the homework:

Create an empty dictionary gdict, which we'll fill with result data.
Deserialize the raw text (txt) with json.loads()
- This means we have to include import json at the top of the file
The dictionary contains a features key, which points to a list of results.
If there is more than one result/feature:
- set gdict["status"] to "OK", because everything is just OK.
- pluck the first result/feature, then fill up gdict with the copied attributes, e.g. latitude, longitude, etc.
But if there aren't any results, then:
- set gdict['status'] to None

Whether or not results are fetched from Mapzen, we return gdict, which will at least have a status key

Here's one way of doing it; again, this function goes into utils/geocoding.py

def parse_mapzen_response(txt):
    """
    `txt` is a string containing JSON-formatted text from Mapzen's API

    returns a dictionary containing the useful key/values from the most
       relevant result.
    """
    gdict = {} # just initialize a dict for now, with status of None
    data = json.loads(txt)
    if data['features']: # it has at least one feature...
        gdict['status'] = 'OK' 
        feature = data['features'][0] # pick out the first one
        props = feature['properties']  # just for easier reference
        gdict['confidence'] = props['confidence']
        gdict['label'] = props['label']

        # now get the coordinates
        coords = feature['geometry']['coordinates']
        gdict['longitude'] = coords[0]
        gdict['latitude'] = coords[1]
    else:
        gdict['status'] = None

    return gdict

Testing parse_mapzen_response()

That's a big function. Does it even work?

Only one way to find out.

Go into ipython. Then import both the fetch_mapzen_response and parse_mapzen_response functions from the utils.geocoding package.

Remember that fetch_mapzen_response() takes in a location string…but just returns the canned response (i.e. always "Stanford"). But that's OK, pass that response text directly to parse_mapzen_response and see what happens:

>>> from utils.geocoding import fetch_mapzen_response, parse_mapzen_response
>>> rawtext = fetch_mapzen_response("whatever")
>>> georesult = parse_mapzen_response(rawtext)
>>> type(georesult)
dict
>>> print(georesult)

The printed dictionary should be what we expect:

{'status': 'OK', 'latitude': 37.42411, 'label': 'Stanford, Santa Clara County, CA', 'longitude': -122.16608, 'confidence': 0.949}

Once you're satisfied that it works, quit out of iPython and point your editor to utils/geocoding.py.

Putting the pieces together in `geocode()`

Remember the geocode function inside of utils/geocoding.py?

Remember how it just returned a dictionary full of fake data?

def geocode(location):
    """
    docstring
    """
    mydict = {}
    mydict['query_text'] = location
    mydict['latitude'] = 99
    mydict['longitude'] = -42
    mydict['confidence'] = 0.01
    mydict['label'] = "HA HA JK"
    return mydict

Now it doesn't have to do that anymore. Replace the fake arbitrary values with our…well, less fake response from fetch_mapzen_response and parse_mapzen_response. We still want to return a dictionary, mydict, though, and it should still include the original location as the query_text attribute.

The geocode function, again, takes a location argument…which is mostly ignored by fetch_mapzen_response. But that's OK. As long as rawtext is some kind of Mapzen data response, we'll be able to return the correct kind of object:

def geocode(location):
    """
    that giant docstring
    """
    rawtext = fetch_mapzen_response(location)
    mydict = parse_mapzen_response(rawtext)
    # add the location string to mydict
    mydict['query_string'] = location 
    # return the diccionary
    return mydict

Does the geocode function still work?

You can test this in two ways. Via ipython:

>>> from utils.geocoding import geocode
>>> x = geocode("doesn't matter")
>>> type(x)
dict
>>> print(x)
{'longitude': -122.16608, 'latitude': 37.42411, 'confidence': 0.949, 'query_string': "doesn't matter", 'label': 'Stanford, Santa Clara County, CA', 'status': 'OK'}

And you can run woz.py:

$ python woz.py 
What do you want to do? geocode
What is your location? Anywhere
OK...geocoding: Anywhere
{'latitude': 37.42411, 'status': 'OK', 'longitude': -122.16608, 'query_string': 'Anywhere', 'label': 'Stanford, Santa Clara County, CA', 'confidence': 0.949}

Parsing the Mapzen response data before actually calling the API

Summary

Getting started

How to turn one problem into 3 separate problems

Fetching and returning a canned response with fetch_mapzen_response()

Let's parse a canned Mapzen JSON response

Implementing parse_mapzen_response()

Testing parse_mapzen_response()

Putting the pieces together in `geocode()`

Does the geocode function still work?

References and Related Readings

Parsing the Mapzen response data before actually calling the API

Summary

Getting started

How to turn one problem into 3 separate problems

Fetching and returning a canned response with fetch_mapzen_response()

Let's parse a canned Mapzen JSON response

Implementing parse_mapzen_response()

Testing parse_mapzen_response()

Putting the pieces together in geocode()

Does the geocode function still work?

References and Related Readings

Putting the pieces together in `geocode()`