The Checklist
In your compciv-2016 Git repository create a subfolder and name it:
exercises/0010-map-json-responses
The folder structure will look like this (not including any subfolders such as `tempdata/`:
compciv-2016 └── exercises └── 0010-map-json-responses ├── a.py ├── b.py ├── c.py ├── d.py ├── e.py ├── f.py ├── g.py
Background information
About the APIs
If you're interested in the documentation for what these JSON responses actually contain, you can read the docs online:
However, you don't need to know all the specifics of what the APIs return, or how to contact the APIs yourself, as I've contacted the APIs myself and saved their responses exactly as I received them.
Here is what I got in response for a query of simply, "Stanford"
I don't know how your web browser will actually render those files, so visit this webpage a more web-friendly rendering of these files.
About the JSON data format
If these JSON files look just like text files to you, that's right, that's all they are. In fact, they should remind you of Python objects you've seen before, particularly the dictionary and the list.
The fact that the text content of a JSON file looks almost identical to various Python objects is a coincidence that we can deal with later. For now, all you have to know is that if you have a string object that contains purportedly JSON-formatted text, this is how you deserialize (a fancy word for "convert") that text string into whatever Python object it looks like, e.g. a list or a dictionary (usually the latter):
import json
# etc...
# assuming mystring points to a text string
mydata = json.loads(mystring)
That's it: you import the json module, and then you pass a string object into the json.loads()
method.
You don't even need to know (yet) what it means to "serialize" something.
You can read about the json module if you'd like. But for now, just be assured that the Python docs promise us that json.loads(mystring)
converts (i.e. deserializes) whatever mystring
points to into some other kind of Python object, either of type dict
or of type list
. Well, as long as mystring
points to a JSON-formatted text string.
And what is that exactly? Sure, you could read the information at www.json.org. But if I've kept my promise, that the URLs provided for the exercise are JSON-formatted text files, then that's all you need to know to do these exercises.
Test this out in interactive Python if you don't believe me:
>>> import requests
>>> import json
>>> URL = 'http://www.compciv.org/files/datadumps/apis/googlemaps/geocode-stanford.json'
>>> txt = requests.get(URL).text
>>> type(txt)
str
# you can run: print(txt) to see what it actually looks like
>>> mydata = json.loads(txt)
>>> type(mydata)
dict
# And here's how to derive the answer to b.py
>>> print(dict['status'])
OK
However, while you don't have to know a lot of new specifics for this set of exercises, you will have to know all about for-loops, lists, and dictionaries
The Exercises
0010-map-json-responses/a.py » Download and save the JSON files; print the count of lines and characters
Same routine as in past exercises:
1. Create a tempdata
directory
2. Download the two files at the given source URLs and save them at the specific corresponding paths in your working directory:
- Source: http://www.compciv.org/files/datadumps/apis/googlemaps/geocode-stanford.json
-
Save as:
tempdata/googlemaps/stanford.json
- Source: http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json
- Save as:
tempdata/mapzen/stanford.json
Here’s what your file tree should look like:
compciv-2016
└── exercises
└── 0010-map-json-responses
└── a.py
└── tempdata
└── mapzen
└── stanford.json
googlemaps
└── stanford.json
3. Print the number of lines and characters found in each files text content.
You should not have to import the json module for this, as we are not converting the text into data. The len()
is all you need.
The string object has a splitlines()
function which should make it easier to convert a text string into a list of strings (just in case you wanted to, you know, count the number of lines with len()
)
Please take notice of where exactly the files are being saved to – i.e. in subdirectories of tempdata
, not just in tempdata
. And notice how the filenames are different than what they are from the website.
There’s no specific technical reason except that that’s the requirement for the exercise. Although the bigger picture is to prep you for the reality of a big data project, in which sometimes you have to name files whatever you feel like naming them, and it has nothing to do with the URL that they came from.
When you run a.py
from the command-line:
0010-map-json-responses $ python a.py
-
The program's output to screen should be:
--- Downloading from: http://www.compciv.org/files/datadumps/apis/googlemaps/geocode-stanford.json Writing to: tempdata/googlemaps/stanford.json Wrote 59 lines and 1751 characters --- Downloading from: http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json Writing to: tempdata/mapzen/stanford.json Wrote 273 lines and 6826 characters
-
The program creates this file path:
tempdata/googlemaps/stanford.json
-
The program creates this file path:
tempdata/mapzen/stanford.json
- The program accesses this remote file: http://www.compciv.org/files/datadumps/apis/googlemaps/geocode-stanford.json
- The program accesses this remote file: http://www.compciv.org/files/datadumps/apis/mapzen/search-stanford.json
JSON text files, when opened and read, are just string objects. It’s not until we use the
json.loads()
method that anything special happens.
0010-map-json-responses/b.py » Deserialize the Google Maps geocoder's JSON file and read its status code
The Google Geocoding API, along with a set of results, returns metadata, including a top-level status
object, so that the requesting program has an easy way to check if the API was able to fulfill the request.
For this exercise, simply print out the value that the status
key points to.
This is a situation in which you can just look at the actual file and find the corresponding object/key-value pair. But please try to do this programatically.
Here’s some sample code to open and read the file, and then deserialize it into a Python dictionary:
import json
f = open(MYFILENAME, 'r')
txt = f.read()
f.close()
mydict = json.loads(txt)
Now you just have to print its status key.
When you run b.py
from the command-line:
0010-map-json-responses $ python b.py
-
The program's output to screen should be:
OK
0010-map-json-responses/c.py » Print the formatted address of every result returned in the Google Maps geocoder's JSON response
The response object has a results
key, which is a list of result objects. For each object, print to screen the formatted_address
value.
If you eyeball our specific Google Maps geocoder’s JSON response, you’ll notice that the response’s results
list actually only contains one item. Even so, you should write your program as if there could be more than 1 result (or even none at all), because when you actually use an API, you won’t be taking the time to manually eyeball the dense JSON text it returns.
When you run c.py
from the command-line:
0010-map-json-responses $ python c.py
-
The program's output to screen should be:
Stanford, CA, USA
0010-map-json-responses/d.py » Print the verbose address of every result from in the Google Maps geocoder JSON file
For every result in the Google Maps geocoder JSON file, print the text composed of the long_name
form of each of the result’s address_components
, delimited by a semicolon and a space.
This is similar to the previous exercise, except that extracting the long_name
part of each of the address_components
is slightly maddeningly complicated. However, if you take some time to think over the details, you might notice how having each address component be its own dictionary, with not just a long_name
key-value pair, might be very useful when trying to determine if a given result has certain geopolitical boundaries.
When you run d.py
from the command-line:
0010-map-json-responses $ python d.py
-
The program's output to screen should be:
Stanford; Santa Clara County; California; United States
0010-map-json-responses/e.py » Print the formatted address and longitude and latitude in the Google Maps geocoder JSON file
For each result in the Google Maps geocoder JSON file, print a semicolon-delimited list of:
- the
formatted_address
value - the
lng
value in thelocation
object within thegeometry
object - the
lat
value in thelocation
object within thegeometry
object
When you run e.py
from the command-line:
0010-map-json-responses $ python e.py
-
The program's output to screen should be:
Stanford, CA, USA;-122.1660756;37.42410599999999
0010-map-json-responses/f.py » Print the parameters used to query the Mapzen Search API and the type of data it responded with, as found in its JSON response,
Like the Google Maps Geocoder, the Mapzen Search API returns metadata along with a set of geocoded results.
Print out the type
of result returned, according to the JSON file.
Then read the geocoding
object, which has a query
object, and from that query
object, read the following key-value pairs, in this exact order:
Please print the key-value pairs for the following keys, in this order:
- text
- size
- boundary.country
When you run f.py
from the command-line:
0010-map-json-responses $ python f.py
-
The program's output to screen should be:
type: FeatureCollection text: Stanford size: 10 boundary.country: USA
0010-map-json-responses/g.py » Print the location, confidence score, and coordinates for each result in the Mapzen Search JSON response
For each Feature
-type object in the Mapzen Search JSON file, print out the following values in a semicolon delimited list:
- label
- confidence
- longitude (as found in the
Point
object’scoordinates
, withingeometry
) - latitude (as found in the
Point
object’scoordinates
, withingeometry
)
- The
longitude
is the first value inside eachcoordinates
object. - The
latitude
is in the second value inside eachcoordinates
object.
When you run g.py
from the command-line:
0010-map-json-responses $ python g.py
-
The program's output to screen should be:
Stanford, Santa Clara County, CA;0.949;-122.16608;37.42411 Stanford, Lincoln County, KY;0.945;-84.66189;37.53119 Stanford, Allin, IL;0.941;-89.21786;40.43476 Stanford, Judith Basin County, MT;0.94;-110.21826;47.15358 Stanford, Santa Clara County, CA;0.737;-122.167340615422;37.4251401412163 Stanford, Oakland County, MI;0.731;-83.1792681531045;42.6751206714193 Stanford, Clay County, IL;0.725;-88.4167904448051;38.6696905856532 Stanford, Isanti County, MN;0.725;-93.407649642371;45.4442114012486 Stanford, Dutchess County, NY;0.725;-73.6917318290657;41.8885062048017 Stanford, Lincoln County, KY;0.725;-84.6605602220764;37.5349694856098