Most Web APIs require you to pass in configuration values via a URL query string. Creating these strings is a matter of reading the API’s documentation, and then either doing the mind-numbing work of manually creating the query strings. Or using Python’s urllib parsing modules to do it for you.
A typical URL looks very much like a system file path, e.g.
http://www.example.com/index.html
A query string is a convention for appending key-value pairs to a URL. The standard URL for the New York Times's website's New York section is this:
http://www.nytimes.com/section/nyregion
However, if you click on the New York tab via the nytimes.com homepage, you'll notice that a whole bunch of characters are appended to the URL:
http://www.nytimes.com/section/nyregion?action=click&pgtype=Homepage
The question mark ?
denotes the separation between the standard URL and the query string. Everything after that is a key value pair, with each pair separated by an ampersand, &
. The equals sign =
is used to separate key and value.
So the key-value pairs in the above query string are:
key | value |
---|---|
action | click |
pgtype | Homepage |
Or, more to our purposes, this is what those keypairs would look like as a dictionary:
{
'action': 'click',
'pgtype': 'Homepage'
}
What do those actually do? That's actually a question we can't answer, unless we're running the nytimes.com servers. Though it's safe to assume that the NYT uses the query string in its analytics, so that it can tell how many people visited http://www.nytimes.com/section/nyregion
via the homepage and by clicking on some button.
Other web services have query strings that serve a more obvious purpose. For example, DuckDuckGo, which has this URL endpoint:
https://www.duckduckgo.com/
However, if we append a key-pair value as a query string, with q
being the key (think of it as an abbreviation for "query") and the value being the term we want to search for, e.g. Stanford
, then DuckDuckGo will return search results for Stanford
:
https://www.duckduckgo.com/?q=Stanford
It's hard to tell these days because modern web browsers allow us to literally type in anything from the keyboard. Unless we prepend the input with http://
, the text is just sent as is to Google or DuckDuckGo or whatever your default search engine is.
However, this is just a convenient illusion. When we type into your browser bars, say, something with a whitespace character, e.g.
Stanford University
Before sending it to the search engine, the web browser will actually serialize it as:
Stanford%20University
This is because whitespace characters are not allowed in URLs, so the token %20
is used to represent it. Basically, almost everything that is not an alphanumeric character needs to have this special encoding.
In the olden days, you'd have to remember how to do these encodings or else the browser would throw you an error. Now, the browser just fixes it for you, not unlike an auto spellchecker.
Of course, when programming in Python, things still work like the olden days – i.e. we're forced to be _explicit.
Here's what happens when you use the urlretrieve
method that comes via Python's built-in module urllib.request
:
from urllib.request import urlretrieve
thing = urlretrieve("https://www.duckduckgo.com/?q=Stanford University")
An error is raised:
HTTPError: HTTP Error 400: Bad Request
We have to throw in the %20
ourselves to avoid the error:
thing = urlretrieve("https://www.duckduckgo.com/?q=Stanford%20University")
Trying to remember which characters are invalid, nevermind manually escaping them with percent signs, is a maddening task. That's why there's a built-in Python module – urllib.parse – that contains an appropriate method: quote
.
Try it out via interactive Python – note that parse
doesn't actually do any URL requesting itself – it is a method that does one thing and one thing well: making strings safe for URLs:
>>> from urllib.parse import quote
>>> quote("Stanford University")
'Stanford%20University'
>>> quote("I go to: Stanford University, California!")
'I%20go%20to%3A%20Stanford%20University%2C%20California%21'
In combination with the previously-tried urlretrieve method:
from urllib.request import urlretrieve
from urllib.parse import quote
qstr = quote("Stanford University")
thing = urlretrieve("https://www.duckduckgo.com/?q=" + qstr)
In the previous example, having to type out that q=
should also seem tedious to you. Once again, urllib.parse has a method for that: urlencode.
Try it out in interactive Python:
>>> from urllib.parse import urlencode
>>> mydict = {'q': 'Stanford University, whee!!!'}
>>> urlencode(mydict)
'q=Stanford+University%2C+whee%21%21%21'
Note that the urlencode
method includes the functionality of the quote
function, so you probably rarely need to call quote
on its own. Also note that urlencode
uses the plus
sign to encode a space character in a URL…which is basically as valid as using %20
. Again, the confusing rules and standards are yet another reason to delegate this string parsing to the proper Python libraries.
And, again, urlencode
does not actually fetch the URL. We still have to use urlretrieve
:
from urllib.request import urlretrieve
from urllib.parse import urlencode
mydict = {'q': 'whee! Stanford!!!', 'something': 'else'}
qstr = urlencode(mydict)
# str resolves to: 'q=whee%21+Stanford%21%21%21&something=else'
thing = urlretrieve("https://www.duckduckgo.com/?" + qstr)
Note that we also have to include the ?
, which is always used to set off the query string from the first part of the URL.
Also note that programatically fetching search queries via DuckDuckGo (or Google, for that matter)…is not very effective. I just use it as an example so that you can see what the URL turns out to be and test it in your browser.
What about the Requests library, which we've been using to fetch URLs for the most part? Well, true to its slogan of being "HTTP for Humans", the Requests library neatly wraps up all that urllib.parse
functionality for us.
Just use the requests.get
method with a second argument (the name of the argument is params
):
import requests
url_endpoint = 'https://www.duckduckgo.com'
mydict = {'q': 'whee! Stanford!!!', 'something': 'else'}
resp = requests.get(url_endpoint, params=mydict)
Let's work with a more fun, visual API: the Google Static Maps API
(For more information on Google Static Maps API, check out: Visualizing Geopolitical Sensitivities with the Google Static Maps API)
As with most APIs, Google Static Maps starts out with a URL endpoint:
https://maps.googleapis.com/maps/api/staticmap
At a minimum, it requires a size parameter, with a value in the format of WIDTHxHEIGHT
:
https://maps.googleapis.com/maps/api/staticmap?size=600x400
Here's what that map looks like:
Let's add another parameter: zoom
https://maps.googleapis.com/maps/api/staticmap?size=600x400&zoom=4
And let's change where the map is centered around with the center parameter, which takes any string that describes a human-readable location:
https://maps.googleapis.com/maps/api/staticmap?size=600x400&zoom=8¢er=Chicago
Or, we can pass in a latitude/longitude pair:
https://maps.googleapis.com/maps/api/staticmap?size=600x400&zoom=8¢er=42,-70
Maybe you're thinking: this is easy, handcoding the URL parameters. Let's do something tricky. The markers parameter lets us add markers to the map. It takes a location string (just like center):
https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=Stanford,CA
However, the API allows for the marking of multiple points (hence, the parameter plural name of "markers"). The standard for URL query strings when multiple values have the same key is to repeat the key, In other words, to show markers
for both Stanford,CA
and Chicago
, we include this in the query string:
markers=Stanford,CA&markers=Chicago
e.g.
https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=Chicago%7Ccolor:blue%7Csize:tiny
Still seem simple? Let's try to add different colors. And icons. According to the documentation on markers, styling a marker involves setting "a series of value assignments separated by the pipe ( | ) character": |
Because both style information and location information is delimited via the pipe character, style information must appear first in any marker descriptor. Once the Google Static Maps API server encounters a location in the marker descriptor, all other marker parameters are assumed to be locations as well.
By default, the map markers are red. To make a marker green, this is what the markers
value is set to:
markers=color:green|Chicago
e.g.
https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=color:green|Chicago
![https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=color:green | Chicago](https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=color:green | Chicago) |
We can also change the icon's size. And give it a letter. Here's the value for a blue icon for Chicago, with a label consisting of the letter "X":
markers=color:blue|label:X|Chicago
![https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=color:blue | label:X | Chicago](https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=color:blue | label:X | Chicago) |
Technically, while the URL examples above will work in a modern browser, pipe characters aren't allowed in URLs. They should be escaped with %7C
:
markers=color:blue%7Clabel:X%7CChicago
If that's not convoluted/ugly enough for you, the Google Maps API lets you use custom icons. Here's how to make a marker using President Obama's face (found at the following URL:
http://www.compciv.org/files/images/randoms/obamaicon.png
The corresponding markers
value:
markers=icon:http://www.compciv.org/files/images/randoms/obamaicon.png
However, as you can imagine, some of those characters in the URL are not allowed as is in the query string. Think about it; we're putting a URL inside another URL…how would a primitive browser easily parse that? The URL example above is pretty simple, but since URLs can contain any number of strange characters, we should assume many of those characters will have to be encoded in that percent-fashion. From the Google API's explanation:
Note: The multiple levels of escaping above may be confusing. The Google Chart API uses the pipe character (
|
) to delimit strings within its URL parameters. Since this character is not legal within a URL (see the note above), when creating a fully valid chart URL it is escaped to %7C. Now the result is embedded as a string in an icon specification, but it contains a % character (from the %7C mentioned above), which cannot be included directly as data in a URL and must be escaped to %25. The result is that the URL contains %257C, which represents two layers of encoding. Similarly, the chart URL contains an & character, which cannot be included directly without being confused as a separator for Google Static Maps API URL parameters, so it too must be encoded.
This is what the URL ends up being:
https://maps.googleapis.com/maps/api/staticmap?size=600x400&markers=icon:http%3A//www.compciv.org/files/images/randoms/obamaicon.png%7CChicago
And here's Obama's head, floating over Chicago:
OK, now that we've seen how to do things the painful and old-fashioned way, let's use the urllib.parse.urlencode
method to do the painful work of creating the URL query strings.
First, let's import urlencode
and set up the constants, i.e. the variables that won't change:
from urllib.parse import urlencode
import webbrowser
GMAPS_URL = 'https://maps.googleapis.com/maps/api/staticmap?'
Importing webbrowser
is optional. But it allows you to conveniently test out the URLs from your Python code:
import webbrowser
url = "https://maps.googleapis.com/maps/api/staticmap?size=600x400"
webbrowser.open(url)
(You should be doing this in interactive Python)
Here we go.
Only the size
parameter needs to be specified:
mydict = {'size': '600x400'}
url = GMAPS_URL + urlencode(mydict)
# https://maps.googleapis.com/maps/api/staticmap?size=600x400
This is simply another key-value pair in the dictionary:
mydict = {'size': '600x400', 'zoom': 4}
url = GMAPS_URL + urlencode(mydict)
# https://maps.googleapis.com/maps/api/staticmap?size=600x400&zoom=4
Again, just another key-value pair
mydict = {'size': '600x400', 'zoom': 9, 'center': 'Stanford, California'}
url = GMAPS_URL + urlencode(mydict)
# https://maps.googleapis.com/maps/api/staticmap?center=Stanford%2C+California&zoom=4&size=600x400
markers
is just another key-value pair, when we're just adding a single marker:
(note that I've removed the center
param…the Google Static Maps API is smart enough to just auto-center the map around the marker)
mydict = {'size': '600x400', 'zoom': 14,
'markers': 'CoHo Café at Stanford'}
url = GMAPS_URL + urlencode(mydict)
# https://maps.googleapis.com/maps/api/staticmap?markers=CoHo+Caf%C3%A9+at+Stanford&zoom=14&size=600x400
OK, adding multiple markers is where things get slightly complicated. We can represent a list of location strings by using, well, a list of strings:
locations = ['Stanford, CA', 'Berkeley, CA']
However, we must call urlencode
with the doseq
argument set to True
(try omitting the argument to see what results on your own):
locations = ['Stanford, CA', 'Berkeley, CA']
mydict = {'size': '600x400', 'markers': locations}
url = GMAPS_URL + urlencode(mydict, doseq=True)
# https://maps.googleapis.com/maps/api/staticmap?markers=Stanford%2C+CA&markers=Berkeley%2C+CA&size=600x400
Just for fun, serialize a bunch of locations into a Google Static Maps URL:
pac12 = ['University of Arizona, AZ',
'Arizona State University, AZ',
'University of California, Berkeley, CA',
'University of California, Los Angeles, CA',
'University of Colorado Boulder, CO',
'University of Oregon, OR',
'Oregon State University, OR',
'University of Southern California, CA',
'Stanford University, CA',
'University of Utah, UT',
'University of Washington, WA',
'Washington State University, WA']
mydict = {
'size': '600x400','markers': pac12
}
url = GMAPS_URL + urlencode(mydict, doseq=True)
The resulting image and URL:
It's not a lot of fun creating lists by hand. So let's use a list from an official government source: the USGS Earthquake Hazards program.
The example below is a demonstration of the csv
module and the DictReader
function, which can be used to create a list of dictionaries from a CSV file.
import csv
import requests
from urllib.parse import urlencode
import webbrowser
USGS_URL = 'http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/significant_month.csv'
resp = requests.get(USGS_URL)
lines = resp.text.splitlines()
earthquakes = csv.DictReader(lines)
coordinate_pairs = []
for quake in earthquakes:
coordinate_pairs.append(quake['latitude'] + ',' + quake['longitude'])
# create a URL based on Google Static Maps API specs
GMAPS_URL = 'https://maps.googleapis.com/maps/api/staticmap'
query_string = urlencode(
{'size': '800x500', 'markers': coordinate_pairs},
doseq=True)
url = GMAPS_URL + '?' + query_string
webbrowser.open(url)
Here's the resulting URL (as of February 9th, 2016):
Here's a slightly more cleaned-up version of the code that uses the Requests library to "prepare" a URL:
# slightly more Pythonic, cleaner version
from csv import DictReader
import requests
import webbrowser
USGS_URL = 'http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/significant_month.csv'
GMAPS_URL = 'https://maps.googleapis.com/maps/api/staticmap'
# get the USGS data, create a list of lines
lines = requests.get(USGS_URL).text.splitlines()
# get the latitude/longitude pairs
coordinate_pairs = ["%s,%s" % (q['latitude'], q['longitude']) for q in DictReader(lines)]
# this is another way of serializing the URL
preq = requests.PreparedRequest()
preq.prepare_url(GMAPS_URL, {'size':'800x500', 'markers': coordinate_pairs})
webbrowser.open(preq.url)
Before we get into the tricky business of styling the markers, let's wrap up the functionality that we've been using to create a proper Google Static Maps URL into a function:
(assuming you've read the short guide on making functions: Function fundamentals in Python)
Here's a bare-bones implementation, in which a user only has to specify a list (or just a string, if there's only one location) of locations and, optionally, width and height. The foo_goo_url
function does the work of serializing the input into proper Google Static Maps API format.
At the end, it returns the URL as a string object:
def foogoo_url(locations, width = 600, height = 400):
from urllib.parse import urlencode
gmap_url = 'https://maps.googleapis.com/maps/api/staticmap'
size_str = str(width) + 'x' + str(height)
query_str = urlencode({'size': size_str, 'markers': locations}, doseq=True)
return gmap_url + '?' + query_str
Here's how you would use it, interactively:
>>> foogoo_url('New York, NY', height='200')
'https://maps.googleapis.com/maps/api/staticmap?markers=New+York%2C+NY&size=600x200'
>>> foogoo_url(['Wyoming', 'Alaska'])
'https://maps.googleapis.com/maps/api/staticmap?markers=Wyoming&markers=Alaska&size=600x400'
Testing out those URLs by pasting them into the web browser is so time-consuming. So let's make another function. This one doesn't return anything. Instead, it takes the same parameters as foogoo_url
, but passes them directly into foogoo_url
, and then passes the result of that into webbrowser.open
, which performs the action of opening a webbrowser:
(note that this definition assumes that foogoo_url
has been defined earlier)
def foogoo_browse(locations, width=600, height=400):
import webbrowser
url = foogoo_url(locations, width, height)
webbrowser.open(url)
(Note: This section ends up veering out of plain URL creation and into application and function design…I end up not quite finishing it…)
This is where it gets tricky. As a reminder, markers
takes a pipe-delimited string to separate the style configuration, e.g.
markers=color:green|label:Z|Chicago
However, that's a convention of Google's own making, because they needed a way to do key-value pairs (e.g. label
= Z
) that is independent (or rather, nested) in the way that key-value pairs are done in URL query strings.
So basically, we have to hand-create the string ourselves:
marker_styles = {
'color': 'orange',
'label': 'X'
}
marker_config = []
for k, v in marker_styles.items():
s = str(k) + ':' + str(v)
# e.g. 'color' + ':' + 'orange'
marker_config.append(s)
# the location always comes last, as specified by the Google API
marker_config.append('Chicago')
# now join the list elements together as a pipe-delimited string
marker_string = '|'.join(marker_config)
The result of the code above would result in the marker_string
variable pointing to a string like this:
color%3Aorange%7Clabel%3AX%7CChicago
Which we can then pass into the locations
argument of our previously defined foogoo_url
function:
foogoo_url(marker_string)
Which generates a URL like this:
https://maps.googleapis.com/maps/api/staticmap?markers=color%3Aorange%7Clabel%3AX%7CChicago&size=600x400
Defining the marker style certainly got complicated…it's so complicated that it probably deserve its own method.
I'm going to skip the full explanation, or bother even creating what I consider to be the best real-world implementation of a create_styled_marker
function. We can cover it in another lesson, but the main takeaway is: look at how we can use functions and Python data structures, such as lists and dictionaries, to create text strings useful for communicating with other services.
Without further elaboration, here's how to turn the previous icon mapping code into a reusable function:
def create_styled_marker(location, style_options={}):
mconfig = []
for k, v in style_options.items():
mconfig.append(str(k) + ':' + str(v))
mconfig.append(location)
return '|'.join(mconfig)
Note: if lists and dictionaries are old hat to you and you understand list comprehensions, as well as string formatting, here's a fancy pants Pythony-version:
def create_styled_marker(location, style_options={}):
opts = ["%s:%s" % (k, v) for k, v in style_options.items()]
opts.append(location)
return '|'.join(opts)
Here's all the relevant code to create a quickie-let's-make-a-Google-Static-Maps-API convenience wrapper, as one big script:
Note that I've drastically modified foogoo_url
from the previous demonstration. See if you can untangle the reasoning…but it's not worth explaining in full since this isn't a lesson on application design…
from urllib.parse import urlencode
import webbrowser
def create_styled_marker(location, style_options={}):
opts = ["%s:%s" % (k, v) for k, v in style_options.items()]
opts.append(location)
return '|'.join(opts)
def foogoo_url(locations, width = 600, height = 400, maptype='terrain'):
gmap_url = 'https://maps.googleapis.com/maps/api/staticmap'
size_str = str(width) + 'x' + str(height)
# note: this is messy, and it has more to do with opinions about interface
# design than the core lesson...
if type(locations) is not list:
markers_objects = locations
else:
markers_objects = []
for loc in locations:
if type(loc) is str:
obj = create_styled_marker(loc)
else:
obj = create_styled_marker(loc[0], loc[1])
markers_objects.append(obj)
#finally, make the query string
mapopts = {'size': size_str,
'markers': markers_objects,
'maptype': maptype}
query_str = urlencode(mapopts, doseq=True)
return gmap_url + '?' + query_str
def foogoo_browse(locations, width=600, height=400):
import webbrowser
url = foogoo_url(locations, width, height)
webbrowser.open(url)
And when the functions are defined and loaded into the interpreter, this is how we call the functions:
foogoo_browse("Stanford University, California")
# a list of strings
foogoo_browse(['Stanford, CA', 'Chicago, IL'])
# multiple kinds of objects
mylist = []
mylist.append(['Stanford, CA', {'color': 'red'}])
mylist.append(['Berkeley, CA', {'color': 'yellow', 'size': 'small'}])
foogoo_browse(mylist)