I've created landing pages for your convenience:

COMM 113/213 - Winter 2016

Tuesday and Thursday, 1:30 to 2:50PM
Building 160, Room 322 [CourseExplorer link]

Getting in touch

Instructor: Dan Nguyen | @dancow | dun @ stanford
Office hours: Tuesday and Thursdays, 3 to 6PM. Or by appointment.
Slack Chat: stanfordcompciv.slack.com - Give this popular chat client a try if you haven't already. It might be an easier way to reach me and a much easier way for me to post code and so forth.

(The 2015 site is archived here)

About COMM 113/213

A winter elective on programming and journalism for the Stanford Computational Journalism Lab, taught by Dan Nguyen. The programming part involves modestly-written, simple programs which are then executed with simple brute force. The journalism part involves finding something important about the world.

This does not work as elegantly as we want, but once you've eliminated the boring, whatever remains, no matter how improbably, might be less boring.

You aren't expected to have much if any experience with programming. So we'll take advantage of the years of time you've spent reading and writing. You will be writing some code. And you will eventually be reading more code than you write.

Syllabus

Week 1: Base installation

An introduction to computers

We start out by being introduced to the modern personal computer (PC), to install plain text editors and a text-based programming languages, for efficiently working with text. Then we learn how to use the web browser and its development tools to see how webpages of text are generated by sending strings of text (including cats, but described in text) between computers.

Reading: Who Controls Your Facebook Feed

Week 2: Text and Regular Expressions

Reading: Why do Nigerian Scammers Say They are from Nigeria?

Week 3: Python lessons I - Conditional Statements and Control Flow

One of our goals in programming is to not let have to be there when the program does its work. The concept of a "block" will have us writing code more deliberately, and almost exclusively in our text editors.

Week 4: Python Lessons II - Functions and Data Structures

Week 5: Standards and serializing text as data

Hard to believe, but data formats, such as JSON, were designed to make data both efficient for machines and humans to consume. If you can read JSON, you can basically do the kinds of interesting data mishmashes that make startups and apps seem magical. Airbnb is a startup that uses Facebook data, which you give it, combined with the data of its customers. Tinder does that too, probably in a much easier way. And Tinder also uses JSON under its hood, apparently.

Weeks 6 and 7: Application Programming Interfaces

Learning programming without being connected to other computers is like learning a foreign language from a book.

Learning programming through APIs is like learning a foreign language by visiting a foreign country with its permission, and sometimes its hospitality.

Something to think about:

APIs are a technical thing. But their existence, their design, and their availability reflect things about their owners and the data that they distribute.

Another way to put it: The Dallas Police Department as an API (via the Socrata portal) of police involved shootings. Virtually no other police department does. Why?

Week 8: Web protocols

An overview of HTTP and the Python Requests library, and how the URL query string is used to define data resources, e.g. Google Static Maps and Google Street View.

Week 9: Web publishing and command-line applications

HTML is a form of structured text and provides a convenient way for us to build user-facing data files and analysis. Designing command-line applications allow us to build scripts that provide more direct interaction with data functions.

Week 10: Project week

This week reserved for class time in working on projects.

Grading

Attendance policy

Give notice several days in advance. If necessary, we can arrange for you to do a short-term project.

Things to do (for Thursday, January 7)

I want you to install some things. Then ask me questions. There's not much else to do so don't do anything you don't know that you don't know how to do.

Learn how to switch applications with your keyboard

Email me (dun at stanford.edu) at least once

Install Anaconda 3

You may already have Python on your system. If you already really, really know what you're doing (i.e. you used pyenv to set things up. Or you think you could manage that, then you can ask me for help), then you can ignore this part.

Otherwise, this is like installing any other program to your computer. Though keep in mind the file size is quite hefty. Email me if you are have hard drive space limitations, e.g. fewer than a few gigabytes.

The instructions are here. However, to make things consistent, don't download the most recent file. Download the most appropriate installer from the archive:

If you are unsure of anything, just email me.

Other Python things

Let's wait till Thursday before we try to install other packages. Although if you really know what you're doing, you can try to look up and install, in this order:

If you can do this, then you can try out the face detection script, which will be the most complicated kind of script we can copy from and execute. It's actually not that important, it's just an example of what we can do with just text.

Download and install Sublime Text 3

Sublime Text 3 is a plaintext editor. It will be your primary, and possibly only tool you'll need to write and organize programs.

Join and add something to Github

Do some reading

Always bet on text by Graydon Hoare:

Text is the most socially useful communication technology. It works well in 1:1, 1:N, and M:N modes. It can be indexed and searched efficiently, even by hand. It can be translated. It can be produced and consumed at variable speeds. It is asynchronous. It can be compared, diffed, clustered, corrected, summarized and filtered algorithmically. It permits multiparty editing. It permits branching conversations, lurking, annotation, quoting, reviewing, summarizing, structured responses, exegesis, even fan fic. The breadth, scale and depth of ways people use text is unmatched by anything. There is no equivalent in any other communication technology for the social, communicative, cognitive and reflective complexity of a library full of books or an internet full of postings. Nothing else comes close.

So this is my stance on text: always pick text first.

We'll actually be reading a lot about Facebook over the course, so including it here is a bit overkill. Though this recent development (as of yesterday) is worth talking about. Most of the news about Facebook's research concerns their data science, but check out their publications page for a long list of scientific papers. Expect to be OK with being able to find the bigger picture, even if the math escapes you.

Unicode: A story of corruption, connection, and smiling poo

Unicode is traditionally something programmers hate, because of the bugs it causes in programs that read text (i.e. basically all of them). It took me awhile to realize that Unicode is one of the most amazing creations of

Aren't Nigerian Prince Scams such an Obvious Scam?

Yeah, those graphs seem intimidating…I don't know if I could easily explain them in English. But try to pick out the "bigger picture" reason – what is it about email, the cost of email, the number of scammers vs. number of victims, and most importantly, the type of victims that makes, "Hey, I'm a prince from an exotic place, send me money" seemingly effective?

Related:


Example Text problems

Text is a valuable programming interface for communicating with computers. It can lead to many fundamental calculations and algorithmic computations, including: