Originally published June 12, 2013; formatting updated
A bit of background about why I found this article so so so funny… .
Yesterday I had occasion to learn a bit about Joseph Warren, who among other things, helped craft the Suffolk Resolves of 1774 which denounced Parliament after the Boston Tea Party. Paul Revere carried the Suffolk Resolves to the Continental Congress in Philadelphia.
And, as long time readers of the blog and I guess now just about every civil servant knows, I like data. So this cracked me up.
The larger the score, the more connected a node is in the network– in this case a “Social Networke” of suspected terrorists. Note that Warren is third most connected. Here’s how the author of this really really really
funny serious article describes it:
Once again, I remind you that I know nothing of Mr Revere, or his conversations, or his habits or beliefs, his writings (if he has any) or his personal life. All I know is this bit of metadata, based on membership in some organizations. And yet my analytical engine, on the basis of absolutely the most elementary of operations in Social Networke Analysis, seems to have picked him out of our 254 names as being of unusual interest. We do not have to stop here, with just a picture. Now that we have used our simple “Person by Event” table to generate a “Person by Person” matrix, we can do things like calculate centrality scores, or figure out whether there are cliques, or investigate other patterns. For example, we could calculate a betweenness centrality measure for everyone in our matrix, which is roughly the number of “shortest paths” between any two people in our network that pass through the person of interest. It is a way of asking “If I have to get from person a to person z, how likely is it that the quickest way is through person x?”
That’s pulled from near the end of the article. Don’t be put off by talk of tables and matrices. The author, Kieran Healy, does an awesome job of walking even the data-adverse through an analysis of a sample of metadata–nothing more than the names of 254 men, and their membership in one or more of seven organizations, that’s all— to uncover a treasure trove of information that culminates in a most interesting person of unusual interest.
To give you a taste, here’s how the article– really a blog post— begins:
I have been asked by my superiors to give a brief demonstration of the surprising effectiveness of even the simplest techniques of the new-fangled Social Networke Analysis in the pursuit of those who would seek to undermine the liberty enjoyed by His Majesty’s subjects. This is in connection with the discussion of the role of “metadata” in certain recent events and the assurances of various respectable parties that the government was merely “sifting through this so-called metadata” and that the “information acquired does not include the content of any communications”. I will show how we can use this “metadata” to find key persons involved in terrorist groups operating within the Colonies at the present time. I shall also endeavour to show how these methods work in what might be called a relational manner.
The analysis in this report is based on information gathered by our field agent Mr David Hackett Fischer and published in an Appendix to his lengthy report to the government. As you may be aware, Mr Fischer is an expert and respected field Agent with a broad and deep knowledge of the colonies. I, on the other hand, have made my way from Ireland with just a little quantitative training—I placed several hundred rungs below the Senior Wrangler during my time at Cambridge—and I am presently employed as a junior analytical scribe at ye olde National Security Administration. Sorry, I mean the Royal Security Administration. And I should emphasize again that I know nothing of current affairs in the colonies. However, our current Eighteenth Century beta of PRISM has been used to collect and analyze information on more than two hundred and sixty persons (of varying degrees of suspicion) belonging variously to seven different organizations in the Boston area.
Please do take a couple of minutes to read the article. It’s loaded with stats jokes, but even if you aren’t a stats geek, you’ll still be entertained by the ideas of “bigge data” and “analytical engines” in 1772.
And after you’ve had a good laugh, contemplate this: all of the analysis Healy does, all of it– the tables, the matrices, the graphics, the conclusions– can be done on a laptop with a bare-bones statistics package.
Which is not to say that a Country Mouse Vegetable Farmer data geek could have easily done it. As you know, I don’t have a bare-bones stats package on my iComputer. But I do have
crappy old stats books. And paper and a pencil or two. And if I had to, I could do this. Ha.