In Rudder’s new book, Dataclysm: Who We Are* (*When We Think No One’s Looking), he expands on his OKTrends blog, parsing Big Data and yielding some interesting insights regarding what we talk about when we’re looking for things like love and friendship online. He provides some fascinating information on race (Belle and Sebastian: statistically the whitest band alive), politics, and where we choose to live, among other things, and the unifying factor is Rudder’s lively, clear prose, which makes heady concepts understandable and transforms the book’s many charts into revealing truths. As someone who could be described as a skeptic regarding data journalism today, I was pleasantly surprised by how engaging this book was and how much I enjoyed it. Rudder teaches us a bit about how wonderfully peculiar humans are, and how we go about hiding it.
I spoke with Rudder in the basement of the Brooklyn bookstore Word, right before a presentation on Dataclysm. We talked about big data, along with some of his other fascinating projects: the indie band Bishop Allen, whose new album Lights Out was released last month; and Funny Ha Ha, Andrew Bujalski’s 2002 film that kicked off the so-called “mumblecore” genre, in which Rudder starred.
Flavorwire: I have talked a lot of shit about data journalism online, and I was pleasantly surprised by how much I liked Dataclysm. For me, most data journalism — like FiveThirtyEight — feels vaguely sociopathic, and yours really didn’t. How’d you humanize it?
I think with FiveThirtyEight they’re trying to be objective from the very beginning and take a method and find places to apply it, which is a very tough way to work. Whereas with OkCupid the problems come to me, and I can decide to do whatever with them. The word choice is debatable.
I’m sympathetic to their problem. That is their imperative. That’s their reason to exist. That aside, I have a wide range of interests and appreciations, and I don’t feel like I have an axe to grind, one way or the other. I think that helps. It’s the axe part that gets tough.
I think that example in FiveThirtyEight, they were handed that axe by ESPN. It’s not like they’ve all been maniacal data journalists for their entire lives, it’s the theme of the thing.
You’re able to be more curious in Dataclysm.
Right! Look, I think anyone who works with this stuff will freely admit the limitations of it, especially where it is right now. And it’s cool to see how people come together on OkCupid, but I’m not going to tell you that I know how love works, or whatever. And that’s true not just of me but everyone in the office. I’m sure people in a lot of places would say the same thing. It’s just getting started, and it’ll be interesting to see where it all goes. I mean, I started the book before Snapchat was really like a thing, or that variety of Internet app or whatever you want to call it. And of course, that stuff is even further behind. It’s private again!
You got in trouble recently regarding your blog post that said, “We experiment on human beings!”
Yeah, that’s true, I did get in trouble. But mostly that’s because the tone of the blog is cavalier, it is there to entertain, often with humor, sometimes with sarcastic humor and so forth. It was really just a coincidence that it was topical at that point. It wasn’t like we waited to get this post out and it was ready since March or anything. And people who just dropped into this situation were like, “What the fuck is wrong with these people?” When really everything on the blog is treated in the same cavalier way. Part of the problem was a byproduct of the positioning of the explanation of what was going on.
What did OkCupid users think? Did they care?
No, no, OkCupid users did not care at all. In the intervening six weeks or whatever it is, probably four million people logged in, and nobody complains now, obviously, because it’s a dead issue. Even at the time I think the total number of complaints was somewhere in the low two digits.
Of course, if you’re trying to make a moral or ethical argument, the number of people who complain is irrelevant, I recognize that, but I think the core of the ethical argument is that these people were surprised somehow that this was going on and that it betrays some essential trust. I would point to the lack of complaints as evidence that that trust wasn’t necessarily breached, and that also, [considering] our tagline — we use math to get you a date — people expect a certain amount of analysis. We get more complaints, even in that short time when there was so much press about it, about other users, their credit card doesn’t work, my email address is wrong, just very mundane, banal type of things. Never got above that stuff. So.
I lived in Boston for a long time, so I saw Funny Ha Ha a million years ago at the Brattle.
That’s funny. The time between shooting the movie and the movie being completed — the person who plays Marnie [Kate Dollenmayer, one of the great one-off film performances] in the movie was my girlfriend, we broke up and I was like, “I don’t want to do this movie!” So I didn’t go to the release or any of that stuff. I rigorously avoided it.
Dollenmayer has that beautiful mystique of just doing one movie, wonderfully. Have you seen Funny Ha Ha again?
I’ve definitely seen it. I might’ve watched it like four or five years ago. One time I was flipping through IFC, or Sundance Channel or something, and it was on, and I was like, “This is really weird.”
So much of the movie is about how dreamy you are.
Well, in a weird way. We all were friends, and everyone ends up playing — and Andrew included — everyone ends up playing the weird, distorted-in-a-bad-way versions of themselves. Which is cool, but it’s hard to watch from the inside perspective, like, am I really like that? I’m kind of like that, but — shit!
What statistics and other crazy facts about human nature did you discover while researching this book?
Honestly, some of the craziest stuff were things where these guys in the UK looked at Facebook likes and — it’s insane, that from just your likes, forgetting your social network or pictures — that you can tell, with incredible degrees of certainty, shit about you, down to your race, to 95 percent. Which makes sense, if you’re really into Tyler Perry or whatever you can probably make a guess about your ethnicity. But you know, sexuality — it was at 85 percent, and kind of like all the way down to “were your parents divorced,” which is 50 percent.
Which is kind of intense, because it’s not a demographic fact about you, it’s just something that happened in your life history, especially because likes have only been around for five years. That’s not very much time. I’m 39, so I was starting to realize I knew kids whose parents had been divorced around maybe ’85 or whatever, and they were into Ozzy Osbourne, Judas Priest. I remember this one kid, I went to his house and he wanted to stay up all night and watch the Ozzy Osbourne concert on HBO… The kids I knew who were from stable, more normal households back then were into REM or whatever. You can see it in life, but it’s cool that they were able to actually pull it away from a “this one guy one time” into a thing that’s more legit.
The public’s relationship with data is complicated, especially these days. The NSA leak and government surveillance seem to have loosened this distrust among the general populace. What is the most interesting thing you’ve been discussing about data with people?
The thing that’s been most interesting to me is to see that the conversation is evolving, whether it’s these experiments or Reddit’s self-reflection over this Jennifer Lawrence crap. People are kind of like, “What are we doing here?” I’m not trying to tell you that people suddenly are gonna be more self-aware on the Internet. I think from the experiment thing — I think — we don’t need to get into the weeds about whether it is or is not proper that these things happen, but people should understand that websites are changing and what you see as a user is not what these sites are taking from it.
I just see less conflation now than six months ago about issues of privacy and hacking. Now there’s an understanding… that data loss has nothing to do with hacking at all. I’ve been encouraged by the evolution of the public’s knowledge.