The conclusion to Seth Stephens-Davidowitz’s book might be something of a self-fulfilling prophecy. He argues – and I’m not spoiling the ending here – that data shows a disappointingly small number of people actually read the conclusion to books written by economists; contrasting the 3 per cent who apparently made it to the last page of Thomas Piketty’s ‘Capital in the 21st Century’ with the more than 90 per cent who finished Donna Tartt’s novel ‘The Goldfinch’.

He says: “One of the points of this book is that we have to follow the Big Data wherever it leads and act accordingly. I may hope that most readers are going to hang on my every word and try to detect patterns linking the final pages to what happened earlier. But, no matter how hard I work on polishing my prose, most people are going to read the first fifty pages, get a few points, and move on with their lives.”

Full disclosure – I’ve read the conclusion and some other parts but haven’t finished the whole thing yet. But I wanted to review it as Christmas approaches, since it’s just a terrific read and the perfect gift for your favorite anti-corporatist conspiracy theorist uncle or, simply, anyone who wants to better understand our relationship to the increasingly vast, online information society at a time when “big data” can make us as individuals feel quite small.

Harvard-trained economist Stephens-Davidowitz (his father is noted NYU media scholar Mitchell Stephens) was a data scientist at Google and his work centers on how ever-bigger data sources can reveal truths about individuals’ behavior, gathered, for example, through their Google searches, that are almost impossible to disguise – what he calls in the book a “digital truth serum.”

In an article for The Guardian at the time of the book’s release, he wrote:

The power in Google data is that people tell the giant search engine things they might not tell anyone else. Google was invented so that people could learn about the world, not so researchers could learn about people, but it turns out the trails we leave as we seek knowledge on the internet are tremendously revealing.

I have spent the past four years analysing anonymous Google data. The revelations have kept coming. Mental illness, human sexuality, abortion, religion, health. Not exactly small topics, and this dataset, which didn’t exist a couple of decades ago, offered surprising new perspectives on all of them. I am now convinced that Google searches are the most important dataset ever collected on the human psyche.

Everybody lies, he says. Public-facing social media allows us to present as honest – or dishonest – a picture of ourselves as we want people to see. We lie about ourselves and we lie to ourselves (perhaps that we’re not as prejudiced as we actually are). And then we’re surprised to find ourselves in a world where untruthfulness has limited consequences, swept aside in a catch-all description of “alternative facts” or even, yes, “fake news.”

As we approach the end of the first year of the Trump presidency (I’ll be taking a look back in a column next week) a new normal in the reporting of the administration has consisted of keeping various running totals of its exaggerations and outright falsehoods yet what does that accomplish, other than periodically remind us that as those instances grow, their listing in a newspaper is the only redress?

This, therefore, is probably the perfect book for helping with the ongoing process of forcing ourselves to consider what we know we know – what another Donald, former Bush defence secretary Rumsfeld, called the “known knowns” – about ourselves and about the world we inhabit.

In a section titled “The Truth About The Internet” Stephens-Davidowitz takes aim at notions of confirmation bias (citing Cass Sunstein by name) and arguing that while an assertion that the internet might be causing greater political segregation is a logical premise, “the data tells us that it is simply not true” and that “a surprising amount of the information people get on Facebook comes from people with opposing views.” He explains:

How can this be? Don’t our friends tend to share our political views? Indeed, they do.  But there is one crucial reason that Facebook may lead to a more diverse political discussion than offline socializing. People, on average, have substantially more friends on Facebook than they do offline. And these weak ties facilitated by Facebook are more likely to be people with opposite political views.

In other words, Facebook exposes us to weak social connections – the high school acquaintance, the crazy third cousin, the friend of the friend of the friend you sort of, kind of, maybe know. These are people you might never go bowling with or to a barbecue with. You might not invite them over to a dinner party. But you do Facebook friend them. And you do see their links to articles with views you might never have otherwise considered.”

Until, that is, you end up simply having to block them, if intra-family political discord is as polarizing and widespread as my personal experience seems to indicate.

Overall, this fascinating book opens a window onto people’s intimate fears, obsessions, anxieties, biases, self-absorbtion, curiosity, and shame.

But what it primarily concerns itself with is the compelling issue of the power of Big Data – prompting questions about what governmental and commercial entities know about us, how they harvest and organize that knowledge, and how they use it.  (There are signs in China, for example, that surveillance and massive data collection by the state will have huge implications for its future relationship with its citizens.)

No Chill

While the book initially appeared in the summer, some of the issues it raises jumped to prominence a couple of weeks ago when Netflix appeared to pull back the curtain as a way of promoting one of its movies.

The viral ‘joke’ soon turned into a full-blown debate about not just the scale and scope of the data a company like Netflix holds on its customers, but whether it might be sound marketing practice to use it to shame some of its most loyal ones (probably not, but was it funny? Almost certainly – there’s probably data on that somewhere.)

In the New York Times, Sapna Maheshwari wondered if “data mining can make for cute ads?” and reminded us that:

Spotify ran a similar campaign last year, bidding farewell to 2016 with messages like “Dear person who played ‘Sorry’ 42 times on Valentine’s Day, what did you do?” One message even referred to a specific Manhattan neighborhood: “To the person in NoLiTa who started listening to holiday music way back in June, you really jingle all the way, huh?” The company said this year that it had more than 140 million regular users, with 50 million paying for monthly subscription plans and others using a free service that comes with targeted ads.

Even though – and here’s an important point – we have no clue whether the data “revealed” by Netflix or Spotify is genuine or just the product of clever marketing; data collection, organization and mining, and the intelligence that flows from it, is clearly not going away. Neither, obviously, is the personalized targeting that allows the myriad data points you generate through your connected presence – from your browser history to your GPS to your selfie subjects – to build a more complete picture of your offline self.

Remember the variations on the old saying “if you don’t know what the product is, you’re the product?” Just think about that when you send off that swab of your partner’s saliva to some DNA company to find out their ancestry as a cute Christmas gift. You have no way of independently verifying the accuracy of whatever “analysis” they send you back, and now they have your DNA.

All told, this book would probably make a much less messier present.

Happy holidays.

They know when you’ve been sleeping, they know when you’re awake;

They know if you’ve been bad or good,

So be good, for goodness sake.Everybody Lies – What the Internet Can Tell Us About Who We Really Are is published by Bloombury.


Also published on Medium.