Flood of digital data gives ‘God’s eye view,’ needs regulation


Scott Shumaker

In “Mitt,” a documentary streaming on Netflix, a camera follows former presidential candidate Mitt Romney and his family during the former governor’s 2008 and 2012 presidential bids. In the film, the Romney’s can be seen looking up La Guardia airport restaurants on a phone, playing NPR’s “This American Life” from a phone while eating lunch, and hearing the bleep of text messages during the most intimate family moments.

It was probably not the film’s intent, but “Mitt” captures the importance of mobile devices today in a striking way.

Realizing that everything the Romney’s did with the film’s ubiquitous phones was stored away somewhere in cyberspace, you can see how mobile devices are also marvelous data collectors. This is the digital information that Dr. Alex (Sandy) Pentland, professor of computer science at MIT, has described as our “breadcrumbs” to the online publication “Edge”.

These digital breadcrumbs have accumulated into vast libraries, and they have lots of stories to tell. Search histories, purchases and GPS locations are just a few data sources now mined by researchers to understand and predict human behavior. Mobile devices have fulfilled what must have been many scientists’ wildest dreams just a few years ago— hooking up sensors to millions of human beings.

Pentland, who was named one of Forbes Magazine’s “Top 7 Most Powerful Data Scientists,” is a leader in this field. To his credit, he promotes transparency in digital data collection and calls for a New Deal in data privacy, but he is also advancing the power and sophistication of “data-mining,” Some of the things Pentlend told the Edge are uncomfortable.

“By analyzing that stuff, I can tell an enormous about you. I can tell whether you pay back loans, whether you are likely to get diabetes,” he says. “I can tell you all sorts of things about what you would like and not like. People are enmeshed in a social fabric that determines the sorts of things that they think are normal, that they’ve learned from each other. So if I can see some of the things you do, I can infer the rest.”

This is slightly scary stuff. Still, Big Data analysis could work miracles by doing things like detecting outbreaks of disease at their very beginning. Pentland uses the example of a digital network being able to detect that a lot of people living in the same apartment building have stayed home sick on the same day. And look at Google. The search engine scans your emails to help you find information much faster— but also select better ads for your viewing.

In part, I welcome the Information Age to make our lives better with smarter designs of—well, everything. Without data, a lot of systems are best guesses about what will work. But I fear advertisers, private companies, and power groups like political parties could use the power of Big Data to wield ever-greater influence over our perceptions and actions. The analytical and predictive power of Big Data could become a more powerful weapon than the layperson comprehends. Pentland himself wrote in a 2009 report to the World Economic Forum that, “these new tools have the potential to make George Orwell’s vision of an all-controlled state into a reality.”

I have a proposal: Let’s decide today at the birth of the Information Age that our digital breadcrumbs, those seemingly insignificant traces of our online life, are something a bit sacred. We should think of our digital selves a bit like our nude selves— an intimate and exposed condition. Therefore, all of our digital traces should by default be kept private.

This does not mean we won’t learn from the revolutionary flood of “people data” we now have, but there should be two changes implemented in this brave new world. First, as Pentland himself urges in his “New Deal on Data,” ownership of digital breadcrumbs should be placed firmly in the hand of the person who generates it. When I type a search word into Google, I should own that information, not Google. If you want to keep your online behavior off-limits to analysis, or limit the ways it can be used, that should be made simple and easy to do.

Finally, the most sensitive or revealing digital data should be tightly regulated, and accessible only to qualified researchers or companies for certain purposes. Let’s face it, if handed total control our personal information tomorrow—GPS locations, Amazon purchases, Yahoo searches—many people today might sign on the dotted line without too much thought. But even when permission is granted, we as a society need to draw ethical lines around large scale digital data research, which Pentland describes as “providing us with a God’s eyeview of ourselves.” Technology companies and computer scientists should not be left to self-regulate, but should also receive guidelines from governmental and ethical leaders.