Insights and Impact

Big (Bad?) Data


man looks at his phone with someone looking over his shoulder

In 2018, users of a fitness tracking app discovered that mapping their casual runs could be used for something far more sinister than pinpointing the most popular trails in a city—the app’s data could reveal the location of secret military bases around the world, where service members were logging their daily exercise routines. The same year, the “Golden State Killer” was arrested after authorities matched decades-old DNA from a crime scene to genetic information that distant relatives of his had uploaded to a public genealogy website. 

These are just a few examples of how seemingly innocuous data can be used in ways for which it was not originally intended. What if your smart refrigerator sends data on your diet to your health insurance company? What if a menstrual tracking app makes data on women’s periods or pregnancies available to prosecutors in states that have banned abortion? 

“Today is just the first day of a very long life for your data,” says Aram Sinnreich, a professor and chair of communication studies in the School of Communication. “Whatever your data is being collected for right now, many other uses and abuses are going to happen downstream.”

For several decades, Sinnreich has been studying the intersection of technology, law, culture, and media. In 2019, he and longtime friend and collaborator Jesse Gilbert published an academic paper on a theoretical new idea suggesting that, because of the exponential growth in new computational technologies, there is no limit to how much knowledge can ultimately be produced from any object, event, or interaction.

Sinnreich and Gilbert knew, though, that few people outside their small research field would read their paper. They began to grow concerned that the general public and mainstream media sources were missing important parts of the story about data privacy in our interconnected, always-online society. 

“We wanted people to start thinking more about the second- and third-order consequences of sharing their data,” Sinnreich says. “It’s not only about whether you’re comfortable sharing your biometric data with Apple or sharing pictures of your face with Facebook; it’s whether you are OK with where that data ultimately ends up after its initial use.”

So the duo began interviewing technology, law, ethics, privacy, and public health experts around the world and writing a book for popular consumption. The Secret Life of Data: Navigating Hype and Uncertainty in the Age of Algorithmic Surveillance was published on April 30, a few days after Sinnreich and Gilbert spoke with an audience at AU about the book. 

“Data have no expiration date,” they told students, faculty, and other attendees at the event. “This book isn’t necessarily [about] the initial uses of data, but [rather] what happens after those applications, and what happens when the data becomes added to a universal network that gets correlated, corresponded, and co-analyzed with other forms of data.”

Gilbert, an interdisciplinary artist who uses software and technology to design installations that change in real time, says that most technology is “a double-edged sword.”

“Maybe one of your parents uploads their genetic information to a database and is found to have some kind of disease predisposition,” Gilbert says. “That information could save their life, but it also could affect your insurance coverage.”

In The Secret Life of Data, Sinnreich and Gilbert encourage readers to have a healthy dose of caution when it comes to sharing data—but they admit that it is nearly impossible to be completely cut off from the interconnected, ever-present world of data that we live in. 

As they write in the book’s introduction: “Whatever we think we’re sharing when we upload a selfie, write an email, shop online, stream a video, look up driving directions, track our sleep, ‘like’ a post, write a book, or spit into a test tube, that’s only the tip of the proverbial iceberg. Both the artifacts we produce intentionally and the data traces we leave in our wake as we go about our daily lives can—and likely will—be recorded, archived, analyzed, combined, and cross-referenced with other data and used to generate new forms of knowledge without our awareness or consent.”

Rather than give up all technology, the authors want readers to be cognizant of what they’re sharing and to what end. That means reading the fine print on apps and websites and being aware of the ways—both positive and negative—that data can be used. It also means gaining a better understanding of how personal decisions about data can impact people around you. 

“You might be OK with the idea of a voice-activated sensor in the privacy of your own home, but are all your visitors OK with it? We need to become more comfortable with developing an ethos of transparency around things like this,” Gilbert says.

A common refrain is that if you have nothing nefarious to hide, there is no reason to worry about digital surveillance. But Sinnreich says that view is shortsighted because of how hard it is to predict the future of data.

“The idea that you have nothing to hide is contextual to the current moment and to a technological and legal framework that could change in the future,” he says.

A woman using a fertility tracking app in Texas in 2022 had nothing to hide; by 2023 the data from that app could be used to prove an illegal abortion. When the US military collected biometric data like fingerprints, facial IDs, and retinal scans from 25 million Afghans, they were doing so to help support aid programs. But when that database was seized by the Taliban, it became a dangerous tool to expose citizens who had aided the American war effort. 

During the process of researching and writing their book, Sinnreich and Gilbert say they were both surprised at how rarely experts in different fields communicate about issues. Technology developers often don’t think about the long-term regulations surrounding the products they are making, for instance, and regulatory agents struggle to incorporate thinking about culture and society into their legal frameworks. 

“It just seems like nobody is really thinking about the higher-level impact of their contribution to the system,” Sinnreich says. “And if nobody—even the biggest decision makers—is thinking about what could go wrong, then it’s almost inevitable that everything will go wrong.”

That warning is part of what they want readers to take away—especially those who work in the tech industry or policy arena. 

Long term, preventing data misuse and invasions of privacy can’t be solely in the hands of consumers exercising caution, they say. Policymakers and companies must take responsibility for shaping the future of technology.

“We don’t currently have adequate regulation in this country, which has to do with the influence of corporations in the political process, as well as this unfettered growth mentality—we want tech companies to keep being on the leading edge of growth,” says Gilbert. “I think there is also a lack of ethical training within fields like computer science.”

In the years they were researching the book, and the months since they finalized a manuscript, technology has continued to evolve at a dizzying pace. But Sinnreich and Gilbert say they don’t expect their ideas to be outdated any time soon; even as new technologies and devices hit the market, the issues surrounding data privacy and sharing remain largely the same.

At AU, Sinnreich tries to promote the same kind of critical thinking about technology that he endorses in The Secret Life of Data. He has hosted cybersecurity speakers and mentors graduate students who work on issues surrounding digital privacy and data security. He also tries to model a technological caution in his everyday life: he doesn’t have an Alexa device in his house, he cancelled his Dropbox cloud storage account after learning that his documents could be used to train AI systems, and he doesn’t use networked printers at the university because he couldn’t verify how data sent to the printers was protected. 

“Since the day I came to AU, I’ve been involved in trying to further the public’s understanding of data,” he says.  “This book is one more way to do that.”

How much data is collected from internet users each year?

Says Sinnreich: “It’s literally incalculable, but a reasonable figure is probably hundreds of zettabytes. A hundred zettabytes is 100 sextillion bytes—or 1023 (100,000,000,000,000,000,000,000) bytes. But as our book argues, even if this number was accurate, it wouldn’t tell the whole story because it’s the connections between data that give tech its social power. The number of connections between 1023 bytes is so high, we probably don’t have a word for it.”