At any given time, all of us are being watched. It may not feel that way, and if you spent time looking over your shoulder and checking down every path, you would not find anyone following. Instead of being tailed by surveillance agents, we are persistently tracked by our technology, both in action and inaction, and without our direct involvement.
This surveillance is not directly malicious, or at least it is not supposed to be. Your phone needs to know where you are—within 2000 feet—so that any messages or calls you receive can be properly delivered. The addition of a GPS—accurate to 16 or 17 feet—to cell phones is meant as an additional service. But these services are not without trade-offs. Our information and our privacy are forms of capital that are often traded away, perhaps without appropriate consideration. It is important to understand the value of our data and, to that end, it is necessary to know how data are used.
Take Facebook, Twitter and Gmail, for instance. What do each of these have in common? These are free services operated by large companies with significant overhead costs needed to keep the systems running. These services make money by selling ads that are then displayed to the users. When users click on an ad, a small fee is given to the company, Facebook for instance. In other words: if the service is free, then you are the product.
This in and of itself is not particularly revolutionary; after all, ads have been broadcast on the radio, displayed on the television and printed in magazines for ages. However, online ad delivery is different both in scope and personalization. Ads can be delivered to a far wider audience online than they ever could in other media; as of second quarter 2017, Facebook has two billion active monthly users, over a quarter of the world’s population.
But the true revolution in advertising is not in reach, rather precision. With radio, for instance, a single ad for diapers was played to all listeners. This advertisement was only relevant to parents who happen to be listening, for all others it was wasted time and thus wasted resources for the company. This imprecision was coupled with a lack of data on listener engagement. At best, a company could see that sales might have gone up after airing a radio ad, but precise diagnostic information was lacking. The best targeting advertisers could manage would be based on expected audience, like advertising toys during cartoons likely to attract a young audience.
Online engagement with ads can immediately and accurately be quantified—either a user clicks the ad or they do not. Not only this, but ad engagement can be measured on a user-by-user basis. This might not seem like a significant step, but it makes all the difference because advertisers can then employ machine-learning algorithms to iteratively determine what each person responds to most and display ads that are likely to get the highest engagement.
This process is refined by the addition of specific personal information, which can be used to identify other targets for advertisements, such as a user’s friends. This allows for targeted ads to be sent to individuals who rarely engage because interests are often shared within interpersonal networks.
These strategies are often referred to as “data driven” or “big data” due to the typically large databases employed. But, despite the seemingly miraculous results, big-data techniques are far from magic; rather, they are an expansion of existing methods to draw upon a larger and larger amount of information. Modern targeted advertising is what advertisers have been doing forever, but now they can market directly to the individual.
We have all probably experienced targeted advertising at least once. When the ads are obviously wrong—like offering home construction consultation to college students—it can be humorous, but more often than not these systems are fairly accurate and rather quickly self-correct. In the course of regular use, companies can accurately determine our interests with little more than how much we react to ads and what information we offer. This can be unsettling and companies know that, so we can never be certain if incorrect ads are actual errors or deliberate diversions.
Our personal information gives potential advertisers ripe targets to increase profit margins, so companies whose services we use have immense incentives to collect as much information as possible, even if it is not used. This is why phone apps often demand more information than it makes any sense for them to have. (Why does my alarm app need access to my identity?) Well, it turns out that our data can be sold just as easily as it can be used and many apps do this, even when they ostensibly are respecting the use for privacy. For instance, AccuWeather, a forecasting service, was recently caught selling user location data in violation of their statements to the contrary.
Data are often collected widely without direct regard for content. As Bruce Schneier notes in “Data and Goliath,” it is generally easier to track everyone and keep all the data than it is to specifically track individuals or only keep relevant data. With advances in computing, storage costs have plummeted making massive data trawling far more viable. In 2014, the NSA opened a new datacenter in Utah for just this reason. Companies and agencies can be far more effective simply by gathering as much information as they can about as many people as they can, only searching through it for specifics when necessary.
To get a sense of how this works, consider your email inbox. If you are anything like me, you probably do not delete many emails, not because there are too few emails, but because there are too many. It would be exhausting to try to delete every irrelevant email while only keeping the important ones. Some practice the “inbox zero” technique for email management, but for most of us, it is just easier to keep everything and find specific emails by starring them or using the search function.
Data stick around, and in our data, we are immortalized. It is likely that many companies have a data file on you specifically, created from various trails of data left online—these files do exist, and a few people have successfully sued to see them. I can already name one database you are already a part of: AskBanner.
By default, the student directory displays a considerable quantity of information, including room number, any phone numbers and hometown. But, unlike most databases, you have some control over what AskBanner shows and, if nothing else, you should take the time to make sure it shares only what you are comfortable with it sharing.
None of this is to encourage paranoia, simply awareness. Your data are valuable—they are an incarnation of you—and should only be handed out with understanding. Regardless of our efforts, data are going to be collected about us, so it behooves us to make sure we understand the trades being made for the services we use. By signing up for services like Gmail or Facebook, we agree, both implicitly and explicitly, to hand over our data for whatever purposes the service providers deem necessary, which normally means we are implicitly consenting to targeted advertising or to having our information sold.
Whether or not this is problematic is down to the individual to decide—perhaps targeted advertising is a useful way to find important and helpful products, or perhaps it is deeply unsettling. Regardless, it is important to be aware of what information you are giving away and to make sure that you want to be doing that.
Big data techniques are not going away, so it is up to us to understand how our data are being used and to make sure that their use is appropriate—at present there is little regulation. You may be more than your data, but your data can paint a striking portrait. Who are you comfortable showing it to?