An abstract illustration of a person, with connected wires behind and on them and two harms hugging them from behind.
Sara Wong (2020)

You are not your data but your data is still you

Deep Dives
Deep Dives

--

by Tricia Wang

What happens when your own data becomes your enemy?

Just shy of his 42nd birthday, Robert Julian-Borchak Williams found out.

In January 2020, the Detroit police uploaded a photo captured from a store robbery video to DataWorks. DataWorks — a mugshot management company turned facial recognition software company — then matched the video still with an outdated DMV photo of Williams, a Black man. That led to Williams’ arrest for felony larceny. But the match was wrong. And Williams became the first documented case in America of a wrongful arrest due to a mistaken identity match from facial recognition algorithms.

In an op-ed, Williams writes that prior to his experience, he was a supporter of facial recognition technology. He rhetorically asks: ‘What’s so terrible if they’re not invading our privacy and all they’re doing is using this technology to narrow in on a group of suspects?’

A digital artwork of a colourful face made of small scaly parts. Some of these parts are flying away from the face.
‘Void System’, by Maxime DES TOUCHES (2015) Image courtesy: https://www.deviantart.com/elreviae/art/Void-System-538340110

Williams goes on to answer his own question. Because of the wrongful arrest, he had to miss several days of work. Despite proving that he wasn’t the person in the photo, his case was only conditionally dismissed. This means he could still be charged again for the same crime — pending further evidence. Even though he was released without charges, he now has a police record for a crime he didn’t commit. This could adversely impact his chances of getting a job or finding a home. What’s more, his neighbours saw him getting arrested. His daughters saw him getting arrested. While his op-ed does not go into detail about the emotional trauma an event like this can cause, we know from other cases that this kind of trauma is real.

If what happened to Williams was only a threat to privacy, then we would be talking only about Williams’ control over his data. We might be talking about the State’s right to access his old DMV photo without his consent. But instead, what we’re talking about, is how processes that involve algorithmic decision-making over human beings can become fundamental threats to our lives and our communities. The wrongful arrest compromised Williams’ agency to live his life freely without harm or fear, to do a job, and to be part of a community.

Williams’ experience is not just an invasion of privacy. Something even more terrible happened to him that we don’t have a widely shared language for yet: the invasion of personhood. If privacy is about our ability to control information about ourselves, personhood is about our fundamental agency as human beings.

***

In the digital age, individual privacy in the broadest sense is about control over protecting one’s personally identifiable information (PII), such as information about health, credit, shopping, or communication. But the types of information deemed ‘personally identifiable’ and the amount of control one has over them varies around the world.

I’ve done a lot of work to try and document concepts of privacy and identity, and found that conceptions of privacy vary radically across cultures. It’s quite common for Chinese internet users to share their blood types on social media platforms, whereas Westerners would see that as deeply personal health information. In Peru, people give out their DNI, the equivalent to a social security number, without even a second thought when asked by store cashiers.

Regardless of the types of information it encompasses, privacy is about controlling which pools of data are protected. If the operative word for privacy is ‘control’, then personhood is all about agency. Personhood is the agency to determine one’s own life decisions and outcomes. Personhood is tied to the qualities that make us people. It’s about making decisions about everything: our careers, personal lives, homes, and relationships. It’s about having the freedom to determine where we live, who we live with, and how we live. It’s about self-determination.

An illustration of a short-haired person with several colours on their face, wearing a blue shirt, looking at us.
‘Guy #1’, by Ben Tam (2017) Image courtesy: https://www.artistsinourmidst.com/artists-gallery/byng-arts-mini-school/

Today, privacy and personhood are both mediated digitally. Our primary language for conceptualising the data we produce is through privacy, which treats our personal information as separate from us, a piece of property that can be measured, negotiated over, sold, and reused. But data doesn’t just belong to you in the way that your house or car might; it is also you. It is like a quantum particle that can exist in two places at the same time, as both a representation of who you are and also a commodity that can be sold.

Violations of privacy are violations of data as a commodity. Violations of personhood are violations of the actual person represented by that data. That is why violations of personhood lead to unintended and debilitating effects that last beyond the violation itself. So not only is it important to talk about the privacy violations that come with algorithmic decision-making, but to also about the bigger threat to personhood. Evaluating the threats of automated decision making tools through the lens of privacy obscures the nature and scale of the threat, and it obscures who will be most negatively impacted: people and communities who are still fighting for their personhood.

Even though we don’t use the concept of personhood widely, we may actually understand it better than we realise. The current Black Lives Matter protests in the USA did not start because protesters were demanding privacy around their data. They were, and are, demanding the right of Black people to live without discrimination — in other words, the right to their personhood.

Perhaps the value of personhood is most intuitively understood by those who have had it taken away, either within their lifetimes or inter-generationally. Removal of personhood comes in many forms: abuse, surveillance, racism, genocide, homophobia, slavery, trafficking, colonialism, torture, and human rights violations. Immigrants who are detained, ethnic groups who are discriminated against, and prisoners kept in solitary confinement are all systemically subjected to dehumanisation, which is one way of removing personhood. The poet Joshua Bennett describes how Black people have historically been deemed as human nonpeople. Essentially, anyone whose sense of self has been reduced to a single identifier or data point by a majority group has experienced violations of personhood — and viscerally knows what it’s like to be denied agency and autonomy over their lives and decisions.

The violation of personhood is not a new experience. It is, however, now happening in unfamiliar ways because the entire process is invisible, automated, and designed to obfuscate.

An abstract illustration of a big stack of books/diaries and some record players.
‘Memoria colectiva’, by Rosa Delgado Leyva (2004–2006) Image courtesy: https://artenet.es/en/paintings/collective-memory-

For example, the typical smartphone app user or Amazon shopper is aware that their behaviours are being tracked and purchases catalogued. But most people usually don’t understand the extent to which data is being gathered, by whom, or how it is being or could be used. People often don’t account for less obvious use cases like transit companies recording conversations on trains and buses, or advertisers tracking consumer travel patterns through billboards and smartphones. . Even something seemingly unrelated to data, like high school students taking SAT test, results in this data being sent to tech advertising companies. When people do see the extent to which their preferences and activities are being monitored, they feel overwhelmed.

When it comes to privacy and data ownership, most of us feel downright helpless. High-profile stories like the Equifax data breach or the Facebook-Cambridge Analytica scandal are made public — and yet, here we are, likely still using credit cards and a Facebook product. We feel compelled to agree to terms and conditions designed to protect corporations so we can use the technologies we’ve come to rely upon to live our lives. When we demand better alternatives, technologists will often echo messages such as ‘If you care about your privacy, just stop using social media…and your phone…oh, and the internet!” or ‘Privacy is obsolete anyway, so stop worrying about it!’

These two extremes offer neither helpful advice nor peace of mind. They are also two extremes that only address surface-level privacy without examining its link to personhood. The more our personal data is being collected automatically and acted upon by institutions, the more our privacy and our personhood are compromised. I believe we need to level up conversations around our digital humanity before we can collectively advance solutions to protect it.

***

Let’s be real: the word ‘data’ feels cold and lifeless. Perhaps because of this, many of us believe our data is somehow separate from our true selves. Some people just aren’t worried, thinking: ‘Oh, sure, my data is being collected, but it’s fine. It has never gotten in the way of my life.’ The reality is that as we begin to live more and more of our lives online, and institutions surveil and interact with us digitally, our data selves become just as real as our physical selves. That information you got from 23andMe is you. Those Tik Tok videos are you. That Grindr profile is you. The GPS mobility data from your phone is you.. That chatbot conversation is you. In his book We are Data, John Cheney-Lippold documents how algorithms are used to construct digital versions of ourselves. Companies and institutions can then act upon us in the physical world just by acting upon our digital proxies.

A black, white and grey abstract digital art of a human face with closed eyes.
Untitled by pareidoloop. Image courtesy: https://rb.gy/07d80y

Because your data is you, and because your data is acted on without your consent, and because that action can impact how much agency you have over your life and livelihood, data is at the core of digital personhood. This is why any proposed solution of abandoning connected technologies is bound to be ineffectual. For starters, our digital selves already exist, and quitting social media won’t change that. But more importantly, these apps have become so central to our lives, that eradicating them would make it difficult to be a person.

Facebook and its suite of applications have become many people’s primary means of communicating with friends and family. Without it, they’d not only be without work, but also isolated from their social networks. Abandon Instagram and lose knowledge about the experiences of your friends and family. Abandon WhatsApp and lose the ability to contact people you know. Abandon Lyft and Uber and in certain cities you’ll either pay three times as much for your ride or be forced to walk because there are no cabs in sight. Imagine being unable to email your employer, or check your bank balance, or find the fastest and cheapest flight to see your family. You wouldn’t cease to exist on the spot, but the agency you have over your life would change drastically. Your sudden loss of personhood would be overwhelming.

Orange background. There are small illustrations of several faces in a asymmetrical arrangement.
Untitled by Jean-Manuel Duvivier (2017) Image courtesy: https://rb.gy/gyzcei

We carry out so much of our personhood digitally that we don’t even think about it. Every text message and email we send, every map and browser search, every photo we upload to the cloud, every browser search — all of this is data, all of this is us. But our language around our data creates the illusion that it is external to us. And since our data is physically distributed across other people’s computers or companies’ databases, it’s hard to conceptualise that our data is just as much a part of who we are as our hair, bodily organs, arms, or legs.

***

A few years ago, Google released a new machine learning algorithm that automatically tagged photos to describe their content. Jacky Alcine discovered that his friend, who is Black, was tagged a Gorilla. Google claimed it was a computational error, but the real problem is twofold. First, they didn’t have a diverse enough photo set on which to train the algorithm. Second, they likely didn’t have a diverse enough team to raise the issue of diverse photo sets and approaches in the first place. In 2011, the National Institute of Standards and Technologies (NIST) did a study that revealed facial recognition algorithms performed better on the faces that looked like the team developing them. Meaning: Japanese teams produced better results at recognising Japanese faces while Caucasian teams produced better results at recognising Caucasian faces.

In 2019, NIST performed another study on 189 algorithms and concluded that they were highly susceptible to bias. The algorithms misidentified Black and Asian faces ten to a hundred times more than Caucasian faces. Two of the DataWorks algorithms that misidentified Williams and led to his wrongful arrest were included in this study.

What is worrisome is that despite studies showing how automated decision-making systems are inaccurate and extend bias, institutions are still deploying them. Julie Angwin’s work on courtrooms using COMPAS, a machine learning program that gives risk scores, shows that the programme is overly biased in rating Black people as higher risk. This one score doesn’t reflect real risk, but does keep thousands of people, mostly Black, in jail.

An abstract graphic art comprising of elements such as an eye, CCTV cameras and arrows.
Untitled by Oliver Munday (2013) Image courtesy: https://omunday.tumblr.com/post/69601272050/vollmann-feature-on-the-nsa-foreign-policy

Outside of courtrooms, companies are using software to make hiring more efficient. Amazon’s HR hiring tool was discovered to privilege male applicants over female applicants because it was trained on past hiring practices that were sexist. Employers are also paying software companies to perform background checks that include a sweep of all of the public online interactions of potential employees. Companies like Fama Technology claim that they can ‘identify problematic behaviour among potential hires and current employees by analysing publicly available online information.’ Soon after a job interview, Twitter user @bruisealmighty received a package of 300 pages of her tweets printed out. All tweets with the word ‘fuck’ were categorised as ‘bad.’ @bruisealmighty didn’t know that her tweets would be used and taken out of context, and possibly put her job prospects at risk. In South Korea, companies are using AI software to sort interviewees based on facial expressions. There are now facial coaches to help candidates prepare for algorithmically run interviews.

In 2013, Dr. Latanya Sweeney demonstrated how the algorithms in Google’s Ad Delivery are racist. Searches of Black-sounding names served ads for criminal background checks, while White-sounding names did not produce the same results. Imagine the psychological impact on the millions of Black people, as well as their potential employers, who see their names next to racist ads. Many Native Americans, such as Shane Creepingbear and Dana Lone Hill, have written about how Facebook makes it hard for people with non-Western names that mix adjectives and nouns to get an account.

All these stories involve individuals or entire communities whose personhood is threatened by systems that are not designed by them or even keeping them in mind. These are systems that have become embedded in existing systems, creating entanglements that have evolved faster than the law or policymakers can keep up with. These software spaces will increasingly be home to what Madelaine Claire-Elish calls moral crumple zones, instances where responsibility and liability are obscured in complex, automated systems.

We have crossed the uncanny valley of digital personhood, and if we want to have agency over that personhood, we have to have agency over our data.

This is why the work of groups like Our Data Bodies is so important — they help marginalised communities understand and value their data. For this five-person team, community-hood is the basis of personhood, an idea underscored by the work of Sabelo Mhlambi. As more communities completely rely on social platforms to communicate, they are in effect centralising all their data, memories and history into one place — putting their entire community-hood at risk if it were to all be misused or disappear.

***

Tech companies like Facebook have long argued that they own our data because they’ve created free platforms for us to use. At first glance, we may agree: ‘Sure companies can claim my data is theirs. What would I do with it anyway? They’re the ones collecting the data and paying for it to be stored in some server.’ This logic serves the bigger argument that social media companies have the right to do with your data record as they please. And it potentially sets up the perversion that any entity creating a record of us could ‘own’ us because they own a part of our personhood.

An abstract graphic art of two silver human figures standing in a starry frame.
‘Homos Luminosos’, by Roseline de Thélin (2013) Image courtesy: https://rb.gy/ungtvm

Companies collecting our data know that our personally identifiable information (PII) is valuable. They often use the metaphor that our data is oil, which is reflective of their conception that our data is at once both valueless without their refinement and valuable as a tradeable commodity. Seen as a natural, feral and raw resource, our data then belongs to the first company to create a record of us, capture us in their dataset, and create economic value out of it. Much like how oil companies stockpile land or how colonizers claim land as their property, a similar thing is happening right now with data. There is a rush to gather data even when the use case isn’t clear yet. Companies have long defaulted to collecting as much PII as possible instead of collecting the least amount of data needed to get the job done. Data hoarding is rampant because companies are confident that if the data is not valuable now it will become valuable someday. This is compounded by the technology logic that says that machine learning is only possible on large enough datasets, so it is better to collect more rather than less.

This business model is predicated upon a company’s ability to sell data repeatedly, extracting value from it over and over again. An entire shadow industry of data brokers has emerged to facilitate the trading of our PII to pharmaceutical companies, finance, healthcare, marketers, advertisers, and governments looking to act upon our data. The data brokerage industry is estimated to be valued at around $200 billion. It is designed to be opaque and untraceable.

The purpose of this obfuscated market is less data brokerage and more data arbitrage, because it builds on the concept that companies can extract value from your data for less than its actual trading price. It is estimated that a personal data record that has an email address, social media and credit card transactions attached to it has a value of $8 per month. If companies can get that email address from you for free in exchange for signing up for a service or a marketing newsletter, then they have reduced the cost of acquisition of this valuable asset to near zero.

A colourful abstract illustration of a human face with long hair & elements in boxes: speech bubbles, night sky, hands.
Untitled by Catalina Alzate (2019) Image courtesy: https://rb.gy/mtrc8g

Companies are taking advantage of the lack of shared, collective understanding around personal data by hiding their actions under complex legal terms and conditions, making it difficult to track their actions. While there is nothing illegal about arbitrage, a more accurate way to describe their use of our personal data might be data embezzlement. Companies from Facebook to all the unknown players in the data brokerage industry are siphoning off as much information as they legally can, and selling it to the highest bidder without giving a thought to how it might affect us; how it might interrupt our personhood.

When questioned about their data practices, companies will argue that their actions fall well within the law. But an entity can be legally compliant with privacy laws and still be socially flagrant of personhood. And for now, most companies will continue on their current path of orientating data policies towards privacy — which they perceive as easier to measure and execute than personhood.

***

Our data is being collected all the time. We might think opting out is an option if we don’t want our data collected. Like staying off social media, for example. But it’s not that simple. Just by existing, we expose ourselves to violations of our personhood on a daily basis. If we walk down a city street, surveillance cameras may capture our image and cellphone towers may log our movements. If we need to catch a flight, we’ll be asked to give up biometric data to pass through security. If we buy groceries, the supermarket’s loyalty programme will carefully tabulate how much we spent, when, and how often. These examples demonstrate that we’re no longer in the position to opt out when data collection is so automated and ingrained into our everyday lives.

If you take a walk around the block with your dog, your neighbour’s Amazon’s Ring camera might capture you. If your local police is one of the 630 police departments that Ring has a partnership with, and if they ask to view footage from your neighbour’s Ring, then your likeness could make it to their database without you ever knowing. The increasing use of facial recognition technologies, from cameras to matching software, complicate the possibilities of opting out. Cities, border patrol and police officers around the world have installed video cameras with facial recognition technologies where capturing is automatic and always on. As the costs to capture and store data have decreased, law enforcement is continuously grabbing data — not because they are looking for someone specific, but just because they can.

A graphic digital art of a shiny silver ball melting into the void.
‘Melting Memories’, by Refik Anadol (2019) Image courtesy: https://rb.gy/jqfrzq

In a study on facial recognition technologies deployed by cities, Claire Garvey and Laura Moy warn that technology meant to make residents feel more secure can do the opposite by violating neighbourhood privacy and potentially suppressing free speech. These types of warnings are starting to make their way to local, city and state leaders. Over the last few years, a patchwork of policies have emerged against the use of facial recognition at the city and state level. San Francisco, Oakland, Somerville, Massachusetts, and Portland all have bans already in place or that are being put into place regarding its use in law enforcement. Illinois has had one of the most forward-thinking data protection laws, the Biometric Information Privacy Act (BIPA), in effect for over a decade. Illinois residents recently filed a lawsuit against Amazon, Microsoft and Google for violating the BIPA by using their faces without permission in a database used to train facial recognition systems.

This flurry of bans on facial recognition technology across the country is a step forward in curtailing automated data gathering. Susan Crawford notes that the annoyances that tech companies experience in having to comply with the variations of local and state laws is often what brings them to the table to negotiate with policymakers. In June 2020, Amazon, Microsoft and IBM introduced a call for a national law on facial recognition (the same week they all agreed to temporarily stop selling facial recognition to the police).

The advent of social media has hastened the trend of passive, automatic data collection wielded against U.S. citizens. In 2015 the Baltimore police labelled two Black Lives Matters organisers, Deray Mckesson and Johnetta Elzie, as ‘high threat actors’ through monitoring their Twitter, Instagram, and email. The US Immigration and Customs Enforcment (ICE) department targeted suspected undocumented immigrants by purchasing access to state DMV data in order to obtain high resolution photographs. Most recently, federal agents used YouTube live streams to identify, arrest, and charge protestors with alleged crimes against federal buildings.

A 3D digital art of numerous concentric circles made of small rectangular and square parts.
‘Virtual Archive’, by Refik Anadol (2017) Image courtesy: https://rb.gy/3h6olq

Much of the technology powering the government’s data surveillance infrastructure is provided by private Silicon Valley companies. One company, Palantir, has over $1.5 billion in federal contracts and several more with local police stations around the US. Palantir’s role is to create the data infrastructure and software to predict who to detain in ICE raids and who to target in drone strikes. While Palantir describes itself as a mission-based company committed to privacy and civil liberties, many critics and activists have argue that Palantir’s software enables the U.S. government to create a surveillance society and abuse human rights.

This isn’t surprising, because corporations and governments have a long history of working together to define and enforce control over personhood.

***

In January 2020, Zachary McCoy became a suspect in a home burglary case in Gainesville, Florida. About a year earlier, in December 2018, the police department of Avondale, Arizona arrested Jorge Molina as a suspected murderer. These were two very different crimes, committed at two different ends of the the U.S. But despite the fact that Molina and McCoy never met, both their cases had many things in common.

First, they were both proven innocent. Second, they both became suspects only after Google fulfilled geofence warrants from police departments. A geofence warrant is a new type of warrant that enables police departments to request from Google the data of individuals who were near the site of a crime. In McCoy’s case, he had biked past the victim’s home several times while wearing his fitness app that was tracking location data and sending it to his Android device. In Molina’s case, the geofence warrant showed that a cellphone attached to Molina was at the location of the murder, along with his car. In the end, neither of the cases could prove that the person arrested was physically at the crime scene. McCoy had biked by the victim’s home, but that didn’t mean he was the robber. In Molina’s case, it was his stepfather who had murdered the victim; he used one of Molina’s old cellphones and drove his car without his permission.

Neither McCoy or Molina were aware that their data could be used in such a way. Since the passage of the Patriot Act in 2001, corporations have been compelled to comply with government requests for user data, whether they ethically agree with it or not, and irrespective of the consequences.

Both Molina and McCoy’s lives were disrupted by these wrongful accusations. Both of them were mentally traumatised, too scared to leave their homes for fear of being tracked again. McCoy’s parents used their savings to hire a lawyer to clear the warrant. Molina lost his job and was unable to get a new one. His car was repossessed and he dropped out of school.

An illustration of a person holding a briefcase and walking. There are two huge CCTV cameras point at them from above.
Untitled (2016) Image courtesy: https://rb.gy/ukg1tl

Molina and McCoy join Williams in a common story. All three had their personal data used by institutions without their consent. In all three cases, data misidentified them with a crime that they did not commit. This experience negatively affected not just their privacy, but their personhood.

But Nathaniel Raymond argues that when data is being gathered remotely, such as through social media or geospatial data, informed consent is not practical. His essay on this topic provides many useful recommendations for how to deploy such technologies. The first suggestion is that organisations need to take the time to understand the potential harm before even seeking to do no harm. And that organisations need to work on data preparedness before data collection, meaning they need to plan in advance how the data will be used, shared, and protected. Raymond’s guidance speaks to why GDPR — the General Data Protection Regulations — was needed as a reset button. Despite its limited scope and impact, the policy is at least mandating organisations to explain what they do with the data they collect. The implementation of GDPR could pave the way for a more progressive approach by governments to link data practices to our personhood.

***

Stephanie Dinkins is an artist who brings digital personhood to life in a very real way. Her work seeks to demystify AI and make it accessible to everyone. In one project, ‘Not The Only One (NTOO)’, Dinkins builds a device that captures three generations of family stories. A locally stored AI is trained on the stories to share them with future generations. In another project, ‘Project al-Khwarizmi (PAK)’, Dinkins leads workshops that teach communities of colour to become more aware of the ways that their data is used to inform the algorithms that shape their daily life.

In speaking to Dinkins about her work, she says, ‘For me, art has always been a space for me to ask about value, internally and externally. In NTOO, I am showing people that my family history has value to me, my community and society at large. I put all that history in a form that people can interact with, hopefully making technologies like algorithms seem more approachable so that it can spark people to ask: “What kind of value does my data have to myself?”’

A photo of a black person facing a statue, as if having a conversation with it.
‘Conversations with Bina48’, by Stephanie Dinkins (2014-ongoing) Image courtesy: https://rb.gy/3sct0x

For Dinkins, people need to see themselves in the product of these technologies in order to understand and even recognise the value in their own data. She goes on to explain that ‘an internal understanding of your own value is important for personhood. But personhood is a radical idea because it means you, as a person, just for being, are valuable. This extends to everything we do. But what happens when others don’t think of you as a being with full personhood?’ Dinkins raises an important question: if on some level we don’t view people as deserving of personhood, then we treat their data that way too. We slide back into thinking that data is not the person, it is just another thing produced by them that could be sold. When we don’t have a shared language to address a new social context, artists are often the ones who help society materialise abstract ideas.

New worlds need new language. There are new things to name. And one of those things to name is what is happening to ourselves and our data proxies. Expanding our language from privacy to personhood enables us to have conversations that enable us to see that our data is us, our data is valuable, and our data is being collected automatically. We must expand our language to keep up with technology. Rights, policies, and laws are crafted in response to new needs, but we first must be able to describe those needs

What would happen if we redesigned systems to protect personhood? Dinkins, along with Williams, McCoy and Molina, remind us that we must keep fighting for both privacy and personhood. This is not a binary decision. Apps, facial recognition software, surveillance cameras, social media and the internet don’t just affect our lives, they also affect what we believe we are capable of accomplishing in life. Personhood should not be for the privileged, it should be for everyone. And that’s something worth fighting for.

This work was carried out as part of the Big Data for Development (BD4D) network supported by the International Development Research Centre, Ottawa, Canada. Bodies of Evidence is a joint venture between Point of View and the Centre for Internet and Society (CIS).

--

--

Now running: Agency or Age — in depth stories on consent, age of marriage and agency for young women