I am usually not a fast blogger. This post, however, has been rather long in the making, even by my standards. I first started to explore the topic of post privacy – i.e. the notion that the (almost) total loss of privacy is inevitable – in 2010. My interest was based on two observations. But before we get to these observations, let’s look at the term privacy first.
Recently, there was an interesting discussion on the W3C mailing list on the definition of privacy. It quickly emerged that data protection and confidentiality (“the right to be left alone”) are two important concepts in that context. But as Kasey Chappelle put it, privacy is more than that. He defined privacy as informational self-determination, meaning the individual right to decide which information is shared about oneself and under what circumstances. On top of that, I would put Seda Gürses’s definition of privacy as a practice: not only the individual can decide on the use of personal information, but it is also a social convention on what is acceptable and what not. This “what is acceptable and what not” is fluent and subject to a social negotiation process.
The loss of privacy
Now for the observations that ignited my interest:
- All data about us is stored in digital form. Most of this data is with third parties, such as the state, insurance companies and so on. There is a lot of data about us which we would never think of: location data collected by cashback cards and digital traffic surveillance for example, connection data in telecommunication… And this is not even taking into account the data about us that we or others put into the world – such as photos, tweets etc.
- Digital data is fugitive. It is in the nature of digital data that it can easily be copied and replicated, and we have a hard time protecting it. Countermeasures such as encryption are not widely adopted. Also, different entities have different interests; Facebook is not in the data protection business after all.
In a highly interconnected world, these two factors spell trouble. In a recent keynote at WWW 2012 (the World Wide Web Conference), Tim Berners-Lee addressed further issues. One of them is jigsaw identification. Jigsaw identification relates to the fact that while information from one source might not be suitable to identify someone, the combination of information from different sources might well be. For example, if one source publishes post code and age of a person, then that information will not be enough to identify a person: these characteristics usually apply to more than one person. But if another source publishes gender and profession of the same person (which by themselves apply to several people as well), the combination of these four characteristics might be enough to uniquely identify a person. And with the eternal memory of the web, the dates of publication might be far from each other.
All of that leads me to the conclusion that data protection and confidentiality are a lost cause in a digital and highly interconnected world. And with all the data out there, informational self-determination will become impossible. As Tim said, we cannot know which data will be published about us in the future. If a potential employer can buy my health records from a data provider, then the whole notion of privacy as we know it is bound to fail. Furthermore, the data that we publish voluntarily is only the tip of the iceberg. More important is the data that we expose involuntarily (e.g. connection data), data that others expose voluntarily or involuntarily about us (see a data loss scandal near you), and information that can be inferred based on data from various sources (e.g. our social graph). As time passes by, the evidence grows for me that we are headed towards the loss of privacy.
What will happen?
Interestingly enough, most discussions that I had on the consequences of these developments followed the same pattern. The two most prominent views are: a) you are wrong, because I am the one who controls which data is out there (by setting everything private on Facebook, disallowing photo tagging etc.), and b) let’s simply abandon privacy. If all of the data is out there, we actually level the playfield for everyone. In that we discover that everyone has faults, we will attribute less importance to these faults, and society will come out better as a whole.
I think that both of these statements are wrong. Regarding the former statement, I pointed out earlier that the main problem is not the data that we voluntarily put on the web, but rather what others expose about us, what we expose involuntarily and what can be inferred from different data sources. I do not believe that anyone will have everybody’s data at their fingertips. But I do think that all data will be somehow obtainable. The gray market for data that was lost or stolen from third parties, is already huge, and it will continue to grow. It will supplemented by companies who explicitely seek to infer data from various sources, exploiting the jigsaw effect.
With the latter the argument is not so easy. I think it is an intriguing idea. But it I have a hard time in believing that abandoning privacy will make the world a better place, mainly out of two reasons: 1. even though all the data is out there, we will not have an even playfield. We will still have different capabilities in processing the data to get something meaningful out of it. After all, we need to make sense of the data first, and even though everyone can theoretically access it, processing capabilities will not be evenly distributed. Therefore there will be parties with a competitive advantage that they can use to exert power on those with fewer processing cpapbilities. 2. Even when assuming that scoiety will get more tolerant, there will still be things that are more frowned upon than others on a moral scale. A lot will also depend on the presentation of facts to others. The “shitstorms” that we already witness on social media are often based on incomplete or outright false facts.
What can we do?
Now we get to the question that I think is the really important one: How can we deal with the loss of privacy? The only concept that I know of so far is information accountability. It was postulated by Weitzner and al., and builds on informational self-determination. Information accountability is a different paradigm: not the sender is protected, but the receiver is guarded. That means that you only take note in case something happens (you do not get a job, or an insurance because of leaked data). In that event, the offending party would have to present which data they used to make the decision. Bearing that in mind, one of the major questions is: how can we assure accountability on a technological level?
One proposal comes from Oshani Seneviratne. In her PhD at MIT, she develops HTTPa, an accountability-aware web protocol. In essence, the protocol enables you to tell the receiver what he is allowed to do with the transmitted data. This is kind of a creative commons for personal data. A network of provenance trackers stores logs of those permissions and can be consulted in case something goes wrong. There is a lot more to Oshani’s work, and I suggest to check out this presentation as a start.
So should we abolish data protection and confidentiality now, and move solely to information accountability? I do not think so. I see accountability as a good addition which may be suitable to deal the new requirements of a digital and heavily interconnected world. As Oshani points out, accountability is quite compatible with anonymity – because you only have to reveal your identity in case something goes wrong. Apart from the technical solutions, we also need to discuss legal frameworks for accountability to work. Otherwise there will be no way to hold people accountable in court. Therefore, we need to have a broad debate on what is acceptable and what not on a social level. Thankfully, that debate has already started and gets more and more attention. Just like Seda Gürses put it with privacy as a practice. After all, technology can only give us the tools, but what we want to do with them is up to us.
If you made it that far, thanks for reading. Below are a few slides that are meant as a short summary. Of course, I would love to hear your comments and ideas on the subject! Does it make sense to you? Which concepts am I missing?