Editorial

Image by alancleaver

I am usually not a fast blogger. This post, however, has been rather long in the making, even by my standards. I first started to explore the topic of post privacy – i.e. the notion that the (almost) total loss of privacy is inevitable – in 2010. My interest was based on two observations. But before we get to these observations, let’s look at the term privacy first.

Defining privacy

Recently, there was an interesting discussion on the W3C mailing list on the definition of privacy. It quickly emerged that data protection and confidentiality (“the right to be left alone”) are two important concepts in that context. But as Kasey Chappelle put it, privacy is more than that. He defined privacy as informational self-determination, meaning the individual right to decide which information is shared about oneself and under what circumstances. On top of that, I would put Seda Gürses’s definition of privacy as a practice: not only the individual can decide on the use of personal information, but it is also a social convention on what is acceptable and what not. This “what is acceptable and what not” is fluent and subject to a social negotiation process.

The loss of privacy

Now for the observations that ignited my interest:

All data about us is stored in digital form. Most of this data is with third parties, such as the state, insurance companies and so on. There is a lot of data about us which we would never think of: location data collected by cashback cards and digital traffic surveillance for example, connection data in telecommunication… And this is not even taking into account the data about us that we or others put into the world – such as photos, tweets etc.
Digital data is fugitive. It is in the nature of digital data that it can easily be copied and replicated, and we have a hard time protecting it. Countermeasures such as encryption are not widely adopted. Also, different entities have different interests; Facebook is not in the data protection business after all.

In a highly interconnected world, these two factors spell trouble. In a recent keynote at WWW 2012 (the World Wide Web Conference), Tim Berners-Lee addressed further issues. One of them is jigsaw identification. Jigsaw identification relates to the fact that while information from one source might not be suitable to identify someone, the combination of information from different sources might well be. For example, if one source publishes post code and age of a person, then that information will not be enough to identify a person: these characteristics usually apply to more than one person. But if another source publishes gender and profession of the same person (which by themselves apply to several people as well), the combination of these four characteristics might be enough to uniquely identify a person. And with the eternal memory of the web, the dates of publication might be far from each other.

All of that leads me to the conclusion that data protection and confidentiality are a lost cause in a digital and highly interconnected world. And with all the data out there, informational self-determination will become impossible. As Tim said, we cannot know which data will be published about us in the future. If a potential employer can buy my health records from a data provider, then the whole notion of privacy as we know it is bound to fail. Furthermore, the data that we publish voluntarily is only the tip of the iceberg. More important is the data that we expose involuntarily (e.g. connection data), data that others expose voluntarily or involuntarily about us (see a data loss scandal near you), and information that can be inferred based on data from various sources (e.g. our social graph). As time passes by, the evidence grows for me that we are headed towards the loss of privacy.

What will happen?

Interestingly enough, most discussions that I had on the consequences of these developments followed the same pattern. The two most prominent views are: a) you are wrong, because I am the one who controls which data is out there (by setting everything private on Facebook, disallowing photo tagging etc.), and b) let’s simply abandon privacy. If all of the data is out there, we actually level the playfield for everyone. In that we discover that everyone has faults, we will attribute less importance to these faults, and society will come out better as a whole.

I think that both of these statements are wrong. Regarding the former statement, I pointed out earlier that the main problem is not the data that we voluntarily put on the web, but rather what others expose about us, what we expose involuntarily and what can be inferred from different data sources. I do not believe that anyone will have everybody’s data at their fingertips. But I do think that all data will be somehow obtainable. The gray market for data that was lost or stolen from third parties, is already huge, and it will continue to grow. It will supplemented by companies who explicitely seek to infer data from various sources, exploiting the jigsaw effect.

With the latter the argument is not so easy. I think it is an intriguing idea. But it I have a hard time in believing that abandoning privacy will make the world a better place, mainly out of two reasons: 1. even though all the data is out there, we will not have an even playfield. We will still have different capabilities in processing the data to get something meaningful out of it. After all, we need to make sense of the data first, and even though everyone can theoretically access it, processing capabilities will not be evenly distributed. Therefore there will be parties with a competitive advantage that they can use to exert power on those with fewer processing cpapbilities. 2. Even when assuming that scoiety will get more tolerant, there will still be things that are more frowned upon than others on a moral scale. A lot will also depend on the presentation of facts to others. The “shitstorms” that we already witness on social media are often based on incomplete or outright false facts.

What can we do?

Now we get to the question that I think is the really important one: How can we deal with the loss of privacy? The only concept that I know of so far is information accountability. It was postulated by Weitzner and al., and builds on informational self-determination. Information accountability is a different paradigm: not the sender is protected, but the receiver is guarded. That means that you only take note in case something happens (you do not get a job, or an insurance because of leaked data). In that event, the offending party would have to present which data they used to make the decision. Bearing that in mind, one of the major questions is: how can we assure accountability on a technological level?

One proposal comes from Oshani Seneviratne. In her PhD at MIT, she develops HTTPa, an accountability-aware web protocol. In essence, the protocol enables you to tell the receiver what he is allowed to do with the transmitted data. This is kind of a creative commons for personal data. A network of provenance trackers stores logs of those permissions and can be consulted in case something goes wrong. There is a lot more to Oshani’s work, and I suggest to check out this presentation as a start.

So should we abolish data protection and confidentiality now, and move solely to information accountability? I do not think so. I see accountability as a good addition which may be suitable to deal the new requirements of a digital and heavily interconnected world. As Oshani points out, accountability is quite compatible with anonymity – because you only have to reveal your identity in case something goes wrong. Apart from the technical solutions, we also need to discuss legal frameworks for accountability to work. Otherwise there will be no way to hold people accountable in court. Therefore, we need to have a broad debate on what is acceptable and what not on a social level. Thankfully, that debate has already started and gets more and more attention. Just like Seda Gürses put it with privacy as a practice. After all, technology can only give us the tools, but what we want to do with them is up to us.

If you made it that far, thanks for reading. Below are a few slides that are meant as a short summary. Of course, I would love to hear your comments and ideas on the subject! Does it make sense to you? Which concepts am I missing?

Research 2.0 Communities around the Web

August 31, 2011

Editorial, Science 2.0

3 Comments

In the spirit of the upcoming RDSRP’11, I decided to list a few Research 2.0 communities that I check with more or less frequently. That means communities specifically on the topic of Research 2.0, not just Web 2.0 tools for science. Without further ado:

Mendeley: Future of Science and Science 2.0
TELeurope: Research 2.0
FriendFeed: Science 2.0
ResearchGate: Science 2.0 and Publication 2.0
LinkedIn: Science 2.0
Science 3.0

I am sure, I missed tons of places here. What are your favourite Research 2.0 hangouts?

This blog is doing (just) awesome!

January 3, 2011

Editorial

Crunchy numbers

A helper monkey made this abstract painting, inspired by your stats.

The Leaning Tower of Pisa has 296 steps to reach the top. This blog was viewed about 1,100 times in 2010. If those were steps, it would have climbed the Leaning Tower of Pisa 4 times

In 2010, there were 11 new posts, not bad for the first year! (seeing post counts from various other scientific bloggers leaves with some doubt about that statement)

The busiest day of the year was October 11th with 37 views. The most popular post that day was Blinded peer reviews – a thing of the past?.

Where did they come from?

The top referring site in 2010 was twitter.com (by far) . This is not surprising as I announce all new posts there.

People that came via search engines searched mostly for science 2.0, for me, or for a combination of both. The most popular searches contentwise related to conducting a group discussion.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

Blinded peer reviews – a thing of the past? October 2010
1 comment

Barcamp Graz 2010 – A weekend in review May 2010
1 comment

A Publication Feed Ecosystem for Technology Enhanced Learning [UPDATED] July 2010
1 comment

Reminder: Research 2.0 Workshop at ECTEL 2010 June 2010

IJTEL Young Researcher Special Issue CfP and CfR September 2010

With that little overview I would like to say “Thank you!” to my readers. I wish all of you a successful year 2011!

First Post!

March 7, 2010

Editorial

1 Comment

Hi, my name is Peter Kraker and I am a research assistant at Know-Center (Graz University of Technology). Currently, I am involved in STELLAR, an EU-funded Network of Excellence revolving around Technology Enhanced Learning. My main research interest and the topic of my PhD thesis is “Science 2.0”: the way in which researchers use Web 2.0 for their work and the effects this has on science itself.

I will use this blog to report about my research, to cover important developments in the area, and to publish interesting stuff I came across. I am looking forward to your input and I sincerely hope that this will lead to a fruitful exchange!

—Science and the Web

Peter Kraker's Weblog

Archive

Editorial

Open search tools need sustainable funding

My objectives as a Panton Fellow