Archive

Open Science

baseintegration

We have now connected Open Knowledge Maps to one of the largest academic search engines in the world: BASE. This means, you are able to visualize a research topic from 100+ million documents. And for the first time, you can search within different types of resources, including datasets and software. I would like to thank our collaborators BASE and rOpenSci for their outstanding support in making this happen!

We have also spent a lot of time improving the naming of the sub-areas to make the concepts in a field more visible – which means that this update improves our existing PubMed integration too.

In addition, we have added much more information to the site about the project and our approach. Open Knowledge Maps follows the motto “open science, all the way”. From our roadmap to our source code and our data, we publish everything under an open license that is compatible to the Open Definition.

Try it out now and let me know of any feedback you may have!

Create a visualization based on 28 million articles

Today, I am very proud to announce a milestone for Open Knowledge Maps. Thanks to an outstanding team and continued support by our partners and advisors, we have added two major content sources: the Directory of Open Access Journals (DOAJ) with more than 2.3 million articles and PubMed with more than 26.5 million articles. Taking into account a certain overlap between the two sources, we can now credibly state that one can create maps based on 28 million papers. That’s a content pool that is 175 times larger than in the previous iteration using PLOS (about 160,000 articles).

We have also completely overhauled our design & overall presentation and improved the user experience considerably. In addition, we have included the open annotation software Hypothes.is in our PDF preview.

We believe that this is a major step towards revolutionizing the way we discover research. There are many new things to try out and explore – we are looking forward to your feedback and suggestions!

Try it out now!

A little longer than a month ago, I posted an Open Call for Collaborators for an Open Science Prize Proposal on Discovery on this blog and to various open science mailing lists. The call has been very fruitful and I am happy to announce that we have submitted a proposal. In the spirit of open science, you can find the full proposal and the supplementary materials on Github. See below for the executive summary and our video.

Team Open Discovery: Peter Kraker, Mike Skaug, Scott Chamberlain, Maxi Schramm, Michael Karpeles, Omiros Metaxas, Asura Enkhbayar & Björn Brembs

Executive Summary: Discovery is an essential task for every researcher, especially in dynamic research fields such as biomedicine. Currently, however, there are very few discover tools that can be used by a mainstream audience, most notably search engines. The problem with search engines is that they present resources in a linear, one-dimensional way, making it necessary to sift through every item in a list. Another problem is that the results of the traditional discovery process are usually closed. Therefore, the discovery process is repeated over and over again by different researchers, taking away valuable time and resources from the actual research. To solve these challenges and bring the discovery process into the open science era, we propose BLAZE, the comprehensive open science discovery tool. BLAZE will leverage the existing open science ecosystem to provide multi-dimensional topical maps of research fields, involving not only publications, but also datasets, presentations, source code and media files. BLAZE will provide a single, intuitive interface for researchers to explore, edit and share maps. The edit history of a map will be preserved to allow Wikipedia style collaboration. The maps themselves will be open, so users can embed them on their own websites and export the structure into other open science tools. Opening the discovery process will enable researchers to reuse maps, saving valuable time and effort because they can build on top of each other’s work. Furthermore, they will be able identify collaborators long before the research is usually communicated. There is an existing, early-stage protoype for BLAZE and with the Open Science Prize, we plan to develop this prototype into a comprehensive tool. BLAZE will show the enormous potential of open science for innovation in scholarly communication by providing a structured, open and multi-dimensional approach to discovery.

I am currently preparing a proposal for the Open Science Prize in the field of open discovery, and I am looking for motivated collaborators who want to join the project and change the way we do discovery. Here is the current summary:

Discovery is an essential task for every researcher, especially in dynamic research fields such as biomedicine. Currently, however, there are only a limited number of tools that can be used by a mainstream audience.We propose BLAZE, an open discovery tool that goes far beyond the functionality of search engines and social reading lists. The tool builds on Pubmed Central and other open content sources and will provide topical maps for a given list of papers, e.g. a search result or a journal volume. The maps are created automatically using fulltexts to calculate similarities and derive topical structures among papers. Furthermore, they will be enriched with features that are extracted from the papers (e.g. all papers with the same species are highlighted). BLAZE will enable users to do their discovery in a single interface. Users can interact with the maps, explore different topical areas, filter and read individual papers in the same interface. An edit mode will provided for users to make changes to the maps and to introduce new papers and topical areas. Users can openly share maps with others and export the structure in various open formats. BLAZE will be based on the existing open source visualization Head Start, and make extensive use of the digital open science ecosystem, including, but not limited to, open content, content mining services, open source solutions, and open metrics data. With this tool, we want to show the potential of open science for innovation in scholarly communication and discovery. In addition, we believe that this tool will increase the visibility of and awareness for open content and open science in general.

A first draft is also available.

I am looking for backend and frontend web developers who code in JavaScript and/or PHP and R. We will be extending an existing tool for creating web-based knowledge domain visualizations that uses D3.js on the frontend, and R content mining packages on the backend, in particular rOpenSci and tm, so you should have experience with at least one of these libraries. A background in biomed would be nice but it’s not mandatory.

Everything about this project will be open: we will prepare the proposal in the open, the development will take place on a public Github repository, and all project outputs will be published under an open license.

So if you want to join the project and create an awesome open science tool together with me, please send an e-mail to opendiscovery@gmx.at outlining which part of the project interests you most, what you’d be able to contribute and how many hours you could devote to the project over the coming months. Please also include a link to your Github repository. It would be great if you could let me know whether you are a citizen of, or permanent resident in, the United States (US), as we will need to have at least one team member who satisfies this criterion. I am looking forward to your messages!

With “Ich bin Open Science!”, we want to raise public awareness for open science in Austria and beyond. The project, a collaboration between Know-Center and FH Joanneum, has been submitted to netidee 2015. In the video (German only for the moment) we explain the project idea, and you can see first testimonials who lend a face to open science. Why are you committed to openness in science and research?

Note: This is a reblog from the OKFN Science Blog.

It’s hard to believe that it has been over a year since Peter Murray-Rust announced the new Panton fellows at OKCon 2013. I am immensly proud that I was one of the 2013/14 Panton Fellows and the first non UK-based fellow. In this post, I will recap my activities during the last year and give an outlook of things to come after the end of the fellowship. At the end of the post, you can find all outputs of my fellowship at a glance. My fellowship had two focal points: the work on open and transparent altmetrics and the promotion of open science in Austria and beyond.

Open and transparent altmetrics

Peter Kraker on stage at the Open Science Panel Vienna (Photo by FWF/APA-Fotoservice/Thomas Preiss)

On stage at the Open Science Panel Vienna (Photo by FWF/APA-Fotoservice/Thomas Preiss)

The blog post entitled “All metrics are wrong, but some are useful” sums up my views on (alt)metrics: I argue that no single number can determine the worth of an article, a journal, or a researcher. Instead, we have to find those numbers that give us a good picture of the many facets of these entities and put them into context. Openness and transparency are two necessary properties of such an (alt)metrics system, as this is the only sustainable way to uncover inherent biases and to detect attempts of gaming. In my comment to the NISO whitepaper on altmetrics standards, I therefore maintained that openness and transparency should be strongly considered for altmetrics standards.

In another post on “Open and transparent altmetrics for discovery”, I laid out that altmetrics have a largely untapped potential for visualizaton and discovery that goes beyond rankings of top papers and researchers. In order to help uncover this potential, I released the open source visualization Head Start that I developed as part of my PhD project. Head Start gives scholars an overview of a research field based on relational information derived from altmetrics. In two blog posts, “New version of open source visualization Head Start released” and “What’s new in Head Start?” I chronicled the development of a server component, the introdcution of the timeline visualization created by Philipp Weißensteiner, and the integration of Head Start with Conference Navigator 3, a nifty conference scheduling system. With Chris Kittel and Fabian Dablander, I took first steps towards automatic visualizations of PLOS papers. Recently, Head Start also became part of the Open Knowledge Labs. In order to make the maps created with Head Start openly available to all, I will set up a server and website for the project in the months to come. The ultimate goal would be to have an environment where everybody can create their own maps based on open knowledge and share them with the world. If you are interested in contributing to the project, please get in touch with me, or have a look at the open feature requests.

Evolution of the UMAP conference visualized in Head Start. More information in  Kraker, P., Weißensteiner, P., & Brusilovsky, P. (2014). Altmetrics-based Visualizations Depicting the Evolution of a Knowledge Domain 19th International Conference on Science and Technology Indicators (STI 2014), 330-333.

Evolution of the UMAP conference visualized in Head Start. More information in Kraker, P., Weißensteiner, P., & Brusilovsky, P. (2014). Altmetrics-based Visualizations Depicting the Evolution of a Knowledge Domain 19th International Conference on Science and Technology Indicators (STI 2014), 330-333.

Promotion of open science and open data

Regarding the promotion of open science, I teamed up with Stefan Kasberger and Chris Kittel of openscienceasap.org and the Austrian chapter of Open Knowledge for a series of events that were intended to generate more awareness in the local community. In October 2013, I was a panelist at the openscienceASAP kick-off event at University of Graz entitled “The Changing Face of Science: Is Open Science the Future?”. In December, I helped organizing an OKFN Open Science Meetup in Vienna on altmetrics. I also gave an introductory talk on this occasion that got more than 1000 views on Slideshare. In February 2014, I was interviewed for the openscienceASAP podcast on my Panton Fellowship and the need for an inclusive approach to open science.

In June, Panton Fellowship mentors Peter Murray-Rust and Michelle Brook visited Vienna. The three-day visit, made possible by the Austrian Science Fund (FWF), kicked off with a lecture by Peter and Michelle at the FWF. On the next day, the two lead a well-attended workshop on content mining at the Institute of Science and Technology Austria.The visit ended with a hackday organized by openscienceASAP, and an OKFN-AT meetup on content mining. Finally, last month, I gave a talk on open data at the “Open Science Panel” on board of the MS Wissenschaft in Vienna.

I also became active in the Open Access Network Austria (OANA) of the Austrian Science Fund. Specifically, I am contributing to the working group “Involvment of researchers in open access”. There, I am responsible for a visibility concept for open access researchers. Throughout the year, I have also contributed to a monthly sum-up of open science activities in order to make these activities more visible within the local community. You can find the sum-ups (only available in German) on the openscienceASAP stream.

I also went to a lot of events outside Austria where I argued for more openness and transparency in science: OKCon 2013 in Geneva, SpotOn 2013 in London, and Science Online Together 2014 in Raleigh (NC). At the Open Knowledge Festival in Berlin, I was session facilitator for “Open Data and the Panton Principles for the Humanities. How do we go about that?”. The goal of this session is to devise a set of clear principles which describe what we mean by Open Data in the humanities, what these should contain and how to use them. In my role as an advocate for reproducibility I wrote a blog post on why reproducibility should become a quality criterion in science. The post sparked a lot of discussion, and was widely linked and tweeted.

by Martin Clavey

by Martin Clavey

What’s next?

The Panton Fellowship was a unique opportunity for me to work on open science, to visit open knowledge events around the world, and to meet many new people who are passionate about the topic. Naturally, the end of the fellowship does not mark the end of my involvement with the open science community. In my new role as a scientific project developer for Science 2.0 and open science at Know-Center, I will continue to advocate openness and transparency. As part of my research on altmetrics-driven discovery, I will also pursue my open source work on the Head Start framework. With regards to outreach work, I am currently busy drafting a visibility concept for open access researchers in the Open Access Network Austria (OANA). Furthermore, I am involved in efforts to establish a German-speaking open science group

I had a great year, and I would like to thank everyone who got involved. Special thanks go to Peter Murray-Rust and Michelle Brook for administering the program and for their continued support. As always, if you are interested in helping out with one or the other project, please get in touch with me. If you have comments or questions, please leave them in the comments field below.

All outputs at a glance

Head Start – open source research overview visualization
Blog Posts
Audio and Video
Slides
Reports
Open Science Sum-Ups (contributions) [German]

Note: This is a reblog from the OKFN Science Blog. As part of my duties as a Panton Fellow, I will be regularly blogging there about my activities concerning open data and open science.

6795008004_8046829553

by AG Cann

Altmetrics are a hot topic in scientific community right now. Classic citation-based indicators such as the impact factor are amended by alternative metrics generated from online platforms. Usage statistics (downloads, readership) are often employed, but links, likes and shares on the web and in social media are considered as well. The altmetrics promise, as laid out in the excellent manifesto, is that they assess impact quicker and on a broader scale.

The main focus of altmetrics at the moment is evaluation of scientific output. Examples are the article-level metrics in PLOS journals, and the Altmetric donut. ImpactStory has a slightly different focus, as it aims to evaluate the oeuvre of an author rather than an individual paper.

This is all good and well, but in my opinion, altmetrics have a huge potential for discovery that goes beyond rankings of top papers and researchers. A potential that is largely untapped so far.

How so? To answer this question, it is helpful to shed a little light on the history of citation indices.

Pathways through science

In 1955, Eugene Garfield created the Science Citation Index (SCI) which later went on to become the Web of Knowledge. His initial idea – next to measuring impact – was to record citations in a large index to create pathways through science. Thus one can link papers that are not linked by shared keywords. It makes a lot of sense: you can talk about the same thing using totally different terminology, especially when you are not in the same field. Furthermore, terminology has proven to be very fluent even in the same domain (Leydesdorff 1997). In 1973, Small and Marshakova realized – independently from each other – that co-citation is a measure of subject similarity and therefore can be used to map a scientific field.

Due to the fact that citations are considerably delayed, however, co-citation maps are often a look into the past and not a timely overview of a scientific field.

Altmetrics for discovery

In come altmetrics. Similarly to citations, they can create pathways through science. After all, a citation is nothing else but a link to another paper. With altmetrics, it is not so much which papers are often referenced together, but rather which papers are often accessed, read, or linked together. The main advantage of altmetrics, as with impact, is that they are much earlier available.

clickstream_map

Bollen et al. (2009): Clickstream Data Yields High-Resolution Maps of Science. PLOS One. DOI: 10.1371/journal.pone.0004803.

One of the efforts in this direction is the work of Bollen et al. (2009) on click-streams. Using the sequences of clicks to different journals, they create a map of science (see above).

In my PhD, I looked at the potential of readership statistics for knowledge domain visualizations. It turns out that co-readership is a good indicator for subject similarity. This allowed me to visualize the field of educational technology based on Mendeley readership data (see below). You can find the web visualization called Head Start here and the code here (username: anonymous, leave password blank).

Why we need open and transparent altmetrics

The evaluation of Head Start showed that the overview is indeed more timely than maps based on citations. It, however, also provided further evidence that altmetrics are prone to sample biases. In the visualization of educational technology, the computer science driven areas such as adaptive hypermedia are largely missing. Bollen and Van de Sompel (2008) reported the same problem when they compared rankings based on usage data to rankings based on the impact factor.

It is therefore important that altmetrics are transparent and reproducible, and that the underlying data is openly available. This is the only way to ensure that all possible biases can be understood.

As part of my Panton Fellowship, I will try to find datasets that satisfy these criteria. There are several examples of open bibliometric data, such as the Mendeley API, and figshare API that have adopted CC BY, but most of the usage data is not available publicly or cannot be redistributed. In my fellowship, I want to evaluate the goodness of fit of different open altmetrics data. Furthermore, I plan to create more knowledge domain visualizations such as the one above.

So if you know any good datasets please leave a comment below. Of course any other comments on the idea are much appreciated as well.

%d bloggers like this: