Thursday, October 8, 2015

The Open Science Pyradmid

I made a mental promise to myself to write a blog post here every week or so, and I find myself breaking this promise the second week. After I attended the BIG4 kick-off meeting in Copenhagen, Denmark, I traveled to Moscow on a vacation (if you friend me on Facebook you might find some pictures from the trip there), and after that I spent most of my blogging time contributing tothe Pensoft blog. I'll alert the readers of this blog and my Twitter folllowers (@vsenderov) when the post that I contributed to appears there, and I will discuss the subject matter here as well. It will be about data papers.


What happened a few weeks ago in Copenhagen? The first and second days of the symposium were dedicated to presentations by PI's and students, the third day was an excursion, and Thursday and Friday were spent in the museum looking and sorting specimens, and hearing presentations about museum stuff. It was all pretty neat as you can see from the photo below!


In my first post I introduced the Open Biodiversity Knowledge Management System. The system in itself - what it means, what it is, etc. - needs to be discussed much further, but for now I want to say that my PhD project will be dedicated to building this system and this blog is dedicated to my PhD project. The logical structure of the project that was agreed upon after discussions with Prof. Penev who is my advisor is as follows:
  1. Chapter One: Introduction of the open science principles, work on universal identifiers.
  2. Chapter Two: New forms of digital publications.
  3. Chapter Three: New forms of displaying genomics data.

So the blogs here will follow this scheme somewhat. The next few posts will be about open science and after that everything else will follow.

Therefore I will begin now by introducing the open science pyramid, a simple visual aid to illustrate some of the aspects of open science:





The idea behind it is that your digital publication, which is not behind a pay-wall (open access), is only the tip of the ice-berg when it comes to dissemination of knowledge. Your paper usually consists of a nice story plus results in form of figures, tables and statements. While the story is what is appealing to the reader and is what our brains have evolved to process, it is actually the least scientific part of your paper since it is neither verifiable, nor reproducible, What is verifiable and reproducible are the figures, tables and statements, which presumably have been computed by an algorithm. So, if another author is to be able to collect more evidence in favor of your statements, or even better, disprove them, you have to give them access to more than the story plus the results - you have to give them access to the algorithms that are behind the computation (open source). And lastly, if you want other people to be able to reproduce your computations, you would have to open up the data that your algorithm worked on (open data). A caveat: they will of course have to subsample the population again to eliminate biases.

This is for my second post: it is too short, but as time and project progresses I will make these posts longer. Since my thesis is starting to shape up as an open thesis, the next post will be about open theses.

No comments:

Post a Comment