Wednesday, June 29, 2016

Latest Developments as of 2016-06-29

I haven’t updated my blog since January 25th, wow! Memo to self: update your blog more often!

A lot of time has passed and a lot of things have happened, which are impossible to cover fully. For example, BIG4 had its second workshop in Havraniky, the Czech Republic, apparently now called Czechia.

The following pictures should sum up the trip pretty well.

Here, I am pictured examining a small insect---probably a Coleopteran---under a microscope in the field station in Havraniky. This work was fun but very time-consuming. I learned to appreciate the expertise that professional taxonomists have. I really thank Emanuel from BIG4 for teaching me about Coleoptera identification and Prof. Ximo Mengual for teaching me about Diptera recognition. Unfortunately, I didn't have time to learn Lepidoptera or Hymenoptera. 
After collecting in the field, and identifying in the station we visit local wineries. The area around Podyji is known for its white wines.

In the meantime, work on the Open Biodiversity Knowledge Management System (OBKMS) is well underway. We have a prototype of the system already working thanks to some help from Plazi , Ontotext, Kiril Simov, and Eamon O'Tuama. Rod Page wrote an article about his vision of the biodiversity knowledge graph, which I reviewed via the RIO’s post-publication peer-review mechanism. The review itself, has DOI: "10.3897/rio.2.e8767.r25935". I'll re-publish an updated version of the review and thoughts around it as one of my next blog posts.

In a nutshell, it is a technological article, which is an excellent read for people interested in the OBKMS. In the end of the day, however, one can realize the vision of the OBKMS with a number of technologies each having its advantages and disadvantages.

My work in the last six months or so was concentrated on using RDF stores, GraphDB, in particular to implement OBKMS. I have chosen it for a number of reasons after having played around a little bit with Neo4j in the end of last year. One of the things that I like about RDF is that there is a huge body of both data (linked open datasets) and “data schemas,” known in RDF world as ontologies, which give you ideas about how to model your data.


In particular, as OBKMS will be based mostly on scholarly papers, we intend to make heavy use of the Spar Ontologies. For the biodiversity part, we will draw from Darwin Core Filtered Push, Darwin Core for the Semantic Web, and the Treatment Ontology. In particular, in the last week, I have come up with a conceptual model about the relationships between Taxa, Names, Treatments, and Taxon Concepts, which I plan to express in OWL on top of these existing ontologies. I also have an idea about R package to abstract RDF4J query, which I will publish next in the blog.

No comments:

Post a Comment