Tuesday, June 26th, 2007

In the latest effort to make myself obsolete by the end of this year, we are looking for a software developer to help us better make our data available to both humans and machines. The main responsibilities of this position will be the further development of the UniProt web site and the UniProt RDF distribution.

Compare Ontology Versions

Thursday, June 7th, 2007

There is a rather useful (but perhaps somewhat hidden) plug-in for Protege that can be used to compare two versions of an OWL file: PROMPT.

N-Triple Converter Comparison

Monday, March 12th, 2007

In order to bulk-load RDF data into Oracle (Spatial) 11g, the data needs to be converted to N-Triples first. If the data set is large, this step can add quite a bit of overhead, which is why I decided to benchmark and compare several options.

Metalink for UniProt RDF

Wednesday, March 7th, 2007

The UniProt RDF distribution is over 5GB large. To help people retrieve the data more efficiently, we now mirror the data and provide a Metalink file that describes all the file locations.


Saturday, April 1st, 2006

We’ve been looking for a decent ontology editor for a while now. The problem is that most editors are either to technical or too cumbersome to use for entering a lot of data. But it looks like we have finally found something suitable!

Data and Reality

Monday, February 6th, 2006

Brief review of Data and Reality by William Kent. This book was written in 1978, but is still remarkably relevant in many ways.

Call for Better Information Retrieval Systems

Friday, January 27th, 2006

From a recent review article in Nature Genetics:

[…] current ad hoc IR systems are not able to retrieve our example sentence when they are given the query ‘yeast cell cycle’. Instead, this could be achieved by realizing that ‘yeast’ is a synonym for S. cerevisiae, that ‘cell cycle’ is a Gene Ontology term, that the word ‘Cdc28’ refers to an S. cerevisiae protein and finally, by looking up the Gene Ontology terms that relate to Cdc28 to connect it to the yeast cell cycle. Although this will not be easy, we see this form of query expansion as the next logical step for ad hoc IR.


Saturday, January 7th, 2006

Vivisimo has set up a new site for searching content from life-science-related journals and databases – though none of ours so far.