Linking World Digital Library Data

Posted by Michael Giarlo on August 10, 2009

As I mentioned earlier, I've been learning about linked data in the context of dropping it into the World Digital Library project. I am hopeful we'll be able to deploy the RDF views[1] before too long. In advance of that, I thought it might be helpful to share a sample of what our RDF would look like. The RDF below represents the WDL item for the U.S. Constitution. I appreciate constructive criticism.

A few things to note:

  • Mmm, Unicode.
  • Item types are from the Bibliographic Ontology.
  • Most of the properties are from the Dublin Core Metadata Element Set ontology, especially used where literals are objects rather than resources identified by URI.
  • Where possible I dug up or found URIs and used the Dublin Core Metadata Terms ontology.
  • An item is modeled as an aggregation of its constituent files, as defined in OAI-ORE. The notion here is that an ORE aggregation of an item, as expressed in a resource map which is discoverable via a link header in each item detail page, is a "whole" item, including all of its files[2], metadata, and translations.
  • I'm also making light use of the NEPOMUK File Ontology to express that constituent files are files, and to be explicit about file sizes so that folks know in advance of retrieving it how large files are.
  • Links out to DDC (Decimalised Database of Concepts), Lingvoj, DBpedia, and Library of Congress Authorities & Vocabularies (e.g., LC Subject Headings) are included where possible. [3] I'd be especially stoked to hear of other vocabs I might link to. The more linked the data, the better.
  • The output below is Turtle for readability, but the application will offer up RDF/XML.

The data after the jump:
Continue reading…

Notes
  1. Sadly, the URIs are uglyish due to some constraints from our caching configuration. I figure we can redirect uglyish URIs to cool ones and make use of owl:sameAs if those constraints go away. []
  2. sans certain low-quality derivatives such as small thumbnails and tiles for the zoom interface []
  3. I was poking through the DBpedia output for Geonames URIs as well, but my method was way too slow and clunky, so that's disabled for the time being. Clients can always follow their noses from the DBpedia output. []


ORE plugin updated

Posted by Michael Giarlo on July 25, 2008

I've been using my time at RepoCamp today to get the OAI-ORE plugin for WordPress validating again.  I'm having some trouble using the validator so I say that with some diffidence.  But the latest code which is now checked in to the WordPress plugins svn repo ought to be close, if not fully conformant, to the 0.9 version of the ORE spec.

I'm not sure the plugin is really useful; it's just an Atom feed of all posts and pages in a WP instance.  I can think of some ways to make this more useful, by allowing blog authors to create their own aggregations, pulling in content outside of the particular instance.  I am certain that others can come up with even better uses.  I'm open to suggestions.

Thanks to Jay Datema for prodding me a bit, if indirectly.

OAI-ORE ResourceMap for WordPress

Posted by Michael Giarlo on December 14, 2007

This is very rough, but here's a WordPress plugin that provides a resource map for the aggregation of all posts within an installation of WordPress. I'll be working on this some more, but for now, it does appear to work and validate (as Atom). Useful? If so, I'll zip it up and commit it to the wp-plugins svn.

Note:Ed reminds me that xsltproc can be used to transform the Atom-based resource map into RDF via GRDDL:

xsltproc http://www.openarchives.org/ore/atom-grddl.xsl http://lackoftalent.org/michael/blog/wp-content/plugins/oai-ore/rem.php

Update: The plugin has its own page.

RESTful Fedora?

Posted by Michael Giarlo on June 19, 2007

Matt Zumwalt of MediaShelf, LLC has been hard at work thinking about how to make Fedora RESTful. There is now a proposal on the Fedora wiki based on a PDF he sent to the fedora-commons-developers list.

It's an interesting proposal. I've read over the PDF version quickly but it does bear a closer read.

Whether SOAP or REST is more appropriate for a Fedora API is something I'm not sure about, though I don't mean to imply it's an either/or situation.

Fedora marches forward

Posted by Michael Giarlo on December 22, 2006

I was pleased to see the note that Sandy Payette sent to the fedora-users mailing list earlier today, updating the community on the Fedora 2.2 release date. Version 2.2 is going to include a bunch of features, some of which have been long-awaited and are quite, well, sexy. Some of the highlights:

  • Database support has been extended to include Postgres, which should make all the MySQL-haters happy
  • Fedora may now be deployed via a .war file in an existing servlet container, such as Tomcat, rather than requiring its very own Tomcat server
  • A Lucene- or Zebra-backed search service has been included, which is more robust than the previous search service that used the built-in Dublin Core-populated database

These are but a few of the enhancements, and I can't wait to put it through its paces when it's released on January 19th.

For a more complete set of feature enhancements, click on the link above to read Sandy's message.

Now if we can come together as a community and work on some more UIs, and get them used in some high-profile projects, many of the gripes against Fedora may be silenced. It's still not a perfect product, but what is? That it uses XML as a storage format and exposes its functions via web-services APIs and allows use of any metadata schema, in my humble opinion, puts it head and shoulders above many other library repository solutions. And for that, it's at least worth consideration.