OAI-ORE ResourceMap for WordPress
This is very rough, but here's a WordPress plugin that provides a resource map for the aggregation of all posts within an installation of WordPress. I'll be working on this some more, but for now, it does appear to work and validate (as Atom). Useful? If so, I'll zip it up and commit it to the wp-plugins svn.
Note:Ed reminds me that xsltproc can be used to transform the Atom-based resource map into RDF via GRDDL:
xsltproc http://www.openarchives.org/ore/atom-grddl.xsl http://lackoftalent.org/michael/blog/wp-content/plugins/oai-ore/rem.php
Update: The plugin has its own page.
Use cases for Handle identifiers?
Reading Adam Smith's D-Lib article has got me thinking about identifiers again. I don't agree with some of the assertions in the section titled "A Persistent Identifier Primer" — URIs are in fact persistent; we just break them through poor management — and so I'm led to a fundamental question: what are the good use cases for Handle (or ARK, or PURL) identifiers?
I get the need for persistent and globally unique identifiers; I'm just wondering why one needs special software with a separate URI namespace to gain persistence.
One potential use case might be resources that are outside of the organization's control — i.e., licensed content from vendors — but surely folks are using Handles for many resources that are created and managed within the organization. And I'm curious why they have decided that Handles are more durable than native URIs (the URIs to which Handles redirect), and how they deal with the problem of downstream (post-redirection) citation and bookmarking. How useful is this sort of identifier scheme if your users never even see the supposedly more persistent URI for a resource?
As a former proponent of Handles and ARKs, this may seem like a hypocritical question to pose. If I had to answer my own question, I would say that Handles seem like a good option because they save you some work and headaches in the short-term; you don't need to get together with your web team and come up with a scalable and sustainable URI policy; just assign native URIs in the usual haphazard way and generate Handles to compensate for a lack of identifier policies.
But if you're already making an organizational commitment to identifier persistence — and if you're rolling out Handles, I'd wager that's likely — why not do so by minting carefully-considered cool URIs? Less management and technology overhead and less confusion for your users are two good reasons to consider it.
OAI-PMH in XQuery
I seem to be having issues successfully submitting comments to certain WordPress blogs lately — or perhaps Akismet has finally decided to (rightly) classify my comments are spam? Anyone know of any Firefox / WordPress comment bugs? My comments seem to be submitted — there are no errors — and Firefox winds up on a link like "http://example.org/blog/foo-bar-whatever/#comment-12309". Any ideas? At any rate, I'm left to comment via trackback for now.
Thanks for the nod, Winona. Hopefully you folks will get some good use out of the XQuery-based OAI-PMH data provider I've been working on.
I just want to clarify that only one small bit of the code is specific to X-Hive, and that's a call to an extension that gets last-modified dates from the X-Hive service. We do not reliably store this information in the metadata itself, and so I needed to go this route. Some folks do store this in MODS or elsewhere in descriptive or administrative metadata. It should be a two-line change to short-circuit this behavior (xhive-exts:last-update() is only invoked in two places, I believe).
I'm currently working on adding EAD support, modularizing things a bit more, and streamlining configuration. resumptionTokens will come after that, I hope.
I'll be interested to hear more of UVM's implementation and how I can make this thing more useful to others.
Digital preservation for archivists
At long last, the paper that Ron Jantz and I wrote for the Journal of Archival Organization has been published in a special double issue. It's titled "Digital Archiving and Preservation: Technologies and Processes for a Trusted Repository" and is intended to be a fairly nitty-gritty piece on digital preservation (in the context of trusted repositories) for archivists. The abstract:
This article examines what is implied by the term "trusted" in the phrase "trusted digital repositories." Digital repositories should be able to preserve electronic materials for periods at least comparable to existing preservation methods. Our collective lack of experience with preserving digital objects and consensus about the reliability of our technological infrastructure raises questions about how we should proceed with digital-based preservation practices, an emerging role for academic libraries and archival institutions. This article reviews issues relating to building a trusted digital repository, highlighting some of the issues raised and possible solutions proposed by the authors in their work of implementing and acculturating a digital repository at Rutgers University Libraries.
This special double-issue of JAO will also be released in the manuscript, "Archives and the Digital Library."
Thanks to editors Bill Landis, Robin Chandler, Tom Frusciano, and Caryn Radick for seeing this through. And of course to Ron Jantz for getting me interested in this crazy stuff at a time when I had no direction or interest in my career.
NJLA 2007 Talk
This is a slightly modified (read: rough) transcription of the talk I gave at this year's NJLA conference, called "Library Revolution." Continue reading…
