Jythons and Javas and bears, oh my!
It's hard to believe but I've been at the new job for six months already, a full half-year come the 29th. Some days it seems like I've been here forever; others like I'm still a rank newb. I haven't written terribly much about what I've been up to (but I assure you I've been busy). Let me rectify that.
The Transfer Problem
Two of the projects I've been working on relate to a fairly general problem that we like to call "transfer," which revolves around, well, transferring files to and fro. Sounds simple. Is simple. That is, until you start thinking about preservation and accounting for a highly heterogeneous network with idiosyncratic nodes, esoteric storage software, and differential firewall rules. And that's where it gets interesting (and problematic). Continue reading…
Rails Deployment
Deploying Rails (to Apache servers) is about to get much easier. Hopefully.
Deployment has long been the bugaboo with Rails, so this should bode well for the framework.
OAI-ORE ResourceMap for WordPress
This is very rough, but here's a WordPress plugin that provides a resource map for the aggregation of all posts within an installation of WordPress. I'll be working on this some more, but for now, it does appear to work and validate (as Atom). Useful? If so, I'll zip it up and commit it to the wp-plugins svn.
Note:Ed reminds me that xsltproc can be used to transform the Atom-based resource map into RDF via GRDDL:
xsltproc http://www.openarchives.org/ore/atom-grddl.xsl http://lackoftalent.org/michael/blog/wp-content/plugins/oai-ore/rem.php
Update: The plugin has its own page.
OAI-PMH in XQuery
I seem to be having issues successfully submitting comments to certain WordPress blogs lately — or perhaps Akismet has finally decided to (rightly) classify my comments are spam? Anyone know of any Firefox / WordPress comment bugs? My comments seem to be submitted — there are no errors — and Firefox winds up on a link like "http://example.org/blog/foo-bar-whatever/#comment-12309". Any ideas? At any rate, I'm left to comment via trackback for now.
Thanks for the nod, Winona. Hopefully you folks will get some good use out of the XQuery-based OAI-PMH data provider I've been working on.
I just want to clarify that only one small bit of the code is specific to X-Hive, and that's a call to an extension that gets last-modified dates from the X-Hive service. We do not reliably store this information in the metadata itself, and so I needed to go this route. Some folks do store this in MODS or elsewhere in descriptive or administrative metadata. It should be a two-line change to short-circuit this behavior (xhive-exts:last-update() is only invoked in two places, I believe).
I'm currently working on adding EAD support, modularizing things a bit more, and streamlining configuration. resumptionTokens will come after that, I hope.
I'll be interested to hear more of UVM's implementation and how I can make this thing more useful to others.
Library Camp NYC 2007
I proposed an NJ Library BarCamp some months ago, not realizing that efforts were already under way to do the same in NYC. In retrospect, I'm glad I didn't do anything to get things moving; I wouldn't have pulled things together nearly as well as the NYC folks did. The event was excellent. It was my first camp, and I'd definitely try another. A big thanks to Stephen Francoeur et al.
Here are the three sessions I attended, with links to the "official" wiki pages for summaries:
- Solr and Lucene (session moderated by AIP's Mark Matienzo and NYU's Jason Casden) seem to be gaining momentum in the library world. Having gone to the last Code4Lib conference, my head was already chock full of relevant tidbits, but the moderators did a great job of showing examples, evangelizing, and keeping the discussion going.
- Grid Services (session moderated by OCLC Openly Informatics' Eric Hellman) might have been very interesting if I hadn't kept receiving phone calls from an insurance company. I had to take the calls, and so this session was difficult to follow. The basic idea was to think of networked library services like the power grid. What would libraries want from the grid? What would they be willing to contribute back?
- Semantic Web (session moderated by NYU's Corey Harper and CUNY's Sunny Yoon) was the most widely attended session I went to: standing room only! When I first added the topic to the wiki, I had no idea it would draw this many people. Odd that I would suggest this topic since I had little to offer on the topic, so I gleaned an awful lot. The discussion was spirited and, as you might expect, the RDF vs. microformats arguments flew fast and furious across the room. I'm left wondering if the RDFa/GRRDL approach might not be a good middle-road between the "everything must be represented as RDF in a triplestore" camp and the "just embed microformats in xhtml" people.
And now, the requisite name-dropping. I got to reconnect with a bunch of people I hadn't seen in a while, like Terry Catapano, Jay Datema, Nicole Engard, Valerie Forrestal, Kevin Reiss, and Sunny Yoon. And I got to meet LibLime's Chris Cormack, NYPL's Josh Greenberg, Corey Harper, Mark Matienzo, Jenkins Law's RayAna Min Park, and Steven's Tech's Linda Scanlon, among other people.
It was about as good as any camp without kayaks and archery can be. Check out some more summaries.
