Command-line shuffle
Being a nerd, I tend to like the command-line. When I'm working on my laptop at home, I tend to like listening to music. Before I discovered that mplayer had a really convenient shuffle idiom, I would invoke it thusly (to listen to all my Pavement tracks in shuffle mode):
export IFS=$'\n' for track in $(find /mnt/upnp/MediaTomb/Audio/Artists/Pavement -name \*.mp3 | ~/bin/shuffle.py); do mplayer $track; done
And the wee shuffle script I whipped together looks like this:
#!/usr/bin/env python # shuffle.py import sys import random args = list(sys.stdin) random.shuffle(args) sys.stdout.writelines(args)
And here's the convenient shuffle idiom that renders my arg-shuffling script somewhat useless:
find /mnt/upnp/MediaTomb/Audio/Artists/Pavement -name \*.mp3 | mplayer -playlist - -shuffle -loop 0
I2: Survey results
I wrote in June that the I2 subgroup surveyed "repository managers to determine the current practices and needs of the repository community regarding institutional identifiers. Results from the survey will inform a set of use cases that will be shared with the community, and that are expected to drive the development of a new standard for institutional identifiers."
The survey closed in July, and the subgroup spent August writing a report on the survey results. That report is now final and it's available to the public. Feedback may be sent to our (woefully underutilized) public i2info mailing list, left as a comment on this post, or e-mailed to me privately which I can forward to our internal list.
The next step is to build upon the report to draw yet more conclusions from the data — there's an awful lot there — and flesh out some repository use cases for institutional identifiers. The I2 core group is moving quickly towards finalizing identifier metadata elements so that a standard may be drafted, and I think having some use cases documented will help drive the standard in a direction the community can get behind.
Onward and upward.
JSONovich emerges
JSONovich has now emerged from the Mozilla Add-ons sandbox and is available to the masses: http://addons.mozilla.org/en-US/firefox/addon/10122.
Linking World Digital Library Data
As I mentioned earlier, I've been learning about linked data in the context of dropping it into the World Digital Library project. I am hopeful we'll be able to deploy the RDF views[1] before too long. In advance of that, I thought it might be helpful to share a sample of what our RDF would look like. The RDF below represents the WDL item for the U.S. Constitution. I appreciate constructive criticism.
A few things to note:
- Mmm, Unicode.
- Item types are from the Bibliographic Ontology.
- Most of the properties are from the Dublin Core Metadata Element Set ontology, especially used where literals are objects rather than resources identified by URI.
- Where possible I dug up or found URIs and used the Dublin Core Metadata Terms ontology.
- An item is modeled as an aggregation of its constituent files, as defined in OAI-ORE. The notion here is that an ORE aggregation of an item, as expressed in a resource map which is discoverable via a link header in each item detail page, is a "whole" item, including all of its files[2], metadata, and translations.
- I'm also making light use of the NEPOMUK File Ontology to express that constituent files are files, and to be explicit about file sizes so that folks know in advance of retrieving it how large files are.
- Links out to DDC (Decimalised Database of Concepts), Lingvoj, DBpedia, and Library of Congress Authorities & Vocabularies (e.g., LC Subject Headings) are included where possible. [3] I'd be especially stoked to hear of other vocabs I might link to. The more linked the data, the better.
- The output below is Turtle for readability, but the application will offer up RDF/XML.
The data after the jump:
Continue reading…
Notes
- Sadly, the URIs are uglyish due to some constraints from our caching configuration. I figure we can redirect uglyish URIs to cool ones and make use of owl:sameAs if those constraints go away. [↩]
- sans certain low-quality derivatives such as small thumbnails and tiles for the zoom interface [↩]
- I was poking through the DBpedia output for Geonames URIs as well, but my method was way too slow and clunky, so that's disabled for the time being. Clients can always follow their noses from the DBpedia output. [↩]
Is MARC a data model?
I posted a status update to Twitter, identi.ca, and Facebook late last night hoping to suss out two questions:
- Is MARC a data model?
- But really: what qualifies something as a data model?
I'd poked around looking for clues to the latter and was left cold by the long Wikipedia entry. Maybe I've been doing the micro-blog thing for too long and my ability to parse information that comes in greater-than-140-character chunks has been damaged. Plus I like learning from examples, and what better example for the library geek than MARC?
The feedback I received was pretty impressive, and not all of it consistent with the rest. I found it an interesting example of crowdsourcing, so to speak. As each response came in, I would read it, cross-reference with, e.g., Wikipedia articles, for accuracy, and revise my own answers to the above questions. I'm honing in on an answer to the former question. The latter question is still a bit murky.
I thought I'd share the responses, too. Responses from Twitter are included in full w/ links to the original. Responses from quasi-public Facebook have been anonymized. You can see my replies interspersed as well and watch the evolution of the (admittedly short) discussion. After the jump:
Continue reading…
