<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>τεχνοσοφια &#187; Cataloging and Metadata</title>
	<atom:link href="http://lackoftalent.org/michael/blog/category/libraries/cataloging-and-metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://lackoftalent.org/michael/blog</link>
	<description>The occasional rambling of a digital library artisan</description>
	<lastBuildDate>Sun, 24 Jan 2010 18:30:30 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Linking World Digital Library Data</title>
		<link>http://lackoftalent.org/michael/blog/2009/08/10/linking-world-digital-library-data/</link>
		<comments>http://lackoftalent.org/michael/blog/2009/08/10/linking-world-digital-library-data/#comments</comments>
		<pubDate>Mon, 10 Aug 2009 22:44:29 +0000</pubDate>
		<dc:creator>Michael Giarlo</dc:creator>
				<category><![CDATA[APIs]]></category>
		<category><![CDATA[Cataloging and Metadata]]></category>
		<category><![CDATA[Digital Libraries and Archives]]></category>
		<category><![CDATA[Linked Data]]></category>
		<category><![CDATA[OAI-ORE]]></category>
		<category><![CDATA[World Digital Library]]></category>

		<guid isPermaLink="false">http://lackoftalent.org/michael/blog/?p=457</guid>
		<description><![CDATA[
As I mentioned earlier, I&#039;ve been learning about linked data in the context of dropping it into the World Digital Library project.  I am hopeful we&#039;ll be able to deploy the RDF views[1] before too long.  In advance of that, I thought it might be helpful to share a sample of what our [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="oai:lackoftalent.org:technosophia:457"><!-- &nbsp; --></abbr>
<p>As I <a href="/michael/blog/2009/07/31/validating-ore-from-the-command-line/">mentioned earlier</a>, I&#039;ve been learning about linked data in the context of dropping it into the <a href="http://www.wdl.org">World Digital Library</a> project.  I am hopeful we&#039;ll be able to deploy the RDF views[1] before too long.  In advance of that, I thought it might be helpful to share a sample of what our RDF would look like.  The RDF below represents the WDL item for the U.S. Constitution.  I appreciate constructive criticism.</p>
<p>A few things to note:</p>
<ul>
<li>Mmm, Unicode.</li>
<li>Item types are from the <a href="http://bibliontology.com/">Bibliographic Ontology</a>.</li>
<li>Most of the properties are from the <a href="http://dublincore.org/documents/dces/">Dublin Core Metadata Element Set</a> ontology, especially used where literals are objects rather than resources identified by URI. </li>
<li>Where possible I dug up or found URIs and used the <a href="http://dublincore.org/documents/dcmi-terms/">Dublin Core Metadata Terms</a> ontology.</li>
<li>An item is modeled as an aggregation of its constituent files, as defined in <a href="http://www.openarchives.org/ore/">OAI-ORE</a>.  The notion here is that an ORE aggregation of an item, as expressed in a resource map which is discoverable via a link header in each item detail page, is a &#034;whole&#034; item, including all of its files[2], metadata, and translations.</li>
<li>I&#039;m also making light use of the <a href="http://www.semanticdesktop.org/ontologies/nfo/">NEPOMUK File Ontology</a> to express that constituent files are files, and to be explicit about file sizes so that folks know in advance of retrieving it how large files are.</li>
<li>Links out to <a href="http://purl.org/NET/decimalised#">DDC</a> (Decimalised Database of Concepts), <a href="http://www.lingvoj.org/">Lingvoj</a>, <a href="http://dbpedia.org/">DBpedia</a>, and <a href="http://id.loc.gov/authorities/">Library of Congress Authorities &amp; Vocabularies</a> (e.g., LC Subject Headings) are included where possible. [3] I&#039;d be especially stoked to hear of other vocabs I might link to.  The more linked the data, the better.</li>
<li>The output below is Turtle for readability, but the application will offer up RDF/XML.</li>
</ul>
<p>The data after the jump:<br />
<span id="more-457"></span></p>

<div class="wp_syntax"><div class="code"><pre class="ttl" style="font-family:monospace;">@prefix rdf: &lt;http://www.w3.org/1999/02/22-rdf-syntax-ns#&gt; .
@prefix dc: &lt;http://purl.org/dc/elements/1.1/&gt; .
@prefix dcterms: &lt;http://purl.org/dc/terms/&gt; .
@prefix nfo: &lt;http://www.semanticdesktop.org/ontologies/nfo#&gt; .
@prefix ore: &lt;http://www.openarchives.org/ore/terms/&gt; .
@prefix rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt; .
&nbsp;
&lt;http://localhost/static/c/2708/service/00303_2003_001_pr.jpg&gt;
    dc:format &quot;image/jpeg&quot; ;
    nfo:fileSize &quot;259485&quot;^^&lt;http://www.w3.org/2001/XMLSchema#long&gt; ;
    a nfo:FileDataObject .
&nbsp;
&lt;http://localhost/static/c/2708/service/00303_2003_003_pr.jpg&gt;
    dc:format &quot;image/jpeg&quot; ;
    nfo:fileSize &quot;267031&quot;^^&lt;http://www.w3.org/2001/XMLSchema#long&gt; ;
    a nfo:FileDataObject .
&nbsp;
&lt;http://localhost/static/c/2708/reference/00303_2003_004_pr_thumb_item.gif&gt;
    dc:format &quot;image/gif&quot; ;
    nfo:fileSize &quot;56620&quot;^^&lt;http://www.w3.org/2001/XMLSchema#long&gt; ;
    a nfo:FileDataObject .
&nbsp;
&lt;http://localhost/static/c/2708/service/00303_2003_004_pr.jpg&gt;
    dc:format &quot;image/jpeg&quot; ;
    nfo:fileSize &quot;233875&quot;^^&lt;http://www.w3.org/2001/XMLSchema#long&gt; ;
    a nfo:FileDataObject .
&nbsp;
&lt;http://localhost/static/c/2708/service/00303_2003_002_pr.jpg&gt;
    dc:format &quot;image/jpeg&quot; ;
    nfo:fileSize &quot;245809&quot;^^&lt;http://www.w3.org/2001/XMLSchema#long&gt; ;
    a nfo:FileDataObject .
&nbsp;
&lt;http://localhost/item/2708/about.rdf&gt;
    dcterms:created &quot;2009-08-10T18:11:25-04:00&quot;^^dcterms:W3CDTF ;
    dcterms:creator &lt;http://dbpedia.org/resource/World_Digital_Library&gt; ;
    dcterms:modified &quot;2009-08-10T18:11:25-04:00&quot;^^dcterms:W3CDTF ;
    ore:describes &lt;http://localhost/item/2708/about.rdf#item&gt; ;
    a ore:ResourceMap .
&nbsp;
&lt;http://localhost/item/2708/about.rdf#item&gt;
    dc:created &quot;17 Septembre 1787&quot;@fr, &quot;17 de septiembre de 1787&quot;@es, &quot;17 de setembro de 1787&quot;@pt, &quot;17 сентября 1787 г.&quot;@ru, &quot;1787年9月17日&quot;@zh, &quot;September 17, 1787&quot;@en, &quot;&quot;&quot;١٧ ايلول ١٧٨٧
&quot;&quot;&quot;@ar ;
    dc:creator &quot;Constitutional Convention, United States&quot;@en, &quot;Convención Constituyente, Estados Unidos&quot;@es, &quot;Convention constitutionnelle, États-Unis&quot;@fr, &quot;Convenção Constitucional, Estados Unidos&quot;@pt, &quot;Конституционная Конвенция, Соединенные Штаты&quot;@ru, &quot;الاتفاقية الدستورية، الولايات المتحدة&quot;@ar, &quot;制宪会议，美国&quot;@zh ;
    dc:extent &quot;Manuscript (4 pages of parchment)&quot;@en, &quot;Manuscrit (4 pages de parchemin)&quot;@fr, &quot;Manuscrito (4 páginas de pergamino)&quot;@es, &quot;Manuscrito (4 páginas em pergaminho)&quot;@pt, &quot;Рукопись (4 пергаментных страницы)&quot;@ru, &quot;مخطوطة (٤ صفحات من الورق النفيس)&quot;@ar, &quot;手草本（4 页羊皮纸）&quot;@zh ;
    dc:language &quot;Anglais&quot;@fr, &quot;English&quot;@en, &quot;Inglés&quot;@es, &quot;Inglês&quot;@pt, &quot;Английский язык&quot;@ru, &quot;الإنجليزية&quot;@ar, &quot;英语&quot;@zh ;
    dc:publisher &quot;Administração de Registros e Arquivos Nacionais&quot;@pt, &quot;Archives Nationales et Administration des documents (NARA) des États-Unis d'Amérique &quot;@fr, &quot;Los Archivos Nacionales y Administración de Documentos (NARA) de los Estados Unidos de América&quot;@es, &quot;National Archives and Records Administration&quot;@en, &quot;Управление национальных архивов и документов&quot;@ru, &quot;الإدارة الأمريكية للوثائق والسجلات الوطنية&quot;@ar, &quot;美国国家文件与档案管理局&quot;@zh ;
    dc:subject &quot;Constituciones&quot;@es, &quot;Constituições&quot;@pt, &quot;Constitutional &amp; administrative law&quot;@en, &quot;Constitutions&quot;@en, &quot;Constitutions&quot;@fr, &quot;Derecho constitucional y administrativo&quot;@es, &quot;Direito constitucional e administrativo&quot;@pt, &quot;Droit constitutionnel et administratif&quot;@fr, &quot;Politics and government&quot;@en, &quot;Politique et gouvernement&quot;@fr, &quot;Política e governo&quot;@pt, &quot;Política y gobierno&quot;@es, &quot;Конституции&quot;@ru, &quot;Конституционное и административное право&quot;@ru, &quot;Политика и правительство&quot;@ru, &quot;الدساتير&quot;@ar, &quot;السياسة والحكومة&quot;@ar, &quot;القانون الدستوري والإداري.&quot;@ar, &quot;宪法&quot;@zh, &quot;宪法 &amp; 行政法&quot;@zh, &quot;政治和政府&quot;@zh ;
    dc:title &quot;Constitución de los Estados Unidos&quot;@es, &quot;Constituição dos Estados Unidos&quot;@pt, &quot;Constitution des États-Unis&quot;@fr, &quot;Constitution of the United States&quot;@en, &quot;Конституция Соединенных Штатов&quot;@ru, &quot;دستور الولايات المتحدة&quot;@ar, &quot;美国宪法&quot;@zh ;
    dcterms:DDC &quot;342&quot; ;
    dcterms:LCSH &lt;http://id.loc.gov/authorities/label/Constitutions&gt; ;
    dcterms:alternative &quot;Constitution of the United States&quot;@en ;
    dcterms:dateSubmitted &quot;2009-05-07T06:45:21-04:00&quot;^^dcterms:W3CDTF ;
    dcterms:description &quot;1787 年 5 月 14 日，制宪会议在费城的议会大楼（独立厅）召开，目的是修订《邦联条例》。 由于开始时只有两个州的代表团出席，成员不得不一天天地休会，直到 5 月 25 日与会人数达到法定的七个州。 通过讨论和争辩，6 月中旬时明确显示大会与其修改现有的《联邦条例》不如为政府重新起草一份全新的框架。 整个夏季，代表们都在非公开会议中辩论、起草、重新起草新宪法的条款。 主要的争论问题包括要赋予中央政府多大权利、允许各州在国会中有多少个代表席位以及这些代表应该如何选举产生——由人民直接选举还是由各州立法人员选举产生。 这部宪法是很多人智慧的结晶，是合作政治运作和妥协艺术的典范。&quot;@zh, &quot;A Convenção Federal reuniu-se na Casa de Estado (Hall da Independência), em Filadélfia, em 14 de maio de 1787 para revisar os Artigos da Confederação. Em virtude de estarem presentes, inicialmente, as delegações de apenas dois estados, os membros suspenderam os trabalhos, dia após dia, até que fosse atingido o quórum de sete estados em 25 de maio. Através de discussões e debates ficou claro, em meados de junho que, em vez de alterar os atuais artigos da Confederação, a convenção deveria elaborar uma estrutura inteiramente nova para o governo. Ao longo de todo o verão, os delegados debateram, elaboraram e reelaboraram os artigos da nova Constituição em sessões fechadas. Entre os principais pontos em questão estavam o grau de poder permitido ao governo central, o número de representantes no Congresso para cada Estado, e como estes representantes deveriam ser eleitos - diretamente pelo povo ou pelos legisladores do estado. A Constituição foi o trabalho de muitas mentes e permanece como um modelo de cooperação entre lideranças políticas e da arte da condescendência.&quot;@pt, &quot;La Convención Federal se reunió en la Cámara del Estado (Salón de la Independencia) en Filadelfia el 14 de mayo de 1787, para revisar los artículos de la Confederación. Debido a que las delegaciones de sólo dos estados estuvieron presentes inicialmente, los miembros levantaron sesión de un día para el siguiente hasta que se obtuvo un quórum de siete estados el 25 de mayo. A través de la discusión y el debate se hizo evidente a mediados de junio que, en lugar de modificar los actuales artículos de la Confederación, la convención prepararía un marco totalmente nuevo para el gobierno. Durante todo el verano, los delegados debatieron, prepararon y redactaron nuevamente los artículos de la nueva Constitución en sesiones a puerta cerrada. Entre los principales puntos en cuestión estuvieron cuánto poder otorgar al gobierno central, el número de representantes en el Congreso que se iban a permitir a cada Estado y la forma en que estos representantes debían ser elegidos, directamente por el pueblo o por los legisladores estatales. La Constitución fue el resultado del trabajo de muchas mentes y se erige como modelo de cooperación política y del arte del compromiso.&quot;@es, &quot;La Convention Fédérale s'assembla dans la Chambre Législative (Independence Hall) à Philadelphie le 14 mai 1787, pour réviser les articles de la Confédération. En raison de la seule présence initiale des délégations de deux États, les membres ajournèrent d'un jour à l'autre jusqu'à ce que le quorum de sept États soit obtenu le 25 mai. Â travers les discussions et les débats, il devint clair dès la mi-juin que, plutôt que de modifier les articles existants de la Confédération, la convention allait plutôt ébaucher un cadre entièrement nouveau pour le gouvernement. Tout au long de l'été, les délégués débattirent, élaborèrent, et remanièrent les articles de la nouvelle Constitution, à huis clos. Les principaux points litigieux portaient sur la puissance à accorder au gouvernement central, sur le nombre de représentants au Congrès pour chaque État, et sur le mode d'élection de ces représentants - directement par le peuple ou par les législateurs de l'état. La Constitution fut l'œuvre de nombreux esprits et reste un modèle de coopération politique et de l'art du compromis.&quot;@fr, &quot;The Federal Convention convened in the State House (Independence Hall) in Philadelphia on May 14, 1787, to revise the Articles of Confederation. Because the delegations from only two states were present initially, the members adjourned from one day to the next until a quorum of seven states was obtained on May 25. Through discussion and debate it became clear by mid-June that, rather than amend the existing Articles of Confederation, the convention would draft an entirely new framework for the government. All through the summer, the delegates debated, drafted, and redrafted the articles of the new Constitution in closed sessions. Among the chief points at issue were how much power to allow the central government, how many representatives in Congress to allow each state, and how these representatives should be elected--directly by the people or by the state legislators. The Constitution was the work of many minds and stands as a model of cooperative statesmanship and the art of compromise.&quot;@en, &quot;Федеральное собрание собралось на заседание в Доме правительства (зал Независимости) 14 мая 1787 года для пересмотра Статей Конфедерации. Поскольку вначале на заседании присутствовали представители только двух штатов, Собрание было распущено на несколько дней до тех пор, пока 25 мая не был обеспечен кворум из представителей семи штатов. В ходе дискуссий и дебатов к середине июня стало понятно, что собрание было намерено скорее составить новый вариант структуры правительства, нежели чем пересматривать существующие Статьи Конфедерации. В течение всего лета делегаты обсуждали, составляли черновые варианты статей новой Конституции и тут же их пересматривали в ходе закрытых заседаний. Среди основных обсуждавшихся вопросов были вопросы степени власти и полномочий, которыми должно быть наделено центральное правительство, количества представителей в Конгрессе от каждого штата, а также процедуры переизбрания этих представителей — непосредственно жителями штатов или законодательными собраниями штатов. Конституция была плодом работы многих политиков и является ярким примером сотрудничества государственных деятелей и искусства компромисса.&quot;@ru, &quot;اجتمع ممثلو الاتحاد الفدرالي في قصر الدولة (قاعة الاستقلال) في فيلادلفيا يوم ١٤  أيار ١٧٨٧ لتعديل النظام الأساسي للاتحاد. وحيث حضر وفدان اثنان فقط من وفود الولايات في البداية، رفع الأعضاء الحضور الجلسة من يوم إلى آخر حتى اكتمل النصاب القانوني بحضور وفود سبع ولايات في ٢٥ أيار. وقد اتضح خلال المناقشات والحوار بحلول منتصف حزيران أنه بدلا من تعديل مواد الاتحاد الكونفدرالي القائمة، كان على المؤتمرين صياغة إطار جديد تماما بالنسبة للحكومة. وطوال ذلك الصيف، ناقش المندوبون وصاغوا ثم أعادوا صياغة مواد الدستور الجديد في جلسات مغلقة. ومن بين النقاط الرئيسية التي دار حولها الجدل مدى صلاحيات الحكومة المركزية وعدد الممثلين في الكونغرس لكل ولاية ، وكيفية انتخاب هؤلاء ممثلين -- بالانتخاب المباشر من الشعب أو من قبل مشرّعي الولايات. لقد كان الدستور من عمل عقول كثيرة وهو يمثل نموذجا لفن الحكم التعاوني حنكة التوصل إلى الحلول الوسط.&quot;@ar ;
    dcterms:identifier &quot;http://localhost/item/2708/about.rdf#item&quot; ;
    dcterms:language &lt;http://www.lingvoj.org/lang/en&gt; ;
    dcterms:publisher &lt;http://dbpedia.org/resource/National_Archives_and_Records_Administration&gt; ;
    dcterms:spatial &lt;http://dbpedia.org/resource/North_America&gt;, &lt;http://dbpedia.org/resource/United_States_of_America&gt;, &quot;América del Norte&quot;@es, &quot;América do Norte&quot;@pt, &quot;Amérique du Nord&quot;@fr, &quot;Estados Unidos da América&quot;@pt, &quot;Estados Unidos de América&quot;@es, &quot;North America&quot;@en, &quot;United States of America&quot;@en, &quot;États-Unis d'Amérique&quot;@fr, &quot;Северная Америка&quot;@ru, &quot;Соединенные Штаты Америки&quot;@ru, &quot;أمريكا الشمالية&quot;@ar, &quot;الولايات المتحدة الأمريكية&quot;@ar, &quot;北美&quot;@zh, &quot;美国&quot;@zh ;
    dcterms:subject &lt;http://dbpedia.org/resource/Constitutions&gt; ;
    dcterms:temporal &quot;1700 AD - 1799 AD&quot;@en, &quot;1700 ap. J.-C. - 1799 ap. J.-C.&quot;@fr, &quot;1700 d.C. - 1799 d.C.&quot;@es, &quot;1700 d.C. - 1799 d.C.&quot;@pt, &quot;1700 н.э. - 1799 н.э.&quot;@ru, &quot;1700 公元 - 1799 公元&quot;@zh, &quot;١٧٠٠ م - ١٧٩٩ م&quot;@ar ;
    dcterms:title &lt;http://dbpedia.org/resource/Constitution_of_the_United_States&gt; ;
    ore:aggregates &lt;http://localhost/static/c/2708/reference/00303_2003_004_pr_thumb_item.gif&gt;, &lt;http://localhost/static/c/2708/service/00303_2003_001_pr.jpg&gt;, &lt;http://localhost/static/c/2708/service/00303_2003_002_pr.jpg&gt;, &lt;http://localhost/static/c/2708/service/00303_2003_003_pr.jpg&gt;, &lt;http://localhost/static/c/2708/service/00303_2003_004_pr.jpg&gt; ;
    ore:isDescribedBy &lt;http://localhost/item/2708/about.rdf&gt; ;
    a &lt;http://purl.org/ontology/bibo/Manuscript&gt; ;
    rdfs:seeAlso &lt;http://hdl.loc.gov/loc.wdl/dna.2708&gt; .</pre></div></div>

<h5>Notes</h5><ol class="footnotes"><li id="footnote_0_457" class="footnote">Sadly, the URIs are uglyish due to some constraints from our caching configuration.  I figure we can redirect uglyish URIs to cool ones and make use of owl:sameAs if those constraints go away.</li><li id="footnote_1_457" class="footnote"><em>sans</em> certain low-quality derivatives such as small thumbnails and tiles for the zoom interface</li><li id="footnote_2_457" class="footnote">I was poking through the DBpedia output for <a href="http://www.geonames.org/">Geonames</a> URIs as well, but my method was way too slow and clunky, so that&#039;s disabled for the time being.  Clients can always follow their noses from the DBpedia output.</li></ol><br/>
<hr/>]]></content:encoded>
			<wfw:commentRss>http://lackoftalent.org/michael/blog/2009/08/10/linking-world-digital-library-data/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Is MARC a data model?</title>
		<link>http://lackoftalent.org/michael/blog/2009/08/10/is-marc-a-data-model/</link>
		<comments>http://lackoftalent.org/michael/blog/2009/08/10/is-marc-a-data-model/#comments</comments>
		<pubDate>Mon, 10 Aug 2009 12:49:55 +0000</pubDate>
		<dc:creator>Michael Giarlo</dc:creator>
				<category><![CDATA[Cataloging and Metadata]]></category>

		<guid isPermaLink="false">http://lackoftalent.org/michael/blog/?p=452</guid>
		<description><![CDATA[
I posted a status update to Twitter, identi.ca, and Facebook late last night hoping to suss out two questions:

Is MARC a data model?
But really: what qualifies something as a data model?

I&#039;d poked around looking for clues to the latter and was left cold by the long Wikipedia entry.  Maybe I&#039;ve been doing the micro-blog [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="oai:lackoftalent.org:technosophia:452"><!-- &nbsp; --></abbr>
<p>I posted a status update to <a href="http://twitter.com/mjgiarlo/statuses/3215173861">Twitter</a>, <a href="http://identi.ca/notice/7827179">identi.ca</a>, and <a href="http://facebook.com/mjgiarlo?story_fbid=255213260600">Facebook</a> late last night hoping to suss out two questions:
<ol>
<li>Is MARC a data model?</li>
<li>But really: what qualifies something as a data model?</li>
</ol>
<p>I&#039;d poked around looking for clues to the latter and was left cold by the long Wikipedia entry.  Maybe I&#039;ve been doing the micro-blog thing for too long and my ability to parse information that comes in greater-than-140-character chunks has been damaged.  Plus I like learning from examples, and what better example for the library geek than MARC?</p>
<p>The feedback I received was pretty impressive, and not all of it consistent with the rest.  I found it an interesting example of crowdsourcing, so to speak.  As each response came in, I would read it, cross-reference with, e.g., Wikipedia articles, for accuracy, and revise my own answers to the above questions.  I&#039;m honing in on an answer to the former question.  The latter question is still a bit murky.</p>
<p>I thought I&#039;d share the responses, too.  Responses from Twitter are included in full w/ links to the original.  Responses from quasi-public Facebook have been anonymized.  You can see my replies interspersed as well and watch the evolution of the (admittedly short) discussion.  After the jump:<br />
<span id="more-452"></span></p>
<blockquote><p><a href="http://twitter.com/bangpound/statuses/3215214058">@bangpound</a>: @mjgiarlo MARC is a markup language. It makes no declarations about how data is stored only how it&#039;s formatted.</a></p></blockquote>
<blockquote><p><a href="http://twitter.com/ranginui/statuses/3215591211">@ranginui</a>: @mjgiarlo a piece of crap, cue neil young and crazy horse</p></blockquote>
<blockquote><p><a href="http://twitter.com/anarchivist/statuses/3216566687">@anarchivist</a>: @mjgiarlo not a data model, it&#039;s a transmission format</p></blockquote>
<blockquote><p><a href="http://twitter.com/vphill/statuses/3216984096">@vphill</a>: @mjgiarlo I&#039;ve heard that said about MARC too, let me know if you get an answer</p></blockquote>
<blockquote><p>A container for a data model, such as AACR2</p></blockquote>
<blockquote><p><a href="http://twitter.com/mjgiarlo/statuses/3217501084">@mjgiarlo</a>: @bangpound, @anarchivist, @vphill: So. let&#039;s see: MARC21 bib is a profile of a serialization/transmission format w/ AACR2 as the data model? </p></blockquote>
<blockquote><p><a href="http://twitter.com/anarchivist/statuses/3219349208">@anarchivist</a>: @mjgiarlo wouldn&#039;t even assume AACR2 if I was you.</p></blockquote>
<blockquote><p><a href="http://twitter.com/mjgiarlo/statuses/3223365237">@mjgiarlo</a>: @anarchivist: Okay. Something says &#034;authors go in 100; contributors go in 700,&#034; though, right? Is that not a data model? Sorry if dense.</p></blockquote>
<blockquote><p>MARC is not a data model (and neither is AACR2) in the sense that neither of them explicitly describes entities and relationships among entities. The relationships in these two non-relational frameworks are implicit, and the semantics of the model must be supplied in the end by the people who use these frameworks. RDA/FRBR is a move toward an actual data model &#8212; it makes some relationships explicit and can properly be represented in an Entity-Relationship diagram (with all those relationship words that explicitly express the semantics &#8212; words like, for example, &#034;is realized through&#034; or &#034;is embodied in&#034; or &#034;is exemplified by&#034;), but even RDA/FRBR does not fully express all of the relationships/semantics and must be translated into an actual data model in order to be implemented &#8212; librarians have been irresponsible, in my opinion, in refusing to learn about relational database concepts, mostly because of their slavish adherence to the old flat-file style that MARC represents.</p></blockquote>
<blockquote><p><a href="http://twitter.com/gmcharlt/statuses/3223446556">@gmcharlt</a>: @mjgiarlo MARC is many things at once, which is part of the problem. Not just transmission standard; embodies current cataloging worldview</p></blockquote>
<blockquote><p><a href="http://twitter.com/edsu/statuses/3224290838">@edsu</a>: @mjgiarlo i think there are aspects of data modeling in Z39.2 &#038; ISO 2709, and certainly in MARC21 ; that said, i think @gmcharlt is right.</p></blockquote>
<p>So, based on all the responses I&#039;ve gotten (on Facebook, on Twitter, around the office), here&#039;s my current thinking:</p>
<ul>
<li>MARC means more than one thing.</li>
<li>One meaning of MARC is MARC the binary format. A format is not a data model.</li>
<li>Another meaning of MARC is, e.g., MARC21 Bibliographic.</li>
<li>MARC21 Bibliographic is a profile of MARC, which is serialized in the MARC binary format.</li>
<li>MARC21 Bibliographic defines semantics for fields and subfields and indicators, which makes it feel like a data model.  This gets at some of the assumptions I&#039;ve internalized about data models.</li>
<li>The MARC21 Bibliographic data model thus has well-defined entities, but otherwise is a poor data model, primarily because:
<ol>
<li>It does not have well-defined relationships between the entities;</li>
<li>It conflates different conceptual models, such as the FRBR Group 1 entities and also mixes FRBR Group 1 entities with Group 2 and 3 entities.</li>
</ol>
</li>
<li>I&#039;m not sure where this leaves AACR2, but it feels like it just fell out of the discussion.</li>
</ul>
<p>I&#039;d be pleased if the discussion continued.  If nothing else, it really satisfies my curiosity and gets my brain going (which is useful on a Monday morning).</p>
]]></content:encoded>
			<wfw:commentRss>http://lackoftalent.org/michael/blog/2009/08/10/is-marc-a-data-model/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>WDL metadata mapping, and, parsing TEI in Python</title>
		<link>http://lackoftalent.org/michael/blog/2009/07/13/wdl-metadata-mapping-and-parsing-tei-in-python/</link>
		<comments>http://lackoftalent.org/michael/blog/2009/07/13/wdl-metadata-mapping-and-parsing-tei-in-python/#comments</comments>
		<pubDate>Mon, 13 Jul 2009 22:27:46 +0000</pubDate>
		<dc:creator>Michael Giarlo</dc:creator>
				<category><![CDATA[Cataloging and Metadata]]></category>
		<category><![CDATA[Metadata Evaluation Toolkit]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[World Digital Library]]></category>

		<guid isPermaLink="false">http://lackoftalent.org/michael/blog/?p=430</guid>
		<description><![CDATA[
Context
Early on in the effort to develop the first public version of the World Digital Library web application, we developed a (non-public) Django-based cataloging application where Library of Congress catalogers could manage metadata for WDL items.  Management in this sense includes creation of records, editing of records, versioning of edits, mapping of source records, [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="oai:lackoftalent.org:technosophia:430"><!-- &nbsp; --></abbr>
<h2>Context</h2>
<p>Early on in the effort to develop the first public version of the World Digital Library <a href="http://www.wdl.org/">web application</a>, we developed a (non-public) Django-based cataloging application where Library of Congress catalogers could manage metadata for WDL items.  Management in this sense includes creation of records, editing of records, versioning of edits, mapping of source records, and some light workflow for assignment of records to individual catalogers and for hooking into translation processes[1].  </p>
<p>I worked primarily on the source record mapping tools.  They take a number of formats as input and are called by the cataloging application to map metadata from these formats into the WDL domain model.  Several though not all of which are XML-based, and thus easily dealt with in Python, via the <a href="http://codespeak.net/lxml/api.html">etree module in the lxml package</a>.  </p>
<p><a href="http://onebiglibrary.net/">Dan</a> recently kicked off a new R&#038;D project for evaluating (any) metadata against any number of metadata profiles, mapping into a generic data dictionary, the goal being to determine how feasible it would be to develop a toolset for aiding remediation of metadata across any number of digital collections.  I have been working on this project with Dan, and got started by seeing how generalizable the WDL metadata mapping tools are.  Turns out they&#039;re fairly generalizable once you tweak the various format-specific mapping rules to map into the generic data dictionary model rather than the WDL model (around 15 elements, and somewhere between Dublin Core and MODS in terms of specificity but flatly structured like DC).</p>
<p>Some of the test data I am working with now, that has nothing to do with WDL, is SGML-based <a href="http://quod.lib.umich.edu/t/tei/">TEI 2</a> markup.  The closest I worked with on WDL was <a href="http://www.tei-c.org/release/doc/tei-p5-doc/html/MS.html">TEI P5 for manuscript description</a> which is serialized in XML.  Turns out my TEI mapping rules from before blew up on this TEI 2 stuff, as lxml.etree (naturally) wasn&#039;t digging the non-XML input.  I googled around a bit for how best to parse TEI (or any SGML) in Python and then discovered it&#039;s actually simple as pie.</p>
<h2>Code</h2>
<p>If you&#039;ve got the <a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a> module installed[2]:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">from</span> BeautifulSoup <span style="color: #ff7700;font-weight:bold;">import</span> BeautifulSoup
<span style="color: #66cc66;">&gt;&gt;&gt;</span> tei = <span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'foo.sgm'</span><span style="color: black;">&#41;</span>.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> BeautifulSoup<span style="color: black;">&#40;</span>tei<span style="color: black;">&#41;</span>.<span style="color: black;">findAll</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'title'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>.<span style="color: #dc143c;">string</span>
u<span style="color: #483d8b;">'[Memorandum to Dr. Botkin]: a machine readable transcription.'</span></pre></div></div>

<p>If not, the <a href="http://codespeak.net/lxml/lxmlhtml.html">lxml.html</a> module works too:</p>

<div class="wp_syntax"><div class="code"><pre class="python" style="font-family:monospace;"><span style="color: #66cc66;">&gt;&gt;&gt;</span> <span style="color: #ff7700;font-weight:bold;">from</span> lxml <span style="color: #ff7700;font-weight:bold;">import</span> html
<span style="color: #66cc66;">&gt;&gt;&gt;</span> h = html.<span style="color: black;">parse</span><span style="color: black;">&#40;</span><span style="color: #008000;">open</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'foo.sgm'</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>
<span style="color: #66cc66;">&gt;&gt;&gt;</span> h.<span style="color: black;">xpath</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'//title'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>.<span style="color: black;">text</span>
<span style="color: #483d8b;">'[Memorandum to Dr. Botkin]: a machine readable transcription.'</span></pre></div></div>

<h2>Data</h2>
<p>And here&#039;s what the sample data looks like:</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;">&lt;!doctype tei2 public <span style="color: #ff0000;">&quot;-//Library of Congress - Historical Collections (American Memory)//DTD ammem.dtd//EN&quot;</span> </span>
<span style="color: #009900;"><span style="color: #66cc66;">&#91;</span></span>
<span style="color: #009900;">&lt;!entity % images system <span style="color: #ff0000;">&quot;07010101.ent&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span> %images;
]&gt;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;tei2<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;teiheader</span> <span style="color: #000066;">type</span>=<span style="color: #ff0000;">&quot;text&quot;</span> <span style="color: #000066;">date.created</span>=<span style="color: #ff0000;">&quot;1994/03/15&quot;</span> <span style="color: #000066;">date.updated</span>=<span style="color: #ff0000;">&quot;2002/04/05&quot;</span> <span style="color: #000066;">status</span>=<span style="color: #ff0000;">&quot;updated&quot;</span> <span style="color: #000066;">creator</span>=<span style="color: #ff0000;">&quot;National Digital Library Program</span>
<span style="color: #009900;">, Library of Congress&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;filedesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;titlestmt<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;amid</span> <span style="color: #000066;">type</span>=<span style="color: #ff0000;">&quot;aggitemid&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>wpa0-07010101<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/amid<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;title<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>[Memorandum to Dr. Botkin]: a machine readable transcription.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/title<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;amcol<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;amcolname<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Life Histories from the Folklore Project, WPA Federal Writers<span style="color: #ddbb00;">&amp;apos;</span> Project, 1936-1940; American Memory, Library of Congress.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/amcolname<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;amcolid</span> <span style="color: #000066;">type</span>=<span style="color: #ff0000;">&quot;aggid&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span><span style="color: #000000; font-weight: bold;">&lt;/amcolid<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/amcol<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;respstmt<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;resp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Selected and converted.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/resp<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>American Memory, Library of Congress.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/respstmt<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/titlestmt<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;publicationstmt<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Washington, DC, 1994.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Preceding element provides place and date of transcription only.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>For more information about this text and this American Memory collection, refer to accompanying matter.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/publicationstmt<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;sourcedesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;lccn<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/lccn<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;sourcecol<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>U.S. Work Projects Administration, Federal Writers<span style="color: #ddbb00;">&amp;apos;</span> Project (Folklore Project, Life Histories, 1936-39); Manuscript Division, Library of Congress.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/sourcecol<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;copyright<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Copyright status not determined; refer to accompanying matter.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/copyright<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/sourcedesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/filedesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;encodingdesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;projectdesc<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>The National Digital Library Program at the Library of Congress makes digitized historical materials available for education and scholarship.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/projectdesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;editorialdecl<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>This transcription is intended to have an accuracy of 99.95 percent or greater and is not intended to reproduce the appearance of the original work.  The accompanying images provide a facsimile of this work and represent the appearance of the original.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/editorialdecl<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;encodingdate<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1994/03/15<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/encodingdate<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;revdate<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>2002/04/05<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/revdate<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/encodingdesc<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/teiheader<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;text</span> <span style="color: #000066;">type</span>=<span style="color: #ff0000;">&quot;manuscript&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;body<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;div<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;pageinfo<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;controlpgno</span> <span style="color: #000066;">entity</span>=<span style="color: #ff0000;">&quot;I07010101&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>0001<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/controlpgno<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;printpgno<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/printpgno<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/pageinfo<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Memorandum to Dr. Botkin from G. B. Roberts, May 26, 1941<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Subject:  Alabama Material<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>This material has not yet been accessioned and has only 
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;del</span> <span style="color: #000066;">rend</span>=<span style="color: #ff0000;">&quot;overstrike&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>beeen<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/del<span style="color: #000000; font-weight: bold;">&gt;</span></span></span> been roughly classified as life histories, folklore, and miscellaneous data and copy save in the case of the 2 ex-slave items and the essay on Jesse Owens, each of which was recommended.<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;p<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>Total no. of items recommended:  3 (14 pp.) 
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;handwritten<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>In progress<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/handwritten<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/p<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/div<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/body<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/text<span style="color: #000000; font-weight: bold;">&gt;</span></span><span style="color: #000000; font-weight: bold;">&lt;/tei2<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<h5>Notes</h5><ol class="footnotes"><li id="footnote_0_430" class="footnote">Catalogers cataloged stuff in the English language, but every metadata record needed to be translated into the other six U.N. languages: Spanish, Russian, French, Arabic, Chinese, and Portuguese.</li><li id="footnote_1_430" class="footnote">And you are but one <code>sudo easy_install BeautifulSoup</code> away from that.</li></ol><br/>
<hr/>]]></content:encoded>
			<wfw:commentRss>http://lackoftalent.org/michael/blog/2009/07/13/wdl-metadata-mapping-and-parsing-tei-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Cataloging and institutional repositories</title>
		<link>http://lackoftalent.org/michael/blog/2009/02/09/cataloging-and-institutional-repositories/</link>
		<comments>http://lackoftalent.org/michael/blog/2009/02/09/cataloging-and-institutional-repositories/#comments</comments>
		<pubDate>Mon, 09 Feb 2009 14:30:33 +0000</pubDate>
		<dc:creator>Michael Giarlo</dc:creator>
				<category><![CDATA[Cataloging and Metadata]]></category>
		<category><![CDATA[Digital Libraries and Archives]]></category>
		<category><![CDATA[Libraries]]></category>
		<category><![CDATA[Management]]></category>
		<category><![CDATA[Repositories]]></category>

		<guid isPermaLink="false">http://lackoftalent.org/michael/blog/?p=288</guid>
		<description><![CDATA[
While doing some reading for a little talk my colleague, Ed Summers, and I are giving at code4lib 2009, I came across a paragraph that sparked a crazy thought.  So crazy that it&#039;s not crazy at all.  So not crazy that I am sure other people have thought of it.  But nonetheless, [...]]]></description>
			<content:encoded><![CDATA[<abbr class="unapi-id" title="oai:lackoftalent.org:technosophia:288"><!-- &nbsp; --></abbr>
<p>While doing some reading for a little <a href="http://code4lib.org/conference/2009/schedule#hcal10">talk</a> my colleague, <a href="http://inkdroid.org/ehs">Ed Summers</a>, and I are giving at <a href="http://code4lib.org/conference/2009">code4lib 2009</a>, I came across a paragraph that sparked a crazy thought.  So crazy that it&#039;s not crazy at all.  So not crazy that I am sure other people have thought of it.  But nonetheless, here I am writing about it just in case.</p>
<p>From Sarah Currier&#039;s <a href="http://www.elearning.ac.uk/features/sword">paper</a> on <a href="http://www.swordapp.org/">SWORD</a> (emphasis mine):<br />
<blockquote>One of the most frequently cited barriers to academics depositing their teaching materials into repositories is the keystroke-count involved in logging into a repository, uploading the resource, creating metadata, perhaps selecting a licence, and publishing the resource. It was a quick win, therefore, to create a drag-and-drop desktop tool to allow a single keystroke deposit of resources, including multiple resources in one action. For a repository that supports <b>automatic metadata generation</b>, administrative metadata can be created at the point of entry to the repository without the user needing to create any.</p></blockquote>
<p>And I wondered how many repositories supported automatic metadata generation.  I wondered how many repositories supported automatic generation of <em>rich</em> metadata.  And lastly I wondered, might this be a more or less natural role for catalogers: augmenting stub metadata records or doing original cataloging for institutional repository deposits?  Especially at a time when many of them are being reclassified as acquisitions specialists or digital projects managers?</p>
<p>Potential issues and questions:
<ul>
<li>Author ignorance: Maybe catalogers are already doing this and I&#039;m a moron?</li>
<li>Scale: Is it realistic to expect to be able to &#034;keep up&#034; with repository deposits?</li>
<li>Granularity: Does cataloging at the level of articles, and perhaps at even finer granularities, introduce challenges?</li>
<li>Duplication: If pre-prints are cataloged in the IR, for instance, will they need to be cataloged again later?</li>
<li>&#8230; there are others I thought of on my commute this morning but have since forgotten them.  Feel free to add comments.</li>
</ul>
<p>I will admit here that I&#039;ve been somewhat out of the (academic) institutional repository space a while, and cataloging is something I don&#039;t share thoughts about very often because my exposure is limited to having taken one course a couple years ago.  </p>
<p>I assume there&#039;s a body of research about this out there somewhere but I figured I&#039;d post this anyway.</p>
]]></content:encoded>
			<wfw:commentRss>http://lackoftalent.org/michael/blog/2009/02/09/cataloging-and-institutional-repositories/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
