Linking World Digital Library Data

Posted by Michael Giarlo on August 10, 2009

As I mentioned earlier, I've been learning about linked data in the context of dropping it into the World Digital Library project. I am hopeful we'll be able to deploy the RDF views[1] before too long. In advance of that, I thought it might be helpful to share a sample of what our RDF would look like. The RDF below represents the WDL item for the U.S. Constitution. I appreciate constructive criticism.

A few things to note:

  • Mmm, Unicode.
  • Item types are from the Bibliographic Ontology.
  • Most of the properties are from the Dublin Core Metadata Element Set ontology, especially used where literals are objects rather than resources identified by URI.
  • Where possible I dug up or found URIs and used the Dublin Core Metadata Terms ontology.
  • An item is modeled as an aggregation of its constituent files, as defined in OAI-ORE. The notion here is that an ORE aggregation of an item, as expressed in a resource map which is discoverable via a link header in each item detail page, is a "whole" item, including all of its files[2], metadata, and translations.
  • I'm also making light use of the NEPOMUK File Ontology to express that constituent files are files, and to be explicit about file sizes so that folks know in advance of retrieving it how large files are.
  • Links out to DDC (Decimalised Database of Concepts), Lingvoj, DBpedia, and Library of Congress Authorities & Vocabularies (e.g., LC Subject Headings) are included where possible. [3] I'd be especially stoked to hear of other vocabs I might link to. The more linked the data, the better.
  • The output below is Turtle for readability, but the application will offer up RDF/XML.

The data after the jump:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix nfo: <http://www.semanticdesktop.org/ontologies/nfo#> .
@prefix ore: <http://www.openarchives.org/ore/terms/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
 
<http://localhost/static/c/2708/service/00303_2003_001_pr.jpg>
    dc:format "image/jpeg" ;
    nfo:fileSize "259485"^^<http://www.w3.org/2001/XMLSchema#long> ;
    a nfo:FileDataObject .
 
<http://localhost/static/c/2708/service/00303_2003_003_pr.jpg>
    dc:format "image/jpeg" ;
    nfo:fileSize "267031"^^<http://www.w3.org/2001/XMLSchema#long> ;
    a nfo:FileDataObject .
 
<http://localhost/static/c/2708/reference/00303_2003_004_pr_thumb_item.gif>
    dc:format "image/gif" ;
    nfo:fileSize "56620"^^<http://www.w3.org/2001/XMLSchema#long> ;
    a nfo:FileDataObject .
 
<http://localhost/static/c/2708/service/00303_2003_004_pr.jpg>
    dc:format "image/jpeg" ;
    nfo:fileSize "233875"^^<http://www.w3.org/2001/XMLSchema#long> ;
    a nfo:FileDataObject .
 
<http://localhost/static/c/2708/service/00303_2003_002_pr.jpg>
    dc:format "image/jpeg" ;
    nfo:fileSize "245809"^^<http://www.w3.org/2001/XMLSchema#long> ;
    a nfo:FileDataObject .
 
<http://localhost/item/2708/about.rdf>
    dcterms:created "2009-08-10T18:11:25-04:00"^^dcterms:W3CDTF ;
    dcterms:creator <http://dbpedia.org/resource/World_Digital_Library> ;
    dcterms:modified "2009-08-10T18:11:25-04:00"^^dcterms:W3CDTF ;
    ore:describes <http://localhost/item/2708/about.rdf#item> ;
    a ore:ResourceMap .
 
<http://localhost/item/2708/about.rdf#item>
    dc:created "17 Septembre 1787"@fr, "17 de septiembre de 1787"@es, "17 de setembro de 1787"@pt, "17 сентября 1787 г."@ru, "1787年9月17日"@zh, "September 17, 1787"@en, """١٧ ايلول ١٧٨٧
"""@ar ;
    dc:creator "Constitutional Convention, United States"@en, "Convención Constituyente, Estados Unidos"@es, "Convention constitutionnelle, États-Unis"@fr, "Convenção Constitucional, Estados Unidos"@pt, "Конституционная Конвенция, Соединенные Штаты"@ru, "الاتفاقية الدستورية، الولايات المتحدة"@ar, "制宪会议,美国"@zh ;
    dc:extent "Manuscript (4 pages of parchment)"@en, "Manuscrit (4 pages de parchemin)"@fr, "Manuscrito (4 páginas de pergamino)"@es, "Manuscrito (4 páginas em pergaminho)"@pt, "Рукопись (4 пергаментных страницы)"@ru, "مخطوطة (٤ صفحات من الورق النفيس)"@ar, "手草本(4 页羊皮纸)"@zh ;
    dc:language "Anglais"@fr, "English"@en, "Inglés"@es, "Inglês"@pt, "Английский язык"@ru, "الإنجليزية"@ar, "英语"@zh ;
    dc:publisher "Administração de Registros e Arquivos Nacionais"@pt, "Archives Nationales et Administration des documents (NARA) des États-Unis d'Amérique "@fr, "Los Archivos Nacionales y Administración de Documentos (NARA) de los Estados Unidos de América"@es, "National Archives and Records Administration"@en, "Управление национальных архивов и документов"@ru, "الإدارة الأمريكية للوثائق والسجلات الوطنية"@ar, "美国国家文件与档案管理局"@zh ;
    dc:subject "Constituciones"@es, "Constituições"@pt, "Constitutional & administrative law"@en, "Constitutions"@en, "Constitutions"@fr, "Derecho constitucional y administrativo"@es, "Direito constitucional e administrativo"@pt, "Droit constitutionnel et administratif"@fr, "Politics and government"@en, "Politique et gouvernement"@fr, "Política e governo"@pt, "Política y gobierno"@es, "Конституции"@ru, "Конституционное и административное право"@ru, "Политика и правительство"@ru, "الدساتير"@ar, "السياسة والحكومة"@ar, "القانون الدستوري والإداري."@ar, "宪法"@zh, "宪法 & 行政法"@zh, "政治和政府"@zh ;
    dc:title "Constitución de los Estados Unidos"@es, "Constituição dos Estados Unidos"@pt, "Constitution des États-Unis"@fr, "Constitution of the United States"@en, "Конституция Соединенных Штатов"@ru, "دستور الولايات المتحدة"@ar, "美国宪法"@zh ;
    dcterms:DDC "342" ;
    dcterms:LCSH <http://id.loc.gov/authorities/label/Constitutions> ;
    dcterms:alternative "Constitution of the United States"@en ;
    dcterms:dateSubmitted "2009-05-07T06:45:21-04:00"^^dcterms:W3CDTF ;
    dcterms:description "1787 年 5 月 14 日,制宪会议在费城的议会大楼(独立厅)召开,目的是修订《邦联条例》。 由于开始时只有两个州的代表团出席,成员不得不一天天地休会,直到 5 月 25 日与会人数达到法定的七个州。 通过讨论和争辩,6 月中旬时明确显示大会与其修改现有的《联邦条例》不如为政府重新起草一份全新的框架。 整个夏季,代表们都在非公开会议中辩论、起草、重新起草新宪法的条款。 主要的争论问题包括要赋予中央政府多大权利、允许各州在国会中有多少个代表席位以及这些代表应该如何选举产生——由人民直接选举还是由各州立法人员选举产生。 这部宪法是很多人智慧的结晶,是合作政治运作和妥协艺术的典范。"@zh, "A Convenção Federal reuniu-se na Casa de Estado (Hall da Independência), em Filadélfia, em 14 de maio de 1787 para revisar os Artigos da Confederação. Em virtude de estarem presentes, inicialmente, as delegações de apenas dois estados, os membros suspenderam os trabalhos, dia após dia, até que fosse atingido o quórum de sete estados em 25 de maio. Através de discussões e debates ficou claro, em meados de junho que, em vez de alterar os atuais artigos da Confederação, a convenção deveria elaborar uma estrutura inteiramente nova para o governo. Ao longo de todo o verão, os delegados debateram, elaboraram e reelaboraram os artigos da nova Constituição em sessões fechadas. Entre os principais pontos em questão estavam o grau de poder permitido ao governo central, o número de representantes no Congresso para cada Estado, e como estes representantes deveriam ser eleitos - diretamente pelo povo ou pelos legisladores do estado. A Constituição foi o trabalho de muitas mentes e permanece como um modelo de cooperação entre lideranças políticas e da arte da condescendência."@pt, "La Convención Federal se reunió en la Cámara del Estado (Salón de la Independencia) en Filadelfia el 14 de mayo de 1787, para revisar los artículos de la Confederación. Debido a que las delegaciones de sólo dos estados estuvieron presentes inicialmente, los miembros levantaron sesión de un día para el siguiente hasta que se obtuvo un quórum de siete estados el 25 de mayo. A través de la discusión y el debate se hizo evidente a mediados de junio que, en lugar de modificar los actuales artículos de la Confederación, la convención prepararía un marco totalmente nuevo para el gobierno. Durante todo el verano, los delegados debatieron, prepararon y redactaron nuevamente los artículos de la nueva Constitución en sesiones a puerta cerrada. Entre los principales puntos en cuestión estuvieron cuánto poder otorgar al gobierno central, el número de representantes en el Congreso que se iban a permitir a cada Estado y la forma en que estos representantes debían ser elegidos, directamente por el pueblo o por los legisladores estatales. La Constitución fue el resultado del trabajo de muchas mentes y se erige como modelo de cooperación política y del arte del compromiso."@es, "La Convention Fédérale s'assembla dans la Chambre Législative (Independence Hall) à Philadelphie le 14 mai 1787, pour réviser les articles de la Confédération. En raison de la seule présence initiale des délégations de deux États, les membres ajournèrent d'un jour à l'autre jusqu'à ce que le quorum de sept États soit obtenu le 25 mai. Â travers les discussions et les débats, il devint clair dès la mi-juin que, plutôt que de modifier les articles existants de la Confédération, la convention allait plutôt ébaucher un cadre entièrement nouveau pour le gouvernement. Tout au long de l'été, les délégués débattirent, élaborèrent, et remanièrent les articles de la nouvelle Constitution, à huis clos. Les principaux points litigieux portaient sur la puissance à accorder au gouvernement central, sur le nombre de représentants au Congrès pour chaque État, et sur le mode d'élection de ces représentants - directement par le peuple ou par les législateurs de l'état. La Constitution fut l'œuvre de nombreux esprits et reste un modèle de coopération politique et de l'art du compromis."@fr, "The Federal Convention convened in the State House (Independence Hall) in Philadelphia on May 14, 1787, to revise the Articles of Confederation. Because the delegations from only two states were present initially, the members adjourned from one day to the next until a quorum of seven states was obtained on May 25. Through discussion and debate it became clear by mid-June that, rather than amend the existing Articles of Confederation, the convention would draft an entirely new framework for the government. All through the summer, the delegates debated, drafted, and redrafted the articles of the new Constitution in closed sessions. Among the chief points at issue were how much power to allow the central government, how many representatives in Congress to allow each state, and how these representatives should be elected--directly by the people or by the state legislators. The Constitution was the work of many minds and stands as a model of cooperative statesmanship and the art of compromise."@en, "Федеральное собрание собралось на заседание в Доме правительства (зал Независимости) 14 мая 1787 года для пересмотра Статей Конфедерации. Поскольку вначале на заседании присутствовали представители только двух штатов, Собрание было распущено на несколько дней до тех пор, пока 25 мая не был обеспечен кворум из представителей семи штатов. В ходе дискуссий и дебатов к середине июня стало понятно, что собрание было намерено скорее составить новый вариант структуры правительства, нежели чем пересматривать существующие Статьи Конфедерации. В течение всего лета делегаты обсуждали, составляли черновые варианты статей новой Конституции и тут же их пересматривали в ходе закрытых заседаний. Среди основных обсуждавшихся вопросов были вопросы степени власти и полномочий, которыми должно быть наделено центральное правительство, количества представителей в Конгрессе от каждого штата, а также процедуры переизбрания этих представителей — непосредственно жителями штатов или законодательными собраниями штатов. Конституция была плодом работы многих политиков и является ярким примером сотрудничества государственных деятелей и искусства компромисса."@ru, "اجتمع ممثلو الاتحاد الفدرالي في قصر الدولة (قاعة الاستقلال) في فيلادلفيا يوم ١٤  أيار ١٧٨٧ لتعديل النظام الأساسي للاتحاد. وحيث حضر وفدان اثنان فقط من وفود الولايات في البداية، رفع الأعضاء الحضور الجلسة من يوم إلى آخر حتى اكتمل النصاب القانوني بحضور وفود سبع ولايات في ٢٥ أيار. وقد اتضح خلال المناقشات والحوار بحلول منتصف حزيران أنه بدلا من تعديل مواد الاتحاد الكونفدرالي القائمة، كان على المؤتمرين صياغة إطار جديد تماما بالنسبة للحكومة. وطوال ذلك الصيف، ناقش المندوبون وصاغوا ثم أعادوا صياغة مواد الدستور الجديد في جلسات مغلقة. ومن بين النقاط الرئيسية التي دار حولها الجدل مدى صلاحيات الحكومة المركزية وعدد الممثلين في الكونغرس لكل ولاية ، وكيفية انتخاب هؤلاء ممثلين -- بالانتخاب المباشر من الشعب أو من قبل مشرّعي الولايات. لقد كان الدستور من عمل عقول كثيرة وهو يمثل نموذجا لفن الحكم التعاوني حنكة التوصل إلى الحلول الوسط."@ar ;
    dcterms:identifier "http://localhost/item/2708/about.rdf#item" ;
    dcterms:language <http://www.lingvoj.org/lang/en> ;
    dcterms:publisher <http://dbpedia.org/resource/National_Archives_and_Records_Administration> ;
    dcterms:spatial <http://dbpedia.org/resource/North_America>, <http://dbpedia.org/resource/United_States_of_America>, "América del Norte"@es, "América do Norte"@pt, "Amérique du Nord"@fr, "Estados Unidos da América"@pt, "Estados Unidos de América"@es, "North America"@en, "United States of America"@en, "États-Unis d'Amérique"@fr, "Северная Америка"@ru, "Соединенные Штаты Америки"@ru, "أمريكا الشمالية"@ar, "الولايات المتحدة الأمريكية"@ar, "北美"@zh, "美国"@zh ;
    dcterms:subject <http://dbpedia.org/resource/Constitutions> ;
    dcterms:temporal "1700 AD - 1799 AD"@en, "1700 ap. J.-C. - 1799 ap. J.-C."@fr, "1700 d.C. - 1799 d.C."@es, "1700 d.C. - 1799 d.C."@pt, "1700 н.э. - 1799 н.э."@ru, "1700 公元 - 1799 公元"@zh, "١٧٠٠ م - ١٧٩٩ م"@ar ;
    dcterms:title <http://dbpedia.org/resource/Constitution_of_the_United_States> ;
    ore:aggregates <http://localhost/static/c/2708/reference/00303_2003_004_pr_thumb_item.gif>, <http://localhost/static/c/2708/service/00303_2003_001_pr.jpg>, <http://localhost/static/c/2708/service/00303_2003_002_pr.jpg>, <http://localhost/static/c/2708/service/00303_2003_003_pr.jpg>, <http://localhost/static/c/2708/service/00303_2003_004_pr.jpg> ;
    ore:isDescribedBy <http://localhost/item/2708/about.rdf> ;
    a <http://purl.org/ontology/bibo/Manuscript> ;
    rdfs:seeAlso <http://hdl.loc.gov/loc.wdl/dna.2708> .
Notes
  1. Sadly, the URIs are uglyish due to some constraints from our caching configuration. I figure we can redirect uglyish URIs to cool ones and make use of owl:sameAs if those constraints go away. []
  2. sans certain low-quality derivatives such as small thumbnails and tiles for the zoom interface []
  3. I was poking through the DBpedia output for Geonames URIs as well, but my method was way too slow and clunky, so that's disabled for the time being. Clients can always follow their noses from the DBpedia output. []


Trackbacks

Use this link to trackback from your own site.

Comments

Leave a response

  1. Ed Summers Tue, 11 Aug 2009 09:04:19 UTC

    Hey Mike. It's been fun hearing this stuff brewing at $work–and it's awesome to read about it here in your blog. If we can get more linked data bubbling up at loc.gov in different applications I think it'll be increasingly interesting to shift the focus from publishing linked data to consuming linked data, to see what sorts of views can be built on top of the various pools of web resources at the Library of Congress.

    Most of all it's really fun to be part of a grassroots / bottom up movement at LC for thinking of the resources we are publishing, and letting vocabulary elements emerge, and be mixed and matched–rather than trying to force everything into a particular metadata schema.

    A few things that stood out to me in your turtle:

    • That unicode sure looks awesome eh? It's really cool to see the multi-lingual metadata all hanging off of the same resource description. I know the World Digital Library has focused a lot of effort on the translation of metadata, so seeing it laid out so usefully and prettily is doing justice to all the work.
    • It's also really nice to the use of ORE, so people can actually harvest out not only the metadata, but the digital objects themselves.
    • I like the use of NEPOMUK File Ontology to characterize the aggregated resources. I hadn't seen it before! Knowing how big the file is before harvesting will be very valuable. Also there's a lot of potential for nfo:hasHash for giving the file a fixity value that can be used to verify that a harvest was successful.
    • I think I may have promulgated some flawed use of dcterms:LCC and dcterms:LCSH in some early versions of lcsh.info which are present in this chunk of turtle. Or maybe we both made the same mistake independently, which points to some ambiguity in the Dublin Core docs. The problem is that these are of type dcam:VocabularyEncodingScheme which in turn is of type rdfs:Class. So dcterms:LCC and dcterms:LCSH really ought to be used to characterize the type of a resource, not be used as a property. You could really use dcterms:subject though. Perhaps we could talk more about this in here, or on IRC. I only found out I was using it wrong because Tom Baker of DCMI emailed me, after a colleague in Japan noticed my use of dcterms:LCC in lcsh.info had the same problem.
    • It's really awesome to see the links out to dbpedia and id.loc.gov. My linking up of data to dbpedia from Chronicling America was pretty naively done, but it seemed good enough. I was wondering if you could talk more about how you are doing the linking in WDL. It seems relatively easy to link to resources within the context of a particular web application; but when you start to link out to resources outside of your particular web application things get harder, and driftier. Might be a good section on that linked data article we've been threatening to write, even though it got passed up by iPRES?

    So hopefully that wasn't too much feedback. I'm going to tweet this to hopefully get some more feedback from the Linked Data community. The really fun thing about this work is how it feed into a larger community of web practitioners, and compsci folks.

  2. Michael Giarlo Tue, 11 Aug 2009 13:01:01 UTC

    Thanks for the great feedback, Ed. The LCSH/DDC terms did smell funny to me. Now I've got something like the following, just using dcterms:subject to link out to id.loc.gov for LCSH and the Decimalised ontology for top-level DDC codes:

    <dcterms:subject rdf:resource="http://id.loc.gov/authorities/label/Women"/>
    <dcterms:subject rdf:resource="http://purl.org/NET/decimalised#391"/>
    <dcterms:subject rdf:resource="http://dbpedia.org/resource/Women"/>

    And folks can follow their noses in the linked data fashion to discover that a subject is LCSH or DDC. I guess an alternative would be to include dcam:memberOf pointing at dcterms:{LCSH,DDC} within dcterms:subject, but it's not clear to me if that's helpful or what it'd do to the resulting RDF, e.g., would it create blank nodes (a linked data no-no, I apparently).

    As for how I'm doing the linking, it's very crude. I went through the various elements within the WDL domain model and thought about how appropriate it'd be for each to point at a URI rather than a literal. For those where it's appropriate — for instance, a date range or an extent/physicalDescription might not be appropriate in contrast to institution names, geographic terms, and subject headings, or even titles — the code cleans up the string in question and probes at potential URIs in a small number of vocabularies. If the URI exists, returns a 200 HTTP status code, and returns an RDF graph, I use the constructed URI.

    It's pretty crude, like I mentioned, and there are obvious inefficiencies and faulty assumptions here. Baby steps.

  3. Karen Coyle Tue, 11 Aug 2009 18:29:06 UTC

    Pete Johnston gave me this code snippet as the correct way to make use of LCC and other vocabularies in the DC terms metadata:

    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
             xmlns:dcterms="http://purl.org/dc/terms/"
             xmlns:dcam="http://purl.org/dc/dcam/">
     <rdf:Description rdf:about="http://example.org/book/123">
      <dcterms:subject>
        <rdf:Description>
          <dcam:memberOf rdf:resource="http://purl.org/dc/terms/LCC"/>
          <rdf:value>HV3709</rdf:value>
        </rdf:Description>
      </dcterms:subject>
     </rdf:Description>
    </rdf:RDF>

    I have used it this way for the RDF export from the Open Library (openlibrary.org). Soon to be blogged at http://blog.openlibrary.org

  4. Michael Giarlo Tue, 11 Aug 2009 20:50:06 UTC

    Much obliged for the snippet, Karen! I was wondering.

    I'd love to see more examples like this up on the DC website itself, too. Or maybe they're already there and I just couldn't find them.

  5. Bruce Tue, 11 Aug 2009 23:28:29 UTC

    Definitely seems much nicer to use the lcsh URIs for subjects, rather than the blank nodes + strings.

  6. Karen Coyle Wed, 12 Aug 2009 10:25:58 UTC

    Bruce, the OL subject field is not exclusively LCSH — it also includes subjects from Amazon, ONIX, user input, etc. Unfortunately, provenance of the data wasn't retained in an easy-to-determine way, so ….

    The other thing about LCSH URIs is that they do not represent complete subject headings. LCSH authority data is a kind of pattern dictionary for the creation of actual subject headings. So while there is an authority entry for:
    African Fiction (French)
    the subject headings in books are:
    African Fiction (French) — 20th century
    African Fiction (French) — 20th century — History and criticism
    etc.

    Each of these would be assigned the same LCSH URI, but they are not the same subject heading, and there is no URI for the heading in the bibliographic record. I'm not quite sure how to represent this "based on" relationship that exists between the subject heading and the LCSH entry. It is a particular property that we need to carefully define.

  7. Bruce Thu, 13 Aug 2009 11:49:23 UTC

    @Karen: it seems there's a URI for the broader concept, and also (a different) one for African fiction (French)–History and criticism. But there's not one that I see that narrows it to the 20th century.

    So I guess there are just some missing headings, and missing links?

    Still, this sort of linked hierarchy is potentially much more useful for OL users.

  8. PeteJ Thu, 13 Aug 2009 12:25:59 UTC

    Hi Michael,

    Re: having more examples up on the DCMI Web site, the pattern Karen referred to (and others like it) is listed in

    http://dublincore.org/documents/dc-rdf/

    That document uses the terminology/concepts of the DCMI Abstract Model (e.g. "Vocabulary Encoding Scheme"), but since DCMI uses that categorisation in its list of terms in

    http://dublincore.org/documents/dcmi-terms/

    then hopefully it is enough to give an indication of their intended use.

    I tend to agree with your suggestion that, where a URI for the concept is used, and that URI is dereferenceable to more information about the concept, and in a "linked data" context, it may be debatable how much value there is in including the dcam:memberOf triple, but I guess it may be helpful to some applications to have that information included in the "instance".

    Re DCMI documentation more generally, yes, I'm afraid a good deal of the documentation on the DCMI Web site is somewhat out-dated and sorely in need of a substantial overhaul [err, personal opinion only, I hasten to add!]

  9. Bohdan Kantor Thu, 13 Aug 2009 15:39:23 UTC

    Good to see you're working with human languages and Unicode characters.

    Regarding your wish to hear of other vocabularies to link to, Lingvoj rdf file currently describes 522 languages. The Library of Congress – UNESCO World Digital Library project will potentially have many more human language descriptors for content. Currently UNESCO lists 193 member states (and 6 associate members) that have cultural content in many living and extinct languages. The current Ethnologue database lists 7,357 distinct language identifiers. Of these, 421 represent extinct languages, 396 are nearly extinct, 29 are a second language only, and the remainder are listed with "living" status.

    Lexvo.org rdf file (lexvo_2008-12-27.rdf) currently describes more than 7000 languages in the form http://www.lexvo.org/id/iso639-3/eng

  10. Michael Panzer Fri, 21 Aug 2009 12:23:46 UTC

    Hi Michael,

    great to see this happening. Some comments from my end:

    I would favor the use of dcam:memberOf even for dereferenceable URIs, because 1. it makes processing and sorting out mappings much easier without having to rely on an external data store every time and 2. I don't see dcam:memberOf used much in external data sets. So even if a user agent would follow its nose, it is doubtful that by inspecting a single concept it could reliably determine which vocabulary it is part of.

    Nothing in http://id.loc.gov/authorities/sh95000541.rdf tells you that this is an LCSH (except for the skos:inScheme assertion which I find a little ambiguous if LC wants to publish more than just LCSH at this URI).

    In the end it might come down to whether you find it useful to follow Dublin Core conventions. It might even be useful to state the preferred heading or caption of the subject identifier. I think there are at least two ways to do this.

    You could basically reiterate a couple of triples found at the other end (which might be problematic because those have to be kept in sync with the original data source) or use DCAM conventions to indicate that you are not really making assertions about a concept that is not your own. Rather, you are just giving additional information that are relevant in your context but may be disregarded when graphs are merged. So basically:

    World Wide Web

    versus

    World Wide Web

    I don't know which one is better (if any). The rdf:value approach could make processing easier because rdf:value is typically used for identifying the main value of several values. This scenario seems to be a prime use case for that: “What is the value of dcterms:subject?”

    Re other data sets to link to: We now have dewey.info, ;-) so perhaps it might make sense for you to consider using Dewey URIs in your data? The data currently available there aligns mostly with what you already have in the first subject assertion. It could be rendered like this:

    Constitutional & administrative law
    Derecho constitucional y administrativo
    Droit constitutionnel et administratif

    Or the rdf:values could omitted altogether to spare the pain of synching and updating …

  11. Michael Giarlo Wed, 23 Sep 2009 09:51:15 UTC

    Those comments are very helpful.

    Bo, I've just added lexvo links. Thanks for that.

Comments