In this block:

  • Overview metadata formats
  • Overview container formats
  • Overview protocols
  • Overview SPARQL
  • Example: SRU (2.1)
  • Example: data harvesting OAI-PMH (2.2)
  • Example: SPARQL (2.3)

Overview metadata formats

  • Dublin Core
    • set of vocabulary terms to describe digital resources
    • 15 classic metadata terms, known as the Dublin Core Metadata Element Set (DCMES)
    • DCMI Metadata Terms: terms of the DCMES + qualified terms
    • Dublin Core Metadata Initiative
  • MARC
    • MARC (MAchine-Readable Cataloging) standards
      • developed in the 1960s to create records that could be read by computers and shared among libraries
    • MARC 21, MARC record format for the 21st century
    • Full MARC 21 Record Examples
  • Dublin Core Metadata Element Set (DCMES) 1.1

    1. Title: The name of the object
    2. Creator: An entity primarily responsible for making the resource
    3. Subject: The topic addressed by the work
    4. Description: An account of the resource
    5. Publisher: The agent or agency responsible for making the object available
    6. Contributor: An entity responsible for making contributions to the resource
    7. Date: The date of publication
    8. Type: The nature or genre of the resource
    9. Format: The file format, physical medium, or dimensions of the resource
    10. Identifier: String or number used to uniquely identify the object
  • Dublin Core Metadata Element Set (DCMES) 1.1

    1. Source: Objects, either print or electronic, from which this object is derived, if applicable
    2. Language: Language of the intellectual content
    3. Relation: Relationship to other objects
    4. Coverage: The spatial locations and temporal durations characteristic of the object
    5. Rights: Information about rights held in and over the resource

Overview container formats

  • JSON
    • a string = { "name":"John" }
    • a number = { "age":30 }
    • an object (JSON object) = {"employee":{ "name":"John", "age":30, "city":"New York" }}
    • an array = {"employees":[ "John", "Anna", "Peter" ]}
    • a boolean = { "sale":true }
    • null = { "middlename":null }
  • JSON-LD
    • JSON for Linked Data
    • keywords
      • @context to provide additional mappings from JSON to an RDF model (map terms to IRIs)
      • @id to uniquely identify things
      • @type to set the data type of a node or typed value
      • @container to set the default container type for a term
      • "@container": "@set" defines a container as an unordered set
{
  "@context": {
    "name": "http://xmlns.com/foaf/0.1/name",
    "homepage": {
      "@id": "http://xmlns.com/foaf/0.1/workplaceHomepage",
      "@type": "@id"
    },
    "Person": "http://xmlns.com/foaf/0.1/Person"
  },
  "@id": "https://me.example.com",
  "@type": "Person",
  "name": "John Smith",
  "homepage": "https://www.example.com/"
}
  • RDF (Resource Description Framework) W3C standard for modeling information for the semantic web
  • RDF Triples
    • describe everything as subject, predicate and object expression
      • subject denotes the resource
      • predicate, a term used to describe the subject
      • object, the thing that the verb is acting upon, can be another resource, or just a literal value
  • JSON-LD Processing Algorithms
    • JSON-LD Expaneded
      • replaces terms with the URIs they expand to
      • necessary for further transformations
      • removes context
    • JSON-LD Compacted
      • removes context
      • makes it easier to read
    • JSON-LD Flattened
      • all properties of a node are collected in a single JSON object
      • a labeled directed graph ()
    • JSON-LD Framing
{
...
    "publisher": "Arn. Giull. de Brocario",
    "place_of_publication": "Compluti",
    "language": "http://id.loc.gov/vocabulary/iso639-2/mul",
    "@id": "https://open-na.hosted.exlibrisgroup.com/alma/43ACC_ONB/bibs/990028618530603338",
    "title": "Biblia polyglotta",
    "@context": "https://open-na.hosted.exlibrisgroup.com/alma/contexts/bib"
}
In [1]:
import requests
resp=requests.get("https://open-na.hosted.exlibrisgroup.com/alma/43ACC_NETWORK/bibs/990106901740203331")
resp.json()
Out[1]:
{'date': '9999',
 'note': 'Aus: (Sammelband von 63 Hochzeitsgedichten).',
 'identifier': [{'label': '(DE-599)OBVAC10480601'},
  {'label': '(Aleph)010690174ACC01'},
  {'label': '(AT-OBV)AC10480601'},
  {'label': 'AC10480601'}],
 '@type': 'Book',
 'place_of_publication': 's.l.',
 'language': 'http://id.loc.gov/vocabulary/iso639-2/ger',
 '@id': 'https://open-na.hosted.exlibrisgroup.com/alma/43ACC_NETWORK/bibs/990106901740203331',
 'title': 'Bey dem hochadelichen Helmrich- und Bassronischen Beylager, welches ... zu sonderbahren Ehren beyder Vermählten ...',
 '@context': 'https://open-na.hosted.exlibrisgroup.com/alma/contexts/bib'}

Overview protocols

In [2]:
import requests
from lxml import etree
cont=requests.get("https://obv-at-oenb.alma.exlibrisgroup.com/view/sru/43ACC_ONB?version=1.2&query=alma.barcode=%2BZ199052304&startRecord=0&maximumRecords=1&operation=searchRetrieve&recordSchema=marcxml").content
e = etree.XML(cont)
print(etree.tostring(e, encoding='unicode', pretty_print=True))
<searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/">
  <version>1.2</version>
  <numberOfRecords>1</numberOfRecords>
  <records>
    <record>
      <recordSchema>marcxml</recordSchema>
      <recordPacking>xml</recordPacking>
      <recordData>
        <record xmlns="http://www.loc.gov/MARC21/slim">
          <leader>00000nam a2200000 c 4500</leader>
          <controlfield tag="001">990030217420603338</controlfield>
          <controlfield tag="005">20180123084300.0</controlfield>
          <controlfield tag="007">cr#|||||||||||</controlfield>
          <controlfield tag="007">tu</controlfield>
          <controlfield tag="008">000101|1814####xx############|||#|#ger#u</controlfield>
          <controlfield tag="009">AC09865194</controlfield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">AC09865194</subfield>
          </datafield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(Aleph)009871525ACC01</subfield>
          </datafield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(DE-599)OBVAC09865194</subfield>
          </datafield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(AT-OBV)AC09865194</subfield>
          </datafield>
          <datafield tag="035" ind1=" " ind2=" ">
            <subfield code="a">(EXLNZ-43ACC_NETWORK)990098715250203331</subfield>
          </datafield>
          <datafield tag="040" ind1=" " ind2=" ">
            <subfield code="a">ONB</subfield>
            <subfield code="b">ger</subfield>
            <subfield code="c">ONB-AK-RETRO</subfield>
            <subfield code="d">AT-OeNB</subfield>
            <subfield code="e">pi</subfield>
          </datafield>
          <datafield tag="041" ind1=" " ind2=" ">
            <subfield code="a">ger</subfield>
          </datafield>
          <datafield tag="044" ind1=" " ind2=" ">
            <subfield code="c">XA-DXDE</subfield>
          </datafield>
          <datafield tag="245" ind1="0" ind2="0">
            <subfield code="a">&lt;&lt;Die&gt;&gt; Flucht über den Rhein odar Das unverhoffte Wiedersehen</subfield>
            <subfield code="b">Ein erlustirend historisch-rührendes Familiengemälde mit Erscheinungen und vollstimmigen Chören von Baschkiren und Cosaken, und allen Batterien der Deutschen</subfield>
          </datafield>
          <datafield tag="264" ind1=" " ind2="1">
            <subfield code="a">[Meißen]</subfield>
            <subfield code="b">[Gödsche]</subfield>
            <subfield code="c">1814</subfield>
          </datafield>
          <datafield tag="300" ind1=" " ind2=" ">
            <subfield code="a">32 S.</subfield>
          </datafield>
          <datafield tag="689" ind1="0" ind2="0">
            <subfield code="a">Deutschland</subfield>
            <subfield code="D">g</subfield>
            <subfield code="0">(DE-588)4011882-4</subfield>
          </datafield>
          <datafield tag="689" ind1="0" ind2="1">
            <subfield code="a">Krieg</subfield>
            <subfield code="D">s</subfield>
            <subfield code="0">(DE-588)4033114-3</subfield>
          </datafield>
          <datafield tag="689" ind1="0" ind2="3">
            <subfield code="a">Belletristische Darstellung</subfield>
            <subfield code="A">f</subfield>
          </datafield>
          <datafield tag="689" ind1="0" ind2=" ">
            <subfield code="5">AT-OBV</subfield>
            <subfield code="5">ONB-AK</subfield>
          </datafield>
          <datafield tag="689" ind1="1" ind2="0">
            <subfield code="a">Drama</subfield>
            <subfield code="D">s</subfield>
            <subfield code="0">(DE-588)4012899-4</subfield>
          </datafield>
          <datafield tag="689" ind1="1" ind2="1">
            <subfield code="a">Deutsch</subfield>
            <subfield code="D">s</subfield>
            <subfield code="0">(DE-588)4113292-0</subfield>
          </datafield>
          <datafield tag="689" ind1="1" ind2=" ">
            <subfield code="5">AT-OBV</subfield>
            <subfield code="5">ONB-AK</subfield>
          </datafield>
          <datafield tag="710" ind1="2" ind2=" ">
            <subfield code="a">Goedsche, Friedrich Wilhelm</subfield>
            <subfield code="4">pbl</subfield>
          </datafield>
          <datafield tag="856" ind1="4" ind2=" ">
            <subfield code="u">http://data.onb.ac.at/imgk/AZ00308934SZ00220134SZ00628562</subfield>
            <subfield code="z">Zettel</subfield>
            <subfield code="o">Katalogkarte</subfield>
          </datafield>
          <datafield tag="856" ind1="4" ind2="0">
            <subfield code="m">V:AT-OBV;B:AT-OeNB</subfield>
            <subfield code="q">application/html</subfield>
            <subfield code="u">http://data.onb.ac.at/ABO/%2BZ182067107</subfield>
            <subfield code="x">ONB-ABO</subfield>
            <subfield code="3">Volltext</subfield>
            <subfield code="o">OBV-ONB-ABO</subfield>
          </datafield>
          <datafield tag="856" ind1="4" ind2="0">
            <subfield code="m">V:AT-OBV;B:AT-OeNB</subfield>
            <subfield code="q">application/html</subfield>
            <subfield code="u">http://data.onb.ac.at/ABO/%2BZ199052304</subfield>
            <subfield code="x">ONB-ABO</subfield>
            <subfield code="3">Volltext</subfield>
            <subfield code="o">OBV-ONB-ABO</subfield>
          </datafield>
          <datafield tag="974" ind1="0" ind2="s">
            <subfield code="V">029</subfield>
            <subfield code="a">LZ01187985</subfield>
          </datafield>
          <datafield tag="974" ind1="0" ind2="s">
            <subfield code="F">030</subfield>
            <subfield code="A">u|1uf||||||37</subfield>
          </datafield>
          <datafield tag="974" ind1="0" ind2="s">
            <subfield code="F">050</subfield>
            <subfield code="A">a|a|||||g|||||</subfield>
          </datafield>
          <datafield tag="974" ind1="0" ind2="s">
            <subfield code="F">051</subfield>
            <subfield code="A">m|||||||</subfield>
          </datafield>
          <datafield tag="980" ind1="0" ind2=" ">
            <subfield code="a">0</subfield>
            <subfield code="9">LOCAL</subfield>
          </datafield>
          <datafield tag="980" ind1="0" ind2=" ">
            <subfield code="a">ONB-AK-RETRO</subfield>
            <subfield code="9">LOCAL</subfield>
          </datafield>
          <datafield tag="982" ind1=" " ind2=" ">
            <subfield code="f">Drama</subfield>
            <subfield code="9">LOCAL</subfield>
          </datafield>
          <datafield tag="982" ind1=" " ind2=" ">
            <subfield code="f">Dramen / deutsche / 19. Jh.</subfield>
            <subfield code="9">LOCAL</subfield>
          </datafield>
          <datafield tag="AVA" ind1=" " ind2=" ">
            <subfield code="0">990030217420603338</subfield>
            <subfield code="8">22288570940003338</subfield>
            <subfield code="a">43ACC_ONB</subfield>
            <subfield code="b">ZALT</subfield>
            <subfield code="c">State Hall at Josefsplatz</subfield>
            <subfield code="d">80.J.58</subfield>
            <subfield code="e">available</subfield>
            <subfield code="f">1</subfield>
            <subfield code="g">0</subfield>
            <subfield code="i">ONB</subfield>
            <subfield code="j">PRUNK</subfield>
            <subfield code="p">1</subfield>
            <subfield code="q">Department of Manuscripts and Rare Books (ALT)</subfield>
          </datafield>
          <datafield tag="AVA" ind1=" " ind2=" ">
            <subfield code="0">990030217420603338</subfield>
            <subfield code="8">22288570920003338</subfield>
            <subfield code="a">43ACC_ONB</subfield>
            <subfield code="b">ZFID</subfield>
            <subfield code="c">Bildarchiv und Grafiksammlung</subfield>
            <subfield code="d">288765-B</subfield>
            <subfield code="e">available</subfield>
            <subfield code="f">1</subfield>
            <subfield code="g">0</subfield>
            <subfield code="i">ONB</subfield>
            <subfield code="j">MAG</subfield>
            <subfield code="p">2</subfield>
            <subfield code="q">Picture Archives and Graphics Department (FID)</subfield>
          </datafield>
        </record>
      </recordData>
      <recordIdentifier>990030217420603338</recordIdentifier>
      <recordPosition>0</recordPosition>
    </record>
  </records>
  <extraResponseData xmlns:xb="http://www.exlibris.com/repository/search/xmlbeans/">
    <xb:exact>true</xb:exact>
    <xb:responseDate>2019-05-02T16:19:32+0200</xb:responseDate>
  </extraResponseData>
</searchRetrieveResponse>

  • OAI-PMH
    • OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) is used for metadata harvesting
    • 6 verbs
      • GetRecord – Used to retrieve an individual metadata record.
      • Identify – Used to retrieve repository information (ex. name, version).
      • ListIdentifiers – Used to retrieve only headers.
      • ListMetadataFormats – Used to retrieve the available metadata formats.
      • ListRecords – Used to retrieve actual item metadata records.
      • ListSets – Used to retrieve the set structure of a repository
In [3]:
from sickle import Sickle
sickle = Sickle('https://obv-at-oenb.alma.exlibrisgroup.com/view/oai/43ACC_ONB/request')
oai_sets = sickle.ListSets()
for oai_set in oai_sets:
    print('setSpec value for selective harvesting: ' + oai_set.setSpec)
    print('Name of the set (setName): ' + oai_set.setName + '\n')
setSpec value for selective harvesting: PAPYRUSDC
Name of the set (setName): Papyri records in DC simple

setSpec value for selective harvesting: FULLMARC
Name of the set (setName): Complete set of ONB records in MARC

setSpec value for selective harvesting: HANNAMARC
Name of the set (setName): HANNA records in MARC

setSpec value for selective harvesting: ESPERANTOMARC
Name of the set (setName): Esperanto records in MARC

setSpec value for selective harvesting: ESPERANTODC
Name of the set (setName): Esperanto Records in DC simple

setSpec value for selective harvesting: PAPYRUSMARC
Name of the set (setName): Papyri records in MARC

setSpec value for selective harvesting: HANNADC
Name of the set (setName): HANNA records in DC simple

setSpec value for selective harvesting: ABODC
Name of the set (setName): Austrian Books Online in DC simple

setSpec value for selective harvesting: ARIADNEDC
Name of the set (setName): Ariadne records in DC simple

setSpec value for selective harvesting: ARIADNEMARC
Name of the set (setName): Ariadne records in MARC

setSpec value for selective harvesting: MAPMARC
Name of the set (setName): Maps and Globes records in MARC

setSpec value for selective harvesting: FULLDC
Name of the set (setName): Complete set of ONB records in DC simple

setSpec value for selective harvesting: MAPDC
Name of the set (setName): Maps and Globes records in DC simple

setSpec value for selective harvesting: ABOMARC
Name of the set (setName): Austrian Books Online in MARC

setSpec value for selective harvesting: OAIBIBLIOA
Name of the set (setName): Austrian Bibliography A

setSpec value for selective harvesting: MUSHANDC
Name of the set (setName): Musikhandschriften in DC

setSpec value for selective harvesting: MUSHANMARC
Name of the set (setName): Music Manuscripts

setSpec value for selective harvesting: CERLMARC
Name of the set (setName): Old prints and manuscripts for CERL portal

  • SPARQL
    • Query language for the semantic web
    • since 2008 W3C recommendation
    • since 2013 SPARQL 1.1 W3C recommend
  • 4 Types of SPARQL Queries
    • SELECT: select values and return them (we will only use that)
    • CONSTRUCT: create a new Graph
    • ASK: Boolean (True/False)
    • DESCRIBE: Describe a resource
  • Declare prefix shortcuts (optional)
    • PREFIX foo: <...>
  • Query result clause
    • SELECT ...
  • Define the dataset (optional)
    • FROM <...>
  • Query pattern
    • WHERE { ... }
  • Query modifiers (optional):
    • GROUP BY ...
    • HAVING
    • ORDER BY
    • LIMIT
    • OFFSET
    • VALUES

Fun with SPARQL

special thanks to Matthias Schlögl

In [4]:
from rdflib import Graph
import pandas as pd
In [5]:
#authority file Goethe: https://d-nb.info/gnd/118540238
goethe_rdf = "http://d-nb.info/gnd/118540238/about/lds.rdf"
#authority file Kreisky: https://d-nb.info/gnd/118566512
kreisky_rdf = "https://d-nb.info/gnd/118566512/about/lds.rdf"
#subject heading 'Medizin': https://d-nb.info/gnd/4038243-6
medicine_rdf = "https://d-nb.info/gnd/4038243-6/about/lds.rdf"

#list all triples in authority file Goethe
g=Graph()
g.parse(goethe_rdf)
properties = g.query('''
   SELECT ?s ?p ?o 
   WHERE {
      ?s ?p ?o .
   }
''')
df_goethe =  pd.DataFrame(properties)
df_goethe
Out[5]:
0 1 2
0 Nfba28132f47348deb35e068e321cda07 http://d-nb.info/standards/elementset/gnd#fore... Iogann Vol'fgang
1 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/dnb#depr... http://d-nb.info/gnd/1032060956
2 N750cd89e37414c83a6d96634b9560c51 http://d-nb.info/standards/elementset/gnd#fore... Johanas Volfgangas
3 N98bee7cb5ea94557b6c7faaeabd3b355 http://d-nb.info/standards/elementset/gnd#fore... Wolfgang
4 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Goethe, Johann W. von
5 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/agrelon#... http://d-nb.info/gnd/118540246
6 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Nddc10440fa174cf5a779ba1d179ca5eb
7 N64c11db427fa4d32bc32c9f230a8b097 http://d-nb.info/standards/elementset/gnd#fore... J. W.
8 N5aa3f9a945194cfeb8bd73f5e3f50803 http://d-nb.info/standards/elementset/gnd#surname Goethe
9 Ne98739e792274a4d946e47da2570efc3 http://d-nb.info/standards/elementset/gnd#surname Gete
10 Nbd9ec60887f54cd6a83fa599b5e825d0 http://d-nb.info/standards/elementset/gnd#prefix von
11 N87c2119d4ca8445eaeb2a7a43097652f http://d-nb.info/standards/elementset/gnd#surname Goethe
12 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Gete, Yôhân Wôlfgang fôn
13 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Gūta
14 Ne8534290cbbc4400a2f80acf8026ccc0 http://d-nb.info/standards/elementset/gnd#surname Gete
15 Ndef0774077874c35ab747adfebc437b7 http://d-nb.info/standards/elementset/gnd#surname Göthe
16 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Yo han Bol peu gang pon Goe te
17 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Gede
18 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Goethe, Iohan Wolphgang
19 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N7cc2891306964755b9bdf8d72a0be010
20 Nbd9ec60887f54cd6a83fa599b5e825d0 http://d-nb.info/standards/elementset/gnd#surname Goethe
21 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N0ba3b1ff5d8840c2b0c39f267a33f961
22 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Nb86a22eab3a54c2bbae79f6490bf7965
23 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Gete, J. V.
24 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Goethe, J.W.
25 N6cc402baba7f4933a1c7c045eee6db86 http://d-nb.info/standards/elementset/gnd#prefix von
26 Nbbc98019cfaf4d468c0d241254425d58 http://d-nb.info/standards/elementset/gnd#surname Ǧūta
27 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#gndS... http://d-nb.info/standards/vocab/gnd/gnd-sc#15.1p
28 N8230d88be63b4cacb8b53461b2059296 http://d-nb.info/standards/elementset/gnd#pers... Goythe
29 N227251f017f1402eb80c36168c6e0fbd http://d-nb.info/standards/elementset/gnd#surname Goethe
... ... ... ...
677 N82ac34a3812043b887008f7a4349a801 http://d-nb.info/standards/elementset/gnd#fore... &Euml;han Vol'fhanh
678 N48d55032a5fe47d49be99e21d1640724 http://d-nb.info/standards/elementset/gnd#fore... יוהן וולפגנג פון
679 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N49ea00d93bfb4da7a4d1dd3ced77d981
680 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N3696cdc169ec4f06bf57c62a09f20ff4
681 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N04e0413eb907433480aa5dc7bb884a22
682 N20404e7f2b184bc393a43b3b8b0b0dbd http://d-nb.info/standards/elementset/gnd#fore... Yūhān Fūlfġānġ fūn
683 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Goethe, Johan Wolfgang von
684 N4762d97eff1447eb8bc6d8582d4a9d8f http://d-nb.info/standards/elementset/gnd#surname Gete
685 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Göthe, J. W. von
686 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Nae059575f17e4ae9adb9581dd50837d6
687 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Höte, Iohann Volfqanq
688 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N99eb032a0a274352ae4bc3a7c5604b09
689 N88c18f57c143403db783c546e5572a23 http://d-nb.info/standards/elementset/gnd#fore... J. W.
690 N75618c0e7c024fba85d32d5300e88de1 http://d-nb.info/standards/elementset/gnd#surname Gyot'e
691 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Nfba28132f47348deb35e068e321cda07
692 N86494671df53426cbf10a85917d1d3c0 http://d-nb.info/standards/elementset/gnd#fore... Johan W.
693 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N9aa0135db91a48239019b7a693c48e0b
694 N6cc402baba7f4933a1c7c045eee6db86 http://d-nb.info/standards/elementset/gnd#surname Goethe
695 http://d-nb.info/gnd/118540238 http://www.w3.org/2002/07/owl#sameAs http://d-nb.info/gnd/1014927390
696 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#rela... http://d-nb.info/gnd/1085154025
697 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N092464b644ac4980bf81e491b12c19d2
698 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Goethe, Giov. L.
699 Nbc30a63ca0994cfc98f3d6cdf5a521e0 http://d-nb.info/standards/elementset/gnd#surname Gete
700 Nef12db51c4724ef0b5f370e73f25c37c http://d-nb.info/standards/elementset/gnd#fore... Jochann Volfgang
701 Nf2095e9da8df4496bebd3b0c33d39cb2 http://d-nb.info/standards/elementset/gnd#pers... Goet'e
702 Nf65c28c1faca4efd9c5cbc0daa3cfe5e http://d-nb.info/standards/elementset/gnd#prefix fūn
703 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Gūta, Yūhān Wulfgāng fūn
704 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... Gót
705 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#gndS... http://d-nb.info/standards/vocab/gnd/gnd-sc#12.2p
706 http://d-nb.info/gnd/118540238 http://d-nb.info/standards/elementset/gnd#vari... N1d607f295a79421bb9b8427a0e91a96e

707 rows × 3 columns

In [6]:
#list all triples in subject heading 'Medizin'
g=Graph()
g.parse(medicine_rdf)
properties = g.query('''
   SELECT ?s ?p ?o 
   WHERE {
      ?s ?p ?o .
   }
''')
df_medicine =  pd.DataFrame(properties)
df_medicine
Out[6]:
0 1 2
0 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#oldA... (DE-588c)4038243-6
1 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#rela... http://dewey.info/class/610/
2 http://d-nb.info/gnd/040382435/about http://purl.org/dc/terms/license http://creativecommons.org/publicdomain/zero/1.0/
3 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#vari... Heilkunst
4 http://d-nb.info/gnd/4038243-6 http://www.w3.org/2007/05/powder-s#describedby http://d-nb.info/gnd/040382435/about
5 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#pref... Medizin
6 http://d-nb.info/gnd/4038243-6 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://d-nb.info/standards/elementset/gnd#Subj...
7 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#gndI... 4038243-6
8 http://d-nb.info/gnd/4038243-6 http://www.w3.org/2002/07/owl#sameAs http://www.wikidata.org/entity/Q11190
9 http://d-nb.info/gnd/040382435/about http://purl.org/dc/terms/modified 2017-01-25T16:58:26
10 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#vari... Medicine
11 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#gndS... http://d-nb.info/standards/vocab/gnd/gnd-sc#27.1a
12 http://d-nb.info/gnd/4038243-6 http://d-nb.info/standards/elementset/gnd#vari... Humanmedizin
13 http://d-nb.info/gnd/4038243-6 http://www.w3.org/2004/02/skos/core#exactMatch http://zbw.eu/stw/descriptor/15658-5
In [7]:
#list all distinct predicates in subject heading 'Medizin'
g=Graph()
g.parse(medicine_rdf)
properties = g.query('''
   SELECT DISTINCT ?p 
   #SELECT (COUNT(DISTINCT ?p) as ?cnt)
   WHERE {
      ?s ?p ?o .
   }
''')
df_medicine =  pd.DataFrame(properties)
df_medicine
Out[7]:
0
0 http://d-nb.info/standards/elementset/gnd#vari...
1 http://www.w3.org/1999/02/22-rdf-syntax-ns#type
2 http://d-nb.info/standards/elementset/gnd#gndI...
3 http://purl.org/dc/terms/license
4 http://www.w3.org/2002/07/owl#sameAs
5 http://d-nb.info/standards/elementset/gnd#oldA...
6 http://www.w3.org/2007/05/powder-s#describedby
7 http://d-nb.info/standards/elementset/gnd#rela...
8 http://d-nb.info/standards/elementset/gnd#gndS...
9 http://purl.org/dc/terms/modified
10 http://d-nb.info/standards/elementset/gnd#pref...
11 http://www.w3.org/2004/02/skos/core#exactMatch
In [8]:
#list all variantNameForTheSubjectHeading 'Medizin'
g=Graph()
g.parse(medicine_rdf)
properties = g.query('''
   PREFIX gndo: <http://d-nb.info/standards/elementset/gnd#>
   SELECT ?o 
   WHERE {
      ?s gndo:variantNameForTheSubjectHeading ?o .
      #?s gndo:preferredNameForTheSubjectHeading ?o .
   }
''')
df_medicine =  pd.DataFrame(properties)
df_medicine
Out[8]:
0
0 Medicine
1 Heilkunst
2 Humanmedizin
In [9]:
#count all objects in authority file Kreisky
g=Graph()
g.parse(kreisky_rdf)
properties = g.query('''
   SELECT ?o (COUNT(*) AS ?cnt) {
      ?s ?p ?o .
   } GROUP BY ?o ORDER BY DESC(?cnt)
''')
df_kreisky =  pd.DataFrame(properties)
df_kreisky
Out[9]:
0 1
0 http://d-nb.info/gnd/4066009-6 2
1 (DE-588c)4032993-8 1
2 Bundeskanzler 1970-1983 1
3 http://dbpedia.org/resource/Bruno_Kreisky 1
4 118566512 1
5 http://d-nb.info/standards/elementset/gnd#Diff... 1
6 2016-12-16T22:40:25 1
7 http://id.loc.gov/authorities/n50043948 1
8 http://www.filmportal.de/person/5B113A52F8F14A... 1
9 Politiker, Oesterreich 1
10 http://d-nb.info/standards/vocab/gnd/gnd-sc#16.5p 1
11 http://d-nb.info/standards/vocab/gnd/gender#male 1
12 http://www.wikidata.org/entity/Q44517 1
13 (DE-588a)118566512 1
14 https://de.wikipedia.org/wiki/Bruno_Kreisky 1
15 Kreisky, Bruno 1
16 Kreisky 1
17 http://creativecommons.org/publicdomain/zero/1.0/ 1
18 http://d-nb.info/standards/vocab/gnd/geographi... 1
19 http://d-nb.info/gnd/4046517-2 1
20 http://www.isni.org/0000000112608767 1
21 Bruno 1
22 1911-01-22 1
23 http://d-nb.info/gnd/118566512/about 1
24 1990-07-29 1
25 http://viaf.org/viaf/31998484 1
26 http://d-nb.info/gnd/2029382-3 1
27 N7c57750704364dbdbd89505b49c6bd9a 1
28 http://d-nb.info/gnd/121036073 1
In [10]:
#count all predicates in authority file Kreisky
g=Graph()
g.parse(kreisky_rdf)
properties = g.query('''
   SELECT ?p(COUNT(*) AS ?cnt) {
      ?s ?p ?o .
   } GROUP BY ?p ORDER BY DESC(?cnt)
''')
df_kreisky =  pd.DataFrame(properties)
df_kreisky
Out[10]:
0 1
0 http://www.w3.org/2002/07/owl#sameAs 6
1 http://d-nb.info/standards/elementset/gnd#biog... 2
2 http://d-nb.info/standards/elementset/gnd#oldA... 2
3 http://d-nb.info/standards/elementset/gnd#gndS... 1
4 http://d-nb.info/standards/elementset/gnd#fore... 1
5 http://purl.org/dc/terms/license 1
6 http://d-nb.info/standards/elementset/gnd#prof... 1
7 http://www.w3.org/2007/05/powder-s#describedby 1
8 http://d-nb.info/standards/elementset/gnd#geog... 1
9 http://d-nb.info/standards/elementset/gnd#affi... 1
10 http://xmlns.com/foaf/0.1/page 1
11 http://d-nb.info/standards/elementset/gnd#pref... 1
12 http://d-nb.info/standards/elementset/gnd#fami... 1
13 http://d-nb.info/standards/elementset/gnd#date... 1
14 http://d-nb.info/standards/elementset/gnd#plac... 1
15 http://purl.org/dc/terms/modified 1
16 http://d-nb.info/standards/elementset/gnd#surname 1
17 http://d-nb.info/standards/elementset/gnd#plac... 1
18 http://d-nb.info/standards/elementset/gnd#gender 1
19 http://d-nb.info/standards/elementset/gnd#date... 1
20 http://d-nb.info/standards/elementset/gnd#pref... 1
21 http://d-nb.info/standards/elementset/gnd#gndI... 1
22 http://www.w3.org/1999/02/22-rdf-syntax-ns#type 1
In [11]:
#check 'sameAs'
g=Graph()
g.parse('https://d-nb.info/gnd/118566512/about/lds.rdf')
properties = g.query('''
   PREFIX owl: <http://www.w3.org/2002/07/owl#>
   SELECT ?o 
   WHERE {
      ?s owl:sameAs ?o .
   }
''')
df_kreisky =  pd.DataFrame(properties)
df_kreisky
Out[11]:
0
0 http://www.filmportal.de/person/5B113A52F8F14A...
1 http://www.isni.org/0000000112608767
2 http://www.wikidata.org/entity/Q44517
3 http://viaf.org/viaf/31998484
4 http://id.loc.gov/authorities/n50043948
5 http://dbpedia.org/resource/Bruno_Kreisky