Planned Languages
Info
Data Set created: December 2019
Data Set updated: 25 November 2024
Content
This data set contains: International Image Interoperability Framework. Standardizes the provision of images and audiovisual data from servers in different web environments... Joint Photographic Experts Group. Commonly used method of lossy compression for digital images, especially for images created by digital photography... Comma Separated Values. File format that describes the structure of a text file with which simply structured data can be saved or exchanged; files i...
◦ 2.818 IIIF
◦ Metadata to those objects as a dump file in .csv
Data Provenance & Quality
The objects relating to this data set have been digitized by the department of digitization of the ONB Österreichische Nationalbibliothek. Austria’s largest library and a cultural and research institution located in Vienna; founded in the 14th century...
The metadata is based on the library catalogue and contains 2.818 instances relating to as many objects held in the collection of the Department of Planned Languages at the ONB. For more information on the data points and their respective contents check the Readme linked below.
The data set has not been cleaned, augmented or corrected. It does not represent the entirety of the collection at the Department of Planned Languages, but merely an excerpt of it (see Historical Background below). When the data set was originally put together in 2018 only those objects were chosen that had already been catalogized as well as digitized.
The data set comprises 40 languages.
Historical Background
The Department of Planned Languages exists under this name since 1990. However, in 1927 Hugo Steiner founded the International Esperanto Museum, which has been a part of the ONB Österreichische Nationalbibliothek. Austria’s largest library and a cultural and research institution located in Vienna; founded in the 14th century...
Maintenance
This data set can be updated irregularly.
- 2.818
- IIIF Manifests
- 310.217
- Images
- 2.818
- Documents

Detail from the cover of Harald Clegg: Esperanto. The Why and The What (London 1906) [ONB call number: 701561-A ESP MAG].
Reuse
Information on Rights and Reuse
Cite as
Preview
Documents in the Data Set
Data Set Items
Data
Download and Access Options
Sample
- 5 selected documents from the collection with metadata records
- 296 .jpg files
- Readme explanation of properties used
- .zip archive (242 MB)
Metadata Records
- Bibliographic metadata for all documents
- Readme explanation of properties used
- .csv table (2.9 MB)
IIIF Collection
- IIIF collection with URLs to all 310.217 images
- 2.818 documents with metadata
- .json file format
OAI-PMH Data Set
- Metadata from the library catalogue
- OAI-PMH standard for data harvesting
- .xml file format (MARC21)
Use Cases
Areas of Application Related to the Data Set
Possible Uses
Since the contents of the data set are highly multilingual, the images could be used to train a multilingual CV Computer Vision is a field of artificial intelligence (AI) focused on enabling computers to interpret and understand visual information from the wor... Optical Character Recognition. Electronic conversion of images with typewritten, handwritten or printed text into machine-encoded text, for example ...