Glossary

This table provides a comprehensive overview of the technical terms and abbreviations. Here, you will find both the German as well as the English terms and definitions to ensure clear and consistent use of the terminology.

Term Definition
πŸ”—

ABO

Austrian Books Online. Project (and associated collection) of the β†’ ONB, in which it digitizes its historical book holdings in a Public Private Partnership with Google. See https://www.onb.ac.at/en/digital-offers/austrian-books-online

πŸ”—

AC number

Austrian Central Catalog number. β†’ OBV-wide identifier of the bibliographic metadata record after formal indexing; corresponds to an abstract description of all copies, so one AC number can correspond to several copies or barcodes

πŸ”—

AI

Artificial Intelligence (AI) is a branch of computer science focused on creating systems and machines that simulate human intelligence, enabling them to perform tasks that typically require human cognition, such as understanding language, recognizing patterns, solving problems, and making decisions. AI encompasses various technologies, including machine learning, neural networks, and natural language processing, to create applications ranging from virtual assistants to autonomous robots

πŸ”—

AKON

Ansichtskarten Online (picture postcards online). Picture postcard portal (and associated collection) of the β†’ ONB; contains 75.000 digitized postcards from all parts of the world. See https://akon.onb.ac.at/

πŸ”—

Alma

Name of the library software of the ExLibris Group. The β†’ ONB uses the software for its central library catalog. See https://exlibrisgroup.com/products/alma-library-services-platform/

πŸ”—

ALTO (XML)

Analyzed Layout and Text Object. β†’ XML standard for the precise description of documents. See https://www.loc.gov/standards/alto/

πŸ”—

ANNO

Austrian Newspapers Online. Virtual newspaper reading room of the β†’ ONB and associated collection, which include all digitized newspapers and periodicals. See https://anno.onb.ac.at/

πŸ”—

API

Application Programming Interface. Part of a program that is made available by a software system to other programs in order to connect them to the system; APIs allow developers and users to create different complex functionalities more easily and enable the exchange of data between different systems; they are sometimes restricted, especially when offered by for-profit companies

πŸ”—

Bibframe

Data model for bibliographic description; it is intended to reconcile the needs of those who record detailed bibliographic descriptions, the requirements of those who describe other cultural materials, and those who do not require such detailed descriptions. See also https://www.loc.gov/bibframe/

πŸ”—

Blazegraph

Highly scalable graph database specifically designed for processing and querying β†’ RDF data. It supports β†’ SPARQL queries and offers powerful functions for handling large amounts of data in distributed environments. See https://blazegraph.com/

πŸ”—

.bz2

Free and open-source data compression program and associated file format; only compresses individual files and is not a file archiver; for tasks such as processing multiple files, encryption and splitting archives, it relies on separate external utilities; particularly common in Unix and Linux environments

πŸ”—

Capture

Refers to the process of capturing data or information from various sources, such as images, texts or videos; this data is then digitized, stored and further processed in order to make it usable

πŸ”—

CLARIAH-AT

Consortium of Austrian universities and research institutions that coordinates and promotes Austrian activities in the European β†’ ESFRI research infrastructures β†’ CLARIN and β†’ DARIAH. See also https://www.clariah.at

πŸ”—

CLARIN

Common Language Resources and Technology Infrastructure. European research infrastructure aimed at supporting the humanities and social sciences by providing access to language resources and tools; facilitates collaborative research by offering standardized data, tools, and services for linguistic and language-related studies. See also https://www.clarin.eu/

πŸ”—

CQL

Contextual Query Language. Query language developed for accessing bibliographic and text-based databases; it combines human-readable syntax with powerful search capabilities to provide accurate and relevant search results; developed by the Library of Congress. See https://www.loc.gov/standards/sru/cql/ and β†’ SRU

πŸ”—

Crawl

Expression that describes the automated process of systematically browsing and downloading website content, usually used for indexing and analysis

πŸ”—

CSV

Comma Separated Values. File format that describes the structure of a text file with which simply structured data can be saved or exchanged; files in this format use the .csv extension, indicating the file type. See https://www.dnb.de/EN/Professionell/Metadatendienste/Exportformate/CSV/csv.html

πŸ”—

CV

Computer Vision is a field of artificial intelligence (AI) focused on enabling computers to interpret and understand visual information from the world, such as images or videos. By analyzing and processing visual data, computer vision systems can perform tasks like object detection, image classification, facial recognition, and scene reconstruction. It combines elements of machine learning, pattern recognition, and image processing to give machines the ability to β€œsee” and make decisions based on visual inputs

πŸ”—

DARIAH

Digital Research Infrastructure for the Arts and Humanities. European research infrastructure that supports digital scholarship in the arts and humanities; provides tools, services, and resources to enhance research, foster collaboration, and ensure the long-term preservation and accessibility of research data. See also https://www.dariah.eu/

πŸ”—

Docker

Open source platform that enables software applications to be packaged, distributed and executed in isolated containers; these containers are independent, portable software packages and contain all the necessary components to run the application consistently on different environments. See https://www.docker.com/

πŸ”—

EDM

Europeana Data Model. Standardized data model for cultural assets of different domains; is intended to go beyond domain-specific metadata standards and take into account the versatility of established community standards (such as β†’ METS). See https://pro.europeana.eu/page/edm-documentation

πŸ”—

ESFRI

European Strategy Forum on Research Infrastructures. Strategic body of the European Union that coordinates research institutions of European importance with the aim of promoting the development of and access to scientific infrastructure in order to sustainably strengthen European research and innovation

πŸ”—

GeoNames

Freely usable geographical database. See https://www.geonames.org/

πŸ”—

HTR

Handwritten Text Recognition (HTR) refers to the technology and process of automatically identifying and converting handwritten text into digital format. It uses artificial intelligence and machine learning models to recognize individual characters, words, or sentences in various handwriting styles. HTR is widely applied in document digitization, historical document preservation, and data entry automation to transform handwritten notes, forms, and archives into searchable, editable text

πŸ”—

HTTP

Hypertext Transfer Protocol. Generally applicable technical standard that regulates how a website is transferred from a server to the browser of the requesting party or parties

πŸ”—

IIIF

International Image Interoperability Framework. Standardizes the provision of images and audiovisual data from servers in different web environments. See https://iiif.io/

πŸ”—

JPEG

Joint Photographic Experts Group. Commonly used method of lossy compression for digital images, especially for images created by digital photography and associated file format; files in this format use the .jpeg or .jpg extensions and are commonly known by either abbreviation

πŸ”—

JPEG2000

Joint Photographic Experts Group 2000. An advanced image compression standard that provides higher image quality and more flexibility than traditional β†’ JPEG, using wavelet-based compression; supports features like lossless compression and progressive rendering, making it suitable for archival and professional imaging

πŸ”—

JSON

JavaScript Object Notation. Text-based file format with the extension .json, used to store and exchange data; self-describing and easily machine- and human-readable. See https://www.w3schools.com/js/js_json_intro.asp and for a reference of the format https://datatracker.ietf.org/doc/html/rfc7159

πŸ”—

JSON-LD

JavaScript Object Notation for Linked Data. Format for representing structured data on the web that uses β†’ JSON to enable linked data; files in this format have the file extension .jsonld and make it easier to share and link data between different systems by providing a human- and machine-readable method for embedding contextual information

πŸ”—

Jupyter Notebooks

Virtual notebooks for writing, demonstrating and executing β†’ Python code examples; available in the file format .ipynb. See https://jupyter.org/try

πŸ”—

LOD

Linked Open Data. Refers to data freely available on the Internet that is identified and can therefore be retrieved directly via β†’ HTTP; it is available under open licenses, which promotes the reuse and linking of information across different data sources, such as the integration of β†’ RDF

πŸ”—

MEI

Music Encoding Initiative. An open-source framework for encoding and sharing music notation, metadata, and related information in a digital format; expressed as an β†’ XML schema; enables researchers, musicians, and archivists to represent complex musical works with precision, supporting analysis, preservation, and interoperability

πŸ”—

METS

Metadata Encoding and Transmission Standard. β†’ XML standard for the structured storage and transfer of digital objects and their metadata, which is primarily used in libraries, archives and museums to manage digital collections and make them accessible in the long term

πŸ”—

ML

Machine Learning (ML) is a branch of artificial intelligence (β†’ AI) focused on developing algorithms and statistical models that allow computers to learn from data and make decisions without being explicitly programmed for each specific task. ML is used in a wide range of applications, including recommendation systems, predictive analytics, speech recognition, and autonomous driving, where systems adapt and improve over time based on the data they process.

πŸ”—

MMS-ID

Metadata Management System Identifier. An internal identifier of the β†’ OBV or the β†’ ONB

πŸ”—

OAI-PMH

Open Archives Initiative Protocol for Metadata Harvesting. A protocol written in β†’ XML that is controlled by β†’ HTTP requests and specifies how metadata is structured and presented

πŸ”—

OBV

Austrian Library Union. Network of scientific and administrative libraries in Austria with 70 participants and over 90 institutions, see https://www.obvsg.at/bibliothekenverbund/grundlagen/

πŸ”—

OCR

Optical Character Recognition. Electronic conversion of images with typewritten, handwritten or printed text into machine-encoded text, for example from a scanned document, a photo of a document or similar; research topic in the fields of β†’ AI and β†’ CV

πŸ”—

ONB

Γ–sterreichische Nationalbibliothek. Austria’s largest library and a cultural and research institution located in Vienna; founded in the 14th century, it holds millions of books, manuscripts, maps, photographs, and other historical documents, making it one of Europe’s oldest and most comprehensive libraries; the ONB is dedicated to preserving Austria’s cultural heritage and providing access to a wealth of historical and contemporary resources for research, education, and public engagement

πŸ”—

ppi

pixels per inch. Unit of measurement for the dot density of images, refers to the level of detail of the image

πŸ”—

Python

Name of a universal programming language that is characterized by good readability; a common file format for scripts in this language has the extension .py. See https://www.python.org/

πŸ”—

RDA

Resource Description and Access. International cataloging standard; provides guidelines and best practices for creating consistent and accurate metadata that facilitates access and discoverability of metadata for library and cultural heritage resources

πŸ”—

RDF

Resource Description Framework. Approach on the Internet for formulating logical statements about arbitrary things; each statement follows the β€œtriple” pattern: subject, predicate, object; also a data model that describes at a theoretical level how data is structured; enables Linked Open Data (β†’ LOD)

πŸ”—

Readme

Text document that provides basic information about a project, including its purpose, usage notes and installation instructions; serves as an introduction to help users understand and effectively use the software or code repository

πŸ”—

Repo(sitorium)

Central storage location where digital objects such as documents, data or software versions are stored, managed and made accessible; often used for archiving, sharing and long-term preservation of these objects

πŸ”—

RESTful

Refers to systems or an β†’ API that adhere to the principles of REST (Representational State Transfer), a software architectural style for designing scalable web services; a RESTful API uses standard β†’ HTTP methods like GET, POST, PUT, and DELETE to perform operations on resources identified by URLs, making them simple, stateless, and interoperable

πŸ”—

SACHA

Simple Access to Cultural Heritage Assets. Former name of the β†’ IIIF interface of the β†’ ONB. See: https://iiif.onb.ac.at/

πŸ”—

Seed

Metaphorical term of the Austrian Web Archive. What is harvested with a β†’ Crawl must first be sown. Starting address for the crawler to begin a crawl (e. g. for a domain crawl, the start pages of all domains)

πŸ”—

Semantic Web

Extension of the World Wide Web that enables computers to understand and process data based on its meaning; it uses standardized formats and ontologies to link information and thus enable more intelligent, context-based information processing

πŸ”—

Sickle

Open source tool for the continuous integration and provision of software that was specially developed for the automation of deployment (= software distribution) processes; it enables the fast and consistent deployment of applications by defining and managing pipelines for builds and releases. See https://pypi.org/project/Sickle/

πŸ”—

SPARQL

SPARQL Protocol and β†’ RDF Query Language. Query language for the β†’ Semantic Web that can be used to search RDF data

πŸ”—

SRU

Search and Retrieve via β†’ URL. Standardized web service protocol that can be used to query databases on the Internet. The results can be provided in β†’ XML. See https://www.loc.gov/standards/sru/index.html

πŸ”—

Swagger

Framework for the description, creation and documentation of a β†’ RESTful β†’ API; enables developers to define API specifications in a standardized format that is both machine- and human-readable. See https://swagger.io/

πŸ”—

TEI

Text Encoding Initiative. A framework for encoding textual data in digital form, using β†’ XML-based guidelines tailored to humanities research; enables the detailed representation of texts, e.g. for digital editions, including structure, annotations, and metadata, e.g. for preservation, analysis, and sharing

πŸ”—

TIFF

Tag(ged) Image File Format. Common file format with the file extension .tiff, used to store raster graphics and image information; suitable for storing high quality images

πŸ”—

TXT

Text document with the file extension .txt that contains plain, unformatted text

πŸ”—

URI

Uniform Resource Identifier. Character string used to uniquely identify a resource on the Internet, either by a location (β†’ URL) or by a name (URN, Uniform Resource Name). It consists of a scheme, such as http or ftp, followed by a path that describes the specific resource

πŸ”—

URL

Uniform Resource Locator. Unique identifier used to locate a resource on the Internet; also known as a web address; usually begins with http:// or https://

πŸ”—

XML

eXtensible Markup Language. A text-based format used to store and transport structured data in a way that is both human-readable and machine-readable; allows users to define custom tags to describe the data, making it highly flexible for a wide range of applications such as web services, document storage, and data exchange

πŸ”—

Zip

Archive file format with the file extension .zip that supports lossless data compression; may contain one or more files or directories that may have been compressed