Es­per­an­to News­pa­per Ex­cerpts

Info

Data Set created: April 2024

Data Set updated: 25 April 2024

Content

Images, text and metadata for a collection of historical newspaper excerpts (from the period 1898–1915), containing information about the Esperanto language and community.

Data Provenance and Quality

The data set contains recently scanned images (at ONB 

Österreichische Nationalbibliothek. Austria’s largest library and a cultural and research institution located in Vienna; founded in the 14th century...

's Department of Digitization) of high quality, manually created metadata for each article (in preparation for the digitization process), and text files created in the CLARIAH-AT 

Consortium of Austrian universities and research institutions that coordinates and promotes Austrian activities in the European → ESFRI research infra...

project Esperanto Newspaper Excerpts. The generated text is of varying quality, as there is a wide variety of different scripts, layouts and languages in the data set.

Historical Background

Esperanto is a language designed to enable easy communication between people of different countries and cultures. It was launched in 1887 by Dr. Ludwik L. Zamenhof under the pseudonym “Doktoro Esperanto”, which literally means “Doctor Hoper”. The goal of Esperanto was to create an international auxiliary language that everyone on Earth could speak, facilitating global communication and understanding. With its regular grammar and structure, Esperanto has been used as a universal second language for international communication. The “Hachette Collection” consists of 17,204 articles taken from newspapers published in many European countries in the period from 1898 until 1915, which are held at the Departmenf of Planned Languages. The articles themselves deal with Esperanto-themed events and persons, e.g., reports from the Esperanto World Congress and as such serve as an excellent and unique opportunity to study the history of the Esperanto movement in Europe in the early 20th century.

Maintenance

This data set can be updated irregularly.

  • 17.204
  • Articles
  • 22.247
  • Images
  • 17.204
  • IIIF Manifests

Detail from an article (taken from the 13 August 1910 issue of the Washington Times) featuring an image of the founder of Esperanto, Ludwik L. Zamenhof

Reuse

Information on Rights and Reuse

Cite as

ONB Labs. “Data Set Esperanto Newspaper Excerpts.” ONB Labs. Apr 25, 2024. Accessed on Aug 21, 2025, https://labs.onb.ac.at/en/datasets/esperanto-excerpts/.

Preview

Documents in the Data Set

Data Set Items

Es­per­an­to News­pa­per Ex­cerpts152
NameEntries
Esperanto Newspaper Excerpts Folder 191
Esperanto Newspaper Excerpts Folder 2137
Esperanto Newspaper Excerpts Folder 380
Esperanto Newspaper Excerpts Folder 459
Esperanto Newspaper Excerpts Folder 531
Esperanto Newspaper Excerpts Folder 623
Esperanto Newspaper Excerpts Folder 758
Esperanto Newspaper Excerpts Folder 8121
Esperanto Newspaper Excerpts Folder 9129
Esperanto Newspaper Excerpts Folder 1065
Esperanto Newspaper Excerpts Folder 11133
Esperanto Newspaper Excerpts Folder 1283
Esperanto Newspaper Excerpts Folder 1339
Esperanto Newspaper Excerpts Folder 1453
Esperanto Newspaper Excerpts Folder 15101
Esperanto Newspaper Excerpts Folder 1640
Esperanto Newspaper Excerpts Folder 17118
Esperanto Newspaper Excerpts Folder 18191
Esperanto Newspaper Excerpts Folder 1930
Esperanto Newspaper Excerpts Folder 20169
Esperanto Newspaper Excerpts Folder 21149
Esperanto Newspaper Excerpts Folder 22109
Esperanto Newspaper Excerpts Folder 23142
Esperanto Newspaper Excerpts Folder 24138
Esperanto Newspaper Excerpts Folder 25172
Esperanto Newspaper Excerpts Folder 2679
Esperanto Newspaper Excerpts Folder 27152
Esperanto Newspaper Excerpts Folder 28107
Esperanto Newspaper Excerpts Folder 29137
Esperanto Newspaper Excerpts Folder 3064
Esperanto Newspaper Excerpts Folder 31117
Esperanto Newspaper Excerpts Folder 3216
Esperanto Newspaper Excerpts Folder 3376
Esperanto Newspaper Excerpts Folder 3474
Esperanto Newspaper Excerpts Folder 3576
Esperanto Newspaper Excerpts Folder 36100
Esperanto Newspaper Excerpts Folder 37101
Esperanto Newspaper Excerpts Folder 3871
Esperanto Newspaper Excerpts Folder 3977
Esperanto Newspaper Excerpts Folder 4079
Esperanto Newspaper Excerpts Folder 41108
Esperanto Newspaper Excerpts Folder 42139
Esperanto Newspaper Excerpts Folder 43120
Esperanto Newspaper Excerpts Folder 44107
Esperanto Newspaper Excerpts Folder 45173
Esperanto Newspaper Excerpts Folder 47149
Esperanto Newspaper Excerpts Folder 4876
Esperanto Newspaper Excerpts Folder 49187
Esperanto Newspaper Excerpts Folder 50177
Esperanto Newspaper Excerpts Folder 51165
Esperanto Newspaper Excerpts Folder 52213
Esperanto Newspaper Excerpts Folder 53153
Esperanto Newspaper Excerpts Folder 54153
Esperanto Newspaper Excerpts Folder 55212
Esperanto Newspaper Excerpts Folder 56224
Esperanto Newspaper Excerpts Folder 57230
Esperanto Newspaper Excerpts Folder 58182
Esperanto Newspaper Excerpts Folder 59230
Esperanto Newspaper Excerpts Folder 60162
Esperanto Newspaper Excerpts Folder 61201
Esperanto Newspaper Excerpts Folder 62234
Esperanto Newspaper Excerpts Folder 6314
Esperanto Newspaper Excerpts Folder 64210
Esperanto Newspaper Excerpts Folder 65245
Esperanto Newspaper Excerpts Folder 6688
Esperanto Newspaper Excerpts Folder 67137
Esperanto Newspaper Excerpts Folder 68161
Esperanto Newspaper Excerpts Folder 69183
Esperanto Newspaper Excerpts Folder 70189
Esperanto Newspaper Excerpts Folder 71120
Esperanto Newspaper Excerpts Folder 72150
Esperanto Newspaper Excerpts Folder 73193
Esperanto Newspaper Excerpts Folder 7499
Esperanto Newspaper Excerpts Folder 75104
Esperanto Newspaper Excerpts Folder 76152
Esperanto Newspaper Excerpts Folder 77144
Esperanto Newspaper Excerpts Folder 78106
Esperanto Newspaper Excerpts Folder 79118
Esperanto Newspaper Excerpts Folder 80117
Esperanto Newspaper Excerpts Folder 8143
Esperanto Newspaper Excerpts Folder 82105
Esperanto Newspaper Excerpts Folder 8383
Esperanto Newspaper Excerpts Folder 8498
Esperanto Newspaper Excerpts Folder 85107
Esperanto Newspaper Excerpts Folder 8672
Esperanto Newspaper Excerpts Folder 8745
Esperanto Newspaper Excerpts Folder 88113
Esperanto Newspaper Excerpts Folder 8988
Esperanto Newspaper Excerpts Folder 90119
Esperanto Newspaper Excerpts Folder 91112
Esperanto Newspaper Excerpts Folder 9296
Esperanto Newspaper Excerpts Folder 9388
Esperanto Newspaper Excerpts Folder 94120
Esperanto Newspaper Excerpts Folder 9589
Esperanto Newspaper Excerpts Folder 96113
Esperanto Newspaper Excerpts Folder 9784
Esperanto Newspaper Excerpts Folder 98123
Esperanto Newspaper Excerpts Folder 9988
Esperanto Newspaper Excerpts Folder 100113
Esperanto Newspaper Excerpts Folder 10185
Esperanto Newspaper Excerpts Folder 102119
Esperanto Newspaper Excerpts Folder 10388
Esperanto Newspaper Excerpts Folder 10480
Esperanto Newspaper Excerpts Folder 10574
Esperanto Newspaper Excerpts Folder 10690
Esperanto Newspaper Excerpts Folder 107101
Esperanto Newspaper Excerpts Folder 108118
Esperanto Newspaper Excerpts Folder 109121
Esperanto Newspaper Excerpts Folder 11083
Esperanto Newspaper Excerpts Folder 111127
Esperanto Newspaper Excerpts Folder 112119
Esperanto Newspaper Excerpts Folder 11381
Esperanto Newspaper Excerpts Folder 114104
Esperanto Newspaper Excerpts Folder 115135
Esperanto Newspaper Excerpts Folder 116169
Esperanto Newspaper Excerpts Folder 11767
Esperanto Newspaper Excerpts Folder 11825
Esperanto Newspaper Excerpts Folder 119135
Esperanto Newspaper Excerpts Folder 120168
Esperanto Newspaper Excerpts Folder 12186
Esperanto Newspaper Excerpts Folder 12288
Esperanto Newspaper Excerpts Folder 12380
Esperanto Newspaper Excerpts Folder 12420
Esperanto Newspaper Excerpts Folder 125129
Esperanto Newspaper Excerpts Folder 126166
Esperanto Newspaper Excerpts Folder 127149
Esperanto Newspaper Excerpts Folder 128165
Esperanto Newspaper Excerpts Folder 129136
Esperanto Newspaper Excerpts Folder 13037
Esperanto Newspaper Excerpts Folder 131104
Esperanto Newspaper Excerpts Folder 13224
Esperanto Newspaper Excerpts Folder 133104
Esperanto Newspaper Excerpts Folder 13496
Esperanto Newspaper Excerpts Folder 13592
Esperanto Newspaper Excerpts Folder 13615
Esperanto Newspaper Excerpts Folder 137134
Esperanto Newspaper Excerpts Folder 13896
Esperanto Newspaper Excerpts Folder 13982
Esperanto Newspaper Excerpts Folder 140133
Esperanto Newspaper Excerpts Folder 141112
Esperanto Newspaper Excerpts Folder 14297
Esperanto Newspaper Excerpts Folder 143120
Esperanto Newspaper Excerpts Folder 14473
Esperanto Newspaper Excerpts Folder 14590
Esperanto Newspaper Excerpts Folder 14663
Esperanto Newspaper Excerpts Folder 147130
Esperanto Newspaper Excerpts Folder 14856
Esperanto Newspaper Excerpts Folder 149116
Esperanto Newspaper Excerpts Folder 15080
Esperanto Newspaper Excerpts Folder 151137
Esperanto Newspaper Excerpts Folder 152123
Esperanto Newspaper Excerpts Folder 153171
No manifest selected
    Toggle full pageToggle full pageToggle full pageToggle full page
    Previous pagePrevious pagePrevious pagePrevious page
    Next pageNext pageNext pageNext page

    Data

    Download and Access Options

    Sample

    • 15 selected articles from the collection with metadata records
    • 26 .jpg, .xml (ALTO), .txt files
    • Readme explanation of properties used
    • .zip archive (72.7 MB)

    Metadata Records

    • Bibliographic metadata for all articles
    • Readme explanation of properties used
    • .csv table (3.4 MB) and .xlsx file (1.1 MB)

    IIIF Collection

    • IIIF collection with URLs to all 22.247 images
    • 17.204 articles with metadata
    • .json file format

    Code

    • Repository with Python source code used in the project
    • Jupyter Notebooks
    • Readme file with requirements and installation instructions

    Use Cases

    Areas of Application Related to the Data Set

    Possible Uses

    Since the contents of the data set are highly multilingual, the images could be used to train a multilingual CV 

    Computer Vision is a field of artificial intelligence (AI) focused on enabling computers to interpret and understand visual information from the wor...

    -/OCR 

    Optical Character Recognition. Electronic conversion of images with typewritten, handwritten or printed text into machine-encoded text, for example ...

    -model. Furthermore, since the data set contains metadata to text documents, it could be used to gain an exemplary understanding of the distribution of languages over time, and to therefore gain an insight into certain characteristics that might be present in the entire collection. The metadata could also be analyzed to get an overview of the places, where the text documents mostly originate from (via the data point 'countryCodes'). It might also be possible to visualize a network of people and/or publishers (via the data point 'persons' and/or 'publishers').