Skip to content
README.md 3.14 KiB
Newer Older
MartinKrickl's avatar
MartinKrickl committed
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/git/https%3A%2F%2Flabs.onb.ac.at%2Fgitlab%2Fgeorgp%2Fsacha-txt-downloader/HEAD?filepath=txtDownloader.ipynb)

MartinKrickl's avatar
MartinKrickl committed
## How to get the fulltext of a digitized book from Austrian Books Online ##
MartinKrickl's avatar
MartinKrickl committed

### Example: How to get the fulltext for a travelogue ###

For downloading the fulltext of a single travelogue (i.e. fulltext of one digitized copy) we offer a [Jupyter-Notebook](https://labs.onb.ac.at/gitlab/georgp/sacha-txt-downloader/) named "txt-downloader" in the [*ONB Labs*](https://labs.onb.ac.at/en/) GitLab. To run this tool you do not need to download any software or code, it runs in the browser via my Binder. By running this notebook, you will get a ZIP-folder in the output-repository of the Binder application, which you may download. The ZIP-folder contains plain-text-files for each page plus a text-file, in which all text-files of the single pages are combined into one file. Please note that neither structural metadata for page sequence is contained in the combined text file nor any markup (e.g. page breaks). 


+ First step: Click on [Jupyter-Notebook](https://labs.onb.ac.at/gitlab/georgp/sacha-txt-downloader/)


+ Second step: click on **launch binder**. A Binder instance is being launched in a separate browser window. 

+ Third step: As soon as the Jupyter-Notebook opens you may start the application by clicking on "run all" in the menu "cells". Command blocks (cells) executed will have a counter in front, those not completed an asterisk (/*). 


+ Forth step: In the second cell - which will be marked by an asterisk as not yet completed - will appear an insert bar. Please enter the barcode in the form of the example given (without dot or space at the end!) and press the Return key. Please read the following note on how to get the barcode!

>**Note:** Each digitized item is identified by a **barcode** (e.g. *Von der Alster zu den Pyramiden* +Z257607709). For downloading fulltext via the ONB Labs Jupyter-Notebook you have to input the barcode identifier of the item, for which you want to download the fulltext. You can find the barcode in the item overview, which opens by clicking on a single result for your search query in the *Quicksearch* catalog. Please copy the barcode!

+ Fifth step: Have you pressed the Return key after insertion of the barcode? The download will start for each single txt-file. After the next "In"-block you will see single lines for downloading single txt-files. You can click on any of them, to see the OCR-text for the single page chosen. At the end you will get a link to a ZIP-file "{barcode}.zip" in the "Out"-cell. 

>**Note**: Our travelogues are part of Austrian Books Online (ABO), a public-private partnership of the Austrian National Library with Google Books. You may only use digitized items and fulltexts from ABO by accepting the [right statement](https://rightsstatements.org/page/NoC-NC/1.0/?language=en) for **non commercial use only**. Please also note that you are allowed to use this notebook only for the download of individual barcodes (no bulk download allowed!).
For help or further information please contact our librarian at <martin.krickl@onb.ac.at>.