# Using fuzzy string matching on a historic book collection ## Abstract [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/git/https://labs.onb.ac.at/gitlab/bed/JDH_submission/main?filepath=Using-fuzzy-string-matching-on-Prince-Eugene's-historic-book-collection.ipynb) The Bibliotheca Eugeniana was integrated into the Austrian National Library after the death of Prince Eugene of Savoy. The project “Bibliotheca Eugeniana Digital” uses digital humanities methods to make the collection accessible in a new way. Among other methods, we employed a string matching algorithm to link the historical inventory of the collection with bibliographical data from the Austrian National Library's catalogue, which is described in detail in this article. The results of the project include a digital edition of the manuscript catalogue with links to the modern catalogue, additional metadata and a visual interface for exploration. ## Keywords Historic Book Collection, Prince Eugene, Fuzzy String Matching, Handwritten Text Recognition, Digital Edition, Collection Visualization ## Installation instructions This repo is derived from the [Journal of Digital History Author's Repository](https://github.com/C2DH/template_repo_JDH/tree/main), and serves to submit an article to the same journal, including references and code examples. To run the Jupyter notebook containing the article, either open the mybinder URL given at the top of this README, or install the project's requirements into your own Python environment and start a Jupyter server. We use the package manager `uv` (see [here](https://docs.astral.sh/uv/) for more information about `uv`), so after cloning the repo run the following commands in your terminal: ```bash uv init uv add -r requirements.txt uv run jupyter lab ``` This will initialize the project, install its requirements and then start a Jupyter server. The required packages (see [`requirements.txt`](./requirements.txt)) are ```bash jupyter>=1.1.1 jupyterlab-citation-manager>=1.0.0 matplotlib>=3.10.0 pandas>=2.2.3 thefuzz>=0.22.1 tqdm>=4.67.1 ``` ## License Copyright © 2023 University of Luxembourg. This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU Affero General Public License for more details. You should have received a copy of the GNU Affero General Public License along with this program. If not, see . We hope this repository and the provided example notebook are helpful for authors submitting articles to the Journal of Digital History. If you have any questions, feedback, or suggestions, please feel free to open an issue or contact us. Thank you for your contribution!