Skip to content
README.md 1.09 KiB
Newer Older
onb1259's avatar
onb1259 committed
# Extract figures by iiif manifest

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/git/https%3A%2F%2Flabs.onb.ac.at%2Fgitlab%2Fa.rabensteiner%2Fextract_figures_abo/HEAD?labpath=extract_figures.ipynb)

onb1259's avatar
onb1259 committed
This repository provides a Jupyter notebook [extract_figures.ipynb](extract_figures.ipynb) that uses a YOLOv8 model to extract figures from a book given the url of its iiif manifest.
onb1259's avatar
onb1259 committed

The model has been trained on the following fives book from the ABO corpus:
onb1259's avatar
onb1259 committed
- http://data.onb.ac.at/ABO/%2BZ97792402
- http://data.onb.ac.at/ABO/%2BZ155502807
- http://data.onb.ac.at/ABO/%2BZ156318706
- http://data.onb.ac.at/ABO/%2BZ164403901
onb1259's avatar
onb1259 committed
- http://data.onb.ac.at/ABO/%2BZ22101290X
onb1259's avatar
onb1259 committed

onb1259's avatar
onb1259 committed
From these approximately 1700 book pages 250 contain figures that have been annotated with bounding boxes with the image annotation webservice [CVAT](https://www.cvat.ai/). Training has been done locally with the nano version of YOLOv8. The resulting model for figure detection is given by [model_extract_figures.pt](model_extract_figures.pt).
onb1259's avatar
onb1259 committed

a.rabensteiner's avatar
a.rabensteiner committed
![Suggested bounding boxes of figures by the trained model.](example.png)