In this block

  • Overview IIIF
  • Overview OCR formats
  • Example: Create IIIF collection from SPARQL query result
  • Example: Download pre-downsized images for machine learning
  • Example: Download OCR text

Overview IIIF

http://iiif.io/

What is IIIF?

  • International Image Interoperability Framework (http://iiif.io/ - well written, worth a read)
  • Standardised method of describing and delivering images over the web
  • Community that develops APIs and implements them in Software

Image courtesy of https://github.com/IIIF/training, CC-BY 4.0

Why would I use this?

If you want to display images

  • If you want to use one of several nice viewers for images (zoom, rotate, fullscreen ootb)
  • If you want to include image data hosted elsewhere

If you want to process images

  • If you want structured access to potentially huge sets of images
  • If you want included metadata
  • If you want to resize images before downloading

How would I use this?

Pics or didn't happen!

Overview OCR formats

  • 3 ALTO main elements
    • <Description>
      • metadata and general settings (e.g. measurement units) about the ALTO file
    • <Styles>
      • text and paragraph styles
    • <Layout>
      • content information
      • subdivided into <Page> elements

ALTO page element

  • hOCR
    • alternative to ALTO
    • based on XHTML
    • not used in the ONB Labs