skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

AI-ASSISTED DIGITALISATION OF HISTORICAL DOCUMENTS

International archives of the photogrammetry, remote sensing and spatial information sciences., 2023, Vol.XLVIII-M-2-2023, p.557-562 [Peer Reviewed Journal]

2023. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. ;ISSN: 2194-9034 ;ISSN: 1682-1750 ;EISSN: 2194-9034 ;DOI: 10.5194/isprs-archives-XLVIII-M-2-2023-557-2023

Full text available

Citations Cited by
  • Title:
    AI-ASSISTED DIGITALISATION OF HISTORICAL DOCUMENTS
  • Author: Ferro, S. ; Pelillo, M. ; Traviglia, A.
  • Subjects: Artificial intelligence ; Artificial neural networks ; Digital preservation ; Digitization ; Handwriting recognition ; Neural networks ; Recurrent neural networks ; Texts ; Transcription
  • Is Part Of: International archives of the photogrammetry, remote sensing and spatial information sciences., 2023, Vol.XLVIII-M-2-2023, p.557-562
  • Description: Preserving historical archival heritage involves not only physical measures to safeguard these valuable texts but also providing for their digital preservation. However, merely digitising manuscripts and codexes is not enough. A further step is needed: the digitalisation of their content, i.e. the verbatim transcription of scanned texts. This process enables the accurate preservation of their textual content, making it easier to search for information and conduct further analyses. With the help of artificial intelligence, particularly Deep Neural Networks (DNNs), automatic handwriting recognition can be performed. In this study, we employed a Convolutional Recurrent Neural Network (CRNN), an established type of DNN, to determine the minimum amount of labelled data required to automatically transcribe five different historical datasets that vary in language and time period. The results show that a Character Error Rate (CER) lower than 10% can be achieved with just a few hundred labelled text lines in almost all cases.
  • Publisher: Gottingen: Copernicus GmbH
  • Language: English
  • Identifier: ISSN: 2194-9034
    ISSN: 1682-1750
    EISSN: 2194-9034
    DOI: 10.5194/isprs-archives-XLVIII-M-2-2023-557-2023
  • Source: ProQuest Central
    DOAJ Directory of Open Access Journals

Searching Remote Databases, Please Wait