skip to main content
Guest
My Research
My Account
Sign out
Sign in
This feature requires javascript
Library Search
Find Databases
Browse Search
E-Journals A-Z
E-Books A-Z
Citation Linker
Help
Language:
English
Vietnamese
This feature required javascript
This feature requires javascript
Primo Search
All Library Resources
All
Course Materials
Course Materials
Search For:
Clear Search Box
Search in:
All Library Resources
Or hit Enter to replace search target
Or select another collection:
Search in:
All Library Resources
Search in:
Print Resources
Search in:
Digital Resources
Search in:
Online E-Resources
Advanced Search
Browse Search
This feature requires javascript
Search Limited to:
Search Limited to:
Resource type
criteria input
All items
Books
Articles
Images
Audio Visual
Maps
Graduate theses
Show Results with:
criteria input
that contain my query words
with my exact phrase
starts with
Show Results with:
Search type Index
criteria input
anywhere in the record
in the title
as author/creator
in subject
Full Text
ISBN
ISSN
TOC
Keyword
Field
Show Results with:
in the title
Show Results with:
anywhere in the record
in the title
as author/creator
in subject
Full Text
ISBN
ISSN
TOC
Keyword
Field
This feature requires javascript
Enabling Deep Document Image Analysis with Generative Models
ISBN: 9789180483049 ;ISBN: 9180483046 ;ISBN: 9789180483032 ;ISBN: 9180483038
Digital Resources/Online E-Resources
Citations
Cited by
View Online
Details
Recommendations
Reviews
Times Cited
External Links
This feature requires javascript
Actions
Add to My Research
Remove from My Research
E-mail
Print
Permalink
Citation
EasyBib
EndNote
RefWorks
Delicious
Export RIS
Export BibTeX
This feature requires javascript
Title:
Enabling Deep Document Image Analysis with Generative Models
Author:
Nikolaidou, Konstantina
Subjects:
Machine Learning
;
Maskininlärning
Description:
Historical documents are a valuable source of cultural knowledge and can provide information about previous events, societies, beliefs, and cultures. They can serve as an excellent source for research in various fields including history, literature, linguistics, and anthropology. Their preservation and analysis pose significant challenges due to the unique characteristics of handwritten scripts, the variability, and the document degradation. With the rise of the Deep Learning era, enormous amounts of annotated data are required to train large models that can efficiently perform tasks on unseen data. Nowadays, digital libraries provide high-quality digitized images for analysis and processing of historical documents. However, collecting and annotating the provided data is an expensive task and requires a lot of expertise from historians and the humanities. Hence, generating synthetic data to enhance the performance of Deep Learning frameworks is a common approach in Computer Vision and, specifically in this thesis, in Document Image Analysis and Recognition (DIAR). This thesis focuses on leveraging generative models to facilitate DIAR tasks, focusing on historical and handwritten documents, by generating realistic synthetic images that resemble a real distribution and enhance the training of downstream DIAR tasks. The contributions of the thesis include a systematic literature review, a comparison evaluation, and a developed method for handwriting generation. First, a systematic literature review of existing historical document image datasets, provides summarized information of 65 studies, focusing on different aspects, such as statistics, document type, language, visual, and annotation aspects. The study discusses limitations and promising resources for future research, which refer to the limited dataset size and absence of benchmarks, as well as the lack of standardization in terms of data format and evaluation scheme. A subsequent contribution is the integration of generated data in a historical document font classification task. Semi-synthetic data are generated with the use of DocCreator, an open-source software, from which different document degradation augmentations are used. A conditional Generative Adversarial Network (GAN) is used to generate fully synthetic data conditioned on a specific sample. The data generated by the two methods areintegrated as additional samples in the training of several Convolutional Neural Networks classifiers and the effect in the performance is examined. The final contribution of the thesis introduces a new method for generating styled handwritten text images based on Denoising Diffusion Probabilistic Models (DDPM), which is an unexplored method in DIAR. The method manages to capture stylistic and content characteristics of a standard multi-writer handwriting dataset and achieved an improved performance in enhancing writer identification and handwriting text recognition compared to Generative Adversarial Network (GAN)-based methods. The results demonstrate the potential of the generative method for enabling deep document image analysis and pave the way for further research. As a future direction, this work will aim to progress from generating word images to generating sentence and full document images by conditioning on the content, style, and layout of historical documents. Another future action will be to further extend the proposed method to operate in a few-shot scheme for the writer style condition in order to generate unseen styles. Furthermore, the future work will aim to leverage important features from pre-training with synthetic and real data in order to generalize to historical documents that are a scarce source and adjusting the text encoding parts to different languages and scripts. Finally, the ultimate goal of the future work aims to generate a massive synthetic historical document image database to fill the existing benchmark gap.
Creation Date:
2023
Language:
English
Identifier:
ISBN: 9789180483049
ISBN: 9180483046
ISBN: 9789180483032
ISBN: 9180483038
Source:
SWEPUB Freely available online
This feature requires javascript
This feature requires javascript
Back to results list
This feature requires javascript
This feature requires javascript
Searching Remote Databases, Please Wait
Searching for
in
scope:(TDTS),scope:(SFX),scope:(TDT),scope:(SEN),primo_central_multiple_fe
Show me what you have so far
This feature requires javascript
This feature requires javascript