Language:

CArDIS: A Swedish Historical Handwritten Character and Word Dataset

Navngivelse 4.0 Internasjonal © 2022 Author(s) ;ISSN: 2169-3536 ;EISSN: 2169-3536 ;DOI: 10.1109/ACCESS.2022.3175197

Full text available

Citations Cited by

Actions
1. Add to My Research
2. Remove from My Research
3. E-mail
4. Print
5. Permalink
6. Citation
7. EasyBib
8. EndNote
9. RefWorks
10. Delicious
11. Export RIS
12. Export BibTeX

Title:
CArDIS: A Swedish Historical Handwritten Character and Word Dataset
Author: Yavariabdi, Amir ; Kusetogullari, Huseyin ; Celik, Turgay ; Thummanapally, Shivani ; Rijwan, Sakib ; Hall, Johan
Subjects: Teknologi
Description: This paper introduces a new publicly available image-based Swedish historical handwritten character and word dataset named C haracter Ar kiv D igital S weden (CArDIS) ( https://cardisdataset.github.io/CARDIS/ ). The samples in CArDIS are collected from 64, 084 Swedish historical documents written by several anonymous priests between 1800 and 1900. The dataset contains 116, 000 Swedish alphabet images in RGB color space with 29 classes, whereas the word dataset contains 30, 000 image samples of ten popular Swedish names as well as 1, 000 region names in Sweden. To examine the performance of different machine learning classifiers on CArDIS dataset, three different experiments are conducted. In the first experiment, classifiers such as Support Vector Machine (SVM), Artificial Neural Networks (ANN), k-Nearest Neighbor (k-NN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Random Forest (RF) are trained on existing character datasets which are Extended Modified National Institute of Standards and Technology (EMNIST), IAM and CVL and tested on CArDIS dataset. In the second and third experiments, the same classifiers as well as two pre-trained VGG-16 and VGG-19 classifiers are trained and tested on CArDIS character and word datasets. The experiments show that the machine learning methods trained on existing handwritten character datasets struggle to recognize characters efficiently on the CArDIS dataset, proving that characters in the CArDIS contain unique features and characteristics. Moreover, in the last two experiments, the deep learning-based classifiers provide the best recognition rates. publishedVersion
Publisher: IEEE
Creation Date: 2022
Language: English
Identifier: ISSN: 2169-3536
EISSN: 2169-3536
DOI: 10.1109/ACCESS.2022.3175197
Source: Brage Consortium Repository
DOAJ Directory of Open Access Journals

Back to results list


INSPIRE LIBRARY - TON DUC THANG UNIVERSITY	(84-028) 37 755 057	Feedback
19 Nguyen Huu Tho St. Dist.7, HCM	thuvien@tdtu.edu.vn	Feedback

CArDIS: A Swedish Historical Handwritten Character and Word Dataset

Navngivelse 4.0 Internasjonal © 2022 Author(s) ;ISSN: 2169-3536 ;EISSN: 2169-3536 ;DOI: 10.1109/ACCESS.2022.3175197

Searching Remote Databases, Please Wait