ARDIS: a Swedish historical handwritten digit dataset

  • Yazar/lar CHEDDAD, Abbas
    GRAHN, Håkan
    HALL, Johan
    KUSETOĞULLARI, Hüseyin
    YAVARIABDI, Amir
  • Yayın Türü Makale
  • Yayın Tarihi 2019
  • Yayıncı Neural Computing and Applications
  • Tek Biçim Adres https://hdl.handle.net/20.500.12498/882

This paper introduces a new image-based handwritten historical digit dataset named Arkiv Digital Sweden (ARDIS). The images in ARDIS dataset are extracted from 15,000 Swedish church records which were written by different priests with various handwriting styles in the nineteenth and twentieth centuries. The constructed dataset consists of three single-digit datasets and one-digit string dataset. The digit string dataset includes 10,000 samples in red–green–blue color space, whereas the other datasets contain 7600 single-digit images in different color spaces. An extensive analysis of machine learning methods on several digit datasets is carried out. Additionally, correlation between ARDIS and existing digit datasets Modified National Institute of Standards and Technology (MNIST) and US Postal Service (USPS) is investigated. Experimental results show that machine learning algorithms, including deep learning methods, provide low recognition accuracy as they face difficulties when trained on existing datasets and tested on ARDIS dataset. Accordingly, convolutional neural network trained on MNIST and USPS and tested on ARDIS provide the highest accuracies 58.80% and 35.44% , respectively. Consequently, the results reveal that machine learning methods trained on existing datasets can have difficulties to recognize digits effectively on our dataset which proves that ARDIS dataset has unique characteristics. This dataset is publicly available for the research community to further advance handwritten digit recognition algorithms.

Görüntülenme
12
22.03.2024 tarihinden bu yana
İndirme
1
22.03.2024 tarihinden bu yana
Son Erişim Tarihi
18 Mayıs 2024 13:34
Google Kontrol
Tıklayınız
Tam Metin
Tam Metin İndirmek için tıklayın Ön izleme
Detaylı Görünüm
ISSN
(dc.identifier.issn)
1433-3058
Yayıncı
(dc.publisher)
Neural Computing and Applications
Eser Adı
(dc.title)
ARDIS: a Swedish historical handwritten digit dataset
Özet
(dc.description.abstract)
This paper introduces a new image-based handwritten historical digit dataset named Arkiv Digital Sweden (ARDIS). The images in ARDIS dataset are extracted from 15,000 Swedish church records which were written by different priests with various handwriting styles in the nineteenth and twentieth centuries. The constructed dataset consists of three single-digit datasets and one-digit string dataset. The digit string dataset includes 10,000 samples in red–green–blue color space, whereas the other datasets contain 7600 single-digit images in different color spaces. An extensive analysis of machine learning methods on several digit datasets is carried out. Additionally, correlation between ARDIS and existing digit datasets Modified National Institute of Standards and Technology (MNIST) and US Postal Service (USPS) is investigated. Experimental results show that machine learning algorithms, including deep learning methods, provide low recognition accuracy as they face difficulties when trained on existing datasets and tested on ARDIS dataset. Accordingly, convolutional neural network trained on MNIST and USPS and tested on ARDIS provide the highest accuracies 58.80% and 35.44% , respectively. Consequently, the results reveal that machine learning methods trained on existing datasets can have difficulties to recognize digits effectively on our dataset which proves that ARDIS dataset has unique characteristics. This dataset is publicly available for the research community to further advance handwritten digit recognition algorithms.
Yayın Tarihi
(dc.date.issued)
2019
Kayıt Giriş Tarihi
(dc.date.accessioned)
2019-07-09T12:56:58Z
Açık Erişim tarihi
(dc.date.available)
2019-07-09T12:56:58Z
Yayın Dili
(dc.language.iso)
eng
Yayın Türü
(dc.type)
Makale
Yazar/lar
(dc.contributor.author)
CHEDDAD, Abbas
Yazar/lar
(dc.contributor.author)
GRAHN, Håkan
Yazar/lar
(dc.contributor.author)
HALL, Johan
Yazar/lar
(dc.contributor.author)
KUSETOĞULLARI, Hüseyin
Yazar/lar
(dc.contributor.author)
YAVARIABDI, Amir
Tek Biçim Adres
(dc.identifier.uri)
https://hdl.handle.net/20.500.12498/882
Atıf Dizini
(dc.source.database)
Wos
Atıf Dizini
(dc.source.database)
Scopus
Analizler
Yayın Görüntülenme
Yayın Görüntülenme
Erişilen ülkeler
Erişilen şehirler
6698 sayılı Kişisel Verilerin Korunması Kanunu kapsamında yükümlülüklerimiz ve cerez politikamız hakkında bilgi sahibi olmak için alttaki bağlantıyı kullanabilirsiniz.

creativecommons
Bu site altında yer alan tüm kaynaklar Creative Commons Alıntı-GayriTicari-Türetilemez 4.0 Uluslararası Lisansı ile lisanslanmıştır.
Platforms