skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Robust Arabic and Pashto Text Detection in Camera-Captured Documents Using Deep Learning Techniques

IEEE access, 2023, Vol.11, p.135788-135796 [Peer Reviewed Journal]

Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 ;ISSN: 2169-3536 ;EISSN: 2169-3536 ;DOI: 10.1109/ACCESS.2023.3336404 ;CODEN: IAECCG

Full text available

Citations Cited by
  • Title:
    Robust Arabic and Pashto Text Detection in Camera-Captured Documents Using Deep Learning Techniques
  • Author: Khan, Nisar ; Ahmad, Riaz ; Ullah, Khalil ; Muhammad, Siraj ; Hussain, Ibrar ; Khan, Ahmad ; Ghadi, Yazeed Yasin ; Mohamed, Heba G.
  • Subjects: Annotations ; Arabic ; Arabic language ; Cameras ; Classification algorithms ; CNN ; dataset ; Deep learning ; deep learning models ; Document image analysis ; Documents ; Electronic mail ; Image analysis ; Image quality ; Information retrieval ; Layout ; Layouts ; Machine learning ; Pashto ; Text analysis ; text detection
  • Is Part Of: IEEE access, 2023, Vol.11, p.135788-135796
  • Description: In the realm of Document Image Analysis (DIA), the primary objective is to transform image data into a format that can be readily interpreted by machines. Within a DIA-based system, layout analysis plays a crucial role in pre-processing, for the identification and extraction of precise and error-free textual segments. However, regarding the Pashto language, the document images are not explored so far. Pashto text detection in camera-captured documents is a challenging task due to variations in image quality, lighting conditions, complex backgrounds unavailability of labeled documents, cursiveness, shape-context dependency, multi scripts per image, and language-specific layouts. This research examines the case of Pashto and Arabic text and contributes in two aspects. First, it introduces the creation of a real dataset that contains 1080 images of the Pashto documents captured by a handheld camera. Second, this work examines deep learning based classifiers that can perform layout analysis tasks and detects Pashto and Arabic text per document. For the layout classification, we used deep learning models such as Single-Shot Detector (SSD), Yolov5 and Yolov7. A baseline results are achieved by examining 30% images as a test set and achieve a mean average precision (mAP) of 84.51% on SSD, 88.50% on Yolov5 and 91.30% on Yolov7 respectively. The proposed methods have the potential to contribute to various applications, such as document analysis, information retrieval, and translation, for Pashto and Arabic language users.
  • Publisher: Piscataway: IEEE
  • Language: English
  • Identifier: ISSN: 2169-3536
    EISSN: 2169-3536
    DOI: 10.1109/ACCESS.2023.3336404
    CODEN: IAECCG
  • Source: IEEE Xplore Open Access Journals
    DOAJ Directory of Open Access Journals

Searching Remote Databases, Please Wait