skip to main content
Language:
Search Limited to: Search Limited to: Resource type Show Results with: Show Results with: Search type Index

Document Analysis And Classification Based On Passing Window

Journal of advances in computer engineering and technology, 2020-02, Vol.6 (1), p.39-46 [Peer Reviewed Journal]

ISSN: 2423-4192 ;EISSN: 2423-4206

Full text available

Citations Cited by
  • Title:
    Document Analysis And Classification Based On Passing Window
  • Author: ZAHER BAMASOOD
  • Subjects: data mining ; document image analysis ; feature extraction ; information retrieval ; segmentation
  • Is Part Of: Journal of advances in computer engineering and technology, 2020-02, Vol.6 (1), p.39-46
  • Description: In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorithm is proposed to segment a document image into homogenous regions. In document classification, Neural Network (Multilayer Perceptron- Back propagation) classifier is applied to classify each region to text or non text based on a number of features extracted in feature extraction. These features are collected from different other researchers’ works. Experiments were conducted on 398 document images selected randomly from printed Arabic text database (PATDB) which was selected from various printing forms which are advertisements, book chapters, magazines, newspapers, letters and reports documents. As results, the proposed segmentation algorithm achieved only 0.814% as ratio of the overlapping areas of the merged zones to the total size of zones and 1.938% as the ratio of missed areas to total size of zones. The features, that show the best accuracy individually, are Background Vertical Run Length (RL) Mean, and Standard Deviation of foreground.
  • Publisher: Science and Research Branch,Islamic Azad University
  • Language: English
  • Identifier: ISSN: 2423-4192
    EISSN: 2423-4206
  • Source: DOAJ Directory of Open Access Journals

Searching Remote Databases, Please Wait