Document indexing framework for retrieval of degraded document images

R. Garg; E. Hassan; Santanu Chaudhury

doi:10.1109/ICDAR.2015.7333966

Profiles Research Units Publications

Conferences

Document indexing framework for retrieval of degraded document images

R. Garg, E. Hassan,

Published in IEEE Computer Society

2015

DOI: 10.1109/ICDAR.2015.7333966

Volume: 2015-November

Pages: 1261 - 1265

Abstract

With the availability of large collection of document images in Indian languages, image based retrieval has gained popularity. The performance of such systems is effected by the presence of degraded and noisy images. Moreover, Optical character recognition systems for Indian scripts are not yet robust, leading to noisy OCR'ed text. Information retrieval system designed using inputs from both modalities (image features and OCR based recognition data) will lead to better retrieval performance in contrast to usage of individual modality. In this paper we present a indexing methodology that uses multiple kernel learning to combine features from different modalities by joint optimization of search time and accuracy. The evaluation of the proposed methodology is demonstrated on document images of Bangla and Devanagari script. © 2015 IEEE.

Topics: Visual Word (64)%, Search engine indexing (57)%, Optical character recognition (57)%, Modality (human–computer interaction) (52)% and Multiple kernel learning (51)%

View more info for "Document indexing framework for retrieval of degraded document images"

About the journal

Journal	Data powered by SciSpaceProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Publisher	Data powered by SciSpaceIEEE Computer Society
ISSN	15205363

Authors (1)

Santanu Chaudhury
- Department of Computer Science & Engineering

ACADEMICS

RESEARCH

STUDENTS

FACULTY