Header menu link for other important links
Trainable script identification strategies for Indian languages
Published in IEEE Computer Society
Pages: 661 - 664
Identification of the script in an image of a document page is of primary importance for a system processing multi-lingual documents. In this paper three trainable classification schemes have been proposed for identification of Indian scripts. The first scheme is based upon a frequency domain representation of the horizontal profile of the textual blocks. The other two schemes use connected components extracted from the textual region. We have proposed a novel Gabor filter-based feature extraction scheme for the connected components. We have also found that frequency distribution of the width-to-height ratio of the connected components can also be used for script recognition. It has been experimentally found that the Gabor filter-based scheme provides the most reliable performance. However, the other two techniques are computationally more efficient. © 1999 IEEE.
About the journal
JournalData powered by TypesetProceedings of the International Conference on Document Analysis and Recognition, ICDAR
PublisherData powered by TypesetIEEE Computer Society