Top-down and bottom-up cues for scene text recognition

Anand Mishra; K. Alahari; C.V. Jawahar

doi:10.1109/CVPR.2012.6247990

Profiles Research Units Publications

Conferences

Top-down and bottom-up cues for scene text recognition

, K. Alahari, C.V. Jawahar

Published in

2012

DOI: 10.1109/CVPR.2012.6247990

Pages: 2687 - 2694

Abstract

Scene text recognition has gained significant attention from the computer vision community in recent years. Recognizing such text is a challenging problem, even more so than the recognition of scanned documents. In this work, we focus on the problem of recognizing text extracted from street images. We present a framework that exploits both bottom-up and top-down cues. The bottom-up cues are derived from individual character detections from the image. We build a Conditional Random Field model on these detections to jointly model the strength of the detections and the interactions between them. We impose top-down cues obtained from a lexicon-based prior, i.e. language statistics, on the model. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random field model. We show significant improvements in accuracies on two challenging public datasets, namely Street View Text (over 15%) and ICDAR 2003 (nearly 10%). © 2012 IEEE.

Topics: Noisy text analytics (59)%, Intelligent character recognition (54)%, Conditional random field (53)%, Three-dimensional face recognition (53)% and Intelligent word recognition (52)%

View more info for "Top-down and bottom-up cues for scene text recognition"

About the journal

Journal	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN	10636919

Authors (1)

Anand Mishra
- Department of Computer Science & Engineering

ACADEMICS

RESEARCH

STUDENTS

FACULTY