With the rapid increase of multimedia data, textual content in an image has become a very important source of information for several applications like navigation, image search and retrieval, image understanding, captioning, machine translation and several others. Scene text localization is the first step towards such applications and most current methods focus on generating a small set of high precision detectors rather than obtaining large set of detections covering all text patches. In this work we propose a novel hybrid framework for text localization which uses character level recognition recursively in a feedback mechanism to refine text patches and reduce false positives. We use popular MSER algorithm at multiple scales as an initial region proposal algorithm and several filtering stages recursively to improve precision as well as maximize recall. We aim at achieving high recall rather than achieving higher precision since several robust word recognition systems are already available. The word recognition systems are mature enough to produce highly accurate results if provided with maximum amount of regions rather than providing small set of highly precise text patches and losing several other text regions. The main contribution of this paper is the use of character recognizer within a novel feedback mechanism to recursively search for text regions in the neighborhood of previously detected text patches. Using 3 publicly available benchmark datasets (ICDAR2011, MSRA TD-500 and OSTD), we demonstrate the efficacy of the proposed framework for text localization. © 2016 IEEE.