OCR-VQA: Visual question answering by reading text in images

Anand Mishra; S. Shekhar; A.K. Singh; A. Chakraborty

doi:10.1109/ICDAR.2019.00156

Profiles Research Units Publications

Conferences

OCR-VQA: Visual question answering by reading text in images

, S. Shekhar, A.K. Singh, A. Chakraborty

Published in IEEE Computer Society

2019

DOI: 10.1109/ICDAR.2019.00156

Pages: 947 - 952

Abstract

The problem of answering questions about an image is popularly known as visual question answering (or VQA in short). It is a well-established problem in computer vision. However, none of the VQA methods currently utilize the text often present in the image. These 'texts in images' provide additional useful cues and facilitate better understanding of the visual content. In this paper, we introduce a novel task of visual question answering by reading text in images, i.e., by optical character recognition or OCR. We refer to this problem as OCR-VQA. To facilitate a systematic way of studying this new problem, we introduce a large-scale dataset, namely OCRVQA-200K. This dataset comprises of 207,572 images of book covers and contains more than 1 million question-answer pairs about these images. We judiciously combine well-established techniques from OCR and VQA domains to present a novel baseline for OCR-VQA-200K. The experimental results and rigorous analysis demonstrate various challenges present in this dataset leaving ample scope for the future research. We are optimistic that this new task along with compiled dataset will open-up many exciting research avenues both for the document image analysis and the VQA communities. © 2019 IEEE.

Topics: Optical character recognition (54)% and Question answering (52)%

View more info for "OCR-VQA: Visual Question Answering by Reading Text in Images"

About the journal

Journal	Data powered by SciSpaceProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Publisher	Data powered by SciSpaceIEEE Computer Society
ISSN	15205363

Authors (1)

Anand Mishra
- Department of Computer Science & Engineering

ACADEMICS

RESEARCH

STUDENTS

FACULTY