Header menu link for other important links
X
Table extraction from document images using fixed point model
A. Bansal, , S. Dutta Roy
Published in Association for Computing Machinery
2014
Volume: 14
   
Abstract
The paper presents a novel learning-based framework to identify tables from scanned document images. The approach is designed as a structured labeling problem, which learns the layout of the document and labels its various entities as table header, table trailer, table cell and non-table region. We develop features which encode the foreground block characteristics and the contextual information. These features are provided to a fixed point model which learns the inter-relationship between the blocks. The fixed point model attains a contraction mapping and provides a unique label to each block. We compare the results with Condition Random Fields(CRFs). Unlike CRFs, the fixed point model captures the context information in terms of the neighbourhood layout more efficiently. Experiments on the images picked from UW-III (University of Washington) dataset, UNLV dataset and our dataset consisting of document images with multi-column page layout, show the applicability of our algorithm in layout analysis and table detection. Copyright 2014 ACM.
About the journal
JournalData powered by TypesetACM International Conference Proceeding Series
PublisherData powered by TypesetAssociation for Computing Machinery