TY - JOUR
T1 - Text box proposals for handwritten word spotting from documents
AU - Ghosh, Suman
AU - Valveny, Ernest
PY - 2018/6/1
Y1 - 2018/6/1
N2 - © 2018, Springer-Verlag GmbH Germany, part of Springer Nature. In this article, we propose a new approach to segmentation-free word spotting that is based on the combination of three different contributions. Firstly, inspired by the success of bounding box proposal algorithms in object recognition, we propose a scheme to generate a set of word-independent text box proposals. For that, we generate a set of atomic bounding boxes based on simple connected component analysis that are combined using a set of spatial constraints in order to generate the final set of text box proposals. Secondly, an attribute representation based on the Pyramidal Histogram of Characters (PHOC) is encoded in an integral image and used to efficiently evaluate text box proposals for retrieval. Thirdly, we also propose an indexing scheme for fast retrieval based on character n-grams. For the generation of the index a similar attribute space based on a Pyramidal Histogram of Character N-grams (PHON) is used. All attribute models are learned using linear SVMs over the Fisher Vector representation of the word images along with the PHOC or PHON labels of the corresponding words. We show the performance of the proposed approach in both tasks of query-by-string and query-by-example in standard single- and multi-writer data sets, reporting state-of-the-art results.
AB - © 2018, Springer-Verlag GmbH Germany, part of Springer Nature. In this article, we propose a new approach to segmentation-free word spotting that is based on the combination of three different contributions. Firstly, inspired by the success of bounding box proposal algorithms in object recognition, we propose a scheme to generate a set of word-independent text box proposals. For that, we generate a set of atomic bounding boxes based on simple connected component analysis that are combined using a set of spatial constraints in order to generate the final set of text box proposals. Secondly, an attribute representation based on the Pyramidal Histogram of Characters (PHOC) is encoded in an integral image and used to efficiently evaluate text box proposals for retrieval. Thirdly, we also propose an indexing scheme for fast retrieval based on character n-grams. For the generation of the index a similar attribute space based on a Pyramidal Histogram of Character N-grams (PHON) is used. All attribute models are learned using linear SVMs over the Fisher Vector representation of the word images along with the PHOC or PHON labels of the corresponding words. We show the performance of the proposed approach in both tasks of query-by-string and query-by-example in standard single- and multi-writer data sets, reporting state-of-the-art results.
KW - Bounding box proposals
KW - Pyramidal Histogram of Characters
KW - Segmentation-free
KW - Word attributes
KW - Word spotting
U2 - 10.1007/s10032-018-0300-7
DO - 10.1007/s10032-018-0300-7
M3 - Article
SN - 1433-2833
VL - 21
SP - 91
EP - 108
JO - International Journal on Document Analysis and Recognition
JF - International Journal on Document Analysis and Recognition
IS - 1-2
ER -