TY - JOUR
T1 - A fast hierarchical method for multi-script and arbitrary oriented scene text extraction
AU - Gomez, Lluis
AU - Karatzas, Dimosthenis
N1 - Funding Information:
This project was supported by the Spanish project TIN2011-24631 the fellowship RYC-2009-05031 and the Catalan government scholarship 2013FI1126.
Publisher Copyright:
© 2016, Springer-Verlag Berlin Heidelberg.
PY - 2016/12/1
Y1 - 2016/12/1
N2 - Typography and layout lead to the hierarchical organization of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing scene text detection methods. This paper addresses the problem of text segmentation in natural scenes from a hierarchical perspective. Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based on perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state-of-the-art methods in unconstrained scenarios.
AB - Typography and layout lead to the hierarchical organization of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing scene text detection methods. This paper addresses the problem of text segmentation in natural scenes from a hierarchical perspective. Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based on perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state-of-the-art methods in unconstrained scenarios.
KW - Detection
KW - Hierarchical grouping
KW - Perceptual organization
KW - Scene text
KW - Segmentation
UR - http://www.scopus.com/inward/record.url?scp=84991363489&partnerID=8YFLogxK
U2 - 10.1007/s10032-016-0274-2
DO - 10.1007/s10032-016-0274-2
M3 - Article
AN - SCOPUS:84991363489
SN - 1433-2833
VL - 19
SP - 335
EP - 349
JO - International Journal on Document Analysis and Recognition
JF - International Journal on Document Analysis and Recognition
IS - 4
ER -