TY - CHAP
T1 - Graphics recognition techniques
AU - Lladós, Josep
AU - Rusiñol, Marçal
PY - 2014/1/1
Y1 - 2014/1/1
N2 - © Springer-Verlag London 2014. All rights are reserved. This chapter describes the most relevant approaches for the analysis of graphical documents. The graphics recognition pipeline can be splitted into three tasks. The low level or lexical task extracts the basic units composing the document. The syntactic level is focused on the structure, i.e., how graphical entities are constructed, and involves the location and classification of the symbols present in the document. The third level is a functional or semantic level, i.e., it models what the graphical symbols do and what they mean in the context where they appear. This chapter covers the lexical level, while the next two chapters are devoted to the syntactic and semantic level, respectively. The main problems reviewed in this chapter are raster-to-vector conversion (vectorization algorithms) and the separation of text and graphics components. The research and industrial communities have provided standard methods achieving reasonable performance levels. Hence, graphics recognition techniques can be considered to be in a mature state from a scientific point of view. Additionally this chapter provides insights on some related problems, namely, the extraction and recognition of dimensions in engineering drawings, and the recognition of hatched and tiled patterns. Both problems are usually associated, even integrated, in the vectorization process.
AB - © Springer-Verlag London 2014. All rights are reserved. This chapter describes the most relevant approaches for the analysis of graphical documents. The graphics recognition pipeline can be splitted into three tasks. The low level or lexical task extracts the basic units composing the document. The syntactic level is focused on the structure, i.e., how graphical entities are constructed, and involves the location and classification of the symbols present in the document. The third level is a functional or semantic level, i.e., it models what the graphical symbols do and what they mean in the context where they appear. This chapter covers the lexical level, while the next two chapters are devoted to the syntactic and semantic level, respectively. The main problems reviewed in this chapter are raster-to-vector conversion (vectorization algorithms) and the separation of text and graphics components. The research and industrial communities have provided standard methods achieving reasonable performance levels. Hence, graphics recognition techniques can be considered to be in a mature state from a scientific point of view. Additionally this chapter provides insights on some related problems, namely, the extraction and recognition of dimensions in engineering drawings, and the recognition of hatched and tiled patterns. Both problems are usually associated, even integrated, in the vectorization process.
KW - Dimension recognition
KW - Graphic-rich documents
KW - Graphics recognition
KW - Polygonal approximation
KW - Raster-to-vector conversion
KW - Text-graphics separation
KW - Texture-based primitive extraction
U2 - 10.1007/978-0-85729-859-1_18
DO - 10.1007/978-0-85729-859-1_18
M3 - Chapter
SN - 9780857298591
SN - 9780857298584
SP - 489
EP - 521
BT - Handbook of Document Image Processing and Recognition
ER -