A graph-based approach for segmenting touching lines in historical handwritten documents

David Fernández-Mota, Josep Lladós, Alicia Fornés

Research output: Contribution to journalArticleResearchpeer-review

25 Citations (Scopus)

Abstract

Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data. © 2014 Springer-Verlag Berlin Heidelberg.
Original languageEnglish
Pages (from-to)293-312
JournalInternational Journal on Document Analysis and Recognition
Volume17
Issue number3
DOIs
Publication statusPublished - 1 Jan 2014

Keywords

  • Document image processing
  • Handwritten documents
  • Historical document analysis
  • Text line segmentation

Fingerprint

Dive into the research topics of 'A graph-based approach for segmenting touching lines in historical handwritten documents'. Together they form a unique fingerprint.

Cite this