On the influence of word representations for handwritten word spotting in historical documents

Josep LladÓs, Marçal RusiÑol, Alicia FornÉs, David FernÁndez, Anjan Dutta

Producción científica: Contribución a una revistaArtículoInvestigaciónrevisión exhaustiva

49 Citas (Scopus)

Resumen

Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images. © 2012 World Scientific Publishing Company.
Idioma originalInglés
Número de artículo1263002
PublicaciónInternational Journal of Pattern Recognition and Artificial Intelligence
Volumen26
N.º5
DOI
EstadoPublicada - 1 ago 2012

Huella

Profundice en los temas de investigación de 'On the influence of word representations for handwritten word spotting in historical documents'. En conjunto forman una huella única.

Citar esto