TY - JOUR
T1 - The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition
AU - Romero, Verónica
AU - Fornés, Alicia
AU - Serrano, Nicolás
AU - Sánchez, Joan Andreu
AU - Toselli, Alejandro H.
AU - Frinken, Volkmar
AU - Vidal, Enrique
AU - Lladós, Josep
PY - 2013/6/1
Y1 - 2013/6/1
N2 - Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies. © 2012 Elsevier Ltd.
AB - Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies. © 2012 Elsevier Ltd.
KW - BLSTM
KW - Handwritten text recognition
KW - Hidden Markov models
KW - Marriage register books
KW - Neural networks
U2 - 10.1016/j.patcog.2012.11.024
DO - 10.1016/j.patcog.2012.11.024
M3 - Article
SN - 0031-3203
VL - 46
SP - 1658
EP - 1669
JO - Pattern Recognition
JF - Pattern Recognition
IS - 6
ER -