Abstract
In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
Original language | English |
---|---|
Title of host publication | ICPR 2012 - 21st International Conference on Pattern Recognition |
Pages | 521-524 |
Number of pages | 4 |
Publication status | Published - 2012 |
Publication series
Name | Proceedings - International Conference on Pattern Recognition |
---|---|
ISSN (Print) | 1051-4651 |