TY - CHAP
T1 - Out-of-Vocabulary Challenge Report
AU - Garcia-Bordils, Sergi
AU - Mafla, Andrés
AU - Biten, Ali Furkan
AU - Nuriel, Oren
AU - Aberdam, Aviad
AU - Mazor, Shai
AU - Litman, Ron
AU - Karatzas, Dimosthenis
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.
AB - This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.
UR - http://www.scopus.com/inward/record.url?scp=85151051573&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/7e24d82d-fc16-3b3c-ae78-b899c7e2f4e9/
U2 - 10.1007/978-3-031-25069-9_24
DO - 10.1007/978-3-031-25069-9_24
M3 - Chapter
AN - SCOPUS:85151051573
SN - 9783031250682
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 359
EP - 375
BT - Computer Vision – ECCV 2022 Workshops, Proceedings
A2 - Karlinsky, Leonid
A2 - Michaeli, Tomer
A2 - Nishino, Ko
PB - Springer Science and Business Media Deutschland GmbH
ER -