Out-of-Vocabulary Challenge Report

Sergi Garcia-Bordils, Andrés Mafla, Ali Furkan Biten*, Oren Nuriel, Aviad Aberdam, Shai Mazor, Ron Litman, Dimosthenis Karatzas

*Corresponding author for this work

Research output: Chapter in BookChapterResearchpeer-review

2 Citations (Scopus)

Abstract

This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2022 Workshops, Proceedings
EditorsLeonid Karlinsky, Tomer Michaeli, Ko Nishino
PublisherSpringer Science and Business Media Deutschland GmbH
Pages359-375
Number of pages17
ISBN (Print)9783031250682
DOIs
Publication statusPublished - 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13804 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Fingerprint

Dive into the research topics of 'Out-of-Vocabulary Challenge Report'. Together they form a unique fingerprint.

Cite this