Description
Structured text extraction is one of the most valuable and challenging application direction in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we set up two tracks for the Structured text extraction from Visually-Rich Document images (SVRD) competition:
Track 1: HUST-CELL aims to evaluate the end-to-end performance of Complex Entity Linking and Labeling.
Track 2: Baidu-FEST focuses on evaluating the end-to-end performance and generalization of Few-shot Structured Text extraction.
Compared to the current document benchmarks, our two tracks of competition benchmark enriches the scenarios greatly and contains more than 50 types of visually-rich document images (mainly from the actual enterprise applications). In addition, our task settings not only include complex end-to-end entity linking and labeling, based on track 1, but also provide the zero-shot and few-shot tracks to objectively evaluate the performance and generalization of the competition schemes. We believe that our competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI. There are four main tasks in this competition, which will are detailed in the Tasks tab.
Track 1: HUST-CELL aims to evaluate the end-to-end performance of Complex Entity Linking and Labeling.
Track 2: Baidu-FEST focuses on evaluating the end-to-end performance and generalization of Few-shot Structured Text extraction.
Compared to the current document benchmarks, our two tracks of competition benchmark enriches the scenarios greatly and contains more than 50 types of visually-rich document images (mainly from the actual enterprise applications). In addition, our task settings not only include complex end-to-end entity linking and labeling, based on track 1, but also provide the zero-shot and few-shot tracks to objectively evaluate the performance and generalization of the competition schemes. We believe that our competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI. There are four main tasks in this competition, which will are detailed in the Tasks tab.
Date made available | 30 Dec 2022 |
---|---|
Date of data production | 10 Jan 2023 |