Workshop on Document Visual Question Answering (DocVQA 2021)
https://docvqa.org/static/workshop_2021.html
Visual Question Answering has become a key task in the vision and language field, while it has also become clear lately that there are numerous questions of common interest which cannot be answered unless written information in the image could be read and understood in the context provided by the visual information.
Document Visual Question Answering (DocVQA) aims to bring VQA to the Document Image Analysis field, that focuses on understanding written communication in images. DocVQA is proposed as a generic paradigm for purpose-driven document analysis and recognition, where natural language questions drive the information extraction and document understanding processes. This challenges current practice in Document Image Analysis Recognition where research has historically focused on generic bottom-up information extraction tasks (character recognition, table extraction, word spotting), largely disconnected from the final purpose the extracted information is used for.
Invited Speakers
Amanpreet Singh (Facebook AI Research) – “Towards models that can read and reason about scene text”
Brian Price (Adobe Research Labs) – “Understanding Data Visualizations via Question Answering”
Yijuan Lu (Microsoft Azure AI) – “Scene Text-Aware Pre-training for Text-VQA and Text-Caption”
Challenge Session
Winners of the 2021 edition of DocVQA challenge will be presenting their winning submissions at the workshop
Participation
The DocVQA 2021 workshop will take place at the Int. Conf. on Document Analysis and Recognition (ICDAR) on September 6, 2021, in the afternoon session as a virtual event.
Organizers
Minesh Mathew, IIIT Hyderabad, India
Ruben Perez, Computer Vision Centre, Spain
Dimosthenis Karatzas, Computer Vision Centre, Spain
C.V. Jawahar, IIIT Hyderabad, India
R. Manmatha, Amazon, USA