Call for participation

Dear potential participants of AVSD-DSTC10, 

We held Audio Visual Scene-aware Dialogue Challenge at DSTC7 and DSTC8 to test the efficacy of answer generation models based on audio-visual features.

We are holding another challenge for AVSD at DSTC10 to tackle the following issue: 

Goal: Answer generation with Evidence timing

We updated the baseline system for DSTC10 and now available on GitHub. 

https://github.com/dialogtekgeek/AVSD-DSTC10_Official 

** Registration is required. **  

The registration link is – Here. Information on gaining access to baseline is Here 

The baseline system is based on Audio-Visual Transformer for dialogue response sentence generation. 

·  Output: 

  • Answer generation considering dialog context 
  • Evidence timing detection based on attention weights 

    Evaluation: 

  • Validation data (1,787) was evaluated using “single” Answer and Evidence timing
  • Sentence similarity: BLEU, METEOR, CIDEr 
  • Timing overlap: Intersection over Union (IoU)  

    Official evaluation: 

  • Likert scale by 5 humans 
  • Similarity compared with single and multiple ground truths 

     Additional Data:  

  • Evidence timing for Training data (7,659) will be provided soon. 

Note:  

DSTC10 does NOT allow to use the of manual descriptions at the test stage.

The challengers can use the manual description only for the training stage.


If you have any questions, please let us know.  (Contact info in GitHub repository)

AVSD@DSTC10 Task Organizers: 

Ankit Shah, Shijie Geng, Peng Gao, Anoop Cherian, Tim K. Marks and Chiori Hori

Both comments and pings are currently closed.

Comments are closed.

Design by 2b Consult