This page is on the SIGIR 2022 paper
on Conversational Question Answering on Heterogeneous Sources.
GitHub link to CONVINSE code
Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tapping into all of these together, this way boosting answer coverage and confidence. We present CONVINSE, an end-to-end pipeline for ConvQA over heterogeneous sources, operating in three stages: i) learning an explicit structured representation of an incoming question and its conversational context, ii) harnessing this frame-like representation to uniformly capture relevant evidences from KB, text, and tables, and iii) running a fusion-in-decoder model to generate the answer. We construct and release the first benchmark, ConvMix, for ConvQA over heterogeneous sources, comprising 3000 real-user conversations with 16000 questions, along with entity annotations, completed question utterances, and question paraphrases. Experiments demonstrate the viability and advantages of our method, compared to state-of-the-art baselines.
For feedback and clarifications, please contact: Philipp Christmann
(pchristm AT mpi HYPHEN inf DOT mpg DOT de), Rishiraj Saha Roy
(rishiraj AT mpi HYPHEN inf DOT mpg DOT de) or Gerhard Weikum
(weikum AT mpi HYPHEN inf DOT mpg DOT de).
To know more about our group, please visit https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/question-answering/