They require identifying a simple fact in a single web document, which is then presented as the answer, but existing QA systems can’t offer rich explanations the way people do.
In testing, we found that our new Bid AF model is stronger than the baseline, which we measure using term frequency-inverse document frequency (TFIDF).
We use a sequence-to-sequence (seq2seq) approach for abstractive modeling to synthesize information from various web sources to write a paragraph-length answer.
Our data set goes further by requiring machines to elaborate with in-depth answers to open-ended questions, such as “How do jellyfish function without a brain?
” Furthermore, our data set provides researchers with hundreds of thousands of examples to advance AI models that can synthesize information from multiple sources and provide explanations to complex questions across a wide range of topics.
This approach forces our model to read multiple sources in order to develop a complete explanation for each question.
QA models for ELI5 mimic what many people do when they're asked a question: If they don’t know the answer, they’ll likely search the web to learn about the topic, read a few of the results, and then provide the answer.We do this by tackling multiple tasks during training, applying the resulting model on the standard QA task of reading the question and documents, and then writing the answer.It turns out that this multitask seq2seq approach outperforms standard language modeling and seq2seq techniques.To make progress in long-form question answering, researchers need a large, diverse data set of complex how- and why-type questions with paragraph-length answers.Our new long-form QA dataset challenges existing algorithms because it requires processing many web documents comprising hundreds of thousands of words, identifying the relevant information in those documents, and writing a longform response to an often open-ended question.Standard seq2seq models receive a training signal only from predicting the answer, whereas a language model approach would be trained to predict the question, web source, and answer.To improve performance, we train seq2seq models with multitasking to combine the benefits of language modeling with seq2seq.Previous work has proposed datasets with some of these components, but not all of them together.We’ve created a large-scale, high-quality data set, together with web documents, as well as two pretrained models.The ELI5 data set and the accompanying baseline models help us make progress toward this goal.We provide both extractive and abstractive models to produce on-topic answers and demonstrate the ability of our models to read through relatively large quantities of noisy web information.