elBERto: Self-supervised Commonsense Learning for Question Answering

2 minute read

Introduction

What is the name of the elBERTo paper?

elBERto: Self-supervised Commonsense Learning for Question Answering
elBERTo stands for Selfsupervised Bidirectional Encoder Representation Learning of Commonsense
What are the main contributions of the elBERTo paper?
- 5 self supervised tasks to encourage models to perform commonsense reasoning with context
- novel Contrastive Relation Learning (CRL) task to help QA solvers to
- Jigsaw Puzzle (JP) task
- SOTA performance on commonsense QA tasks
What are the claimed advantages of the elBERTo model?
- in domain training corpus
- no extra training time
- improved commonsense question answering
Approaches to commonsense QA fall into two main categories
- Retrieving evidence paths from external knowledge base
- Knowledge Graph propagation based approaches

Method

Context in commonsense multiple choice question answering can consist of complex relations
How are commonsense multiple choice question answering tasks formulated?
- context
- question
- answer choices
- Want to optimize likelihood of ground truth answer under model
elBERTo paper proposes a large number of SSL tasks
elBERTo Contrastive Relation Learning tasks asks a model to distinguish between original and flipped contexts
- flip words in context to antonyms
The elBERTo jigsaw puzzle task aims to predict the correct order out of k ordered sequential contexts
- aims is to prompt model to learn sequential relationships
elBERTo motivation for using Binary Sentence Order Prediction is that contexts for commonsense question answering exhibit causal relationships between consecutive sentences
elBERTo uses Tagme to extract entities for masked entity modeling

Results

What are some examples of commonsense QA datasets?
- WIQA: What if questions
- CosmosQA: human daily life scenarios
- ReClor: from graduate admission exams
elBERTo results: adding pretraining tasks leads to improvement
elBERTo predictions examples:

Conclusions

This paper presents a number of pretraining tasks that improve performance on Commonsense QA tasks. These question answering tasks are very interesting because they require more compositional reasoning compared to other NLP tasks and are less susceptible to shortcut learning. Overall this is a good paper and the pretraining tasks seem well motivated. I think in general it’s a good domain of research to explore how pretraining tasks can be used to encode specific inductive biases into smaller models like the BERT models used in this dataset. I feel this approach will be complementary to the multitask pretraining and finetuning going into larger models like t0 and FLAN. For practical use cases it will often be better to use that smaller model with a specialized pretraining task rather than a larger model even thought the larger model might have good zero shot or few shot performance.

Reference

@article{zhan2022elberto,
  title={elBERto: Self-supervised Commonsense Learning for Question Answering},
  author={Zhan, Xunlin and Li, Yuan and Dong, Xiao and Liang, Xiaodan and Hu, Zhiting and Carin, Lawrence},
  journal={arXiv preprint arXiv:2203.09424},
  year={2022}
}

Twitter Facebook LinkedIn

Ethan Kim

elBERto: Self-supervised Commonsense Learning for Question Answering

Introduction

Method

Results

Conclusions

Reference

You May Also Enjoy

ITERATED DECOMPOSITION: IMPROVING SCIENCE Q&A

Decoder Inference Optimization

1 Year of a Challenging Big-Bench Task

Scattered or Connected? An Optimized Parameter-efficient