elBERto: Self-supervised Commonsense Learning for Question Answering

2 minute read

Introduction

  • What is the name of the elBERTo paper?

    elBERto: Self-supervised Commonsense Learning for Question Answering

  • elBERTo stands for Selfsupervised Bidirectional Encoder Representation Learning of Commonsense

  • What are the main contributions of the elBERTo paper?
    • 5 self supervised tasks to encourage models to perform commonsense reasoning with context
    • novel Contrastive Relation Learning (CRL) task to help QA solvers to
    • Jigsaw Puzzle (JP) task
    • SOTA performance on commonsense QA tasks
  • What are the claimed advantages of the elBERTo model?
    • in domain training corpus
    • no extra training time
    • improved commonsense question answering
  • Approaches to commonsense QA fall into two main categories
    • Retrieving evidence paths from external knowledge base
    • Knowledge Graph propagation based approaches

Method

  • Context in commonsense multiple choice question answering can consist of complex relations

    Untitled

  • How are commonsense multiple choice question answering tasks formulated?
    • context
    • question
    • answer choices
    • Want to optimize likelihood of ground truth answer under model

    Untitled

  • elBERTo paper proposes a large number of SSL tasks

    Untitled

  • elBERTo Contrastive Relation Learning tasks asks a model to distinguish between original and flipped contexts
    • flip words in context to antonyms

    Untitled

  • The elBERTo jigsaw puzzle task aims to predict the correct order out of k ordered sequential contexts
    • aims is to prompt model to learn sequential relationships

    Untitled

  • elBERTo motivation for using Binary Sentence Order Prediction is that contexts for commonsense question answering exhibit causal relationships between consecutive sentences

    Untitled

  • elBERTo uses Tagme to extract entities for masked entity modeling

Results

  • What are some examples of commonsense QA datasets?
    • WIQA: What if questions
    • CosmosQA: human daily life scenarios
    • ReClor: from graduate admission exams
  • elBERTo results: adding pretraining tasks leads to improvement

    Untitled

  • elBERTo predictions examples:

    Untitled

Conclusions

This paper presents a number of pretraining tasks that improve performance on Commonsense QA tasks. These question answering tasks are very interesting because they require more compositional reasoning compared to other NLP tasks and are less susceptible to shortcut learning. Overall this is a good paper and the pretraining tasks seem well motivated. I think in general it’s a good domain of research to explore how pretraining tasks can be used to encode specific inductive biases into smaller models like the BERT models used in this dataset. I feel this approach will be complementary to the multitask pretraining and finetuning going into larger models like t0 and FLAN. For practical use cases it will often be better to use that smaller model with a specialized pretraining task rather than a larger model even thought the larger model might have good zero shot or few shot performance.

Reference

@article{zhan2022elberto,
  title={elBERto: Self-supervised Commonsense Learning for Question Answering},
  author={Zhan, Xunlin and Li, Yuan and Dong, Xiao and Liang, Xiaodan and Hu, Zhiting and Carin, Lawrence},
  journal={arXiv preprint arXiv:2203.09424},
  year={2022}
}

Updated: