Mar 18 Research Wrapup

1 minute read

Quick hit thoughts on trending papers?

PERT: PRE-TRAINING BERT WITH PERMUTED LANGUAGE MODEL
- alternative to MLM pretraining task
- task is to recover order of tokens
- note: despite the name, this is NOT the same as the permuted language modeling objective from XLNET
- good results on NLU tasks
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing
- Release a multilingual fairness benchmark
- release four pretrained language models
Efficient Language Modeling with Sparse all-MLP
- facebooks sparse all-MLP architecture gives training efficiency benefits over transformers
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
- simple method of averaging finetuned models with different hyperparameters
- no increase in inference cost
- mostly apply to vision and multimodal models
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
- A comprehensive survey of parameter efficient methods
- Also provides some theoretical backing
- This is a research area that I am very interested in
Deep Continuous Prompt for Contrastive Learning of Sentence Embeddings
- I cover this paper in detail in a separate post
- Very interesting parameter efficient method, however there are a number of issues with this paper
Does Corpus Quality Really Matter for Low-Resource Languages?
- empirical study shows that NLU performance for low resource language is constrained by quantity rather than quality of text corpora
GrIPS: Gradient-free, Edit-based Instruction Search for Prompting Large Language Models
- I cover this paper in more detail
- They apply a simple RL technique to generate better instructional prompts

Twitter Facebook LinkedIn

Ethan Kim

Mar 18 Research Wrapup

You May Also Enjoy

ITERATED DECOMPOSITION: IMPROVING SCIENCE Q&A

Decoder Inference Optimization

1 Year of a Challenging Big-Bench Task

Scattered or Connected? An Optimized Parameter-efficient