DINO - Emerging Properties in Self-Supervised Vision Transformers May 24, 2021 Unsupervised visual feature learning using knowledge distillation and transformers
DistillBERT - A distilled version of BERT October 2, 2019 Knowledge distillation compresses BERT: smaller, faster, with almost all performance