DINO - Emerging Properties in Self-Supervised Vision Transformers May 24, 2021 • 9 min read Unsupervised visual feature learning using knowledge distillation and transformers
DistillBERT - A distilled version of BERT October 2, 2019 • 4 min read Knowledge distillation compresses BERT: smaller, faster, with almost all performance