LoRA - Low-Rank Adaptation of Large Language Models
June 17, 2021
Fine-tuning large language models via trainable rank decomposition matrices
Posts tagged with #paper-notes
June 17, 2021
Fine-tuning large language models via trainable rank decomposition matrices
May 24, 2021
Unsupervised visual feature learning using knowledge distillation and transformers
February 26, 2021
Contrastive learning for unified vision-language representations in a shared embedding space
October 22, 2020
Google shows how treating image patches as tokens can revolutionize computer vision
October 2, 2019
Knowledge distillation compresses BERT: smaller, faster, with almost all performance
July 26, 2019
Unlocking the true potential of BERT through rigorous optimization and strategic training choices
October 11, 2018
Pre-training bidirectional by jointly conditioning on both left and right context
June 11, 2018
Semi-supervised learning through generative pre-training on unlabeled text and task-specific fine-tuning
September 17, 2017
Introducing channel attention to improve the performance of image classification tasks
June 12, 2017
Demystifying the Transformer architecture, explaining the Encoder, Decoder, and Attention mechanisms block by block with PyTorch implementation
April 17, 2017
Efficient convolutional neural networks for mobile vision applications
August 25, 2016
Connecting each layer to every other layer to maximize information flow and efficiency