AIdventure - #paper-notes

LoRA - Low-Rank Adaptation of Large Language Models

June 17, 2021 • 5 min read

Fine-tuning large language models via trainable rank decomposition matrices

May 24, 2021 • 9 min read

Unsupervised visual feature learning using knowledge distillation and transformers

February 26, 2021 • 9 min read

Contrastive learning for unified vision-language representations in a shared embedding space

October 22, 2020 • 6 min read

Google shows how treating image patches as tokens can revolutionize computer vision

October 2, 2019 • 4 min read

Knowledge distillation compresses BERT: smaller, faster, with almost all performance

July 26, 2019 • 5 min read

Unlocking the true potential of BERT through rigorous optimization and strategic training choices

October 11, 2018 • 5 min read

Pre-training bidirectional by jointly conditioning on both left and right context

June 11, 2018 • 4 min read

Semi-supervised learning through generative pre-training on unlabeled text and task-specific fine-tuning

September 17, 2017 • 6 min read

Introducing channel attention to improve the performance of image classification tasks

June 12, 2017 • 19 min read

Demystifying the Transformer architecture, explaining the Encoder, Decoder, and Attention mechanisms block by block with PyTorch implementation

April 17, 2017 • 6 min read

Efficient convolutional neural networks for mobile vision applications

August 25, 2016 • 6 min read

Connecting each layer to every other layer to maximize information flow and efficiency

December 10, 2015 • 6 min read

Learn how residual blocks help solve the vanishing gradient problem to enable training of extremely deep neural networks

September 4, 2014 • 8 min read

Explore how VGG revolutionized computer vision by using small 3x3 filters to build deeper networks