LoRA - Low-Rank Adaptation of Large Language Models
Fine-tuning large language models via trainable rank decomposition matrices
Posts tagged with #paper-notes
Fine-tuning large language models via trainable rank decomposition matrices
Unsupervised visual feature learning using knowledge distillation and transformers
Contrastive learning for unified vision-language representations in a shared embedding space
Google shows how treating image patches as tokens can revolutionize computer vision
Knowledge distillation compresses BERT: smaller, faster, with almost all performance
Unlocking the true potential of BERT through rigorous optimization and strategic training choices
Pre-training bidirectional by jointly conditioning on both left and right context
Semi-supervised learning through generative pre-training on unlabeled text and task-specific fine-tuning
Introducing channel attention to improve the performance of image classification tasks
Demystifying the Transformer architecture, explaining the Encoder, Decoder, and Attention mechanisms block by block with PyTorch implementation
Efficient convolutional neural networks for mobile vision applications
Connecting each layer to every other layer to maximize information flow and efficiency