Speeding up Attention Layers
September 11, 2024
Multi-head, Multi-Query & Grouped-Query Attention layers clearly explained. How cache works in the Attention layers
Exploring AI and machine learning — one paper, tool, and technique at a time.
September 11, 2024
Multi-head, Multi-Query & Grouped-Query Attention layers clearly explained. How cache works in the Attention layers
May 1, 2024
TF-IDF and BM25 are two of the most used algorithms in Information Retrieval. In this post we will explain how they work
December 21, 2023
Dive into the intuition behind Precision, Recall, and F1 Score. Understand how these metrics balance quality and quantity, from binary problems to object detection.
October 10, 2022
Configure local file serving and JSON imports to handle thousands of files in seconds—the production-ready approach.
June 17, 2021
Fine-tuning large language models via trainable rank decomposition matrices
May 24, 2021
Unsupervised visual feature learning using knowledge distillation and transformers
February 26, 2021
Contrastive learning for unified vision-language representations in a shared embedding space
October 22, 2020
Google shows how treating image patches as tokens can revolutionize computer vision
October 2, 2019
Knowledge distillation compresses BERT: smaller, faster, with almost all performance
July 26, 2019
Unlocking the true potential of BERT through rigorous optimization and strategic training choices
October 11, 2018
Pre-training bidirectional by jointly conditioning on both left and right context
October 10, 2018
Why do architectures use 3x3 filters? It is because of something called Receptive Fields.
June 11, 2018
Semi-supervised learning through generative pre-training on unlabeled text and task-specific fine-tuning
September 17, 2017
Introducing channel attention to improve the performance of image classification tasks
June 12, 2017
Demystifying the Transformer architecture, explaining the Encoder, Decoder, and Attention mechanisms block by block with PyTorch implementation
April 17, 2017
Efficient convolutional neural networks for mobile vision applications
August 25, 2016
Connecting each layer to every other layer to maximize information flow and efficiency
August 25, 2016
Same results as standard convolutions with only a fraction of the computational cost. Explore the tricks behind MobileNet and efficient CNNs