Vision Transformer - An Image is Worth 16x16 Words

October 22, 2020

Google shows how treating image patches as tokens can revolutionize computer vision