Vision Transformer - An Image is Worth 16x16 Words

October 22, 2020 6 min read

Google shows how treating image patches as tokens can revolutionize computer vision