DINO - Emerging Properties in Self-Supervised Vision Transformers
Unsupervised visual feature learning using knowledge distillation and transformers
Posts tagged with #computer-vision
Unsupervised visual feature learning using knowledge distillation and transformers
Contrastive learning for unified vision-language representations in a shared embedding space
Google shows how treating image patches as tokens can revolutionize computer vision
Why do architectures use 3x3 filters? It is because of something called Receptive Fields.
Introducing channel attention to improve the performance of image classification tasks
Efficient convolutional neural networks for mobile vision applications
Same results as standard convolutions with only a fraction of the computational cost. Explore the tricks behind MobileNet and efficient CNNs
Connecting each layer to every other layer to maximize information flow and efficiency