-
Embedding Space Interpolation Beyond Mini-Batch, Beyond Pairs and Beyond Examples
Authors:
Shashanka Venkataramanan,
Ewa Kijak,
Laurent Amsaleg,
Yannis Avrithis
Abstract:
Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of interpolation and the space (input or feature) where it takes place, while the augmentation process itself is less studied. In most methods, the number of generated examples is limited to the mini-batch size and the nu…
▽ More
Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of interpolation and the space (input or feature) where it takes place, while the augmentation process itself is less studied. In most methods, the number of generated examples is limited to the mini-batch size and the number of examples being interpolated is limited to two (pairs), in the input space.
We make progress in this direction by introducing MultiMix, which generates an arbitrarily large number of interpolated examples beyond the mini-batch size and interpolates the entire mini-batch in the embedding space. Effectively, we sample on the entire convex hull of the mini-batch rather than along linear segments between pairs of examples.
On sequence data, we further extend to Dense MultiMix. We densely interpolate features and target labels at each spatial location and also apply the loss densely. To mitigate the lack of dense labels, we inherit labels from examples and weight interpolation factors by attention as a measure of confidence.
Overall, we increase the number of loss terms per mini-batch by orders of magnitude at little additional cost. This is only possible because of interpolating in the embedding space. We empirically show that our solutions yield significant improvement over state-of-the-art mixup methods on four different benchmarks, despite interpolation being only linear. By analyzing the embedding space, we show that the classes are more tightly clustered and uniformly spread over the embedding space, thereby explaining the improved behavior.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video
Authors:
Shashanka Venkataramanan,
Mamshad Nayeem Rizve,
João Carreira,
Yuki M. Asano,
Yannis Avrithis
Abstract:
Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this question by making two contributions. First, we investigate first-person videos and introduce a "Walking Tours" dataset. These videos are high-resolution,…
▽ More
Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this question by making two contributions. First, we investigate first-person videos and introduce a "Walking Tours" dataset. These videos are high-resolution, hours-long, captured in a single uninterrupted take, depicting a large number of objects and actions with natural scene transitions. They are unlabeled and uncurated, thus realistic for self-supervision and comparable with human learning.
Second, we introduce a novel self-supervised image pretraining method tailored for learning from continuous videos. Existing methods typically adapt image-based pretraining approaches to incorporate more frames. Instead, we advocate a "tracking to learn to recognize" approach. Our method called DoRA, leads to attention maps that Discover and tRAck objects over time in an end-to-end manner, using transformer cross-attention. We derive multiple views from the tracks and use them in a classical self-supervised distillation loss. Using our novel approach, a single Walking Tours video remarkably becomes a strong competitor to ImageNet for several image and video downstream tasks.
△ Less
Submitted 23 May, 2024; v1 submitted 12 October, 2023;
originally announced October 2023.
-
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Authors:
Shashanka Venkataramanan,
Amir Ghodrati,
Yuki M. Asano,
Fatih Porikli,
Amirhossein Habibian
Abstract:
This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key redundancy that causes unnecessary computations. Based on this observation, we propose SkipAt, a method to reuse self-attention computation from preceding layers to ap…
▽ More
This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key redundancy that causes unnecessary computations. Based on this observation, we propose SkipAt, a method to reuse self-attention computation from preceding layers to approximate attention at one or more subsequent layers. To ensure that reusing self-attention blocks across layers does not degrade the performance, we introduce a simple parametric function, which outperforms the baseline transformer's performance while running computationally faster. We show the effectiveness of our method in image classification and self-supervised learning on ImageNet-1K, semantic segmentation on ADE20K, image denoising on SIDD, and video denoising on DAVIS. We achieve improved throughput at the same-or-higher accuracy levels in all these tasks.
△ Less
Submitted 17 January, 2023; v1 submitted 5 January, 2023;
originally announced January 2023.
-
Teach me how to Interpolate a Myriad of Embeddings
Authors:
Shashanka Venkataramanan,
Ewa Kijak,
Laurent Amsaleg,
Yannis Avrithis
Abstract:
Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Yet, its extensions focus on the definition of interpolation and the space where it takes place, while the augmentation itself is less studied: For a mini-batch of size $m$, most methods interpolate between $m$ pairs with a single scalar interpolation factor $λ$.
I…
▽ More
Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Yet, its extensions focus on the definition of interpolation and the space where it takes place, while the augmentation itself is less studied: For a mini-batch of size $m$, most methods interpolate between $m$ pairs with a single scalar interpolation factor $λ$.
In this work, we make progress in this direction by introducing MultiMix, which interpolates an arbitrary number $n$ of tuples, each of length $m$, with one vector $λ$ per tuple. On sequence data, we further extend to dense interpolation and loss computation over all spatial positions. Overall, we increase the number of tuples per mini-batch by orders of magnitude at little additional cost. This is possible by interpolating at the very last layer before the classifier. Finally, to address inconsistencies due to linear target interpolation, we introduce a self-distillation approach to generate and interpolate synthetic targets.
We empirically show that our contributions result in significant improvement over state-of-the-art mixup methods on four benchmarks. By analyzing the embedding space, we observe that the classes are more tightly clustered and uniformly spread over the embedding space, thereby explaining the improved behavior.
△ Less
Submitted 29 June, 2022;
originally announced June 2022.
-
It Takes Two to Tango: Mixup for Deep Metric Learning
Authors:
Shashanka Venkataramanan,
Bill Psomas,
Ewa Kijak,
Laurent Amsaleg,
Konstantinos Karantzalos,
Yannis Avrithis
Abstract:
Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data a…
▽ More
Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data augmentation methods for classification consider two or more examples at a time. The combination of the two ideas is under-studied.
In this work, we aim to bridge this gap and improve representations using mixup, which is a powerful data augmentation approach interpolating two or more examples and corresponding target labels at a time. This task is challenging because unlike classification, the loss functions used in metric learning are not additive over examples, so the idea of interpolating target labels is not straightforward. To the best of our knowledge, we are the first to investigate mixing both examples and target labels for deep metric learning. We develop a generalized formulation that encompasses existing metric learning loss functions and modify it to accommodate for mixup, introducing Metric Mix, or Metrix. We also introduce a new metric - utilization, to demonstrate that by mixing examples during training, we are exploring areas of the embedding space beyond the training classes, thereby improving representations. To validate the effect of improved representations, we show that mixing inputs, intermediate representations or embeddings along with target labels significantly outperforms state-of-the-art metric learning methods on four benchmark deep metric learning datasets.
△ Less
Submitted 28 February, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.
-
AlignMixup: Improving Representations By Interpolating Aligned Features
Authors:
Shashanka Venkataramanan,
Ewa Kijak,
Laurent Amsaleg,
Yannis Avrithis
Abstract:
Mixup is a powerful data augmentation method that interpolates between two or more examples in the input or feature space and between the corresponding target labels. Many recent mixup methods focus on cutting and pasting two or more objects into one image, which is more about efficient processing than interpolation. However, how to best interpolate images is not well defined. In this sense, mixup…
▽ More
Mixup is a powerful data augmentation method that interpolates between two or more examples in the input or feature space and between the corresponding target labels. Many recent mixup methods focus on cutting and pasting two or more objects into one image, which is more about efficient processing than interpolation. However, how to best interpolate images is not well defined. In this sense, mixup has been connected to autoencoders, because often autoencoders "interpolate well", for instance generating an image that continuously deforms into another.
In this work, we revisit mixup from the interpolation perspective and introduce AlignMix, where we geometrically align two images in the feature space. The correspondences allow us to interpolate between two sets of features, while kee** the locations of one set. Interestingly, this gives rise to a situation where mixup retains mostly the geometry or pose of one image and the texture of the other, connecting it to style transfer. More than that, we show that an autoencoder can still improve representation learning under mixup, without the classifier ever seeing decoded images. AlignMix outperforms state-of-the-art mixup methods on five different benchmarks.
△ Less
Submitted 25 March, 2022; v1 submitted 29 March, 2021;
originally announced March 2021.
-
Attention Guided Anomaly Localization in Images
Authors:
Shashanka Venkataramanan,
Kuan-Chuan Peng,
Rajat Vikram Singh,
Abhijit Mahalanobis
Abstract:
Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific thr…
▽ More
Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific threshold to localize anomalies. Without the need of anomalous training images, we propose Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information. In the unsupervised setting, we propose an attention expansion loss where we encourage CAVGA to focus on all normal regions in the image. Furthermore, in the weakly-supervised setting we propose a complementary guided attention loss, where we encourage the attention map to focus on all normal regions while minimizing the attention map corresponding to anomalous regions in the image. CAVGA outperforms the state-of-the-art (SOTA) anomaly localization methods on MVTec Anomaly Detection (MVTAD), modified ShanghaiTech Campus (mSTC) and Large-scale Attention based Glaucoma (LAG) datasets in the unsupervised setting and when using only 2% anomalous images in the weakly-supervised setting. CAVGA also outperforms SOTA anomaly detection methods on the MNIST, CIFAR-10, Fashion-MNIST, MVTAD, mSTC and LAG datasets.
△ Less
Submitted 16 July, 2020; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Recent advances in MXenes: from fundamentals to applications
Authors:
Mohammad Khazaei,
Avanish Mishra,
Natarajan S. Venkataramanan,
Abhishek K. Singh,
Seiji Yunoki
Abstract:
The family of MAX phases and their derivative MXenes are continuously growing in terms of both crystalline and composition varieties. In the last couple of years, several breakthroughs have been achieved that boosted the synthesis of novel MAX phases with ordered double transition metals and, consequently, the synthesis of novel MXenes with a higher chemical diversity and structural complexity, ra…
▽ More
The family of MAX phases and their derivative MXenes are continuously growing in terms of both crystalline and composition varieties. In the last couple of years, several breakthroughs have been achieved that boosted the synthesis of novel MAX phases with ordered double transition metals and, consequently, the synthesis of novel MXenes with a higher chemical diversity and structural complexity, rarely seen in other families of two-dimensional (2D) materials. Considering the various elemental composition possibilities, surface functional tunability, various magnetic orders, and large spin$-$orbit coupling, MXenes can truly be considered as multifunctional materials that can be used to realize highly correlated phenomena. In addition, owing to their large surface area, hydrophilicity, adsorption ability, and high surface reactivity, MXenes have attracted attention for many applications, e.g., catalysts, ion batteries, gas storage media, and sensors. Given the fast progress of MXene-based science and technology, it is timely to update our current knowledge on various properties and possible applications. Since many theoretical predictions remain to be experimentally proven, here we mainly emphasize the physics and chemistry that can be observed in MXenes and discuss how these properties can be tuned or used for different applications.
△ Less
Submitted 18 January, 2019;
originally announced January 2019.
-
Density functional theory study on the dihydrogen bond cooperativity in the growth behavior of dimethyl sulfoxide clusters
Authors:
Natarajan Sathiyamoorthy Venkataramanan,
Ambigapathy Suvitha,
Yoshiyuki Kawazoe
Abstract:
We have carried out a density functional theory study on the structures of DMSO clusters and analysed the structure and their stability using molecular electrostatic potential and quantum theory of atoms-in-molecules (QTAIM). The ground state geometry of the DMSO clusters, prefer to exist in ouroboros shape. Pair wise interaction energy calculation show the interaction between methyl groups of adj…
▽ More
We have carried out a density functional theory study on the structures of DMSO clusters and analysed the structure and their stability using molecular electrostatic potential and quantum theory of atoms-in-molecules (QTAIM). The ground state geometry of the DMSO clusters, prefer to exist in ouroboros shape. Pair wise interaction energy calculation show the interaction between methyl groups of adjacent DMSO molecules and a destabilization is is created by the methyl groups which are away from each other. Molecular electrostatic potential analysis shows the existence of hole on the odd numbered clusters, which helps in their highly directional growth. QTAIM analysis show the existence of two intermolecular hydrogen bonds, of type SOC hydrogen bonds and methyl CHC dihydrogen bonds. The computed and Laplacian values were all positive for the intermolecular bonds, supporting the existence of noncovalent interactions. The computed ellipticity for the dihydrogen bonds have values > 2, which confirms the delocalization of electron, are mainly due to the hydrogen-hydrogen interactions of methyl groups. A plot of total hydrogen bonding energy vs the observed total local electron density shows linearity with correlation coefficient of near unity, which indicates the cooperative effects of intermolecular dihydrogen HH bonds.
△ Less
Submitted 10 October, 2017;
originally announced October 2017.
-
Functionalized Nanofullerenes for Hydrogen Storage: A Theoretical Perspective
Authors:
N. S. Venkataramanan,
A. Suvitha,
H. Mizuseki,
Y. Kawazoe
Abstract:
The increase in threats from global warming due to the consumption of fossil fuels requires our planet to adopt new strategies to harness the inexhaustible sources of energy. Hydrogen is an energy carrier which holds tremendous promise as a new renewable and clean energy option. Hydrogen is a convenient, safe, versatile fuel source that can be easily converted to a desired form of energy without r…
▽ More
The increase in threats from global warming due to the consumption of fossil fuels requires our planet to adopt new strategies to harness the inexhaustible sources of energy. Hydrogen is an energy carrier which holds tremendous promise as a new renewable and clean energy option. Hydrogen is a convenient, safe, versatile fuel source that can be easily converted to a desired form of energy without releasing harmful emissions. However, no materials was found satisfy the desired goals and hence there is hunt for new materials that can store hydrogen reversibly at ambient conditions. In this chapter, we discuss and compare various nanofullerene materials proposed theoretically as storage medium for hydrogen. Do** of transition elements leads to clustering which reduces the gravimetric density of hydrogen, while do** of alkali and alkali-earth metals on the nanocage materials, such as carborides, boronitride, and boron cages, were stabilized by the charger transfer from the dopant to the nanocage. Further, the alkali or alkali-earth elements exist with a charge, which are found to be responsible for the higher uptake of hydrogen, through a dipole- dipole and change-induced dipole interaction. The binding energies of hydrogen on these systems were found to be in the range of 0.1 eV to 0.2 eV, which are ideal for the practical applications in a reversible system.
△ Less
Submitted 22 June, 2011;
originally announced June 2011.
-
DFT Perspective of Hydrogen Storage on Porous Materials
Authors:
N. S. Venkataramanan,
Y. Kawazoe
Abstract:
In this chapter, the physisorption of hydrogen molecules in porous materials as possible hydrogen storage systems has been reviewed. Owing to the weak interaction between H2 molecules and the adsorbent, high storage capacities are typically reached only at cryogenic temperature. Different classes of porous materials possessing different structure and composition have been designed for hydrogen sto…
▽ More
In this chapter, the physisorption of hydrogen molecules in porous materials as possible hydrogen storage systems has been reviewed. Owing to the weak interaction between H2 molecules and the adsorbent, high storage capacities are typically reached only at cryogenic temperature. Different classes of porous materials possessing different structure and composition have been designed for hydrogen storage applications using computational methods and especially with the aid of DFT methods. The adsorption energies for hydrogen in different porous materials have been increases by the do** of light weight alkali and alkali earth metals. Ab initio molecular dynamics has been carried out to know the stability of the newly functionalized materials. GCMC methods have been employed to know the gravimetric and volumetric uptake percentage of the newly functionalized materials. Therefore, the combined approach provides a better understanding and designing new materials to operate at near room temperature for the reversible hydrogen storage application.
△ Less
Submitted 4 February, 2011;
originally announced February 2011.
-
Chemical engineering of adamantane by lithium functionalization: A first-principles density functional theory study
Authors:
Ahmad Ranjbar,
Mohammad Khazaei,
Natarajan Sathiyamoorthy Venkataramanan,
Hoonkyung Lee,
Yoshiyuki Kawazoe
Abstract:
Using first-principle density functional theory, we investigated the hydrogen storage capacity of Li functionalized adamantane. We showed that if one of the acidic hydrogen atoms of adamantane is replaced by Li/Li+, the resulting complex is activated and ready to adsorb hydrogen molecules at a high gravimetric weight percent of around ~ 7.0 %. Due to polarization of hydrogen molecules under the in…
▽ More
Using first-principle density functional theory, we investigated the hydrogen storage capacity of Li functionalized adamantane. We showed that if one of the acidic hydrogen atoms of adamantane is replaced by Li/Li+, the resulting complex is activated and ready to adsorb hydrogen molecules at a high gravimetric weight percent of around ~ 7.0 %. Due to polarization of hydrogen molecules under the induced electric field generated by positively charged Li/Li+, they are adsorbed on ADM.Li/Li+ complexes with an average binding energy of ~ -0.15 eV/H2, desirable for hydrogen storage applications. We also examined the possibility of the replacement of a larger number of acidic hydrogen atoms of adamantane by Li/Li+ and the possibility of aggregations of formed complexes in experiments. The stabilities of the proposed structures were investigated by calculating vibrational spectra and doing MD simulations.
△ Less
Submitted 31 January, 2011;
originally announced January 2011.
-
First-principles study of hydrogen storage over Ni and Rh doped BN sheets
Authors:
Natarajan Sathiyamoorthy Venkataramanan,
Mohammad Khazaei,
Ryoji Sahara,
Hiroshi Mizuseki,
Yoshiyuki Kawazoea
Abstract:
Absorption of hydrogen molecules on Nickel and Rhodium doped hexagonal boron nitride(BN) sheet is investigated by using the first principle method. The most stable site for the Ni atom was the on top side of nitrogen atom, while Rh atoms deservers a hollow site over the hexagonal BN sheet. The first hydrogen molecule was absorbed dissociatively over Rh atom, and molecularly on Ni doped BN sheet.…
▽ More
Absorption of hydrogen molecules on Nickel and Rhodium doped hexagonal boron nitride(BN) sheet is investigated by using the first principle method. The most stable site for the Ni atom was the on top side of nitrogen atom, while Rh atoms deservers a hollow site over the hexagonal BN sheet. The first hydrogen molecule was absorbed dissociatively over Rh atom, and molecularly on Ni doped BN sheet. Both Ni and Rh atoms are capable to absorb up to three hydrogen molecules chemically and the metal atom to BN sheet distance increases with the increase in the number of hydrogen molecules. Finally, our calculations offer explanation for the nature of bonding between the metal atom and the hydrogen molecules, which is due to the hybridization of metal d orbital with the hydrogen s orbital. These calculation results can be useful to understand the nature of interaction between the doped metal and the BN sheet, and their interaction with the hydrogen molecules.
△ Less
Submitted 10 December, 2008;
originally announced December 2008.