Search | arXiv e-print repository

Embedding Space Interpolation Beyond Mini-Batch, Beyond Pairs and Beyond Examples

Authors: Shashanka Venkataramanan, Ewa Kijak, Laurent Amsaleg, Yannis Avrithis

Abstract: Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of interpolation and the space (input or feature) where it takes place, while the augmentation process itself is less studied. In most methods, the number of generated examples is limited to the mini-batch size and the nu… ▽ More Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of interpolation and the space (input or feature) where it takes place, while the augmentation process itself is less studied. In most methods, the number of generated examples is limited to the mini-batch size and the number of examples being interpolated is limited to two (pairs), in the input space. We make progress in this direction by introducing MultiMix, which generates an arbitrarily large number of interpolated examples beyond the mini-batch size and interpolates the entire mini-batch in the embedding space. Effectively, we sample on the entire convex hull of the mini-batch rather than along linear segments between pairs of examples. On sequence data, we further extend to Dense MultiMix. We densely interpolate features and target labels at each spatial location and also apply the loss densely. To mitigate the lack of dense labels, we inherit labels from examples and weight interpolation factors by attention as a measure of confidence. Overall, we increase the number of loss terms per mini-batch by orders of magnitude at little additional cost. This is only possible because of interpolating in the embedding space. We empirically show that our solutions yield significant improvement over state-of-the-art mixup methods on four different benchmarks, despite interpolation being only linear. By analyzing the embedding space, we show that the classes are more tightly clustered and uniformly spread over the embedding space, thereby explaining the improved behavior. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: Accepted to NeurIPS 2023. arXiv admin note: substantial text overlap with arXiv:2206.14868

arXiv:2310.08584 [pdf, other]

Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video

Authors: Shashanka Venkataramanan, Mamshad Nayeem Rizve, João Carreira, Yuki M. Asano, Yannis Avrithis

Abstract: Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this question by making two contributions. First, we investigate first-person videos and introduce a "Walking Tours" dataset. These videos are high-resolution,… ▽ More Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this question by making two contributions. First, we investigate first-person videos and introduce a "Walking Tours" dataset. These videos are high-resolution, hours-long, captured in a single uninterrupted take, depicting a large number of objects and actions with natural scene transitions. They are unlabeled and uncurated, thus realistic for self-supervision and comparable with human learning. Second, we introduce a novel self-supervised image pretraining method tailored for learning from continuous videos. Existing methods typically adapt image-based pretraining approaches to incorporate more frames. Instead, we advocate a "tracking to learn to recognize" approach. Our method called DoRA, leads to attention maps that Discover and tRAck objects over time in an end-to-end manner, using transformer cross-attention. We derive multiple views from the tracks and use them in a classical self-supervised distillation loss. Using our novel approach, a single Walking Tours video remarkably becomes a strong competitor to ImageNet for several image and video downstream tasks. △ Less

Submitted 23 May, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

Comments: Accepted to ICLR 2024 (Best paper honorable mention). Project Page: https://shashankvkt.github.io/dora

arXiv:2301.02240 [pdf, other]

Skip-Attention: Improving Vision Transformers by Paying Less Attention

Authors: Shashanka Venkataramanan, Amir Ghodrati, Yuki M. Asano, Fatih Porikli, Amirhossein Habibian

Abstract: This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key redundancy that causes unnecessary computations. Based on this observation, we propose SkipAt, a method to reuse self-attention computation from preceding layers to ap… ▽ More This work aims to improve the efficiency of vision transformers (ViT). While ViTs use computationally expensive self-attention operations in every layer, we identify that these operations are highly correlated across layers -- a key redundancy that causes unnecessary computations. Based on this observation, we propose SkipAt, a method to reuse self-attention computation from preceding layers to approximate attention at one or more subsequent layers. To ensure that reusing self-attention blocks across layers does not degrade the performance, we introduce a simple parametric function, which outperforms the baseline transformer's performance while running computationally faster. We show the effectiveness of our method in image classification and self-supervised learning on ImageNet-1K, semantic segmentation on ADE20K, image denoising on SIDD, and video denoising on DAVIS. We achieve improved throughput at the same-or-higher accuracy levels in all these tasks. △ Less

Submitted 17 January, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

arXiv:2206.14868 [pdf, other]

Teach me how to Interpolate a Myriad of Embeddings

Authors: Shashanka Venkataramanan, Ewa Kijak, Laurent Amsaleg, Yannis Avrithis

Abstract: Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Yet, its extensions focus on the definition of interpolation and the space where it takes place, while the augmentation itself is less studied: For a mini-batch of size $m$, most methods interpolate between $m$ pairs with a single scalar interpolation factor $λ$. I… ▽ More Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Yet, its extensions focus on the definition of interpolation and the space where it takes place, while the augmentation itself is less studied: For a mini-batch of size $m$, most methods interpolate between $m$ pairs with a single scalar interpolation factor $λ$. In this work, we make progress in this direction by introducing MultiMix, which interpolates an arbitrary number $n$ of tuples, each of length $m$, with one vector $λ$ per tuple. On sequence data, we further extend to dense interpolation and loss computation over all spatial positions. Overall, we increase the number of tuples per mini-batch by orders of magnitude at little additional cost. This is possible by interpolating at the very last layer before the classifier. Finally, to address inconsistencies due to linear target interpolation, we introduce a self-distillation approach to generate and interpolate synthetic targets. We empirically show that our contributions result in significant improvement over state-of-the-art mixup methods on four benchmarks. By analyzing the embedding space, we observe that the classes are more tightly clustered and uniformly spread over the embedding space, thereby explaining the improved behavior. △ Less

Submitted 29 June, 2022; originally announced June 2022.

arXiv:2106.04990 [pdf, other]

It Takes Two to Tango: Mixup for Deep Metric Learning

Authors: Shashanka Venkataramanan, Bill Psomas, Ewa Kijak, Laurent Amsaleg, Konstantinos Karantzalos, Yannis Avrithis

Abstract: Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data a… ▽ More Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data augmentation methods for classification consider two or more examples at a time. The combination of the two ideas is under-studied. In this work, we aim to bridge this gap and improve representations using mixup, which is a powerful data augmentation approach interpolating two or more examples and corresponding target labels at a time. This task is challenging because unlike classification, the loss functions used in metric learning are not additive over examples, so the idea of interpolating target labels is not straightforward. To the best of our knowledge, we are the first to investigate mixing both examples and target labels for deep metric learning. We develop a generalized formulation that encompasses existing metric learning loss functions and modify it to accommodate for mixup, introducing Metric Mix, or Metrix. We also introduce a new metric - utilization, to demonstrate that by mixing examples during training, we are exploring areas of the embedding space beyond the training classes, thereby improving representations. To validate the effect of improved representations, we show that mixing inputs, intermediate representations or embeddings along with target labels significantly outperforms state-of-the-art metric learning methods on four benchmark deep metric learning datasets. △ Less

Submitted 28 February, 2022; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: Accepted to ICLR 2022

arXiv:2103.15375 [pdf, other]

AlignMixup: Improving Representations By Interpolating Aligned Features

Authors: Shashanka Venkataramanan, Ewa Kijak, Laurent Amsaleg, Yannis Avrithis

Abstract: Mixup is a powerful data augmentation method that interpolates between two or more examples in the input or feature space and between the corresponding target labels. Many recent mixup methods focus on cutting and pasting two or more objects into one image, which is more about efficient processing than interpolation. However, how to best interpolate images is not well defined. In this sense, mixup… ▽ More Mixup is a powerful data augmentation method that interpolates between two or more examples in the input or feature space and between the corresponding target labels. Many recent mixup methods focus on cutting and pasting two or more objects into one image, which is more about efficient processing than interpolation. However, how to best interpolate images is not well defined. In this sense, mixup has been connected to autoencoders, because often autoencoders "interpolate well", for instance generating an image that continuously deforms into another. In this work, we revisit mixup from the interpolation perspective and introduce AlignMix, where we geometrically align two images in the feature space. The correspondences allow us to interpolate between two sets of features, while kee** the locations of one set. Interestingly, this gives rise to a situation where mixup retains mostly the geometry or pose of one image and the texture of the other, connecting it to style transfer. More than that, we show that an autoencoder can still improve representation learning under mixup, without the classifier ever seeing decoded images. AlignMix outperforms state-of-the-art mixup methods on five different benchmarks. △ Less

Submitted 25 March, 2022; v1 submitted 29 March, 2021; originally announced March 2021.

Comments: Accepted to CVPR 2022

arXiv:1911.08616 [pdf, other]

Attention Guided Anomaly Localization in Images

Authors: Shashanka Venkataramanan, Kuan-Chuan Peng, Rajat Vikram Singh, Abhijit Mahalanobis

Abstract: Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific thr… ▽ More Anomaly localization is an important problem in computer vision which involves localizing anomalous regions within images with applications in industrial inspection, surveillance, and medical imaging. This task is challenging due to the small sample size and pixel coverage of the anomaly in real-world scenarios. Most prior works need to use anomalous training images to compute a class-specific threshold to localize anomalies. Without the need of anomalous training images, we propose Convolutional Adversarial Variational autoencoder with Guided Attention (CAVGA), which localizes the anomaly with a convolutional latent variable to preserve the spatial information. In the unsupervised setting, we propose an attention expansion loss where we encourage CAVGA to focus on all normal regions in the image. Furthermore, in the weakly-supervised setting we propose a complementary guided attention loss, where we encourage the attention map to focus on all normal regions while minimizing the attention map corresponding to anomalous regions in the image. CAVGA outperforms the state-of-the-art (SOTA) anomaly localization methods on MVTec Anomaly Detection (MVTAD), modified ShanghaiTech Campus (mSTC) and Large-scale Attention based Glaucoma (LAG) datasets in the unsupervised setting and when using only 2% anomalous images in the weakly-supervised setting. CAVGA also outperforms SOTA anomaly detection methods on the MNIST, CIFAR-10, Fashion-MNIST, MVTAD, mSTC and LAG datasets. △ Less

Submitted 16 July, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: Accepted to ECCV 2020

arXiv:1901.06163 [pdf, other]

doi 10.1016/j.cossms.2019.01.002

Recent advances in MXenes: from fundamentals to applications

Authors: Mohammad Khazaei, Avanish Mishra, Natarajan S. Venkataramanan, Abhishek K. Singh, Seiji Yunoki

Abstract: The family of MAX phases and their derivative MXenes are continuously growing in terms of both crystalline and composition varieties. In the last couple of years, several breakthroughs have been achieved that boosted the synthesis of novel MAX phases with ordered double transition metals and, consequently, the synthesis of novel MXenes with a higher chemical diversity and structural complexity, ra… ▽ More The family of MAX phases and their derivative MXenes are continuously growing in terms of both crystalline and composition varieties. In the last couple of years, several breakthroughs have been achieved that boosted the synthesis of novel MAX phases with ordered double transition metals and, consequently, the synthesis of novel MXenes with a higher chemical diversity and structural complexity, rarely seen in other families of two-dimensional (2D) materials. Considering the various elemental composition possibilities, surface functional tunability, various magnetic orders, and large spin$-$orbit coupling, MXenes can truly be considered as multifunctional materials that can be used to realize highly correlated phenomena. In addition, owing to their large surface area, hydrophilicity, adsorption ability, and high surface reactivity, MXenes have attracted attention for many applications, e.g., catalysts, ion batteries, gas storage media, and sensors. Given the fast progress of MXene-based science and technology, it is timely to update our current knowledge on various properties and possible applications. Since many theoretical predictions remain to be experimentally proven, here we mainly emphasize the physics and chemistry that can be observed in MXenes and discuss how these properties can be tuned or used for different applications. △ Less

Submitted 18 January, 2019; originally announced January 2019.

Report number: Volume 23, Issue 3, 2019, Pages 164-178

Journal ref: Current Opinion in Solid State & Materials Science, 2019

arXiv:1710.04085 [pdf]

Density functional theory study on the dihydrogen bond cooperativity in the growth behavior of dimethyl sulfoxide clusters

Authors: Natarajan Sathiyamoorthy Venkataramanan, Ambigapathy Suvitha, Yoshiyuki Kawazoe

Abstract: We have carried out a density functional theory study on the structures of DMSO clusters and analysed the structure and their stability using molecular electrostatic potential and quantum theory of atoms-in-molecules (QTAIM). The ground state geometry of the DMSO clusters, prefer to exist in ouroboros shape. Pair wise interaction energy calculation show the interaction between methyl groups of adj… ▽ More We have carried out a density functional theory study on the structures of DMSO clusters and analysed the structure and their stability using molecular electrostatic potential and quantum theory of atoms-in-molecules (QTAIM). The ground state geometry of the DMSO clusters, prefer to exist in ouroboros shape. Pair wise interaction energy calculation show the interaction between methyl groups of adjacent DMSO molecules and a destabilization is is created by the methyl groups which are away from each other. Molecular electrostatic potential analysis shows the existence of hole on the odd numbered clusters, which helps in their highly directional growth. QTAIM analysis show the existence of two intermolecular hydrogen bonds, of type SOC hydrogen bonds and methyl CHC dihydrogen bonds. The computed and Laplacian values were all positive for the intermolecular bonds, supporting the existence of noncovalent interactions. The computed ellipticity for the dihydrogen bonds have values > 2, which confirms the delocalization of electron, are mainly due to the hydrogen-hydrogen interactions of methyl groups. A plot of total hydrogen bonding energy vs the observed total local electron density shows linearity with correlation coefficient of near unity, which indicates the cooperative effects of intermolecular dihydrogen HH bonds. △ Less

Submitted 10 October, 2017; originally announced October 2017.

arXiv:1106.4524 [pdf]

Functionalized Nanofullerenes for Hydrogen Storage: A Theoretical Perspective

Authors: N. S. Venkataramanan, A. Suvitha, H. Mizuseki, Y. Kawazoe

Abstract: The increase in threats from global warming due to the consumption of fossil fuels requires our planet to adopt new strategies to harness the inexhaustible sources of energy. Hydrogen is an energy carrier which holds tremendous promise as a new renewable and clean energy option. Hydrogen is a convenient, safe, versatile fuel source that can be easily converted to a desired form of energy without r… ▽ More The increase in threats from global warming due to the consumption of fossil fuels requires our planet to adopt new strategies to harness the inexhaustible sources of energy. Hydrogen is an energy carrier which holds tremendous promise as a new renewable and clean energy option. Hydrogen is a convenient, safe, versatile fuel source that can be easily converted to a desired form of energy without releasing harmful emissions. However, no materials was found satisfy the desired goals and hence there is hunt for new materials that can store hydrogen reversibly at ambient conditions. In this chapter, we discuss and compare various nanofullerene materials proposed theoretically as storage medium for hydrogen. Do** of transition elements leads to clustering which reduces the gravimetric density of hydrogen, while do** of alkali and alkali-earth metals on the nanocage materials, such as carborides, boronitride, and boron cages, were stabilized by the charger transfer from the dopant to the nanocage. Further, the alkali or alkali-earth elements exist with a charge, which are found to be responsible for the higher uptake of hydrogen, through a dipole- dipole and change-induced dipole interaction. The binding energies of hydrogen on these systems were found to be in the range of 0.1 eV to 0.2 eV, which are ideal for the practical applications in a reversible system. △ Less

Submitted 22 June, 2011; originally announced June 2011.

arXiv:1102.0849 [pdf]

DFT Perspective of Hydrogen Storage on Porous Materials

Authors: N. S. Venkataramanan, Y. Kawazoe

Abstract: In this chapter, the physisorption of hydrogen molecules in porous materials as possible hydrogen storage systems has been reviewed. Owing to the weak interaction between H2 molecules and the adsorbent, high storage capacities are typically reached only at cryogenic temperature. Different classes of porous materials possessing different structure and composition have been designed for hydrogen sto… ▽ More In this chapter, the physisorption of hydrogen molecules in porous materials as possible hydrogen storage systems has been reviewed. Owing to the weak interaction between H2 molecules and the adsorbent, high storage capacities are typically reached only at cryogenic temperature. Different classes of porous materials possessing different structure and composition have been designed for hydrogen storage applications using computational methods and especially with the aid of DFT methods. The adsorption energies for hydrogen in different porous materials have been increases by the do** of light weight alkali and alkali earth metals. Ab initio molecular dynamics has been carried out to know the stability of the newly functionalized materials. GCMC methods have been employed to know the gravimetric and volumetric uptake percentage of the newly functionalized materials. Therefore, the combined approach provides a better understanding and designing new materials to operate at near room temperature for the reversible hydrogen storage application. △ Less

Submitted 4 February, 2011; originally announced February 2011.

Comments: 10 figures, 7 tables

arXiv:1101.5882 [pdf, ps, other]

doi 10.1103/PhysRevB.83.115401

Chemical engineering of adamantane by lithium functionalization: A first-principles density functional theory study

Authors: Ahmad Ranjbar, Mohammad Khazaei, Natarajan Sathiyamoorthy Venkataramanan, Hoonkyung Lee, Yoshiyuki Kawazoe

Abstract: Using first-principle density functional theory, we investigated the hydrogen storage capacity of Li functionalized adamantane. We showed that if one of the acidic hydrogen atoms of adamantane is replaced by Li/Li+, the resulting complex is activated and ready to adsorb hydrogen molecules at a high gravimetric weight percent of around ~ 7.0 %. Due to polarization of hydrogen molecules under the in… ▽ More Using first-principle density functional theory, we investigated the hydrogen storage capacity of Li functionalized adamantane. We showed that if one of the acidic hydrogen atoms of adamantane is replaced by Li/Li+, the resulting complex is activated and ready to adsorb hydrogen molecules at a high gravimetric weight percent of around ~ 7.0 %. Due to polarization of hydrogen molecules under the induced electric field generated by positively charged Li/Li+, they are adsorbed on ADM.Li/Li+ complexes with an average binding energy of ~ -0.15 eV/H2, desirable for hydrogen storage applications. We also examined the possibility of the replacement of a larger number of acidic hydrogen atoms of adamantane by Li/Li+ and the possibility of aggregations of formed complexes in experiments. The stabilities of the proposed structures were investigated by calculating vibrational spectra and doing MD simulations. △ Less

Submitted 31 January, 2011; originally announced January 2011.

Comments: 8 pages, 6 figures, 2 tables, accepted for publication in Physical Review B

arXiv:0812.2070 [pdf, other]

doi 10.1016/j.chemphys.2009.04.001

First-principles study of hydrogen storage over Ni and Rh doped BN sheets

Authors: Natarajan Sathiyamoorthy Venkataramanan, Mohammad Khazaei, Ryoji Sahara, Hiroshi Mizuseki, Yoshiyuki Kawazoea

Abstract: Absorption of hydrogen molecules on Nickel and Rhodium doped hexagonal boron nitride(BN) sheet is investigated by using the first principle method. The most stable site for the Ni atom was the on top side of nitrogen atom, while Rh atoms deservers a hollow site over the hexagonal BN sheet. The first hydrogen molecule was absorbed dissociatively over Rh atom, and molecularly on Ni doped BN sheet.… ▽ More Absorption of hydrogen molecules on Nickel and Rhodium doped hexagonal boron nitride(BN) sheet is investigated by using the first principle method. The most stable site for the Ni atom was the on top side of nitrogen atom, while Rh atoms deservers a hollow site over the hexagonal BN sheet. The first hydrogen molecule was absorbed dissociatively over Rh atom, and molecularly on Ni doped BN sheet. Both Ni and Rh atoms are capable to absorb up to three hydrogen molecules chemically and the metal atom to BN sheet distance increases with the increase in the number of hydrogen molecules. Finally, our calculations offer explanation for the nature of bonding between the metal atom and the hydrogen molecules, which is due to the hybridization of metal d orbital with the hydrogen s orbital. These calculation results can be useful to understand the nature of interaction between the doped metal and the BN sheet, and their interaction with the hydrogen molecules. △ Less

Submitted 10 December, 2008; originally announced December 2008.

Journal ref: Chemical Physics 359 (2009) 173 - 178

Showing 1–13 of 13 results for author: Venkataramanan, S