-
ATOM: Attention Mixer for Efficient Dataset Distillation
Authors:
Samir Khaki,
Ahmad Sajedi,
Kai Wang,
Lucy Z. Liu,
Yuri A. Lawryshyn,
Konstantinos N. Plataniotis
Abstract:
Recent works in dataset distillation seek to minimize training expenses by generating a condensed synthetic dataset that encapsulates the information present in a larger real dataset. These approaches ultimately aim to attain test accuracy levels akin to those achieved by models trained on the entirety of the original dataset. Previous studies in feature and distribution matching have achieved sig…
▽ More
Recent works in dataset distillation seek to minimize training expenses by generating a condensed synthetic dataset that encapsulates the information present in a larger real dataset. These approaches ultimately aim to attain test accuracy levels akin to those achieved by models trained on the entirety of the original dataset. Previous studies in feature and distribution matching have achieved significant results without incurring the costs of bi-level optimization in the distillation process. Despite their convincing efficiency, many of these methods suffer from marginal downstream performance improvements, limited distillation of contextual information, and subpar cross-architecture generalization. To address these challenges in dataset distillation, we propose the ATtentiOn Mixer (ATOM) module to efficiently distill large datasets using a mixture of channel and spatial-wise attention in the feature matching process. Spatial-wise attention helps guide the learning process based on consistent localization of classes in their respective images, allowing for distillation from a broader receptive field. Meanwhile, channel-wise attention captures the contextual information associated with the class itself, thus making the synthetic image more informative for training. By integrating both types of attention, our ATOM module demonstrates superior performance across various computer vision datasets, including CIFAR10/100 and TinyImagenet. Notably, our method significantly improves performance in scenarios with a low number of images per class, thereby enhancing its potential. Furthermore, we maintain the improvement in cross-architectures and applications such as neural architecture search.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Symmetry-breaking-dependent electronic structures and strain regulation in ReSeS monolayer
Authors:
Texture Lin,
J. W. Ma,
H. C. Deng,
L. Z. Liu
Abstract:
Electronic devices for information storages and processes can be further optimized by introducing the degree of freedom of anisotropy, which is strongly dependent of their structural symmetry. Herein, a ReSeS monolayer with asymmetrical double-faces are proposed to disclose the anisotropic electronic structure. Meanwhile infrared fingerprint based on the lattice vibration is also adopted to demons…
▽ More
Electronic devices for information storages and processes can be further optimized by introducing the degree of freedom of anisotropy, which is strongly dependent of their structural symmetry. Herein, a ReSeS monolayer with asymmetrical double-faces are proposed to disclose the anisotropic electronic structure. Meanwhile infrared fingerprint based on the lattice vibration is also adopted to demonstrate the symmetry-breaking-dependent structural transformation. First-principles calculations demonstrate that the geometry deformation will induce the reconstruction of electronic structure. Ulteriorly, both the dynamic properties of carrier and spectroscopic response can be regulated by external strain and displays anisotropic behaviors. Our idea provides threads for designing new regulable optoelectronic devices.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
OpenAgents: An Open Platform for Language Agents in the Wild
Authors:
Tianbao Xie,
Fan Zhou,
Zhoujun Cheng,
Peng Shi,
Luoxuan Weng,
Yitao Liu,
Toh **g Hua,
Junning Zhao,
Qian Liu,
Che Liu,
Leo Z. Liu,
Yiheng Xu,
Hong** Su,
Dongchan Shin,
Caiming Xiong,
Tao Yu
Abstract:
Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs). Current language agent frameworks aim to facilitate the construction of proof-of-concept language agents while neglecting the non-expert user access to agents and paying little attention to application-level…
▽ More
Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs). Current language agent frameworks aim to facilitate the construction of proof-of-concept language agents while neglecting the non-expert user access to agents and paying little attention to application-level designs. We present OpenAgents, an open platform for using and hosting language agents in the wild of everyday life. OpenAgents includes three agents: (1) Data Agent for data analysis with Python/SQL and data tools; (2) Plugins Agent with 200+ daily API tools; (3) Web Agent for autonomous web browsing. OpenAgents enables general users to interact with agent functionalities through a web user interface optimized for swift responses and common failures while offering developers and researchers a seamless deployment experience on local setups, providing a foundation for crafting innovative language agents and facilitating real-world evaluations. We elucidate the challenges and opportunities, aspiring to set a foundation for future research and development of real-world language agents.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
DataDAM: Efficient Dataset Distillation with Attention Matching
Authors:
Ahmad Sajedi,
Samir Khaki,
Ehsan Amjadian,
Lucy Z. Liu,
Yuri A. Lawryshyn,
Konstantinos N. Plataniotis
Abstract:
Researchers have long tried to minimize training costs in deep learning while maintaining strong generalization across diverse datasets. Emerging research on dataset distillation aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset and ultimately achieves test accuracy equivalent to a model trained on the whole dataset. Unfortunatel…
▽ More
Researchers have long tried to minimize training costs in deep learning while maintaining strong generalization across diverse datasets. Emerging research on dataset distillation aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset and ultimately achieves test accuracy equivalent to a model trained on the whole dataset. Unfortunately, the synthetic data generated by previous methods are not guaranteed to distribute and discriminate as well as the original training data, and they incur significant computational costs. Despite promising results, there still exists a significant performance gap between models trained on condensed synthetic sets and those trained on the whole dataset. In this paper, we address these challenges using efficient Dataset Distillation with Attention Matching (DataDAM), achieving state-of-the-art performance while reducing training costs. Specifically, we learn synthetic images by matching the spatial attention maps of real and synthetic data generated by different layers within a family of randomly initialized neural networks. Our method outperforms the prior methods on several datasets, including CIFAR10/100, TinyImageNet, ImageNet-1K, and subsets of ImageNet-1K across most of the settings, and achieves improvements of up to 6.5% and 4.1% on CIFAR100 and ImageNet-1K, respectively. We also show that our high-quality distilled images have practical benefits for downstream applications, such as continual learning and neural architecture search.
△ Less
Submitted 31 October, 2023; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Learning to translate by learning to communicate
Authors:
C. M. Downey,
Xuhui Zhou,
Leo Z. Liu,
Shane Steinert-Threlkeld
Abstract:
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and i…
▽ More
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and interactive language learning has been high lighted. In our approach, we embed a multilingual model (mBART, Liu et al., 2020) into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task. The hypothesis is that this will align multiple languages to a shared task space. We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated, including the low-resource language Nepali.
△ Less
Submitted 19 October, 2023; v1 submitted 14 July, 2022;
originally announced July 2022.
-
Probing Across Time: What Does RoBERTa Know and When?
Authors:
Leo Z. Liu,
Yizhong Wang,
Jungo Kasai,
Hannaneh Hajishirzi,
Noah A. Smith
Abstract:
Models of language trained on very large corpora have been demonstrated useful for NLP. As fixed artifacts, they have become the object of intense study, with many researchers "probing" the extent to which linguistic abstractions, factual and commonsense knowledge, and reasoning abilities they acquire and readily demonstrate. Building on this line of work, we consider a new question: for types of…
▽ More
Models of language trained on very large corpora have been demonstrated useful for NLP. As fixed artifacts, they have become the object of intense study, with many researchers "probing" the extent to which linguistic abstractions, factual and commonsense knowledge, and reasoning abilities they acquire and readily demonstrate. Building on this line of work, we consider a new question: for types of knowledge a language model learns, when during (pre)training are they acquired? We plot probing performance across iterations, using RoBERTa as a case study. Among our findings: linguistic knowledge is acquired fast, stably, and robustly across domains. Facts and commonsense are slower and more domain-sensitive. Reasoning abilities are, in general, not stably acquired. As new datasets, pretraining protocols, and probes emerge, we believe that probing-across-time analyses can help researchers understand the complex, intermingled learning that these models undergo and guide us toward more efficient approaches that accomplish necessary learning faster.
△ Less
Submitted 20 September, 2021; v1 submitted 16 April, 2021;
originally announced April 2021.
-
Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets
Authors:
Chuanrong Li,
Lin Shengshuo,
Leo Z. Liu,
Xinyi Wu,
Xuhui Zhou,
Shane Steinert-Threlkeld
Abstract:
Although large-scale pretrained language models, such as BERT and RoBERTa, have achieved superhuman performance on in-distribution test sets, their performance suffers on out-of-distribution test sets (e.g., on contrast sets). Building contrast sets often re-quires human-expert annotation, which is expensive and hard to create on a large scale. In this work, we propose a Linguistically-Informed Tr…
▽ More
Although large-scale pretrained language models, such as BERT and RoBERTa, have achieved superhuman performance on in-distribution test sets, their performance suffers on out-of-distribution test sets (e.g., on contrast sets). Building contrast sets often re-quires human-expert annotation, which is expensive and hard to create on a large scale. In this work, we propose a Linguistically-Informed Transformation (LIT) method to automatically generate contrast sets, which enables practitioners to explore linguistic phenomena of interests as well as compose different phenomena. Experimenting with our method on SNLI and MNLI shows that current pretrained language models, although being claimed to contain sufficient linguistic knowledge, struggle on our automatically generated contrast sets. Furthermore, we improve models' performance on the contrast sets by apply-ing LIT to augment the training data, without affecting performance on the original data.
△ Less
Submitted 12 November, 2020; v1 submitted 16 October, 2020;
originally announced October 2020.
-
Electronic Structure and Optical Properties of Monolayer $ReS_2$ with Defect Controlled by Strain Engineering
Authors:
Y. M. Min,
L. Z. Liu
Abstract:
By using first-principles calculations, we investigated the monolayer $ReS_2$ with vacancies under strain engineering, specifically focusing on its energy of formation, band gap, electron density of states, effective mass and optical properties. The calculated results disclose that S4 defect is more likely to form than other kinds of vacancies. Asymmetric deformation induced by strain makes its ba…
▽ More
By using first-principles calculations, we investigated the monolayer $ReS_2$ with vacancies under strain engineering, specifically focusing on its energy of formation, band gap, electron density of states, effective mass and optical properties. The calculated results disclose that S4 defect is more likely to form than other kinds of vacancies. Asymmetric deformation induced by strain makes its band structure transformation from direct band gap to indirect band gap. The analysis of the partial density of states indicates that the Re-d, Re-p and S-d orbitals are the major components of the defect states, being different from $MoS_2$, the defect states locate both above and below the Fermi level. Moreover, the effective mass was sensitive and anisotropic under the external strain. The reflection spectrum can be greatly tuned by the external strains, which indicates that the ReS2 monolayer has promising applications in nanoscale strain sensor and conductance-switch FETs.
△ Less
Submitted 25 November, 2016;
originally announced November 2016.
-
Regulation of oxygen vacancy types on SnO2 (110) surface by external strain
Authors:
Z. H. Zhou,
Y. M. Min,
X. X. Liu,
J. Q. Ding,
L. Z. Liu
Abstract:
In tin dioxide nanostructures, oxygen vacancies (OVs) play an important role in their optical properties and thus regulation of both OV concentration and type via external strain is crucial to exploration of more applications. First-principle calculations of SnO2 (110) surface disclose that asymmetric deformations induced by external strain not only lead to its intrinsic surface elastic changes, b…
▽ More
In tin dioxide nanostructures, oxygen vacancies (OVs) play an important role in their optical properties and thus regulation of both OV concentration and type via external strain is crucial to exploration of more applications. First-principle calculations of SnO2 (110) surface disclose that asymmetric deformations induced by external strain not only lead to its intrinsic surface elastic changes, but also result in different OV formation energy. In the absence of external strain, the energetically favorable oxygen vacancies (EFOV) appear in the bridging site of second layer. When -3.5% external strain is applied along y direction, the EFOV moves into plane site. This can be ascribed that the compressed deformation gives rise to redistribution of electronic wave function near OVs, therefore, formation of newly bond structures. Our results suggest that different type OVs in SnO2 surface can be controlled by strain engineering.
△ Less
Submitted 15 November, 2016;
originally announced November 2016.
-
Anisotropic Raman Scattering and Mobility in Monolayer 1Td-ReS2 Controlled by Strain Engineering
Authors:
Z. H. Zhou,
B. C. Wei,
Y. M. Min,
L. Z. Liu
Abstract:
Regulation of electronic structure and mobility cut-on rate in two-dimensional transition metal dichalcogenides (TMDs) has attracted much attention because of its potential in electronic device design. The anisotropic Raman scattering and mobility cut-on rate of monolayer unique distorted-1T(1Td) ReS2 with external strain are determined theoretically based on the density function theory. The angle…
▽ More
Regulation of electronic structure and mobility cut-on rate in two-dimensional transition metal dichalcogenides (TMDs) has attracted much attention because of its potential in electronic device design. The anisotropic Raman scattering and mobility cut-on rate of monolayer unique distorted-1T(1Td) ReS2 with external strain are determined theoretically based on the density function theory. The angle-dependent Raman spectrum of Ag-like, Eg-like and Cp models are used to discriminate and analysis structural anisotropy; the strain is exploited to adjust the structural symmetry and electronic structure of ReS2 so as to enhance mobility cut-on rate to almost 6 times of the original value. Our results suggest the use of the strain engineering in high-quality semiconductor switch device.
△ Less
Submitted 15 November, 2016;
originally announced November 2016.
-
Quasi-phase-matching of high-order-harmonic generation using polarization beating in optical waveguides
Authors:
Lewis Z. Liu,
Kevin O'Keeffe,
Simon M. Hooker
Abstract:
A scheme for quasi-phase-matching high-harmonic generation is proposed in which polarization beating within a hollow core birefringent waveguide modulates the generation of harmonics. The evolution of the polarization of a laser pulse propagating in a birefringent waveguide is calculated and is shown to periodically modulate the harmonic generation process. The optimum conditions for achieving qua…
▽ More
A scheme for quasi-phase-matching high-harmonic generation is proposed in which polarization beating within a hollow core birefringent waveguide modulates the generation of harmonics. The evolution of the polarization of a laser pulse propagating in a birefringent waveguide is calculated and is shown to periodically modulate the harmonic generation process. The optimum conditions for achieving quasi-phase-matching using this scheme are explored and the growth of the harmonic intensity as a function of experimental parameters is investigated.
△ Less
Submitted 16 April, 2013;
originally announced April 2013.
-
Optical Rotation Quasi-Phase-Matching for Circularly Polarized High Harmonic Generation
Authors:
Lewis Z. Liu,
Kevin O'Keeffe,
Simon M. Hooker
Abstract:
The first scheme for quasi-phase-matching high harmonic generation of circularly polarized radiation is proposed: optical rotation quasi-phase-matching (ORQPM). In ORQPM propagation of the driving radiation in a system exhibiting circular birefringence causes its plane of polarization to rotate; by appropriately matching the period of rotation to the coherence length it is possible to avoid destru…
▽ More
The first scheme for quasi-phase-matching high harmonic generation of circularly polarized radiation is proposed: optical rotation quasi-phase-matching (ORQPM). In ORQPM propagation of the driving radiation in a system exhibiting circular birefringence causes its plane of polarization to rotate; by appropriately matching the period of rotation to the coherence length it is possible to avoid destructive interference of the generated radiation. It is shown that ORQPM is approximately 5 times more efficient than conventional QPM, and half as efficient as true phase-matching.
△ Less
Submitted 18 March, 2013;
originally announced March 2013.
-
Quasi-phase-matching of high-order-harmonic generation using multimode polarization beating
Authors:
Lewis Z. Liu,
Kevin O'Keeffe,
Simon M. Hooker
Abstract:
The generalization of quasi-phase-matching using polarization beating and of multimode quasi-phase-matching (MMQPM) for the generation of high-order harmonics is explored, and a method for achieving polarization beating is proposed. If two (and in principle more) modes of a waveguide are excited, modulation of the intensity, phase, and/or polarization of the guided radiation will be achieved. By a…
▽ More
The generalization of quasi-phase-matching using polarization beating and of multimode quasi-phase-matching (MMQPM) for the generation of high-order harmonics is explored, and a method for achieving polarization beating is proposed. If two (and in principle more) modes of a waveguide are excited, modulation of the intensity, phase, and/or polarization of the guided radiation will be achieved. By appropriately matching the period of this modulation to the coherence length, quasi-phase-matching of high-order-harmonic radiation generated by the guided wave can occur. We show that it is possible to achieve efficiencies with multimode quasi-phase-matching greater than the ideal square wave modulation. We present a Fourier treatment of QPM and use this to show that phase modulation, rather than amplitude modulation, plays the dominant role in the case of MMQPM. The experimental parameters and optimal conditions for this scheme are explored.
△ Less
Submitted 21 February, 2013;
originally announced February 2013.