Search | arXiv e-print repository

ATOM: Attention Mixer for Efficient Dataset Distillation

Authors: Samir Khaki, Ahmad Sajedi, Kai Wang, Lucy Z. Liu, Yuri A. Lawryshyn, Konstantinos N. Plataniotis

Abstract: Recent works in dataset distillation seek to minimize training expenses by generating a condensed synthetic dataset that encapsulates the information present in a larger real dataset. These approaches ultimately aim to attain test accuracy levels akin to those achieved by models trained on the entirety of the original dataset. Previous studies in feature and distribution matching have achieved sig… ▽ More Recent works in dataset distillation seek to minimize training expenses by generating a condensed synthetic dataset that encapsulates the information present in a larger real dataset. These approaches ultimately aim to attain test accuracy levels akin to those achieved by models trained on the entirety of the original dataset. Previous studies in feature and distribution matching have achieved significant results without incurring the costs of bi-level optimization in the distillation process. Despite their convincing efficiency, many of these methods suffer from marginal downstream performance improvements, limited distillation of contextual information, and subpar cross-architecture generalization. To address these challenges in dataset distillation, we propose the ATtentiOn Mixer (ATOM) module to efficiently distill large datasets using a mixture of channel and spatial-wise attention in the feature matching process. Spatial-wise attention helps guide the learning process based on consistent localization of classes in their respective images, allowing for distillation from a broader receptive field. Meanwhile, channel-wise attention captures the contextual information associated with the class itself, thus making the synthetic image more informative for training. By integrating both types of attention, our ATOM module demonstrates superior performance across various computer vision datasets, including CIFAR10/100 and TinyImagenet. Notably, our method significantly improves performance in scenarios with a low number of images per class, thereby enhancing its potential. Furthermore, we maintain the improvement in cross-architectures and applications such as neural architecture search. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: Accepted for an oral presentation in CVPR-DD 2024

arXiv:2403.01120 [pdf]

Symmetry-breaking-dependent electronic structures and strain regulation in ReSeS monolayer

Authors: Texture Lin, J. W. Ma, H. C. Deng, L. Z. Liu

Abstract: Electronic devices for information storages and processes can be further optimized by introducing the degree of freedom of anisotropy, which is strongly dependent of their structural symmetry. Herein, a ReSeS monolayer with asymmetrical double-faces are proposed to disclose the anisotropic electronic structure. Meanwhile infrared fingerprint based on the lattice vibration is also adopted to demons… ▽ More Electronic devices for information storages and processes can be further optimized by introducing the degree of freedom of anisotropy, which is strongly dependent of their structural symmetry. Herein, a ReSeS monolayer with asymmetrical double-faces are proposed to disclose the anisotropic electronic structure. Meanwhile infrared fingerprint based on the lattice vibration is also adopted to demonstrate the symmetry-breaking-dependent structural transformation. First-principles calculations demonstrate that the geometry deformation will induce the reconstruction of electronic structure. Ulteriorly, both the dynamic properties of carrier and spectroscopic response can be regulated by external strain and displays anisotropic behaviors. Our idea provides threads for designing new regulable optoelectronic devices. △ Less

Submitted 2 March, 2024; originally announced March 2024.

arXiv:2310.10634 [pdf, other]

OpenAgents: An Open Platform for Language Agents in the Wild

Authors: Tianbao Xie, Fan Zhou, Zhoujun Cheng, Peng Shi, Luoxuan Weng, Yitao Liu, Toh **g Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hong** Su, Dongchan Shin, Caiming Xiong, Tao Yu

Abstract: Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs). Current language agent frameworks aim to facilitate the construction of proof-of-concept language agents while neglecting the non-expert user access to agents and paying little attention to application-level… ▽ More Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs). Current language agent frameworks aim to facilitate the construction of proof-of-concept language agents while neglecting the non-expert user access to agents and paying little attention to application-level designs. We present OpenAgents, an open platform for using and hosting language agents in the wild of everyday life. OpenAgents includes three agents: (1) Data Agent for data analysis with Python/SQL and data tools; (2) Plugins Agent with 200+ daily API tools; (3) Web Agent for autonomous web browsing. OpenAgents enables general users to interact with agent functionalities through a web user interface optimized for swift responses and common failures while offering developers and researchers a seamless deployment experience on local setups, providing a foundation for crafting innovative language agents and facilitating real-world evaluations. We elucidate the challenges and opportunities, aspiring to set a foundation for future research and development of real-world language agents. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 34 pages, 8 figures

arXiv:2310.00093 [pdf, other]

DataDAM: Efficient Dataset Distillation with Attention Matching

Authors: Ahmad Sajedi, Samir Khaki, Ehsan Amjadian, Lucy Z. Liu, Yuri A. Lawryshyn, Konstantinos N. Plataniotis

Abstract: Researchers have long tried to minimize training costs in deep learning while maintaining strong generalization across diverse datasets. Emerging research on dataset distillation aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset and ultimately achieves test accuracy equivalent to a model trained on the whole dataset. Unfortunatel… ▽ More Researchers have long tried to minimize training costs in deep learning while maintaining strong generalization across diverse datasets. Emerging research on dataset distillation aims to reduce training costs by creating a small synthetic set that contains the information of a larger real dataset and ultimately achieves test accuracy equivalent to a model trained on the whole dataset. Unfortunately, the synthetic data generated by previous methods are not guaranteed to distribute and discriminate as well as the original training data, and they incur significant computational costs. Despite promising results, there still exists a significant performance gap between models trained on condensed synthetic sets and those trained on the whole dataset. In this paper, we address these challenges using efficient Dataset Distillation with Attention Matching (DataDAM), achieving state-of-the-art performance while reducing training costs. Specifically, we learn synthetic images by matching the spatial attention maps of real and synthetic data generated by different layers within a family of randomly initialized neural networks. Our method outperforms the prior methods on several datasets, including CIFAR10/100, TinyImageNet, ImageNet-1K, and subsets of ImageNet-1K across most of the settings, and achieves improvements of up to 6.5% and 4.1% on CIFAR100 and ImageNet-1K, respectively. We also show that our high-quality distilled images have practical benefits for downstream applications, such as continual learning and neural architecture search. △ Less

Submitted 31 October, 2023; v1 submitted 29 September, 2023; originally announced October 2023.

Comments: Accepted in International Conference in Computer Vision (ICCV) 2023

Journal ref: booktitle = Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) month = October year = 2023 pages = 17097-17107

arXiv:2207.07025 [pdf, other]

Learning to translate by learning to communicate

Authors: C. M. Downey, Xuhui Zhou, Leo Z. Liu, Shane Steinert-Threlkeld

Abstract: We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and i… ▽ More We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages. It has been argued that the current dominant paradigm in NLP of pre-training on text-only corpora will not yield robust natural language understanding systems, and the need for grounded, goal-oriented, and interactive language learning has been high lighted. In our approach, we embed a multilingual model (mBART, Liu et al., 2020) into an EC image-reference game, in which the model is incentivized to use multilingual generations to accomplish a vision-grounded task. The hypothesis is that this will align multiple languages to a shared task space. We present two variants of EC Fine-Tuning (Steinert-Threlkeld et al., 2022), one of which outperforms a backtranslation-only baseline in all four languages investigated, including the low-resource language Nepali. △ Less

Submitted 19 October, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

Comments: Camera-ready for 3rd Multilingual Representation Learning Workshop (MRL 2023)

arXiv:2104.07885 [pdf, other]

Probing Across Time: What Does RoBERTa Know and When?

Authors: Leo Z. Liu, Yizhong Wang, Jungo Kasai, Hannaneh Hajishirzi, Noah A. Smith

Abstract: Models of language trained on very large corpora have been demonstrated useful for NLP. As fixed artifacts, they have become the object of intense study, with many researchers "probing" the extent to which linguistic abstractions, factual and commonsense knowledge, and reasoning abilities they acquire and readily demonstrate. Building on this line of work, we consider a new question: for types of… ▽ More Models of language trained on very large corpora have been demonstrated useful for NLP. As fixed artifacts, they have become the object of intense study, with many researchers "probing" the extent to which linguistic abstractions, factual and commonsense knowledge, and reasoning abilities they acquire and readily demonstrate. Building on this line of work, we consider a new question: for types of knowledge a language model learns, when during (pre)training are they acquired? We plot probing performance across iterations, using RoBERTa as a case study. Among our findings: linguistic knowledge is acquired fast, stably, and robustly across domains. Facts and commonsense are slower and more domain-sensitive. Reasoning abilities are, in general, not stably acquired. As new datasets, pretraining protocols, and probes emerge, we believe that probing-across-time analyses can help researchers understand the complex, intermingled learning that these models undergo and guide us toward more efficient approaches that accomplish necessary learning faster. △ Less

Submitted 20 September, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: Accepted to EMNLP2021 Finding

arXiv:2010.08580 [pdf, other]

Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets

Authors: Chuanrong Li, Lin Shengshuo, Leo Z. Liu, Xinyi Wu, Xuhui Zhou, Shane Steinert-Threlkeld

Abstract: Although large-scale pretrained language models, such as BERT and RoBERTa, have achieved superhuman performance on in-distribution test sets, their performance suffers on out-of-distribution test sets (e.g., on contrast sets). Building contrast sets often re-quires human-expert annotation, which is expensive and hard to create on a large scale. In this work, we propose a Linguistically-Informed Tr… ▽ More Although large-scale pretrained language models, such as BERT and RoBERTa, have achieved superhuman performance on in-distribution test sets, their performance suffers on out-of-distribution test sets (e.g., on contrast sets). Building contrast sets often re-quires human-expert annotation, which is expensive and hard to create on a large scale. In this work, we propose a Linguistically-Informed Transformation (LIT) method to automatically generate contrast sets, which enables practitioners to explore linguistic phenomena of interests as well as compose different phenomena. Experimenting with our method on SNLI and MNLI shows that current pretrained language models, although being claimed to contain sufficient linguistic knowledge, struggle on our automatically generated contrast sets. Furthermore, we improve models' performance on the contrast sets by apply-ing LIT to augment the training data, without affecting performance on the original data. △ Less

Submitted 12 November, 2020; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: Appears at EMNLP BlackboxNLP Workshop 2020

arXiv:1611.08509 [pdf]

Electronic Structure and Optical Properties of Monolayer $ReS_2$ with Defect Controlled by Strain Engineering

Authors: Y. M. Min, L. Z. Liu

Abstract: By using first-principles calculations, we investigated the monolayer $ReS_2$ with vacancies under strain engineering, specifically focusing on its energy of formation, band gap, electron density of states, effective mass and optical properties. The calculated results disclose that S4 defect is more likely to form than other kinds of vacancies. Asymmetric deformation induced by strain makes its ba… ▽ More By using first-principles calculations, we investigated the monolayer $ReS_2$ with vacancies under strain engineering, specifically focusing on its energy of formation, band gap, electron density of states, effective mass and optical properties. The calculated results disclose that S4 defect is more likely to form than other kinds of vacancies. Asymmetric deformation induced by strain makes its band structure transformation from direct band gap to indirect band gap. The analysis of the partial density of states indicates that the Re-d, Re-p and S-d orbitals are the major components of the defect states, being different from $MoS_2$, the defect states locate both above and below the Fermi level. Moreover, the effective mass was sensitive and anisotropic under the external strain. The reflection spectrum can be greatly tuned by the external strains, which indicates that the ReS2 monolayer has promising applications in nanoscale strain sensor and conductance-switch FETs. △ Less

Submitted 25 November, 2016; originally announced November 2016.

Comments: 24 pages with 12 figs and 1 table

arXiv:1611.04892 [pdf]

Regulation of oxygen vacancy types on SnO2 (110) surface by external strain

Authors: Z. H. Zhou, Y. M. Min, X. X. Liu, J. Q. Ding, L. Z. Liu

Abstract: In tin dioxide nanostructures, oxygen vacancies (OVs) play an important role in their optical properties and thus regulation of both OV concentration and type via external strain is crucial to exploration of more applications. First-principle calculations of SnO2 (110) surface disclose that asymmetric deformations induced by external strain not only lead to its intrinsic surface elastic changes, b… ▽ More In tin dioxide nanostructures, oxygen vacancies (OVs) play an important role in their optical properties and thus regulation of both OV concentration and type via external strain is crucial to exploration of more applications. First-principle calculations of SnO2 (110) surface disclose that asymmetric deformations induced by external strain not only lead to its intrinsic surface elastic changes, but also result in different OV formation energy. In the absence of external strain, the energetically favorable oxygen vacancies (EFOV) appear in the bridging site of second layer. When -3.5% external strain is applied along y direction, the EFOV moves into plane site. This can be ascribed that the compressed deformation gives rise to redistribution of electronic wave function near OVs, therefore, formation of newly bond structures. Our results suggest that different type OVs in SnO2 surface can be controlled by strain engineering. △ Less

Submitted 15 November, 2016; originally announced November 2016.

arXiv:1611.04882 [pdf]

doi 10.1016/j.apsusc.2017.01.305

Anisotropic Raman Scattering and Mobility in Monolayer 1Td-ReS2 Controlled by Strain Engineering

Authors: Z. H. Zhou, B. C. Wei, Y. M. Min, L. Z. Liu

Abstract: Regulation of electronic structure and mobility cut-on rate in two-dimensional transition metal dichalcogenides (TMDs) has attracted much attention because of its potential in electronic device design. The anisotropic Raman scattering and mobility cut-on rate of monolayer unique distorted-1T(1Td) ReS2 with external strain are determined theoretically based on the density function theory. The angle… ▽ More Regulation of electronic structure and mobility cut-on rate in two-dimensional transition metal dichalcogenides (TMDs) has attracted much attention because of its potential in electronic device design. The anisotropic Raman scattering and mobility cut-on rate of monolayer unique distorted-1T(1Td) ReS2 with external strain are determined theoretically based on the density function theory. The angle-dependent Raman spectrum of Ag-like, Eg-like and Cp models are used to discriminate and analysis structural anisotropy; the strain is exploited to adjust the structural symmetry and electronic structure of ReS2 so as to enhance mobility cut-on rate to almost 6 times of the original value. Our results suggest the use of the strain engineering in high-quality semiconductor switch device. △ Less

Submitted 15 November, 2016; originally announced November 2016.

Comments: 16 pages with 3 figs

arXiv:1304.4432 [pdf, ps, other]

doi 10.1103/PhysRevA.85.053823

Quasi-phase-matching of high-order-harmonic generation using polarization beating in optical waveguides

Authors: Lewis Z. Liu, Kevin O'Keeffe, Simon M. Hooker

Abstract: A scheme for quasi-phase-matching high-harmonic generation is proposed in which polarization beating within a hollow core birefringent waveguide modulates the generation of harmonics. The evolution of the polarization of a laser pulse propagating in a birefringent waveguide is calculated and is shown to periodically modulate the harmonic generation process. The optimum conditions for achieving qua… ▽ More A scheme for quasi-phase-matching high-harmonic generation is proposed in which polarization beating within a hollow core birefringent waveguide modulates the generation of harmonics. The evolution of the polarization of a laser pulse propagating in a birefringent waveguide is calculated and is shown to periodically modulate the harmonic generation process. The optimum conditions for achieving quasi-phase-matching using this scheme are explored and the growth of the harmonic intensity as a function of experimental parameters is investigated. △ Less

Submitted 16 April, 2013; originally announced April 2013.

Journal ref: Phys. Rev. A 85, 053823 (2012)

arXiv:1303.4214 [pdf, ps, other]

doi 10.1364/OL.37.002415

Optical Rotation Quasi-Phase-Matching for Circularly Polarized High Harmonic Generation

Authors: Lewis Z. Liu, Kevin O'Keeffe, Simon M. Hooker

Abstract: The first scheme for quasi-phase-matching high harmonic generation of circularly polarized radiation is proposed: optical rotation quasi-phase-matching (ORQPM). In ORQPM propagation of the driving radiation in a system exhibiting circular birefringence causes its plane of polarization to rotate; by appropriately matching the period of rotation to the coherence length it is possible to avoid destru… ▽ More The first scheme for quasi-phase-matching high harmonic generation of circularly polarized radiation is proposed: optical rotation quasi-phase-matching (ORQPM). In ORQPM propagation of the driving radiation in a system exhibiting circular birefringence causes its plane of polarization to rotate; by appropriately matching the period of rotation to the coherence length it is possible to avoid destructive interference of the generated radiation. It is shown that ORQPM is approximately 5 times more efficient than conventional QPM, and half as efficient as true phase-matching. △ Less

Submitted 18 March, 2013; originally announced March 2013.

Journal ref: Optics Letters, Vol. 37, Issue 12, pp. 2415-2417 (2012)

arXiv:1302.5272 [pdf, ps, other]

doi 10.1103/PhysRevA.87.023810

Quasi-phase-matching of high-order-harmonic generation using multimode polarization beating

Authors: Lewis Z. Liu, Kevin O'Keeffe, Simon M. Hooker

Abstract: The generalization of quasi-phase-matching using polarization beating and of multimode quasi-phase-matching (MMQPM) for the generation of high-order harmonics is explored, and a method for achieving polarization beating is proposed. If two (and in principle more) modes of a waveguide are excited, modulation of the intensity, phase, and/or polarization of the guided radiation will be achieved. By a… ▽ More The generalization of quasi-phase-matching using polarization beating and of multimode quasi-phase-matching (MMQPM) for the generation of high-order harmonics is explored, and a method for achieving polarization beating is proposed. If two (and in principle more) modes of a waveguide are excited, modulation of the intensity, phase, and/or polarization of the guided radiation will be achieved. By appropriately matching the period of this modulation to the coherence length, quasi-phase-matching of high-order-harmonic radiation generated by the guided wave can occur. We show that it is possible to achieve efficiencies with multimode quasi-phase-matching greater than the ideal square wave modulation. We present a Fourier treatment of QPM and use this to show that phase modulation, rather than amplitude modulation, plays the dominant role in the case of MMQPM. The experimental parameters and optimal conditions for this scheme are explored. △ Less

Submitted 21 February, 2013; originally announced February 2013.

Journal ref: L. Z. Liu, K O'Keeffe, and S. M. Hooker, Phys. Rev. A 87, 023810 (2013)

Showing 1–13 of 13 results for author: Liu, L Z