-
Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Authors:
Sukmin Yun,
Haokun Lin,
Rusiru Thushara,
Mohammad Qazim Bhat,
Yongxin Wang,
Zutao Jiang,
Mingkai Deng,
**hong Wang,
Tianhua Tao,
Junbo Li,
Haonan Li,
Preslav Nakov,
Timothy Baldwin,
Zhengzhong Liu,
Eric P. Xing,
Xiaodan Liang,
Zhiqiang Shen
Abstract:
Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding webpage screenshots and generating their corresponding HTML code. To address this problem, we propose Web2Code, a benchmark consisting of a new large-scale webpage-t…
▽ More
Multimodal large language models (MLLMs) have shown impressive success across modalities such as image, video, and audio in a variety of understanding and generation tasks. However, current MLLMs are surprisingly poor at understanding webpage screenshots and generating their corresponding HTML code. To address this problem, we propose Web2Code, a benchmark consisting of a new large-scale webpage-to-code dataset for instruction tuning and an evaluation framework for the webpage understanding and HTML code translation abilities of MLLMs. For dataset construction, we leverage pretrained LLMs to enhance existing webpage-to-code datasets as well as generate a diverse pool of new webpages rendered into images. Specifically, the inputs are webpage images and instructions, while the responses are the webpage's HTML code. We further include diverse natural language QA pairs about the webpage content in the responses to enable a more comprehensive understanding of the web content. To evaluate model performance in these tasks, we develop an evaluation framework for testing MLLMs' abilities in webpage understanding and web-to-code generation. Extensive experiments show that our proposed dataset is beneficial not only to our proposed tasks but also in the general visual domain, while previous datasets result in worse performance. We hope our work will contribute to the development of general MLLMs suitable for web-based content generation and task automation. Our data and code will be available at https://github.com/MBZUAI-LLM/web2code.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Universal scaling of quantum state transport in one-dimensional topological chain under nonadiabatic dynamics
Authors:
Lingzi Huang,
Menghua Deng,
Chen Sun,
Fuxiang Li
Abstract:
When a system is driven across a continuous phase transition, the density of topological defects demonstrates a power-law scaling behavior versus the quenching rate, as predicted by Kibble-Zurek mechanism. In this study, we generalized this idea and address the scaling of quantum state transport in a one-dimensional topological system subject to a linear drive through its topological quantum phase…
▽ More
When a system is driven across a continuous phase transition, the density of topological defects demonstrates a power-law scaling behavior versus the quenching rate, as predicted by Kibble-Zurek mechanism. In this study, we generalized this idea and address the scaling of quantum state transport in a one-dimensional topological system subject to a linear drive through its topological quantum phase transition point. We illustrate the power-law dependencies of the quantum state's transport distance, width, and peak magnitude on the driving velocity. Crucially, the power-law exponents are distinct for the edge state and bulk state. Our results offer a novel perspective on quantum state transfer and enriches the field of Kibble-Zurek behaviors and nonadiabatic quantum dynamics.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Autoregressive Image Generation without Vector Quantization
Authors:
Tianhong Li,
Yonglong Tian,
He Li,
Mingyang Deng,
Kaiming He
Abstract:
Conventional wisdom holds that autoregressive models for image generation are typically accompanied by vector-quantized tokens. We observe that while a discrete-valued space can facilitate representing a categorical distribution, it is not a necessity for autoregressive modeling. In this work, we propose to model the per-token probability distribution using a diffusion procedure, which allows us t…
▽ More
Conventional wisdom holds that autoregressive models for image generation are typically accompanied by vector-quantized tokens. We observe that while a discrete-valued space can facilitate representing a categorical distribution, it is not a necessity for autoregressive modeling. In this work, we propose to model the per-token probability distribution using a diffusion procedure, which allows us to apply autoregressive models in a continuous-valued space. Rather than using categorical cross-entropy loss, we define a Diffusion Loss function to model the per-token probability. This approach eliminates the need for discrete-valued tokenizers. We evaluate its effectiveness across a wide range of cases, including standard autoregressive models and generalized masked autoregressive (MAR) variants. By removing vector quantization, our image generator achieves strong results while enjoying the speed advantage of sequence modeling. We hope this work will motivate the use of autoregressive generation in other continuous-valued domains and applications.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Simulating a Chern Insulator with C = $\pm$2 on Synthetic Floquet Lattice
Authors:
Lingxiao Lei,
Weichen Wang,
Guangyao Huang,
Shun Hu,
Xi Cao,
Xinfang Zhang,
Mingtang Deng,
**xing Chen
Abstract:
The synthetic Floquet lattice, generated by multiple strong drives with mutually incommensurate frequencies, provides a powerful platform for the quantum simulation of topological phenomena. In this study, we propose a 4-band tight-binding model of the Chern insulator with a Chern number C = $\pm$2 by coupling two layers of the half-BHZ lattice and subsequently map** it onto the Floquet lattice…
▽ More
The synthetic Floquet lattice, generated by multiple strong drives with mutually incommensurate frequencies, provides a powerful platform for the quantum simulation of topological phenomena. In this study, we propose a 4-band tight-binding model of the Chern insulator with a Chern number C = $\pm$2 by coupling two layers of the half-BHZ lattice and subsequently map** it onto the Floquet lattice to simulate its topological properties. To determine the Chern number of our Floquet-version model, we extend the energy pum** method proposed by Martin et al. [Phys. Rev. X 7, 041008 (2017)] and the topological oscillation method introduced by Boyers et al. [Phys. Rev. Lett. 125, 160505 (2020)], followed by numerical simulations for both methodologies. The simulation results demonstrate the successful extraction of the Chern number using either of these methods, providing an excellent prediction of the phase diagram that closely aligns with the theoretical one derived from the original bilayer half-BHZ model. Finally, we briefly discuss a potential experimental implementation for our model. Our work demonstrates significant potential for simulating complex topological matter using quantum computing platforms, thereby paving the way for constructing a more universal simulator for non-interacting topological quantum states and advancing our understanding of these intriguing phenomena.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
Eulerian-Lagrangian Fluid Simulation on Particle Flow Maps
Authors:
Junwei Zhou,
Duowen Chen,
Molin Deng,
Yitong Deng,
Yuchen Sun,
Sinan Wang,
Shiying Xiong,
Bo Zhu
Abstract:
We propose a novel Particle Flow Map (PFM) method to enable accurate long-range advection for incompressible fluid simulation. The foundation of our method is the observation that a particle trajectory generated in a forward simulation naturally embodies a perfect flow map. Centered on this concept, we have developed an Eulerian-Lagrangian framework comprising four essential components: Lagrangian…
▽ More
We propose a novel Particle Flow Map (PFM) method to enable accurate long-range advection for incompressible fluid simulation. The foundation of our method is the observation that a particle trajectory generated in a forward simulation naturally embodies a perfect flow map. Centered on this concept, we have developed an Eulerian-Lagrangian framework comprising four essential components: Lagrangian particles for a natural and precise representation of bidirectional flow maps; a dual-scale map representation to accommodate the map** of various flow quantities; a particle-to-grid interpolation scheme for accurate quantity transfer from particles to grid nodes; and a hybrid impulse-based solver to enforce incompressibility on the grid. The efficacy of PFM has been demonstrated through various simulation scenarios, highlighting the evolution of complex vortical structures and the details of turbulent flows. Notably, compared to NFM, PFM reduces computing time by up to 49 times and memory consumption by up to 41%, while enhancing vorticity preservation as evidenced in various tests like leapfrog, vortex tube, and turbulent flow.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Scheme for braiding Majorana zero modes in vortices using an STT-matrix
Authors:
Guangyao Huang,
Xinfang Zhang,
Xiaofeng Yi,
Jibang Fu,
Weichen Wang,
Mingtang Deng
Abstract:
Recently conducted experiments on two-dimensional topological superconductors have revealed various indications of Majorana zero modes (MZMs). However, progress in the manipulation of MZM braiding has been limited, impeding the realization of topological quantum computing. In this study, we propose a potential braiding scheme based on a spintronic device matrix. This scheme involves utilizing a ma…
▽ More
Recently conducted experiments on two-dimensional topological superconductors have revealed various indications of Majorana zero modes (MZMs). However, progress in the manipulation of MZM braiding has been limited, impeding the realization of topological quantum computing. In this study, we propose a potential braiding scheme based on a spintronic device matrix. This scheme involves utilizing a matrix composed of spin-transfer torque devices (STT-matrix) alongside a two-dimensional topological superconductor material. By programming the ON/OFF states of the spintronic devices within the STT-matrix, it becomes possible to manipulate vortices hosting MZMs in the two-dimensional topological superconductor. To further investigate this concept, we construct a time-dependent Ginzburg-Landau model and perform numerical simulations to analyze vortex-driving dynamics, MZM braiding processes, and MZM fusion phenomena. Our findings demonstrate that this system exhibits high versatility and flexibility in manipulating vortices. With advancements in spintronic device technology, our proposed scheme offers a feasible and practical method for operating MZMs within vortices present in topological superconductors.
△ Less
Submitted 30 April, 2024; v1 submitted 29 April, 2024;
originally announced April 2024.
-
Miscibility of Binary Bose-Einstein Condensates with $p$-wave Interaction
Authors:
Min Deng,
Ming Xue,
**ghan Pang,
Hui Luo,
Zhiguo Wang,
**bin Li,
Dayou Yang
Abstract:
We investigate the ground-state phase diagram of a binary mixture of Bose-Einstein condensates (BECs) with competing interspecies $s$- and $p$-wave interactions. Exploiting a pseudopotential model for the $l=1$ partial wave, we derive an extended Gross-Pitaevskii (GP) equation for the BEC mixture that incorporates both $s$- and $p$-wave interactions. Based on it, we study the miscible-immiscible t…
▽ More
We investigate the ground-state phase diagram of a binary mixture of Bose-Einstein condensates (BECs) with competing interspecies $s$- and $p$-wave interactions. Exploiting a pseudopotential model for the $l=1$ partial wave, we derive an extended Gross-Pitaevskii (GP) equation for the BEC mixture that incorporates both $s$- and $p$-wave interactions. Based on it, we study the miscible-immiscible transition of a binary BEC mixture in the presence of interspecies $p$-wave interaction, by combining numerical solution of the GP equation and Gaussian variational analysis. Our study uncovers a dual effect -- either enhance or reduce miscibility -- of positive interspecies $p$-wave interaction, which can be precisely controlled by adjusting relevant experimental parameters. By complete characterizing the miscibility phase diagram, we establish a promising avenue towards experimental control of the miscibility of binary BEC mixtures via high partial-wave interactions.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Electrically tunable, rapid spin-orbit torque induced modulation of colossal magnetoresistance in Mn$_3$Si$_2$Te$_6$ nanoflakes
Authors:
Cheng Tan,
Mingxun Deng,
Yuanjun Yang,
Linlin An,
Weifeng Ge,
Sultan Albarakati,
Majid Panahandeh-Fard,
James Partridge,
Dimitrie Culcer,
Bin Lei,
Tao Wu,
Xiangde Zhu,
Mingliang Tian,
Xianhui Chen,
Rui-Qiang Wang,
Lan Wang
Abstract:
As a quasi-layered ferrimagnetic material, Mn$_3$Si$_2$Te$_6$ nanoflakes exhibit magnetoresistance behaviour that is fundamentally different from their bulk crystal counterparts. They offer three key properties crucial for spintronics. Firstly, at least 10^6 times faster response comparing to that exhibited by bulk crystals has been observed in current-controlled resistance and magnetoresistance.…
▽ More
As a quasi-layered ferrimagnetic material, Mn$_3$Si$_2$Te$_6$ nanoflakes exhibit magnetoresistance behaviour that is fundamentally different from their bulk crystal counterparts. They offer three key properties crucial for spintronics. Firstly, at least 10^6 times faster response comparing to that exhibited by bulk crystals has been observed in current-controlled resistance and magnetoresistance. Secondly, ultra-low current density is required for resistance modulation (~ 5 A/cm$^2$). Thirdly, electrically gate-tunable magnetoresistance has been realized. Theoretical calculations reveal that the unique magnetoresistance behaviour in the Mn$_3$Si$_2$Te$_6$ nanoflakes arises from a magnetic field induced band gap shift across the Fermi level. The rapid current induced resistance variation is attributed to spin-orbit torque, an intrinsically ultra-fast process (~nanoseconds). This study suggests promising avenues for spintronic applications. In addition, it highlights Mn$_3$Si$_2$Te$_6$ nanoflakes as a suitable platform for investigating the intriguing physics underlying chiral orbital moments, magnetic field induced band variation and spin torque.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Modulation of chiral anomaly and bilinear magnetoconductivity in Weyl semimetals by impurity-resonance states
Authors:
Mei-Wei Hu,
Zhuo-Yan Fang,
Hou-Jian Duan,
Mou Yang,
Ming-Xun Deng,
Rui-Qiang Wang
Abstract:
The phenomenon of nonlinear transport has attracted tremendous interest within the condensed matter community. We present a theoretical framework for nonlinear transport based on the nonequilibrium retarded Green's function, and examine the impact of disorder on nonlinear magnetotransport in Weyl semimetals (WSMs). It is demonstrated that bilinear magnetoconductivity can be induced in disordered W…
▽ More
The phenomenon of nonlinear transport has attracted tremendous interest within the condensed matter community. We present a theoretical framework for nonlinear transport based on the nonequilibrium retarded Green's function, and examine the impact of disorder on nonlinear magnetotransport in Weyl semimetals (WSMs). It is demonstrated that bilinear magnetoconductivity can be induced in disordered WSMs by several mechanisms, including impurity-induced tilting of the Weyl cones, Lorentz-force-induced normal orbital magnetic moment, and chiral anomaly arising from the Berry-curvature-induced anomalous orbital magnetic moment. Additionally, we observe that the localization of Weyl fermions by impurity scattering will lead to resonant dips in both the chiral chemical potential and magnetoconductivity when the Fermi energy approaches the impurity resonance states. Our findings offer a theoretical proposition for modulating nonreciprocal transport in topological semimetals.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
The origin of High-velocity stars considering the impact of the Large Magellanic Cloud
Authors:
Jiwei Liao,
Cuihua Du,
Mingji Deng,
Dashuang Ye,
Hefan Li,
Yang Huang,
Jianrong Shi,
Jun Ma
Abstract:
Utilizing astrometric parameters sourced from \textit{Gaia} Data Release 3 and radial velocities obtained from various spectroscopic surveys, we identify 519 high-velocity stars (HiVels) with a total velocity in the Galactocentric restframe greater than 70\% of their local escape velocity under the {\tt\string Gala} {\tt\string MilkyWayPotential}. Our analysis reveals that the majority of these Hi…
▽ More
Utilizing astrometric parameters sourced from \textit{Gaia} Data Release 3 and radial velocities obtained from various spectroscopic surveys, we identify 519 high-velocity stars (HiVels) with a total velocity in the Galactocentric restframe greater than 70\% of their local escape velocity under the {\tt\string Gala} {\tt\string MilkyWayPotential}. Our analysis reveals that the majority of these HiVels are metal-poor late-type giants, and we show 9 HiVels that are unbound candidates to the Galaxy with escape probabilities of 50\%. To investigate the origins of these HiVels, we classify them into four categories and consider the impact of the Large Magellanic Cloud (LMC) potential on their backward-integration trajectories. Specifically, we find that one of the HiVels can track back to the Galactic Center, and three HiVels may originate from the Sagittarius dwarf spheroidal galaxy (Sgr dSph). Furthermore, some HiVels appear to be ejected from the Galactic disk, while others formed within the Milky Way or have an extragalactic origin. Given that the LMC has a significant impact on the orbits of Sgr dSph, we examine the reported HiVels that originate from the Sgr dSph, with a few of them passing within the half-light radius of the Sgr dSph.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
RKKY signals characterizing the topological phase transitions in Floquet Dirac semimetals
Authors:
Hou-Jian Duan,
Shi-Ming Cai,
Xing Wei,
Yong-Chi Chen,
Yong-Jia Wu,
Ming-Xun Deng,
Ruiqiang Wang,
Mou Yang
Abstract:
Recently, the Floquet ${\rm Na_3Bi}$-type material has been proposed as an ideal platform for realizing various phases, i.e., the spin-degenerate Dirac semimetal (DSM) can be turned into the Weyl semimetal (WSM), and even to the Weyl half-metal (WHM). Instead of the conventional electrical methods, we use the RKKY interaction to characterize the topological phase transitions in this paper. It is f…
▽ More
Recently, the Floquet ${\rm Na_3Bi}$-type material has been proposed as an ideal platform for realizing various phases, i.e., the spin-degenerate Dirac semimetal (DSM) can be turned into the Weyl semimetal (WSM), and even to the Weyl half-metal (WHM). Instead of the conventional electrical methods, we use the RKKY interaction to characterize the topological phase transitions in this paper. It is found that detecting the Ising term $J_I$ is feasible for distinguishing the phase transition of DSM/WSM, since the emergence of $J_I$ is induced by the broken spin degeneracy. For the case with impurities deposited on $z$ axis (the line connecting the Weyl points), the Heisenberg term $J_H$ coexists with $J_I$ in the WSM, while $J_H$ is filtered out and only $J_I$ survives in the WHM. This magnetic filtering effect is a reflection of the fully spin-polarized property (one spin band is in the WSM phase while the other is gapped) of the WHM, and it can act a signal to capture the phase transition of WSM/WHM. This signal can not be disturbed unless the direction of the impurities greatly deviates from $z$ axis. Interestingly, as the impurities are moved into the $x$-$y$ plane, there arises another signal (a dip structure for $J_H$ at the phase boundary), which can also identify the phase transition of WSM/WHM. Furthermore, we have verified that all magnetic signals are robust to the term that breaks the electron-hole symmetry. Besides characterizing the phase transitions, our results also suggest that the Floquet DSMs are power platforms for controlling the magnetic interaction.
△ Less
Submitted 4 January, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
Randomised benchmarking for characterizing and forecasting correlated processes
Authors:
Xinfang Zhang,
Zhihao Wu,
Gregory A. L. White,
Zhongcheng Xiang,
Shun Hu,
Zhihui Peng,
Yong Liu,
Dongning Zheng,
Xiang Fu,
Anqi Huang,
Dario Poletti,
Kavan Modi,
Junjie Wu,
Mingtang Deng,
Chu Guo
Abstract:
The development of fault-tolerant quantum processors relies on the ability to control noise. A particularly insidious form of noise is temporally correlated or non-Markovian noise. By combining randomized benchmarking with supervised machine learning algorithms, we develop a method to learn the details of temporally correlated noise. In particular, we can learn the time-independent evolution opera…
▽ More
The development of fault-tolerant quantum processors relies on the ability to control noise. A particularly insidious form of noise is temporally correlated or non-Markovian noise. By combining randomized benchmarking with supervised machine learning algorithms, we develop a method to learn the details of temporally correlated noise. In particular, we can learn the time-independent evolution operator of system plus bath and this leads to (i) the ability to characterize the degree of non-Markovianity of the dynamics and (ii) the ability to predict the dynamics of the system even beyond the times we have used to train our model. We exemplify this by implementing our method on a superconducting quantum processor. Our experimental results show a drastic change between the Markovian and non-Markovian regimes for the learning accuracies.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Measuring Feature Sparsity in Language Models
Authors:
Mingyang Deng,
Lucas Tao,
Joe Benton
Abstract:
Recent works have proposed that activations in language models can be modelled as sparse linear combinations of vectors corresponding to features of input text. Under this assumption, these works aimed to reconstruct feature directions using sparse coding. We develop metrics to assess the success of these sparse coding techniques and test the validity of the linearity and sparsity assumptions. We…
▽ More
Recent works have proposed that activations in language models can be modelled as sparse linear combinations of vectors corresponding to features of input text. Under this assumption, these works aimed to reconstruct feature directions using sparse coding. We develop metrics to assess the success of these sparse coding techniques and test the validity of the linearity and sparsity assumptions. We show our metrics can predict the level of sparsity on synthetic sparse linear activations, and can distinguish between sparse linear data and several other distributions. We use our metrics to measure levels of sparsity in several language models. We find evidence that language model activations can be accurately modelled by sparse linear combinations of features, significantly more so than control datasets. We also show that model activations appear to be sparsest in the first and final layers.
△ Less
Submitted 13 October, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Experimental quantum natural gradient optimization in photonics
Authors:
Yizhi Wang,
Shichuan Xue,
Yaxuan Wang,
Jiangfang Ding,
Weixu Shi,
Dongyang Wang,
Yong Liu,
Yingwen Liu,
Xiang Fu,
Guangyao Huang,
Anqi Huang,
Mingtang Deng,
Junjie Wu
Abstract:
Variational quantum algorithms (VQAs) combining the advantages of parameterized quantum circuits and classical optimizers, promise practical quantum applications in the Noisy Intermediate-Scale Quantum era. The performance of VQAs heavily depends on the optimization method. Compared with gradient-free and ordinary gradient descent methods, the quantum natural gradient (QNG), which mirrors the geom…
▽ More
Variational quantum algorithms (VQAs) combining the advantages of parameterized quantum circuits and classical optimizers, promise practical quantum applications in the Noisy Intermediate-Scale Quantum era. The performance of VQAs heavily depends on the optimization method. Compared with gradient-free and ordinary gradient descent methods, the quantum natural gradient (QNG), which mirrors the geometric structure of the parameter space, can achieve faster convergence and avoid local minima more easily, thereby reducing the cost of circuit executions. We utilized a fully programmable photonic chip to experimentally estimate the QNG in photonics for the first time. We obtained the dissociation curve of the He-H$^+$ cation and achieved chemical accuracy, verifying the outperformance of QNG optimization on a photonic device. Our work opens up a vista of utilizing QNG in photonics to implement practical near-term quantum applications.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Quantum generative adversarial learning in photonics
Authors:
Yizhi Wang,
Shichuan Xue,
Yaxuan Wang,
Yong Liu,
Jiangfang Ding,
Weixu Shi,
Dongyang Wang,
Yingwen Liu,
Xiang Fu,
Guangyao Huang,
Anqi Huang,
Mingtang Deng,
Junjie Wu
Abstract:
Quantum Generative Adversarial Networks (QGANs), an intersection of quantum computing and machine learning, have attracted widespread attention due to their potential advantages over classical analogs. However, in the current era of Noisy Intermediate-Scale Quantum (NISQ) computing, it is essential to investigate whether QGANs can perform learning tasks on near-term quantum devices usually affecte…
▽ More
Quantum Generative Adversarial Networks (QGANs), an intersection of quantum computing and machine learning, have attracted widespread attention due to their potential advantages over classical analogs. However, in the current era of Noisy Intermediate-Scale Quantum (NISQ) computing, it is essential to investigate whether QGANs can perform learning tasks on near-term quantum devices usually affected by noise and even defects. In this Letter, using a programmable silicon quantum photonic chip, we experimentally demonstrate the QGAN model in photonics for the first time, and investigate the effects of noise and defects on its performance. Our results show that QGANs can generate high-quality quantum data with a fidelity higher than 90\%, even under conditions where up to half of the generator's phase shifters are damaged, or all of the generator and discriminator's phase shifters are subjected to phase noise up to 0.04$π$. Our work sheds light on the feasibility of implementing QGANs on NISQ-era quantum hardware.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Seeing Is Not Always Believing: Invisible Collision Attack and Defence on Pre-Trained Models
Authors:
Minghang Deng,
Zhong Zhang,
Junming Shao
Abstract:
Large-scale pre-trained models (PTMs) such as BERT and GPT have achieved great success in diverse fields. The typical paradigm is to pre-train a big deep learning model on large-scale data sets, and then fine-tune the model on small task-specific data sets for downstream tasks. Although PTMs have rapidly progressed with wide real-world applications, they also pose significant risks of potential at…
▽ More
Large-scale pre-trained models (PTMs) such as BERT and GPT have achieved great success in diverse fields. The typical paradigm is to pre-train a big deep learning model on large-scale data sets, and then fine-tune the model on small task-specific data sets for downstream tasks. Although PTMs have rapidly progressed with wide real-world applications, they also pose significant risks of potential attacks. Existing backdoor attacks or data poisoning methods often build up the assumption that the attacker invades the computers of victims or accesses the target data, which is challenging in real-world scenarios. In this paper, we propose a novel framework for an invisible attack on PTMs with enhanced MD5 collision. The key idea is to generate two equal-size models with the same MD5 checksum by leveraging the MD5 chosen-prefix collision. Afterwards, the two ``same" models will be deployed on public websites to induce victims to download the poisoned model. Unlike conventional attacks on deep learning models, this new attack is flexible, covert, and model-independent. Additionally, we propose a simple defensive strategy for recognizing the MD5 chosen-prefix collision and provide a theoretical justification for its feasibility. We extensively validate the effectiveness and stealthiness of our proposed attack and defensive method on different models and data sets.
△ Less
Submitted 7 May, 2024; v1 submitted 24 September, 2023;
originally announced September 2023.
-
Quantum Hall effect in topological Dirac semimetals modulated by the Lifshitz transition of the Fermi arc surface states
Authors:
Tao-Rui Qin,
Zhuo-Hua Chen,
Tian-Xing Liu,
Fu-Yang Chen,
Hou-Jian Duan,
Ming-Xun Deng,
Rui-Qiang Wang
Abstract:
We investigate the magnetotransport of topological Dirac semimetals (DSMs) by taking into account the Lifshitz transition of the Fermi arc surface states. We demonstrate that a bulk momentum-dependent gap term, which is usually neglected in study of the bulk energy-band topology, can cause the Lifshitz transition by develo** an additional Dirac cone for the surface to prevent the Fermi arcs from…
▽ More
We investigate the magnetotransport of topological Dirac semimetals (DSMs) by taking into account the Lifshitz transition of the Fermi arc surface states. We demonstrate that a bulk momentum-dependent gap term, which is usually neglected in study of the bulk energy-band topology, can cause the Lifshitz transition by develo** an additional Dirac cone for the surface to prevent the Fermi arcs from connecting the bulk Dirac points. As a result, the Weyl orbits can be turned off by the surface Dirac cone without destroying the bulk Dirac points. In response to the surface Lifshitz transition, the Weyl-orbit mechanism for the 3D quantum Hall effect (QHE) in topological DSMs will break down. The resulting quantized Hall plateaus can be thickness-dependent, similar to the Weyl-orbit mechanism, but their widths and quantized values become irregular. Accordingly, we propose that apart from the bulk Weyl nodes and Fermi arcs, the surface Lifshitz transition is also crucial for realizing stable Weyl orbits and 3D QHE in real materials.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Research on Joint Representation Learning Methods for Entity Neighborhood Information and Description Information
Authors:
Le Xiao,
Xin Shan,
Yuhua Wang,
Miaolei Deng
Abstract:
To address the issue of poor embedding performance in the knowledge graph of a programming design course, a joint represen-tation learning model that combines entity neighborhood infor-mation and description information is proposed. Firstly, a graph at-tention network is employed to obtain the features of entity neigh-boring nodes, incorporating relationship features to enrich the structural infor…
▽ More
To address the issue of poor embedding performance in the knowledge graph of a programming design course, a joint represen-tation learning model that combines entity neighborhood infor-mation and description information is proposed. Firstly, a graph at-tention network is employed to obtain the features of entity neigh-boring nodes, incorporating relationship features to enrich the structural information. Next, the BERT-WWM model is utilized in conjunction with attention mechanisms to obtain the representation of entity description information. Finally, the final entity vector representation is obtained by combining the vector representations of entity neighborhood information and description information. Experimental results demonstrate that the proposed model achieves favorable performance on the knowledge graph dataset of the pro-gramming design course, outperforming other baseline models.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Uniform sets with few progressions via colorings
Authors:
Mingyang Deng,
Jonathan Tidor,
Yufei Zhao
Abstract:
Ruzsa asked whether there exist Fourier-uniform subsets of $\mathbb Z/N\mathbb Z$ with density $α$ and 4-term arithmetic progression (4-APs) density at most $α^C$, for arbitrarily large $C$. Gowers constructed Fourier uniform sets with density $α$ and 4-AP density at most $α^{4+c}$ for some small constant $c>0$. We show that an affirmative answer to Ruzsa's question would follow from the existence…
▽ More
Ruzsa asked whether there exist Fourier-uniform subsets of $\mathbb Z/N\mathbb Z$ with density $α$ and 4-term arithmetic progression (4-APs) density at most $α^C$, for arbitrarily large $C$. Gowers constructed Fourier uniform sets with density $α$ and 4-AP density at most $α^{4+c}$ for some small constant $c>0$. We show that an affirmative answer to Ruzsa's question would follow from the existence of an $N^{o(1)}$-coloring of $[N]$ without symmetrically colored 4-APs. For a broad and natural class of constructions of Fourier-uniform subsets of $\mathbb Z/N\mathbb Z$, we show that Ruzsa's question is equivalent to our arithmetic Ramsey question.
We prove analogous results for all even-length APs. For each odd $k\geq 5$, we show that there exist $U^{k-2}$-uniform subsets of $\mathbb Z/N\mathbb Z$ with density $α$ and $k$-AP density at most $α^{c_k \log(1/α)}$. We also prove generalizations to arbitrary one-dimensional patterns.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Restart Sampling for Improving Generative Processes
Authors:
Yilun Xu,
Mingyang Deng,
Xiang Cheng,
Yonglong Tian,
Ziming Liu,
Tommi Jaakkola
Abstract:
Generative processes that involve solving differential equations, such as diffusion models, frequently necessitate balancing speed and quality. ODE-based samplers are fast but plateau in performance while SDE-based samplers deliver higher sample quality at the cost of increased sampling time. We attribute this difference to sampling errors: ODE-samplers involve smaller discretization errors while…
▽ More
Generative processes that involve solving differential equations, such as diffusion models, frequently necessitate balancing speed and quality. ODE-based samplers are fast but plateau in performance while SDE-based samplers deliver higher sample quality at the cost of increased sampling time. We attribute this difference to sampling errors: ODE-samplers involve smaller discretization errors while stochasticity in SDE contracts accumulated errors. Based on these findings, we propose a novel sampling algorithm called Restart in order to better balance discretization errors and contraction. The sampling method alternates between adding substantial noise in additional forward steps and strictly following a backward ODE. Empirically, Restart sampler surpasses previous SDE and ODE samplers in both speed and accuracy. Restart not only outperforms the previous best SDE results, but also accelerates the sampling speed by 10-fold / 2-fold on CIFAR-10 / ImageNet $64 \times 64$. In addition, it attains significantly better sample quality than ODE samplers within comparable sampling times. Moreover, Restart better balances text-image alignment/visual quality versus diversity than previous samplers in the large-scale text-to-image Stable Diffusion model pre-trained on LAION $512 \times 512$. Code is available at https://github.com/Newbeeer/diffusion_restart_sampling
△ Less
Submitted 1 November, 2023; v1 submitted 26 June, 2023;
originally announced June 2023.
-
TGNN: A Joint Semi-supervised Framework for Graph-level Classification
Authors:
Wei Ju,
Xiao Luo,
Meng Qu,
Yifan Wang,
Chong Chen,
Minghua Deng,
Xian-Sheng Hua,
Ming Zhang
Abstract:
This paper studies semi-supervised graph classification, a crucial task with a wide range of applications in social network analysis and bioinformatics. Recent works typically adopt graph neural networks to learn graph-level representations for classification, failing to explicitly leverage features derived from graph topology (e.g., paths). Moreover, when labeled data is scarce, these methods are…
▽ More
This paper studies semi-supervised graph classification, a crucial task with a wide range of applications in social network analysis and bioinformatics. Recent works typically adopt graph neural networks to learn graph-level representations for classification, failing to explicitly leverage features derived from graph topology (e.g., paths). Moreover, when labeled data is scarce, these methods are far from satisfactory due to their insufficient topology exploration of unlabeled data. We address the challenge by proposing a novel semi-supervised framework called Twin Graph Neural Network (TGNN). To explore graph structural information from complementary views, our TGNN has a message passing module and a graph kernel module. To fully utilize unlabeled data, for each module, we calculate the similarity of each unlabeled graph to other labeled graphs in the memory bank and our consistency loss encourages consistency between two similarity distributions in different embedding spaces. The two twin modules collaborate with each other by exchanging instance similarity knowledge to fully explore the structure information of both labeled and unlabeled data. We evaluate our TGNN on various public datasets and show that it achieves strong performance.
△ Less
Submitted 23 April, 2023;
originally announced April 2023.
-
Growth of Sobolev norms for 2D cubic nonlinear Schrödinger equation with partial harmonic potential
Authors:
Mingming Deng,
Xiaoyan Su,
Jiqiang Zheng
Abstract:
In this paper, we study the $2$D cubic nonlinear Schrödinger equation (NLS) with the partial harmonic potential. First, we prove the local well-posedness in Bourgain spaces by establishing a key bilinear estimate associated with the partial harmonic oscillator. Then, we give the polynomial bound of the Sobolev norms for the solutions using the method of the Planchon, Tzvetkov, and Visciglia.
In this paper, we study the $2$D cubic nonlinear Schrödinger equation (NLS) with the partial harmonic potential. First, we prove the local well-posedness in Bourgain spaces by establishing a key bilinear estimate associated with the partial harmonic oscillator. Then, we give the polynomial bound of the Sobolev norms for the solutions using the method of the Planchon, Tzvetkov, and Visciglia.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
GSQAS: Graph Self-supervised Quantum Architecture Search
Authors:
Zhimin He,
Maijie Deng,
Shenggen Zheng,
Lvzhou Li,
Haozhen Situ
Abstract:
Quantum Architecture Search (QAS) is a promising approach to designing quantum circuits for variational quantum algorithms (VQAs). However, existing QAS algorithms require to evaluate a large number of quantum circuits during the search process, which makes them computationally demanding and limits their applications to large-scale quantum circuits. Recently, predictor-based QAS has been proposed…
▽ More
Quantum Architecture Search (QAS) is a promising approach to designing quantum circuits for variational quantum algorithms (VQAs). However, existing QAS algorithms require to evaluate a large number of quantum circuits during the search process, which makes them computationally demanding and limits their applications to large-scale quantum circuits. Recently, predictor-based QAS has been proposed to alleviate this problem by directly estimating the performances of circuits according to their structures with a predictor trained on a set of labeled quantum circuits. However, the predictor is trained by purely supervised learning, which suffers from poor generalization ability when labeled training circuits are scarce. It is very time-consuming to obtain a large number of labeled quantum circuits because the gate parameters of quantum circuits need to be optimized until convergence to obtain their ground-truth performances. To overcome these limitations, we propose GSQAS, a graph self-supervised QAS, which trains a predictor based on self-supervised learning. Specifically, we first pre-train a graph encoder on a large number of unlabeled quantum circuits using a well-designed pretext task in order to generate meaningful representations of circuits. Then the downstream predictor is trained on a small number of quantum circuits' representations and their labels. Once the encoder is trained, it can apply to different downstream tasks. In order to better encode the spatial topology information and avoid the huge dimension of feature vectors for large-scale quantum circuits, we design a scheme to encode quantum circuits as graphs. Simulation results on searching circuit structures for variational quantum eigensolver and quantum state classification show that GSQAS outperforms the state-of-the-art predictor-based QAS, achieving better performance with fewer labeled circuits.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
Digital Privacy Under Attack: Challenges and Enablers
Authors:
Baobao Song,
Mengyue Deng,
Shiva Raj Pokhrel,
Qiujun Lan,
Robin Doss,
Gang Li
Abstract:
Users have renewed interest in protecting their private data in the digital space. When they don't believe that their privacy is sufficiently covered by one platform, they will readily switch to another. Such an increasing level of privacy awareness has made privacy preservation an essential research topic. Nevertheless, new privacy attacks are emerging day by day. Therefore, a holistic survey to…
▽ More
Users have renewed interest in protecting their private data in the digital space. When they don't believe that their privacy is sufficiently covered by one platform, they will readily switch to another. Such an increasing level of privacy awareness has made privacy preservation an essential research topic. Nevertheless, new privacy attacks are emerging day by day. Therefore, a holistic survey to compare the discovered techniques on attacks over privacy preservation and their mitigation schemes is essential in the literature. We develop a study to fill this gap by assessing the resilience of privacy-preserving methods to various attacks and conducting a comprehensive review of countermeasures from a broader perspective. First, we introduce the fundamental concepts and critical components of privacy attacks. Second, we comprehensively cover major privacy attacks targeted at anonymous data, statistical aggregate data, and privacy-preserving models. We also summarize popular countermeasures to mitigate these attacks. Finally, some promising future research directions and related issues in the privacy community are envisaged. We believe this survey will successfully shed some light on privacy research and encourage researchers to entirely understand the resilience of different existing privacy-preserving approaches.
△ Less
Submitted 18 February, 2023;
originally announced February 2023.
-
On the growth of high Sobolev norms of the cubic nonlinear Schrödinger equation on $\mathbb{R}\times \mathbb{T}$
Authors:
Mingming Deng,
Kailong Yang
Abstract:
We consider the cubic nonlinear Schrödinger equation on product manifolds $\mathbb{R}\times \mathbb{T}$. In this paper, we obtain polynomial bounds on the growth in time of high Sobolev norms of the solutions. The main ingredient of the proof is to establish an iteration bound, which is based on the idea used by Bourgain in \cite{B1}.
We consider the cubic nonlinear Schrödinger equation on product manifolds $\mathbb{R}\times \mathbb{T}$. In this paper, we obtain polynomial bounds on the growth in time of high Sobolev norms of the solutions. The main ingredient of the proof is to establish an iteration bound, which is based on the idea used by Bourgain in \cite{B1}.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Approximating Knapsack and Partition via Dense Subset Sums
Authors:
Mingyang Deng,
Ce **,
Xiao Mao
Abstract:
Knapsack and Partition are two important additive problems whose fine-grained complexities in the $(1-\varepsilon)$-approximation setting are not yet settled. In this work, we make progress on both problems by giving improved algorithms.
- Knapsack can be $(1 - \varepsilon)$-approximated in $\tilde O(n + (1/\varepsilon) ^ {2.2} )$ time, improving the previous…
▽ More
Knapsack and Partition are two important additive problems whose fine-grained complexities in the $(1-\varepsilon)$-approximation setting are not yet settled. In this work, we make progress on both problems by giving improved algorithms.
- Knapsack can be $(1 - \varepsilon)$-approximated in $\tilde O(n + (1/\varepsilon) ^ {2.2} )$ time, improving the previous $\tilde O(n + (1/\varepsilon) ^ {2.25} )$ by ** (ICALP'19). There is a known conditional lower bound of $(n+\varepsilon)^{2-o(1)}$ based on $(\min,+)$-convolution hypothesis.
- Partition can be $(1 - \varepsilon)$-approximated in $\tilde O(n + (1/\varepsilon) ^ {1.25} )$ time, improving the previous $\tilde O(n + (1/\varepsilon) ^ {1.5} )$ by Bringmann and Nakos (SODA'21). There is a known conditional lower bound of $(1/\varepsilon)^{1-o(1)}$ based on Strong Exponential Time Hypothesis.
Both of our new algorithms apply the additive combinatorial results on dense subset sums by Galil and Margalit (SICOMP'91), Bringmann and Wellnitz (SODA'21). Such techniques have not been explored in the context of Knapsack prior to our work. In addition, we design several new methods to speed up the divide-and-conquer steps which naturally arise in solving additive problems.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Indirect magnetic signals mediated by a single surface band in Weyl semimetals
Authors:
Hou-Jian Duan,
Yong-Jia Wu,
Ming-Xun Deng,
Ruiqiang Wang,
Mou Yang
Abstract:
Recently, abundant transport phenomena characterizing the surface states of Weyl semimetals (WSMs) have been reported. To generate these phenomena, electrons have to complete a closed intersurface orbit. Due to the unavoidable impurities in real materials, this orbit would be destroyed by the impurity scattering, which limits the detection of the surface states in WSMs. Here, we investigate the RK…
▽ More
Recently, abundant transport phenomena characterizing the surface states of Weyl semimetals (WSMs) have been reported. To generate these phenomena, electrons have to complete a closed intersurface orbit. Due to the unavoidable impurities in real materials, this orbit would be destroyed by the impurity scattering, which limits the detection of the surface states in WSMs. Here, we investigate the RKKY interaction between magnetic impurities, solely mediated by a single surface band, in semi-infinite WSMs. It is found that peculiar oscillations and slowly decaying laws of the RKKY interaction can act as the signals to capture the dispersive nature of the surface states of WSMs. The underlying physics is attributed to two effects: the band-edge effect and the bending effect of the surface band, which can control the RKKY interaction individually or compete with each other to produce more complex magnetic behaviors. In addition, the band-edge effect together with the finite Fermi energy would result in another interesting oscillation with battering pattern. All the results are significantly different from that in previous literatures where surface states have to couple with bulk states (or other surface states of different spins) to generate nonzero magnetic interaction. Compared to the previous models of surface states, the model here is more practical and is helpful for the deeper understanding of the surface magnetic properties in WSMs.
△ Less
Submitted 14 December, 2022; v1 submitted 22 November, 2022;
originally announced November 2022.
-
Honor of Kings Arena: an Environment for Generalization in Competitive Reinforcement Learning
Authors:
Hua Wei,
**gxiao Chen,
Xiyang Ji,
Hongyang Qin,
Minwen Deng,
Siqin Li,
Liang Wang,
Weinan Zhang,
Yong Yu,
Lin Liu,
Lanxiao Huang,
Deheng Ye,
Qiang Fu,
Wei Yang
Abstract:
This paper introduces Honor of Kings Arena, a reinforcement learning (RL) environment based on Honor of Kings, one of the world's most popular games at present. Compared to other environments studied in most previous work, ours presents new generalization challenges for competitive reinforcement learning. It is a multi-agent problem with one agent competing against its opponent; and it requires th…
▽ More
This paper introduces Honor of Kings Arena, a reinforcement learning (RL) environment based on Honor of Kings, one of the world's most popular games at present. Compared to other environments studied in most previous work, ours presents new generalization challenges for competitive reinforcement learning. It is a multi-agent problem with one agent competing against its opponent; and it requires the generalization ability as it has diverse targets to control and diverse opponents to compete with. We describe the observation, action, and reward specifications for the Honor of Kings domain and provide an open-source Python-based interface for communicating with the game engine. We provide twenty target heroes with a variety of tasks in Honor of Kings Arena and present initial baseline results for RL-based methods with feasible computing resources. Finally, we showcase the generalization challenges imposed by Honor of Kings Arena and possible remedies to the challenges. All of the software, including the environment-class, are publicly available at https://github.com/tencent-ailab/hok_env . The documentation is available at https://aiarena.tencent.com/hok/doc/ .
△ Less
Submitted 18 October, 2022; v1 submitted 18 September, 2022;
originally announced September 2022.
-
Large-scale full-programmable quantum walk and its applications
Authors:
Yizhi Wang,
Yingwen Liu,
Junwei Zhan,
Shichuan Xue,
Yuzhen Zheng,
Ru Zeng,
Zhihao Wu,
Zihao Wang,
Qilin Zheng,
Dongyang Wang,
Weixu Shi,
Xiang Fu,
** Xu,
Yang Wang,
Yong Liu,
Jiangfang Ding,
Guangyao Huang,
Chunlin Yu,
Anqi Huang,
Xiaogang Qiang,
Mingtang Deng,
Weixia Xu,
Kai Lu,
Xuejun Yang,
Junjie Wu
Abstract:
With photonics, the quantum computational advantage has been demonstrated on the task of boson sampling. Next, develo** quantum-enhanced approaches for practical problems becomes one of the top priorities for photonic systems. Quantum walks are powerful kernels for develo** new and useful quantum algorithms. Here we realize large-scale quantum walks using a fully programmable photonic quantum…
▽ More
With photonics, the quantum computational advantage has been demonstrated on the task of boson sampling. Next, develo** quantum-enhanced approaches for practical problems becomes one of the top priorities for photonic systems. Quantum walks are powerful kernels for develo** new and useful quantum algorithms. Here we realize large-scale quantum walks using a fully programmable photonic quantum computing system. The system integrates a silicon quantum photonic chip, enabling the simulation of quantum walk dynamics on graphs with up to 400 vertices and possessing full programmability over quantum walk parameters, including the particle property, initial state, graph structure, and evolution time. In the 400-dimensional Hilbert space, the average fidelity of random entangled quantum states after the whole on-chip circuit evolution reaches as high as 94.29$\pm$1.28$\%$. With the system, we demonstrated exponentially faster hitting and quadratically faster mixing performance of quantum walks over classical random walks, achieving more than two orders of magnitude of enhancement in the experimental hitting efficiency and almost half of the reduction in the experimental evolution time for mixing. We utilize the system to implement a series of quantum applications, including measuring the centrality of scale-free networks, searching targets on Erdös-Rényi networks, distinguishing non-isomorphic graph pairs, and simulating the topological phase of higher-order topological insulators. Our work shows one feasible path for quantum photonics to address applications of practical interests in the near future.
△ Less
Submitted 28 August, 2022;
originally announced August 2022.
-
Unsupervised domain adaptation semantic segmentation of high-resolution remote sensing imagery with invariant domain-level prototype memory
Authors:
**gru Zhu,
Ya Guo,
Geng Sun,
Libo Yang,
Min Deng,
Jie Chen
Abstract:
Semantic segmentation is a key technique involved in automatic interpretation of high-resolution remote sensing (HRS) imagery and has drawn much attention in the remote sensing community. Deep convolutional neural networks (DCNNs) have been successfully applied to the HRS imagery semantic segmentation task due to their hierarchical representation ability. However, the heavy dependency on a large n…
▽ More
Semantic segmentation is a key technique involved in automatic interpretation of high-resolution remote sensing (HRS) imagery and has drawn much attention in the remote sensing community. Deep convolutional neural networks (DCNNs) have been successfully applied to the HRS imagery semantic segmentation task due to their hierarchical representation ability. However, the heavy dependency on a large number of training data with dense annotation and the sensitiveness to the variation of data distribution severely restrict the potential application of DCNNs for the semantic segmentation of HRS imagery. This study proposes a novel unsupervised domain adaptation semantic segmentation network (MemoryAdaptNet) for the semantic segmentation of HRS imagery. MemoryAdaptNet constructs an output space adversarial learning scheme to bridge the domain distribution discrepancy between source domain and target domain and to narrow the influence of domain shift. Specifically, we embed an invariant feature memory module to store invariant domain-level context information because the features obtained from adversarial learning only tend to represent the variant feature of current limited inputs. This module is integrated by a category attention-driven invariant domain-level context aggregation module to current pseudo invariant feature for further augmenting the pixel representations. An entropy-based pseudo label filtering strategy is used to update the memory module with high-confident pseudo invariant feature of current target images. Extensive experiments under three cross-domain tasks indicate that our proposed MemoryAdaptNet is remarkably superior to the state-of-the-art methods.
△ Less
Submitted 14 February, 2023; v1 submitted 16 August, 2022;
originally announced August 2022.
-
Characterization of the John A. Galt telescope for radio holography with CHIME
Authors:
Alex Reda,
Tristan Pinsonneault-Marotte,
Meiling Deng,
Mandana Amiri,
Kevin Bandura,
Arnab Chakraborty,
Simon Foreman,
Mark Halpern,
Alex S. Hill,
Carolin Höfer,
Joseph Kania,
T. L. Landecker,
Joshua MacEachern,
Kiyoshi Masui,
Juan Mena-Parra,
Nikola Milutinovic,
Laura Newburgh,
Anna Ordog,
Sourabh Paul,
J. Richard Shaw,
Seth R. Siegel,
Rick Smegal,
Haochen Wang,
Dallas Wulf
Abstract:
The Canadian Hydrogen Intensity Map** Experiment (CHIME) will measure the 21 cm emission of astrophysical neutral hydrogen to probe large scale structure at redshifts z=0.8-2.5. However, detecting the 21 cm signal beneath substantially brighter foregrounds remains a key challenge. Due to the high dynamic range between 21 cm and foreground emission, an exquisite calibration of instrument systemat…
▽ More
The Canadian Hydrogen Intensity Map** Experiment (CHIME) will measure the 21 cm emission of astrophysical neutral hydrogen to probe large scale structure at redshifts z=0.8-2.5. However, detecting the 21 cm signal beneath substantially brighter foregrounds remains a key challenge. Due to the high dynamic range between 21 cm and foreground emission, an exquisite calibration of instrument systematics, notably the telescope beam, is required to successfully filter out the foregrounds. One technique being used to achieve a high fidelity measurement of the CHIME beam is radio holography, wherein signals from each of CHIME's analog inputs are correlated with the signal from a co-located reference antenna, the 26 m John A. Galt telescope, as the 26 m Galt telescope tracks a bright point source transiting over CHIME. In this work we present an analysis of several of the Galt telescope's properties. We employ driftscan measurements of several bright sources, along with background estimates derived from the 408 MHz Haslam map, to estimate the Galt system temperature. To determine the Galt telescope's beam shape, we perform and analyze a raster scan of the bright radio source Cassiopeia A. Finally, we use early holographic measurements to measure the Galt telescope's geometry with respect to CHIME for the holographic analysis of the CHIME and Galt interferometric data set.
△ Less
Submitted 30 September, 2022; v1 submitted 28 July, 2022;
originally announced July 2022.
-
Antenna characterization for the HIRAX experiment
Authors:
Emily R. Kuhn,
Benjamin R. B. Saliwanchik,
Kevin Bandura,
Michele Bianco,
H. Cynthia Chiang,
Devin Crichton,
Meiling Deng,
Sindhu Gaddam,
Kit Gerodias,
Austin Gumba,
Maile Harris,
Kavilan Moodley,
V. Mugundhan,
Laura Newburgh,
Jeffrey Peterson,
Elizabeth Pieters,
Anna R. Polish,
Alexandre Refregier,
Ajith Sampath,
Mario G. Santos,
Onkabetse Sengate,
Jonathan Sievers,
Ema Smith,
Will Tyndall,
Anthony Walters
, et al. (2 additional authors not shown)
Abstract:
The Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) aims to improve constraints on the dark energy equation of state through measurements of large-scale structure at high redshift ($0.8<z<2.5$), while serving as a state-of-the-art fast radio burst detector. Bright galactic foregrounds contaminate the 400--800~MHz HIRAX frequency band, so meeting the science goals will require precise…
▽ More
The Hydrogen Intensity and Real-time Analysis eXperiment (HIRAX) aims to improve constraints on the dark energy equation of state through measurements of large-scale structure at high redshift ($0.8<z<2.5$), while serving as a state-of-the-art fast radio burst detector. Bright galactic foregrounds contaminate the 400--800~MHz HIRAX frequency band, so meeting the science goals will require precise instrument characterization. In this paper we describe characterization of the HIRAX antenna, focusing on measurements of the antenna beam and antenna noise temperature.
Beam measurements of the current HIRAX antenna design were performed in an anechoic chamber and compared to simulations. We report measurement techniques and results, which find a broad and symmetric antenna beam for $ν<$650MHz, and elevated cross-polarization levels and beam asymmetries for $ν>$700MHz. Noise temperature measurements of the HIRAX feeds were performed in a custom apparatus built at Yale. In this system, identical loads, one cryogenic and the other at room temperature, are used to take a differential (Y-factor) measurement from which the noise of the system is inferred. Several measurement sets have been conducted using the system, involving CHIME feeds as well as four of the HIRAX active feeds. These measurements give the first noise temperature measurements of the HIRAX feed, revealing a $\sim$60K noise temperature (relative to 30K target) with 40K peak- to-peak frequency-dependent features, and provide the first demonstration of feed repeatability. Both findings inform current and future feed designs.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
Study on SiPM performance at low temperatures between $-60^{\circ}$C and $-20^{\circ}$C
Authors:
C. Zhong,
F. J. Luo,
B. Zheng,
X. D. Wang,
M. Y. Bu,
J. Zou,
M. N. Deng
Abstract:
Radon is the main background source of dark matter and neutrino experiments. Radon concentration ($\rm mBq/m^3$) measurement by liquid scintillation detector is a highly sensitive method at low temperatures using silicon photomultipliers (SiPMs) arrays. The SiPM performance characteristics are closely related to the lower detection limit of the detector. In this study, we built an automatic and ac…
▽ More
Radon is the main background source of dark matter and neutrino experiments. Radon concentration ($\rm mBq/m^3$) measurement by liquid scintillation detector is a highly sensitive method at low temperatures using silicon photomultipliers (SiPMs) arrays. The SiPM performance characteristics are closely related to the lower detection limit of the detector. In this study, we built an automatic and accurate low-temperature measurement system to study the single photoelectron spectrum, SPE resolution, optical crosstalk, and after-pulse of the SiPM at different temperatures. As a result, we obtained the variation trend of the SiPM parameters at different temperatures, and the SiPM optimal working conditions were obtained, which can improve the detector's sensitivity
△ Less
Submitted 26 October, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning
Authors:
Mingkai Deng,
Jianyu Wang,
Cheng-** Hsieh,
Yihan Wang,
Han Guo,
Tianmin Shu,
Meng Song,
Eric P. Xing,
Zhiting Hu
Abstract:
Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform diverse NLP tasks, especially when only few downstream data are available. Automatically finding the optimal prompt for each task, however, is challenging. Most existing work resorts to tuning soft prompt (e.g., embeddings) which falls short of interpretability, reusability across LMs, and applicab…
▽ More
Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform diverse NLP tasks, especially when only few downstream data are available. Automatically finding the optimal prompt for each task, however, is challenging. Most existing work resorts to tuning soft prompt (e.g., embeddings) which falls short of interpretability, reusability across LMs, and applicability when gradients are not accessible. Discrete prompt, on the other hand, is difficult to optimize, and is often created by "enumeration (e.g., paraphrasing)-then-selection" heuristics that do not explore the prompt space systematically. This paper proposes RLPrompt, an efficient discrete prompt optimization approach with reinforcement learning (RL). RLPrompt formulates a parameter-efficient policy network that generates the desired discrete prompt after training with reward. To overcome the complexity and stochasticity of reward signals by the large LM environment, we incorporate effective reward stabilization that substantially enhances the training efficiency. RLPrompt is flexibly applicable to different types of LMs, such as masked (e.g., BERT) and left-to-right models (e.g., GPTs), for both classification and generation tasks. Experiments on few-shot classification and unsupervised text style transfer show superior performance over a wide range of existing finetuning or prompting methods. Interestingly, the resulting optimized prompts are often ungrammatical gibberish text; and surprisingly, those gibberish prompts are transferrable between different LMs to retain significant performance, indicating LM prompting may not follow human language patterns.
△ Less
Submitted 22 October, 2022; v1 submitted 25 May, 2022;
originally announced May 2022.
-
On Problems Related to Unbounded SubsetSum: A Unified Combinatorial Approach
Authors:
Mingyang Deng,
Xiao Mao,
Ziqian Zhong
Abstract:
Unbounded SubsetSum is a classical textbook problem: given integers $w_1,w_2,\cdots,w_n\in [1,u],~c,u$, we need to find if there exists $m_1,m_2,\cdots,m_n\in \mathbb{N}$ satisfying $c=\sum_{i=1}^n w_im_i$. In its all-target version, $t\in \mathbb{Z}_+$ is given and answer for all integers $c\in[0,t]$ is required. In this paper, we study three generalizations of this simple problem: All-Target Unb…
▽ More
Unbounded SubsetSum is a classical textbook problem: given integers $w_1,w_2,\cdots,w_n\in [1,u],~c,u$, we need to find if there exists $m_1,m_2,\cdots,m_n\in \mathbb{N}$ satisfying $c=\sum_{i=1}^n w_im_i$. In its all-target version, $t\in \mathbb{Z}_+$ is given and answer for all integers $c\in[0,t]$ is required. In this paper, we study three generalizations of this simple problem: All-Target Unbounded Knapsack, All-Target CoinChange and Residue Table. By new combinatorial insights into the structures of solutions, we present a novel two-phase approach for such problems. As a result, we present the first near-linear algorithms for CoinChange and Residue Table, which runs in $\tilde{O}(u+t)$ and $\tilde{O}(u)$ time deterministically. We also show if we can compute $(\min,+)$ convolution for $n$-length arrays in $T(n)$ time, then All-Target Unbounded Knapsack can be solved in $\tilde{O}(T(u)+t)$ time, thus establishing sub-quadratic equivalence between All-Target Unbounded Knapsack and $(\min,+)$ convolution.
△ Less
Submitted 27 February, 2022;
originally announced February 2022.
-
Single-frame label-free cell tomography at speed of more than 10,000 volumes per second
Authors:
Baoliang Ge,
Yan** He,
Mo Deng,
Md Habibur Rahman,
Yi** Wang,
Ziling Wu,
Chung Hong N. Wong,
Michael K. Chan,
Yi-** Ho,
Liting Duan,
Zahid Yaqoob,
Peter T. C. So,
George Barbastathis,
Renjie Zhou
Abstract:
Three-dimensional (3D) image cytometers may significantly improve the cell analysis accuracy to facilitate biological discoveries and clinical diagnosis, but their development is curbed by the low imaging throughput. Here we report SIngle-frame LAbel-free Cell Tomography (SILACT) with diffraction-limited resolution and unprecedented imaging speed of over 10,000 volumes/second. SILACT is built on a…
▽ More
Three-dimensional (3D) image cytometers may significantly improve the cell analysis accuracy to facilitate biological discoveries and clinical diagnosis, but their development is curbed by the low imaging throughput. Here we report SIngle-frame LAbel-free Cell Tomography (SILACT) with diffraction-limited resolution and unprecedented imaging speed of over 10,000 volumes/second. SILACT is built on a unique interferometric microscope with angle-multiplexing illumination and a pre-trained physics-incorporating Deep Neural Network for efficient 3D Refractive Index (RI) reconstruction, from which 3D morphological and biophysical parameters of cells are extracted. With microfluidics and a high-speed camera, SILACT is capable of imaging over 20,000 cells/second and distinguishing different cell species during rapid measurements of large cell quantities, as well as visualizing shear-induced 3D transient deformation of red blood cells on a sub-millisecond scale.
△ Less
Submitted 7 February, 2022;
originally announced February 2022.
-
Detection of Cosmological 21 cm Emission with the Canadian Hydrogen Intensity Map** Experiment
Authors:
CHIME Collaboration,
Mandana Amiri,
Kevin Bandura,
Tianyue Chen,
Meiling Deng,
Matt Dobbs,
Mateus Fandino,
Simon Foreman,
Mark Halpern,
Alex S. Hill,
Gary Hinshaw,
Carolin Höfer,
Joseph Kania,
T. L. Landecker,
Joshua MacEachern,
Kiyoshi Masui,
Juan Mena-Parra,
Nikola Milutinovic,
Arash Mirhosseini,
Laura Newburgh,
Anna Ordog,
Ue-Li Pen,
Tristan Pinsonneault-Marotte,
Ava Polzin,
Alex Reda
, et al. (8 additional authors not shown)
Abstract:
We present a detection of 21-cm emission from large-scale structure (LSS) between redshift 0.78 and 1.43 made with the Canadian Hydrogen Intensity Map** Experiment (CHIME). Radio observations acquired over 102 nights are used to construct maps which are foreground filtered and stacked on the angular and spectral locations of luminous red galaxies (LRG), emission line galaxies (ELG), and quasars…
▽ More
We present a detection of 21-cm emission from large-scale structure (LSS) between redshift 0.78 and 1.43 made with the Canadian Hydrogen Intensity Map** Experiment (CHIME). Radio observations acquired over 102 nights are used to construct maps which are foreground filtered and stacked on the angular and spectral locations of luminous red galaxies (LRG), emission line galaxies (ELG), and quasars (QSO) from the eBOSS clustering catalogs. We find decisive evidence for a detection when stacking on all three tracers of LSS, with the logarithm of the Bayes Factor equal to 18.9 (LRG), 10.8 (ELG), and 56.3 (QSO). An alternative frequentist interpretation, based on the likelihood-ratio test, yields a detection significance of $7.1σ$ (LRG), $5.7σ$ (ELG), and $11.1σ$ (QSO). These are the first 21-cm intensity map** measurements made with an interferometer. We constrain the effective clustering amplitude of neutral hydrogen (HI), defined as $\mathcal{A}_{\rm HI}\equiv 10^{3}\,Ω_\mathrm{HI}\left(b_\mathrm{HI}+\langle\,fμ^{2}\rangle\right)$, where $Ω_\mathrm{HI}$ is the cosmic abundance of HI, $b_\mathrm{HI}$ is the linear bias of HI, and $\langle\,fμ^{2}\rangle=0.552$ encodes the effect of redshift-space distortions at linear order. We find $\mathcal{A}_\mathrm{HI}=1.51^{+3.60}_{-0.97}$ for LRGs $(z=0.84)$, $\mathcal{A}_\mathrm{HI}=6.76^{+9.04}_{-3.79}$ for ELGs $(z=0.96)$, and $\mathcal{A}_\mathrm{HI}=1.68^{+1.10}_{-0.67}$ for QSOs $(z=1.20)$, with constraints limited by modeling uncertainties at nonlinear scales. We are also sensitive to bias in the spectroscopic redshifts of each tracer, and find a non-zero bias $Δ\,v= -66 \pm 20 \mathrm{km/s}$ for the QSOs. We split the QSO catalog into three redshift bins and have a decisive detection in each, with the upper bin at $z=1.30$ producing the highest redshift 21-cm intensity map** measurement thus far.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Using the Sun to Measure the Primary Beam Response of the Canadian Hydrogen Intensity Map** Experiment
Authors:
CHIME Collaboration,
Mandana Amiri,
Kevin Bandura,
Anja Boskovic,
Jean-François Cliche,
Meiling Deng,
Matt Dobbs,
Mateus Fandino,
Simon Foreman,
Mark Halpern,
Alex S. Hill,
Gary Hinshaw,
Carolin Höfer,
Joseph Kania,
T. L. Landecker,
Joshua MacEachern,
Kiyoshi Masui,
Juan Mena-Parra,
Laura Newburgh,
Anna Ordog,
Tristan Pinsonneault-Marotte,
Ava Polzin,
Alex Reda,
J. Richard Shaw,
Seth R. Siegel
, et al. (5 additional authors not shown)
Abstract:
We present a beam pattern measurement of the Canadian Hydrogen Intensity Map** Experiment (CHIME) made using the Sun as a calibration source. As CHIME is a pure drift scan instrument, we rely on the seasonal North-South motion of the Sun to probe the beam at different elevations. This semiannual range in elevation, combined with the radio brightness of the Sun, enables a beam measurement which s…
▽ More
We present a beam pattern measurement of the Canadian Hydrogen Intensity Map** Experiment (CHIME) made using the Sun as a calibration source. As CHIME is a pure drift scan instrument, we rely on the seasonal North-South motion of the Sun to probe the beam at different elevations. This semiannual range in elevation, combined with the radio brightness of the Sun, enables a beam measurement which spans ~7,200 square degrees on the sky without the need to move the telescope. We take advantage of observations made near solar minimum to minimize the impact of solar variability, which is observed to be <10% in intensity over the observation period. The resulting data set is highly complementary to other CHIME beam measurements -- both in terms of angular coverage and systematics -- and plays an important role in the ongoing program to characterize the CHIME primary beam.
△ Less
Submitted 3 May, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
An Overview of CHIME, the Canadian Hydrogen Intensity Map** Experiment
Authors:
The CHIME Collaboration,
Mandana Amiri,
Kevin Bandura,
Anja Boskovic,
Tianyue Chen,
Jean-François Cliche,
Meiling Deng,
Nolan Denman,
Matt Dobbs,
Mateus Fandino,
Simon Foreman,
Mark Halpern,
David Hanna,
Alex S. Hill,
Gary Hinshaw,
Carolin Höfer,
Joseph Kania,
Peter Klages,
T. L. Landecker,
Joshua MacEachern,
Kiyoshi Masui,
Juan Mena-Parra,
Nikola Milutinovic,
Arash Mirhosseini,
Laura Newburgh
, et al. (18 additional authors not shown)
Abstract:
The Canadian Hydrogen Intensity Map** Experiment (CHIME) is a drift scan radio telescope operating across the 400-800 MHz band. CHIME is located at the Dominion Radio Astrophysical Observatory near Penticton, BC Canada. The instrument is designed to map neutral hydrogen over the redshift range 0.8 to 2.5 to constrain the expansion history of the Universe. This goal drives the design features of…
▽ More
The Canadian Hydrogen Intensity Map** Experiment (CHIME) is a drift scan radio telescope operating across the 400-800 MHz band. CHIME is located at the Dominion Radio Astrophysical Observatory near Penticton, BC Canada. The instrument is designed to map neutral hydrogen over the redshift range 0.8 to 2.5 to constrain the expansion history of the Universe. This goal drives the design features of the instrument. CHIME consists of four parallel cylindrical reflectors, oriented north-south, each 100 m $\times$ 20 m and outfitted with a 256 element dual-polarization linear feed array. CHIME observes a two degree wide stripe covering the entire meridian at any given moment, observing 3/4 of the sky every day due to Earth rotation. An FX correlator utilizes FPGAs and GPUs to digitize and correlate the signals, with different correlation products generated for cosmological, fast radio burst, pulsar, VLBI, and 21 cm absorber backends. For the cosmology backend, the $N_\mathrm{feed}^2$ correlation matrix is formed for 1024 frequency channels across the band every 31 ms. A data receiver system applies calibration and flagging and, for our primary cosmological data product, stacks redundant baselines and integrates for 10 s. We present an overview of the instrument, its performance metrics based on the first three years of science data, and we describe the current progress in characterizing CHIME's primary beam response. We also present maps of the sky derived from CHIME data; we are using versions of these maps for a cosmological stacking analysis as well as for investigation of Galactic foregrounds.
△ Less
Submitted 23 May, 2022; v1 submitted 19 January, 2022;
originally announced January 2022.
-
Learning by Active Forgetting for Neural Networks
Authors:
Jian Peng,
Xian Sun,
Min Deng,
Chao Tao,
Bo Tang,
Wenbo Li,
Guohua Wu,
QingZhu,
Yu Liu,
Tao Lin,
Haifeng Li
Abstract:
Remembering and forgetting mechanisms are two sides of the same coin in a human learning-memory system. Inspired by human brain memory mechanisms, modern machine learning systems have been working to endow machine with lifelong learning capability through better remembering while pushing the forgetting as the antagonist to overcome. Nevertheless, this idea might only see the half picture. Up until…
▽ More
Remembering and forgetting mechanisms are two sides of the same coin in a human learning-memory system. Inspired by human brain memory mechanisms, modern machine learning systems have been working to endow machine with lifelong learning capability through better remembering while pushing the forgetting as the antagonist to overcome. Nevertheless, this idea might only see the half picture. Up until very recently, increasing researchers argue that a brain is born to forget, i.e., forgetting is a natural and active process for abstract, rich, and flexible representations. This paper presents a learning model by active forgetting mechanism with artificial neural networks. The active forgetting mechanism (AFM) is introduced to a neural network via a "plug-and-play" forgetting layer (P\&PF), consisting of groups of inhibitory neurons with Internal Regulation Strategy (IRS) to adjust the extinction rate of themselves via lateral inhibition mechanism and External Regulation Strategy (ERS) to adjust the extinction rate of excitatory neurons via inhibition mechanism. Experimental studies have shown that the P\&PF offers surprising benefits: self-adaptive structure, strong generalization, long-term learning and memory, and robustness to data and parameter perturbation. This work sheds light on the importance of forgetting in the learning process and offers new perspectives to understand the underlying mechanisms of neural networks.
△ Less
Submitted 21 November, 2021;
originally announced November 2021.
-
Non-linear antidam** spin-orbit torque originating from intra-band transport on the warped surface of a topological insulator
Authors:
Yong-Long Zhou,
Hou-Jian Duan,
Yong-jia Wu,
Ming-Xun Deng,
Lan Wang,
Dimitrie Culcer,
Rui-Qiang Wang
Abstract:
Motivated by recent experiments observing a large antidam** spin-orbit torque (SOT) on the surface of a three-dimensional topological insulator, we investigate the origin of the current-induced SOT beyond linear-response theory. We find that a strong antidam** SOT arises from intraband transitions in non-linear response, and does not require interband transitions as is the case in linear trans…
▽ More
Motivated by recent experiments observing a large antidam** spin-orbit torque (SOT) on the surface of a three-dimensional topological insulator, we investigate the origin of the current-induced SOT beyond linear-response theory. We find that a strong antidam** SOT arises from intraband transitions in non-linear response, and does not require interband transitions as is the case in linear transport mechanisms. The joint effect of war** and an in-plane magnetization generates a non-linear antidam** SOT which can exceed the intrinsic one by several orders of magnitude, depending on war** parameter and the position of Fermi energy, and exhibits a complex dependence on the azimuthal angle of the magnetization. This nonlinear SOT provides an alternative explanation of the observed giant SOT in recent experiments.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
One-Bit ADCs/DACs based MIMO Radar: Performance Analysis and Joint Design
Authors:
Minglong Deng,
Ziyang Cheng,
Linlong Wu,
Bhavani Shankar,
Zishu He
Abstract:
Extremely low-resolution (e.g. one-bit) analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) can substantially reduce hardware cost and power consumption for MIMO radar especially with large scale antennas. In this paper, we focus on the detection performance analysis and joint design for the MIMO radar with one-bit ADCs and DACs. Specifically, under the assumption of low si…
▽ More
Extremely low-resolution (e.g. one-bit) analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) can substantially reduce hardware cost and power consumption for MIMO radar especially with large scale antennas. In this paper, we focus on the detection performance analysis and joint design for the MIMO radar with one-bit ADCs and DACs. Specifically, under the assumption of low signal-to-noise ratio (SNR) and interference-to-noise ratio (INR), we derive the expressions of probability of detection ($\mathcal{P}_d$) and probability of false alarm ($\mathcal{P}_f$) for one-bit MIMO radar and also the theoretical performance gap to infinite-bit MIMO radars for the noise-only case. We further find that for a fixed $\mathcal{P}_f$, $\mathcal{P}_d$ depends on the defined quantized signal-to-interference-plus-noise ratio (QSINR), which is a function of the transmit waveform and receive filter. Thus, an optimization problem arises naturally to maximize the QSINR by joint designing the waveform and filter. For the formulated problem, we propose an alternatin\emph{g} wavefo\emph{r}m and filt\emph{e}r d\emph{e}sign for QSINR maximiza\emph{t}ion (GREET). At each iteration of GREET, the receive filter is upadted via the minimum variance distortionless response (MVDR) method, and the one-bit waveform is optimized based on the alternating direction method of multipliers (ADMM) algorithm where the closed-form solutions are obtained for both the primary and slack variables. Numerical simulations are consistent to the theoretical performance analysis and demonstrate the effectiveness of the proposed design algorithm.
△ Less
Submitted 24 December, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation
Authors:
Mingkai Deng,
Bowen Tan,
Zhengzhong Liu,
Eric P. Xing,
Zhiting Hu
Abstract:
Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives and desires different properties of generated text. The complexity makes automatic evaluation of NLG particularly challenging. Previous work has typically focused on a single task and developed individual evaluation metrics based on specific intuitions. In this paper, we propose a unifying…
▽ More
Natural language generation (NLG) spans a broad range of tasks, each of which serves for specific objectives and desires different properties of generated text. The complexity makes automatic evaluation of NLG particularly challenging. Previous work has typically focused on a single task and developed individual evaluation metrics based on specific intuitions. In this paper, we propose a unifying perspective that facilitates the design of metrics for a wide range of language generation tasks and quality aspects. Based on the nature of information change from input to output, we classify NLG tasks into compression (e.g., summarization), transduction (e.g., text rewriting), and creation (e.g., dialog). The information alignment, or overlap, between input, context, and output text plays a common central role in characterizing the generation. Using the uniform concept of information alignment, we develop a family of interpretable metrics for various NLG tasks and aspects, often without need of gold reference data. To operationalize the metrics, we train self-supervised models to approximate information alignment as a prediction task. Experiments show the uniformly designed metrics achieve stronger or comparable correlations with human judgement compared to state-of-the-art metrics in each of diverse tasks, including text summarization, style transfer, and knowledge-grounded dialog. With information alignment as the intermediate representation, we deliver a composable library for easy NLG evaluation and future metric design.
△ Less
Submitted 21 January, 2022; v1 submitted 13 September, 2021;
originally announced September 2021.
-
ARGO: Modeling Heterogeneity in E-commerce Recommendation
Authors:
Daqing Wu,
Xiao Luo,
Zeyu Ma,
Chong Chen,
Minghua Deng,
**wen Ma
Abstract:
Nowadays, E-commerce is increasingly integrated into our daily lives. Meanwhile, shop** process has also changed incrementally from one behavior (purchase) to multiple behaviors (such as view, carting and purchase). Therefore, utilizing interaction data of auxiliary behavior data draws a lot of attention in the E-commerce recommender systems. However, all existing models ignore two kinds of intr…
▽ More
Nowadays, E-commerce is increasingly integrated into our daily lives. Meanwhile, shop** process has also changed incrementally from one behavior (purchase) to multiple behaviors (such as view, carting and purchase). Therefore, utilizing interaction data of auxiliary behavior data draws a lot of attention in the E-commerce recommender systems. However, all existing models ignore two kinds of intrinsic heterogeneity which are helpful to capture the difference of user preferences and the difference of item attributes. First (intra-heterogeneity), each user has multiple social identities with otherness, and these different identities can result in quite different interaction preferences. Second (inter-heterogeneity), each item can transfer an item-specific percentage of score from low-level behavior to high-level behavior for the gradual relationship among multiple behaviors. Thus, the lack of consideration of these heterogeneities damages recommendation rank performance. To model the above heterogeneities, we propose a novel method named intra- and inter-heterogeneity recommendation model (ARGO). Specifically, we embed each user into multiple vectors representing the user's identities, and the maximum of identity scores indicates the interaction preference. Besides, we regard the item-specific transition percentage as trainable transition probability between different behaviors. Extensive experiments on two real-world datasets show that ARGO performs much better than the state-of-the-art in multi-behavior scenarios.
△ Less
Submitted 14 September, 2021; v1 submitted 13 September, 2021;
originally announced September 2021.
-
Statistical computation methods for microbiome compositional data network inference
Authors:
Liang Chen,
Qiuyan He,
Hui Wan,
Shun He,
Minghua Deng
Abstract:
Microbes can affect processes from food production to human health. Such microbes are not isolated, but rather interact with each other and establish connections with their living environments. Understanding these interactions is essential to an understanding of the organization and complex interplay of microbial communities, as well as the structure and dynamics of various ecosystems. A common an…
▽ More
Microbes can affect processes from food production to human health. Such microbes are not isolated, but rather interact with each other and establish connections with their living environments. Understanding these interactions is essential to an understanding of the organization and complex interplay of microbial communities, as well as the structure and dynamics of various ecosystems. A common and essential approach toward this objective involves the inference of microbiome interaction networks. Although network inference methods in other fields have been studied before, applying these methods to estimate microbiome associations based on compositional data will not yield valid results. On the one hand, features of microbiome data such as compositionality, sparsity and high-dimensionality challenge the data normalization and the design of computational methods. On the other hand, several issues like microbial community heterogeneity, external environmental interference and biological concerns also make it more difficult to deal with the network inference. In this paper, we provide a comprehensive review of emerging microbiome interaction network inference methods. According to various assumptions and research targets, estimated networks are divided into four main categories: correlation networks, conditional correlation networks, mixture networks and differential networks. Their scope of applications, advantages and limitations are presented in this review. Since real microbial interactions can be complex and dynamic, no unifying method has captured all the aspects of interest to date. In addition, we discuss the challenges now confronting current microbial associations study and future prospects. Finally, we highlight that the research in microbial network inference requires the joint promotion of statistical computation methods and experimental techniques.
△ Less
Submitted 5 September, 2021;
originally announced September 2021.
-
Circuit complexity in proca theory
Authors:
Kun Meng,
Meihua Deng,
Jiaqiang Zhao,
Lianzhen Cao
Abstract:
In this paper, we study circuit complexity in Proca theory with Nielsen's approach and Fubini-Study (FS) metric approach. We place the fields on a lattice to gain a regularized theory, and obtain the ground state by adopting proper coordinates. We calculate complexities of the ground and thermofield double (TFD) states with Nielsen's approach, complexity of the TFD state is found to grows like a l…
▽ More
In this paper, we study circuit complexity in Proca theory with Nielsen's approach and Fubini-Study (FS) metric approach. We place the fields on a lattice to gain a regularized theory, and obtain the ground state by adopting proper coordinates. We calculate complexities of the ground and thermofield double (TFD) states with Nielsen's approach, complexity of the TFD state is found to grows like a logarithmic function. We quantize the Proca fields and give the approximate ground state and TFD state by acting unitary circuit operators on the associated reference states. The circuit lengths are calculated with FS metric, the minimal lengths are given according to the associated geometric spaces. The complexity of TFD state is found to grows linearly with time.
△ Less
Submitted 1 December, 2021; v1 submitted 16 August, 2021;
originally announced August 2021.
-
Neighborhood Consensus Contrastive Learning for Backward-Compatible Representation
Authors:
Shengsen Wu,
Liang Chen,
Yihang Lou,
Yan Bai,
Tao Bai,
Minghua Deng,
Lingyu Duan
Abstract:
In object re-identification (ReID), the development of deep learning techniques often involves model updates and deployment. It is unbearable to re-embedding and re-index with the system suspended when deploying new models. Therefore, backward-compatible representation is proposed to enable "new" features to be compared with "old" features directly, which means that the database is active when the…
▽ More
In object re-identification (ReID), the development of deep learning techniques often involves model updates and deployment. It is unbearable to re-embedding and re-index with the system suspended when deploying new models. Therefore, backward-compatible representation is proposed to enable "new" features to be compared with "old" features directly, which means that the database is active when there are both "new" and "old" features in it. Thus we can scroll-refresh the database or even do nothing on the database to update.
The existing backward-compatible methods either require a strong overlap between old and new training data or simply conduct constraints at the instance level. Thus they are difficult in handling complicated cluster structures and are limited in eliminating the impact of outliers in old embeddings, resulting in a risk of damaging the discriminative capability of new features. In this work, we propose a Neighborhood Consensus Contrastive Learning (NCCL) method. With no assumptions about the new training data, we estimate the sub-cluster structures of old embeddings. A new embedding is constrained with multiple old embeddings in both embedding space and discrimination space at the sub-class level. The effect of outliers diminished, as the multiple samples serve as "mean teachers". Besides, we also propose a scheme to filter the old embeddings with low credibility, further improving the compatibility robustness. Our method ensures backward compatibility without impairing the accuracy of the new model. And it can even improve the new model's accuracy in most scenarios.
△ Less
Submitted 8 March, 2023; v1 submitted 7 August, 2021;
originally announced August 2021.
-
Performance Prediction of InP/GaAsSb Double Heterojunction Bipolar Transistors for THz applications
Authors:
Xin Wen,
Akshay Arabhavi,
Wei Quan,
Olivier Ostinelli,
Chhandak Mukherjee,
Marina Deng,
Sébastien Frégonèse,
Thomas Zimmer,
Cristell Maneux,
Colombo R. Bolognesi,
Mathieu Luisier
Abstract:
The intrinsic performance of "type-II" InP/GaAsSb double heterojunction bipolar transistors (DHBTs) towards and beyond THz is predicted and analyzed based on a multi-scale technology computer aided design (TCAD) modeling platform calibrated against experimental measurements. Two-dimensional hydrodynamic simulations are combined with 1-D full-band, atomistic quantum transport calculations to shed l…
▽ More
The intrinsic performance of "type-II" InP/GaAsSb double heterojunction bipolar transistors (DHBTs) towards and beyond THz is predicted and analyzed based on a multi-scale technology computer aided design (TCAD) modeling platform calibrated against experimental measurements. Two-dimensional hydrodynamic simulations are combined with 1-D full-band, atomistic quantum transport calculations to shed light on future DHBT generations whose dimensions are decreased step-by-step, starting from the current device configuration. Simulations predict that a peak transit frequency $f_{T,peak}$ of around 1.6 THz could be reached in aggressively scaled type-II DHBTs with a total thickness of 256 nm and an emitter width $W_E$ of 37.5 nm. The corresponding breakdown voltage $BV_{CEO}$ is estimated to be 2.2 V. The investigations are put in perspective with two DHBT performance limiting factors, self-heating and breakdown characteristics.
△ Less
Submitted 18 July, 2021;
originally announced July 2021.
-
Sub-second periodicity in a fast radio burst
Authors:
The CHIME/FRB Collaboration,
Bridget C. Andersen,
Kevin Bandura,
Mohit Bhardwaj,
P. J. Boyle,
Charanjot Brar,
Daniela Breitman,
Tomas Cassanelli,
Shami Chatterjee,
Pragya Chawla,
Jean-François Cliche,
Davor Cubranic,
Alice P. Curtin,
Meiling Deng,
Matt Dobbs,
Fengqiu Adam Dong,
Emmanuel Fonseca,
B. M. Gaensler,
Utkarsh Giri,
Deborah C. Good,
Alex S. Hill,
Alexander Josephy,
J. F. Kaczmarek,
Zarif Kader,
Joseph Kania
, et al. (37 additional authors not shown)
Abstract:
Fast radio bursts (FRBs) are millisecond-duration flashes of radio waves that are visible at distances of billions of light-years. The nature of their progenitors and their emission mechanism remain open astrophysical questions. Here we report the detection of the multi-component FRB 20191221A and the identification of a periodic separation of 216.8(1) ms between its components with a significance…
▽ More
Fast radio bursts (FRBs) are millisecond-duration flashes of radio waves that are visible at distances of billions of light-years. The nature of their progenitors and their emission mechanism remain open astrophysical questions. Here we report the detection of the multi-component FRB 20191221A and the identification of a periodic separation of 216.8(1) ms between its components with a significance of 6.5 sigmas. The long (~3 s) duration and nine or more components forming the pulse profile make this source an outlier in the FRB population. Such short periodicity provides strong evidence for a neutron-star origin of the event. Moreover, our detection favours emission arising from the neutron-star magnetosphere, as opposed to emission regions located further away from the star, as predicted by some models.
△ Less
Submitted 12 July, 2022; v1 submitted 18 July, 2021;
originally announced July 2021.
-
Theory and simulation of electrokinetic fluctuations in electrolyte solutions at the mesoscale
Authors:
Mingge Deng,
Faisal Tushar,
Luis Bravo,
Anindya Ghoshal,
George Karniadakis,
Zhen Li
Abstract:
Electrolyte solutions play an important role in energy storage devices, whose performance highly relies on the electrokinetic processes at sub-micron scales.\ Although fluctuations and stochastic features become more critical at small scales, the long-range Coulomb interactions pose a particular challenge for both theoretical analysis and simulation of fluid systems with fluctuating hydrodynamic a…
▽ More
Electrolyte solutions play an important role in energy storage devices, whose performance highly relies on the electrokinetic processes at sub-micron scales.\ Although fluctuations and stochastic features become more critical at small scales, the long-range Coulomb interactions pose a particular challenge for both theoretical analysis and simulation of fluid systems with fluctuating hydrodynamic and electrostatic interactions. Here, we present a theoretical framework based on the Landau-Lifshitz theory to derive closed-form expressions of fluctuation correlations in electrolyte solutions, indicating significantly different decorrelation processes of ionic concentration fluctuations from hydrodynamic fluctuations, which provides insights for understanding transport phenomena of coupled fluctuating hydrodynamics and electrokinetics. Furthermore, we simulate fluctuating electrokinetic systems using both molecular dynamics (MD) with explicit ions and mesoscopic charged dissipative particle dynamics (cDPD) with semi-implicit ions, from which we identify that the spatial probability density functions of local charge density follow Gamma distribution at sub-nanometer scale (i.e., 0.3 nm) and converge to Gaussian distribution above nanometer scales (i.e., 1.55 nm), indicating the existence of a lower limit of length scale for mesoscale models using Gaussian fluctuations. The temporal correlation functions of both hydrodynamic and electrokinetic fluctuations are computed from all-atom MD and mesoscale cDPD simulations, showing a good agreement with the theoretical predictions based on the linearized fluctuating hydrodynamics theory.
△ Less
Submitted 12 July, 2021;
originally announced July 2021.