-
GenDet: Towards Good Generalizations for AI-Generated Image Detection
Authors:
Mingjian Zhu,
Hanting Chen,
Mouxiao Huang,
Wei Li,
Hailin Hu,
Jie Hu,
Yunhe Wang
Abstract:
The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news. Existing methods can effectively detect images generated by seen generators, but it is challenging to detect those generated by unseen generators. They do not concentrate on amplifying the output discrepancy when detectors process real versus fake images. T…
▽ More
The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news. Existing methods can effectively detect images generated by seen generators, but it is challenging to detect those generated by unseen generators. They do not concentrate on amplifying the output discrepancy when detectors process real versus fake images. This results in a close output distribution of real and fake samples, increasing classification difficulty in detecting unseen generators. This paper addresses the unseen-generator detection problem by considering this task from the perspective of anomaly detection and proposes an adversarial teacher-student discrepancy-aware framework. Our method encourages smaller output discrepancies between the student and the teacher models for real images while aiming for larger discrepancies for fake images. We employ adversarial learning to train a feature augmenter, which promotes smaller discrepancies between teacher and student networks when the inputs are fake images. Our method has achieved state-of-the-art on public benchmarks, and the visualization results show that a large output discrepancy is maintained when faced with various types of generators.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics
Authors:
Wenqian Zhang,
Molin Huang,
Yuxuan Zhou,
Juze Zhang,
**gyi Yu,
**gya Wang,
Lan Xu
Abstract:
The recently emerging text-to-motion advances have spired numerous attempts for convenient and interactive human motion generation. Yet, existing methods are largely limited to generating body motions only without considering the rich two-hand motions, let alone handling various conditions like body dynamics or texts. To break the data bottleneck, we propose BOTH57M, a novel multi-modal dataset fo…
▽ More
The recently emerging text-to-motion advances have spired numerous attempts for convenient and interactive human motion generation. Yet, existing methods are largely limited to generating body motions only without considering the rich two-hand motions, let alone handling various conditions like body dynamics or texts. To break the data bottleneck, we propose BOTH57M, a novel multi-modal dataset for two-hand motion generation. Our dataset includes accurate motion tracking for the human body and hands and provides pair-wised finger-level hand annotations and body descriptions. We further provide a strong baseline method, BOTH2Hands, for the novel task: generating vivid two-hand motions from both implicit body dynamics and explicit text prompts. We first warm up two parallel body-to-hand and text-to-hand diffusion models and then utilize the cross-attention transformer for motion blending. Extensive experiments and cross-validations demonstrate the effectiveness of our approach and dataset for generating convincing two-hand motions from the hybrid body-and-textual conditions. Our dataset and code will be disseminated to the community for future research.
△ Less
Submitted 10 April, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
The FAST all sky HI survey (FASHI): The first release of catalog
Authors:
Chuan-Peng Zhang,
M. Zhu,
P. Jiang,
C. Cheng,
J. Wang,
J. Wang,
J. -L. Xu,
X. -L. Liu,
N. -P. Yu,
L. Qian,
H. Yu,
M. Ai,
Y. **g,
C. Xu,
Z. Liu,
X. Guan,
C. Sun,
Q. Yang,
M. Huang,
Q. Hao,
FAST Collaboration
Abstract:
The FAST All Sky HI survey (FASHI) was designed to cover the entire sky observable by the Five-hundred-meter Aperture Spherical radio Telescope (FAST), spanning approximately 22000 square degrees of declination between -14 deg and +66 deg, and in the frequency range of 1050-1450 MHz, with the expectation of eventually detecting more than 100000 HI sources. Between August 2020 and June 2023, FASHI…
▽ More
The FAST All Sky HI survey (FASHI) was designed to cover the entire sky observable by the Five-hundred-meter Aperture Spherical radio Telescope (FAST), spanning approximately 22000 square degrees of declination between -14 deg and +66 deg, and in the frequency range of 1050-1450 MHz, with the expectation of eventually detecting more than 100000 HI sources. Between August 2020 and June 2023, FASHI had covered more than 7600 square degrees, which is approximately 35% of the total sky observable by FAST. It has a median detection sensitivity of around 0.76 mJy/beam and a spectral line velocity resolution of ~6.4 km/s at a frequency of ~1.4 GHz. As of now, a total of 41741 extragalactic HI sources have been detected in the frequency range 1305.5-1419.5 MHz, corresponding to a redshift limit of z<0.09. By cross-matching FASHI sources with the Siena Galaxy Atlas (SGA) and the Sloan Digital Sky Survey (SDSS) catalogs, we found that 16972 (40.7%) sources have spectroscopic redshifts and 10975 (26.3%) sources have only photometric redshifts. Most of the remaining 13794 (33.0%) HI sources are located in the direction of the Galactic plane, making their optical counterparts difficult to identify due to high extinction or high contamination of Galactic stellar sources. Based on current survey results, the FASHI survey is an unprecedented blind extragalactic HI survey. It has higher spectral and spatial resolution and broader coverage than the Arecibo Legacy Fast ALFA Survey (ALFALFA). When completed, FASHI will provide the largest extragalactic HI catalog and an objective view of HI content and large-scale structure in the local universe.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Towards the Inferrence of Structural Similarity of Combinatorial Landscapes
Authors:
Mingyu Huang,
Ke Li
Abstract:
One of the most common problem-solving heuristics is by analogy. For a given problem, a solver can be viewed as a strategic walk on its fitness landscape. Thus if a solver works for one problem instance, we expect it will also be effective for other instances whose fitness landscapes essentially share structural similarities with each other. However, due to the black-box nature of combinatorial op…
▽ More
One of the most common problem-solving heuristics is by analogy. For a given problem, a solver can be viewed as a strategic walk on its fitness landscape. Thus if a solver works for one problem instance, we expect it will also be effective for other instances whose fitness landscapes essentially share structural similarities with each other. However, due to the black-box nature of combinatorial optimization, it is far from trivial to infer such similarity in real-world scenarios. To bridge this gap, by using local optima network as a proxy of fitness landscapes, this paper proposed to leverage graph data mining techniques to conduct qualitative and quantitative analyses to explore the latent topological structural information embedded in those landscapes. By conducting large-scale empirical experiments on three classic combinatorial optimization problems, we gain concrete evidence to support the existence of structural similarity between landscapes of the same classes within neighboring dimensions. We also interrogated the relationship between landscapes of different problem classes.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Efficient LDPC Decoding using Physical Computation
Authors:
Uday Kumar Reddy Vengalam,
Andrew Hahn,
Yongchao Liu,
Anshujit Sharma,
Hui Wu,
Michael Huang
Abstract:
Due to 5G deployment, there is significant interest in LDPC decoding. While much research is devoted on efficient hardwiring of algorithms based on Belief Propagation (BP), it has been shown that LDPC decoding can be formulated as a combinatorial optimization problem, which could benefit from significant acceleration of physical computation mechanisms such as Ising machines. This approach has so f…
▽ More
Due to 5G deployment, there is significant interest in LDPC decoding. While much research is devoted on efficient hardwiring of algorithms based on Belief Propagation (BP), it has been shown that LDPC decoding can be formulated as a combinatorial optimization problem, which could benefit from significant acceleration of physical computation mechanisms such as Ising machines. This approach has so far resulted in poor performance. This paper shows that the reason is not fundamental but suboptimal hardware and formulation. A co-designed Ising machine-based system can improve speed by 3 orders of magnitude. As a result, a physical computation approach can outperform hardwiring state-of-the-art algorithms. In this paper, we show such an augmented Ising machine that is 4.4$\times$ more energy efficient than the state of the art in the literature.
△ Less
Submitted 20 September, 2023;
originally announced December 2023.
-
Non-saturation intensity dependence of anisotropic third-order optical nonlinearity approaching the damage threshold in ZnSe and GaP
Authors:
Jianpeng Ye,
Min Huang
Abstract:
The intensity dependence of anisotropic third-order optical nonlinearity approaching the damage threshold in ZnSe and GaP crystals is studied by the femtosecond laser pump-probe measurements, which can greatly reduce the laser-matter interaction length and thus realize the probing of orientation-dependent characteristics of nonlinear optical phenomena in the near-damage-threshold intensity regime…
▽ More
The intensity dependence of anisotropic third-order optical nonlinearity approaching the damage threshold in ZnSe and GaP crystals is studied by the femtosecond laser pump-probe measurements, which can greatly reduce the laser-matter interaction length and thus realize the probing of orientation-dependent characteristics of nonlinear optical phenomena in the near-damage-threshold intensity regime without significant photon depletion. In the measured transient 3D map, the typical third-order nonlinear optical signals of two-beam coupling (TBC) and two-photon absorption (TPA) can be clearly found out, which both exhibit the pronounced orientation-dependent periodic modulation corresponding to a specific lattice symmetry. Interestingly, the further fixed-delay-time measurements focusing on TBC and TPA confirm that the modulation amplitude of the orientation-dependent curves always increases with the increase of pump intensity towards the damage threshold, which has not been observed in previous studies. Such a definite upward trend of orientation-dependent third-order nonlinear optical effects in the near-damage-threshold regime indicate that, as long as the laser-matter interaction length is small enough, the third-order nonlinear optical phenomena can still be in a non-saturation physical regime till the damage threshold, and thus exhibit significant crystallographic dependence as that of laser-induced damage at the similar intensity ranges.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Metastability and anharmonicity enhance defect-assisted nonradiative recombination in low-symmetry semiconductors
Authors:
Menglin Huang,
Shanshan Wang,
Shiyou Chen
Abstract:
Strong nonradiative recombination has been observed in quasi-one-dimensional antimony selenide, which runs counter to the simple intuition that claims high defect tolerance exists in semiconductors with antibonding state in the valence band and bonding state in the conduction band. Here we reveal such a defect intolerance actually stems from the richness of structural metastability and vibrational…
▽ More
Strong nonradiative recombination has been observed in quasi-one-dimensional antimony selenide, which runs counter to the simple intuition that claims high defect tolerance exists in semiconductors with antibonding state in the valence band and bonding state in the conduction band. Here we reveal such a defect intolerance actually stems from the richness of structural metastability and vibrational anharmonicity owing to the low-symmetry atomic structure. Taking the deep defect V$_{\rm Se}$ as a benchmark, we show the defect with its ground-state configuration alone does not act as a recombination center. Instead, we identify three different configurations with different formation energies, such richness of metastability offers a higher probability to accomplish a rapid recombination cycle. Another contributing factor is the anharmonicity in the potential energy surfaces that is caused by the large atomic relaxation, which elevates the total capture coefficient by 2-3 orders of magnitude compared with harmonic approximation. Therefore, the unique properties from both crystals and phonons in quasi-one-dimensional system enhance the nonradiative recombination, making the traditional intuition of defect tolerance invalid. These results highlight the importance of the correct identification of metastable defects and phonon anharmonicity in the nonradiative recombination in low-symmetry semiconductors.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
AlignBench: Benchmarking Chinese Alignment of Large Language Models
Authors:
Xiao Liu,
Xuanyu Lei,
Shengyuan Wang,
Yue Huang,
Zhuoer Feng,
Bosi Wen,
Jiale Cheng,
Pei Ke,
Yifan Xu,
Weng Lam Tam,
Xiaohan Zhang,
Lichao Sun,
Hongning Wang,
**g Zhang,
Minlie Huang,
Yuxiao Dong,
Jie Tang
Abstract:
Alignment has become a critical step for instruction-tuned Large Language Models (LLMs) to become helpful assistants. However, effective evaluation of alignment for emerging Chinese LLMs is still significantly lacking, calling for real-scenario grounded, open-ended, challenging and automatic evaluations tailored for alignment. To fill in this gap, we introduce AlignBench, a comprehensive multi-dim…
▽ More
Alignment has become a critical step for instruction-tuned Large Language Models (LLMs) to become helpful assistants. However, effective evaluation of alignment for emerging Chinese LLMs is still significantly lacking, calling for real-scenario grounded, open-ended, challenging and automatic evaluations tailored for alignment. To fill in this gap, we introduce AlignBench, a comprehensive multi-dimensional benchmark for evaluating LLMs' alignment in Chinese. Equipped with a human-in-the-loop data curation pipeline, our benchmark employs a rule-calibrated multi-dimensional LLM-as-Judge with Chain-of-Thought to generate explanations and final ratings as evaluations, ensuring high reliability and interpretability. Furthermore, we report AlignBench evaluated by CritiqueLLM, a dedicated Chinese evaluator LLM that recovers 95% of GPT-4's evaluation ability. We will provide public APIs for evaluating AlignBench with CritiqueLLM to facilitate the evaluation of LLMs' Chinese alignment. All evaluation codes, data, and LLM generations are available at \url{https://github.com/THUDM/AlignBench}.
△ Less
Submitted 5 December, 2023; v1 submitted 30 November, 2023;
originally announced November 2023.
-
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
Authors:
Pei Ke,
Bosi Wen,
Zhuoer Feng,
Xiao Liu,
Xuanyu Lei,
Jiale Cheng,
Shengyuan Wang,
Aohan Zeng,
Yuxiao Dong,
Hongning Wang,
Jie Tang,
Minlie Huang
Abstract:
Since the natural language processing (NLP) community started to make large language models (LLMs) act as a critic to evaluate the quality of generated texts, most of the existing works train a critique generation model on the evaluation data labeled by GPT-4's direct prompting. We observe that these models lack the ability to generate informative critiques in both pointwise grading and pairwise c…
▽ More
Since the natural language processing (NLP) community started to make large language models (LLMs) act as a critic to evaluate the quality of generated texts, most of the existing works train a critique generation model on the evaluation data labeled by GPT-4's direct prompting. We observe that these models lack the ability to generate informative critiques in both pointwise grading and pairwise comparison especially without references. As a result, their generated critiques cannot provide fine-grained distinguishability on generated texts, causing unsatisfactory evaluation performance. In this paper, we propose a simple yet effective method called Eval-Instruct, which can first acquire pointwise grading critiques with pseudo references and then revise these critiques via multi-path prompting to obtain informative evaluation data in different tasks and settings, including pointwise grading and pairwise comparison with / without references. After fine-tuning on these data, the resulting model CritiqueLLM is empirically shown to outperform ChatGPT and all the open-source baselines and even achieve comparable evaluation performance to GPT-4 in system-level correlations of pointwise grading. We also demonstrate that our generated critiques can act as scalable feedback to further improve the generation quality of strong LLMs like ChatGPT.
△ Less
Submitted 26 June, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Unveiling the Implicit Toxicity in Large Language Models
Authors:
Jiaxin Wen,
Pei Ke,
Hao Sun,
Zhexin Zhang,
Chengfei Li,
**feng Bai,
Minlie Huang
Abstract:
The open-endedness of large language models (LLMs) combined with their impressive capabilities may lead to new safety issues when being exploited for malicious use. While recent studies primarily focus on probing toxic outputs that can be easily detected with existing toxicity classifiers, we show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via…
▽ More
The open-endedness of large language models (LLMs) combined with their impressive capabilities may lead to new safety issues when being exploited for malicious use. While recent studies primarily focus on probing toxic outputs that can be easily detected with existing toxicity classifiers, we show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via simply zero-shot prompting. Moreover, we propose a reinforcement learning (RL) based attacking method to further induce the implicit toxicity in LLMs. Specifically, we optimize the language model with a reward that prefers implicit toxic outputs to explicit toxic and non-toxic ones. Experiments on five widely-adopted toxicity classifiers demonstrate that the attack success rate can be significantly improved through RL fine-tuning. For instance, the RL-finetuned LLaMA-13B model achieves an attack success rate of 90.04% on BAD and 62.85% on Davinci003. Our findings suggest that LLMs pose a significant threat in generating undetectable implicit toxic outputs. We further show that fine-tuning toxicity classifiers on the annotated examples from our attacking method can effectively enhance their ability to detect LLM-generated implicit toxic language. The code is publicly available at https://github.com/thu-coai/Implicit-Toxicity.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models
Authors:
**feng Zhou,
Zhuang Chen,
Dazhen Wan,
Bosi Wen,
Yi Song,
Jifan Yu,
Yongkang Huang,
Libiao Peng,
Jiaming Yang,
Xiyao Xiao,
Sahand Sabour,
Xiaohan Zhang,
Wen**g Hou,
Yijia Zhang,
Yuxiao Dong,
Jie Tang,
Minlie Huang
Abstract:
In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters. Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a conversational AI system with character customization for satisfying people's inherent social desires and emotional needs. On top of CharacterGLM, we can custom…
▽ More
In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters. Our CharacterGLM is designed for generating Character-based Dialogues (CharacterDial), which aims to equip a conversational AI system with character customization for satisfying people's inherent social desires and emotional needs. On top of CharacterGLM, we can customize various AI characters or social agents by configuring their attributes (identities, interests, viewpoints, experiences, achievements, social relationships, etc.) and behaviors (linguistic features, emotional expressions, interaction patterns, etc.). Our model outperforms most mainstream close-source large langauge models, including the GPT series, especially in terms of consistency, human-likeness, and engagement according to manual evaluations. We will release our 6B version of CharacterGLM and a subset of training data to facilitate further research development in the direction of character-based dialogue generation.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Strong Interference HVSR Data Processing and Denoising: HVSR Curve Reconstruction Method based on UPEMD
Authors:
Bingxuan Song,
Fuxing Han,
Yubei Chen,
Linjun Wu,
Mengting Huang,
Yanjie Pan
Abstract:
Urban areas pose a challenge for the application of the H/V method due to a high degree of artificial noise. The existing methods fall short in reducing the noise of strong interference data. To solve this issue, a new approach called the HVSR curve reconstruction method is introduced in this paper. The method employs the UPEMD technique to analyze the data component, and the extracted signal is e…
▽ More
Urban areas pose a challenge for the application of the H/V method due to a high degree of artificial noise. The existing methods fall short in reducing the noise of strong interference data. To solve this issue, a new approach called the HVSR curve reconstruction method is introduced in this paper. The method employs the UPEMD technique to analyze the data component, and the extracted signal is evaluated based on the correlation coefficient between the IMFs and the original micro-motion data, trend extraction of micro-motion data, and secondary extraction. This signal is then utilized to retrieve information about the layers, and the effectiveness of the proposed method is demonstrated.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
On the Hyperparameter Loss Landscapes of Machine Learning Models: An Exploratory Study
Authors:
Mingyu Huang,
Ke Li
Abstract:
Previous efforts on hyperparameter optimization (HPO) of machine learning (ML) models predominately focus on algorithmic advances, yet little is known about the topography of the underlying hyperparameter (HP) loss landscape, which plays a fundamental role in governing the search process of HPO. While several works have conducted fitness landscape analysis (FLA) on various ML systems, they are lim…
▽ More
Previous efforts on hyperparameter optimization (HPO) of machine learning (ML) models predominately focus on algorithmic advances, yet little is known about the topography of the underlying hyperparameter (HP) loss landscape, which plays a fundamental role in governing the search process of HPO. While several works have conducted fitness landscape analysis (FLA) on various ML systems, they are limited to properties of isolated landscape without interrogating the potential structural similarities among them. The exploration of such similarities can provide a novel perspective for understanding the mechanism behind modern HPO methods, but has been missing, possibly due to the expensive cost of large-scale landscape construction, and the lack of effective analysis methods. In this paper, we mapped 1,500 HP loss landscapes of 6 representative ML models on 63 datasets across different fidelity levels, with 11M+ configurations. By conducting exploratory analysis on these landscapes with fine-grained visualizations and dedicated FLA metrics, we observed a similar landscape topography across a wide range of models, datasets, and fidelities, and shed light on several central topics in HPO.
△ Less
Submitted 24 May, 2024; v1 submitted 23 November, 2023;
originally announced November 2023.
-
Local control of a single nitrogen-vacancy center by nanoscale engineered magnetic domain wall motions
Authors:
Nathan J. McLaughlin,
Senlei Li,
Jeffrey A. Brock,
Shu Zhang,
Hanyi Lu,
Mengqi Huang,
Yuxuan Xiao,
**gcheng Zhou,
Yaroslav Tserkovnyak,
Eric E. Fullerton,
Hailong Wang,
Chunhui Rita Du
Abstract:
Effective control and readout of qubits form the technical foundation of next-generation, transformative quantum information sciences and technologies. The nitrogen-vacancy (NV) center, an intrinsic three-level spin system, is naturally relevant in this context due to its excellent quantum coherence, high fidelity of operations, and remarkable functionality over a broad range of experimental condi…
▽ More
Effective control and readout of qubits form the technical foundation of next-generation, transformative quantum information sciences and technologies. The nitrogen-vacancy (NV) center, an intrinsic three-level spin system, is naturally relevant in this context due to its excellent quantum coherence, high fidelity of operations, and remarkable functionality over a broad range of experimental conditions. It is an active contender for the development and implementation of cutting-edge quantum technologies. Here, we report magnetic domain wall motion driven local control and measurements of NV spin properties. By engineering the local magnetic field environment of an NV center via nanoscale reconfigurable domain wall motions, we show that NV photoluminescence, spin level energies, and coherence time can be reliably controlled and correlated to the magneto-transport response of a magnetic device. Our results highlight the electrically tunable dipole interaction between NV centers and nanoscale magnetic structures, providing an attractive platform to realize interactive information transfer between spin qubits and non-volatile magnetic memory in hybrid quantum spintronic systems.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Prominent Josephson tunneling between twisted single copper oxide planes of Bi$_2$Sr$_{2-x}$LaxCuO$_{6+y}$
Authors:
Heng Wang,
Yuying Zhu,
Zhonghua Bai,
Zechao Wang,
Shuxu Hu,
Hong-Yi Xie,
Xiaopeng Hu,
Jian Cui,
Miaoling Huang,
Jianhao Chen,
Ying Ding,
Lin Zhao,
Xinyan Li,
Qinghua Zhang,
Lin Gu,
X. J. Zhou,
**g Zhu,
Ding Zhang,
Qi-Kun Xue
Abstract:
Josephson tunneling in twisted cuprate junctions provides a litmus test for the pairing symmetry, which is fundamental for understanding the microscopic mechanism of high temperature superconductivity. This issue is rekindled by experimental advances in van der Waals stacking and the proposal of an emergent d+id-wave. So far, all experiments have been carried out on Bi$_2$Sr$_2$CaCu$_2$O$_{8+x}$ (…
▽ More
Josephson tunneling in twisted cuprate junctions provides a litmus test for the pairing symmetry, which is fundamental for understanding the microscopic mechanism of high temperature superconductivity. This issue is rekindled by experimental advances in van der Waals stacking and the proposal of an emergent d+id-wave. So far, all experiments have been carried out on Bi$_2$Sr$_2$CaCu$_2$O$_{8+x}$ (Bi-2212) with double CuO$_2$ planes but show controversial results. Here, we investigate junctions made of Bi$_2$Sr$_{2-x}$La$_x$CuO$_{6+y}$ (Bi-2201) with single CuO$_2$ planes. Our on-site cold stacking technique ensures uncompromised crystalline quality and stoichiometry at the interface. Junctions with carefully calibrated twist angles around 45° show strong Josephson tunneling and conventional temperature dependence. Furthermore, we observe standard Fraunhofer diffraction patterns and integer Fiske steps in a junction with a twist angle of 45.0$\pm$0.2°. Together, these results pose strong constraints on the d or d+id-wave pairing and suggest an indispensable isotropic pairing component.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
LightEMU: Hardware Assisted Fuzzing of Trusted Applications
Authors:
Haoqi Shan,
Sravani Nissankararao,
Yujia Liu,
Moyao Huang,
Shuo Wang,
Yier **,
Dean Sullivan
Abstract:
Trusted Execution Environments (TEEs) are deployed in many CPU designs because of the confidentiality and integrity guarantees they provide. ARM TrustZone is a TEE extensively deployed on smart phones, IoT devices, and notebooks. Specifically, TrustZone is used to separate code execution and data into two worlds, normal world and secure world. However, this separation inherently prevents tradition…
▽ More
Trusted Execution Environments (TEEs) are deployed in many CPU designs because of the confidentiality and integrity guarantees they provide. ARM TrustZone is a TEE extensively deployed on smart phones, IoT devices, and notebooks. Specifically, TrustZone is used to separate code execution and data into two worlds, normal world and secure world. However, this separation inherently prevents traditional fuzzing approaches which rely upon coverage-guided feedback and existing fuzzing research is, therefore, extremely limited. In this paper, we present a native and generic method to perform efficient and scalable feedback-driven fuzzing on Trusted Applications (TAs) using ARM CoreSight. We propose LightEMU, a novel fuzzing framework that allows us to fuzz TAs by decoupling them from relied TEE. We argue that LightEMU is a promising first-stage approach for rapidly discovering TA vulnerabilities prior to investing effort in whole system TEE evaluation precisely because the majority of publicly disclosed TrustZone bugs reside in the TA code itself. We implement LightEMU and adapt it to Teegris, Trusty, OP-TEE and QSEE and evaluate 8 real-world TAs while triggering 3 unique crashes and achieving x10 time speedup when fuzzing TAs using the state-of-the-art TrustZone fuzzing framework.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Authors:
Zhexin Zhang,
Junxiao Yang,
Pei Ke,
Fei Mi,
Hongning Wang,
Minlie Huang
Abstract:
While significant attention has been dedicated to exploiting weaknesses in LLMs through jailbreaking attacks, there remains a paucity of effort in defending against these attacks. We point out a pivotal factor contributing to the success of jailbreaks: the intrinsic conflict between the goals of being helpful and ensuring safety. Accordingly, we propose to integrate goal prioritization at both tra…
▽ More
While significant attention has been dedicated to exploiting weaknesses in LLMs through jailbreaking attacks, there remains a paucity of effort in defending against these attacks. We point out a pivotal factor contributing to the success of jailbreaks: the intrinsic conflict between the goals of being helpful and ensuring safety. Accordingly, we propose to integrate goal prioritization at both training and inference stages to counteract. Implementing goal prioritization during inference substantially diminishes the Attack Success Rate (ASR) of jailbreaking from 66.4% to 3.6% for ChatGPT. And integrating goal prioritization into model training reduces the ASR from 71.0% to 6.6% for Llama2-13B. Remarkably, even in scenarios where no jailbreaking samples are included during training, our approach slashes the ASR by half. Additionally, our findings reveal that while stronger LLMs face greater safety risks, they also possess a greater capacity to be steered towards defending against such attacks, both because of their stronger ability in instruction following. Our work thus contributes to the comprehension of jailbreaking attacks and defenses, and sheds light on the relationship between LLMs' capability and safety. Our code is available at \url{https://github.com/thu-coai/JailbreakDefense_GoalPriority}.
△ Less
Submitted 12 June, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
HeLM: Highlighted Evidence augmented Language Model for Enhanced Table-to-Text Generation
Authors:
Junyi Bian,
Xiaolei Qin,
Wuhe Zou,
Mengzuo Huang,
Congyi Luo,
Ke Zhang,
Weidong Zhang
Abstract:
Large models have demonstrated significant progress across various domains, particularly in tasks related to text generation. In the domain of Table to Text, many Large Language Model (LLM)-based methods currently resort to modifying prompts to invoke public APIs, incurring potential costs and information leaks. With the advent of open-source large models, fine-tuning LLMs has become feasible. In…
▽ More
Large models have demonstrated significant progress across various domains, particularly in tasks related to text generation. In the domain of Table to Text, many Large Language Model (LLM)-based methods currently resort to modifying prompts to invoke public APIs, incurring potential costs and information leaks. With the advent of open-source large models, fine-tuning LLMs has become feasible. In this study, we conducted parameter-efficient fine-tuning on the LLaMA2 model. Distinguishing itself from previous fine-tuning-based table-to-text methods, our approach involves injecting reasoning information into the input by emphasizing table-specific row data. Our model consists of two modules: 1) a table reasoner that identifies relevant row evidence, and 2) a table summarizer that generates sentences based on the highlighted table. To facilitate this, we propose a search strategy to construct reasoning labels for training the table reasoner. On both the FetaQA and QTSumm datasets, our approach achieved state-of-the-art results. Additionally, we observed that highlighting input tables significantly enhances the model's performance and provides valuable interpretability.
△ Less
Submitted 27 April, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Schur indices for $\mathcal{N}=4$ super-Yang-Mills with more general gauge groups
Authors:
Bao-ning Du,
Min-xin Huang,
Xin Wang
Abstract:
We study the unflavored Schur indices in the $\mathcal{N}=4$ super-Yang-Mills theory for the $B_n,C_n,D_n, G_2$ gauge groups. We explore two methods, namely the character expansion method and the Fermi gas method, to efficiently compute the $q$-series expansion of the Schur indices to some high orders. Using the available data and the modular properties, we are able to fix the exact formulas for t…
▽ More
We study the unflavored Schur indices in the $\mathcal{N}=4$ super-Yang-Mills theory for the $B_n,C_n,D_n, G_2$ gauge groups. We explore two methods, namely the character expansion method and the Fermi gas method, to efficiently compute the $q$-series expansion of the Schur indices to some high orders. Using the available data and the modular properties, we are able to fix the exact formulas for the general gauge groups up to some high ranks and discover some interesting new features. We also identify some empirical modular anomaly equations, but unlike the case of $A_n$ groups, they are quite complicated and not sufficiently useful to fix exact formulas for gauge groups of arbitrary rank.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
Authors:
Jiale Cheng,
Xiao Liu,
Kehan Zheng,
Pei Ke,
Hongning Wang,
Yuxiao Dong,
Jie Tang,
Minlie Huang
Abstract:
Large language models (LLMs) have shown impressive success in various applications. However, these models are often not well aligned with human intents, which calls for additional treatments on them; that is, the alignment problem. To make LLMs better follow user instructions, existing alignment methods primarily focus on further training them. However, the extra training of LLMs is usually expens…
▽ More
Large language models (LLMs) have shown impressive success in various applications. However, these models are often not well aligned with human intents, which calls for additional treatments on them; that is, the alignment problem. To make LLMs better follow user instructions, existing alignment methods primarily focus on further training them. However, the extra training of LLMs is usually expensive in terms of GPU computing; even worse, some LLMs are not accessible for user-demanded training, such as GPTs. In this work, we take a different perspective -- Black-Box Prompt Optimization (BPO) -- to perform alignments. The idea is to optimize user prompts to suit LLMs' input understanding, so as to best realize users' intents without updating LLMs' parameters. BPO leverages human preferences to optimize prompts, thus making it superior to LLM (e.g., ChatGPT) as a prompt engineer. Moreover, BPO is model-agnostic, and the empirical results demonstrate that the BPO-aligned ChatGPT yields a 22% increase in the win rate against its original version and 10% for GPT-4. Notably, the BPO-aligned LLMs can outperform the same models aligned by PPO and DPO, and it also brings additional performance gains when combining BPO with PPO or DPO. Code and datasets are released at https://github.com/thu-coai/BPO.
△ Less
Submitted 21 June, 2024; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Dimensionality crossover to 2D vestigial nematicity from 3D zigzag antiferromagnetism in an XY-type honeycomb van der Waals magnet
Authors:
Zeliang Sun,
Gaihua Ye,
Mengqi Huang,
Chengkang Zhou,
Nan Huang,
Qiuyang Li,
Zhipeng Ye,
Cynthia Nnokwe,
Hui Deng,
David Mandrus,
Zi Yang Meng,
Kai Sun,
Chunhui Du,
Rui He,
Liuyan Zhao
Abstract:
Fluctuations and disorder effects are substantially enhanced in reduced dimensionalities. While they are mostly considered as the foe for long-range orders, fluctuations and disorders can also stimulate the emergence of novel phases of matter, for example, vestigial orders. Taking 2D magnetism as a platform, existing efforts have been focused on maintaining 2D long-range magnetic orders by suppres…
▽ More
Fluctuations and disorder effects are substantially enhanced in reduced dimensionalities. While they are mostly considered as the foe for long-range orders, fluctuations and disorders can also stimulate the emergence of novel phases of matter, for example, vestigial orders. Taking 2D magnetism as a platform, existing efforts have been focused on maintaining 2D long-range magnetic orders by suppressing fluctuations, whereas the other side, exploiting fluctuations for realizing new 2D magnetic phases, remains as an uncharted territory. Here, using a combination of NV spin relaxometry, optical spectroscopy, and Monte Carlo simulations, we report, in an XY-type honeycomb magnet NiPS3, the phase transition from the zigzag AFM order in 3D bulk to a new Z3 vestigial Potts-nematicity in 2D few layers. Spin fluctuations are shown to significantly enhance over the GHz-THz range as the layer number of NiPS3 reduces, using the NV spin relaxometry and the optical Raman quasi-elastic scattering. As a result, the Raman signatures of the zigzag AFM for bulk NiPS3, a zone-folded phonon at ~30cm-1 from the broken translational symmetry (PBTS) and a degeneracy lift of two phonons at ~180cm-1 for the broken 3-fold rotational symmetry (PBRS), evolve into the disappearance of PBTS and the survival of PBRS in few-layer NiPS3, with a critical thickness of ~10nm. The optical linear dichroism microscopy images all three nematic domain states in a single few-layer NiPS3 flake. The large-scale Monte Carlo simulations for bilayer NiPS3 model confirms the absence of long-range zigzag AFM order but the formation of the Z3 vestigial Potts-nematic phase, corroborating with the experimental finding. Our results demonstrate the positivity of strong fluctuations in creating new phases of matter after destroying more conventional ones, and offer an unprecedented pathway for develo** novel 2D phases.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Last fall degree of semi-local polynomial systems
Authors:
Ming-Deh A. Huang
Abstract:
We study the last fall degrees of {\em semi-local} polynomial systems, and the computational complexity of solving such systems for closed-point and rational-point solutions, where the systems are defined over a finite field. A semi-local polynomial system specifies an algebraic set which is the image of a global linear transformation of a direct product of local affine algebraic sets. As a specia…
▽ More
We study the last fall degrees of {\em semi-local} polynomial systems, and the computational complexity of solving such systems for closed-point and rational-point solutions, where the systems are defined over a finite field. A semi-local polynomial system specifies an algebraic set which is the image of a global linear transformation of a direct product of local affine algebraic sets. As a special but interesting case, polynomial systems that arise from Weil restriction of algebraic sets in an affine space of low dimension are semi-local. Such systems have received considerable attention due to their application in cryptography. Our main results bound the last fall degree of a semi-local polynomial system in terms of the number of closed point solutions, and yield an efficient algorithm for finding all rational-point solutions when the prime characteristic of the finite field and the number of rational solutions are small. Our results on solving semi-local systems imply an improvement on a previously known polynomial-time attack on the HFE (Hidden Field Equations) cryptosystems. The attacks implied in our results extend to public key encryption functions which are based on semi-local systems where either the number of closed point solutions is small, or the characteristic of the field is small. It remains plausible to construct public key cryptosystems based on semi-local systems over a finite field of large prime characteristic with exponential number of closed point solutions. Such a method is presented in the paper, followed by further cryptanalysis involving the isomorphism of polynomials (IP) problem, as well as a concrete public key encryption scheme which is secure against all the attacks discussed in this paper.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
From Plate to Production: Artificial Intelligence in Modern Consumer-Driven Food Systems
Authors:
Weiqing Min,
Pengfei Zhou,
Leyi Xu,
Tao Liu,
Tianhao Li,
Mingyu Huang,
Ying **,
Yifan Yi,
Min Wen,
Shuqiang Jiang,
Ramesh Jain
Abstract:
Global food systems confront the urgent challenge of supplying sustainable, nutritious diets in the face of escalating demands. The advent of Artificial Intelligence (AI) is bringing in a personal choice revolution, wherein AI-driven individual decisions transform food systems from dinner tables, to the farms, and back to our plates. In this context, AI algorithms refine personal dietary choices,…
▽ More
Global food systems confront the urgent challenge of supplying sustainable, nutritious diets in the face of escalating demands. The advent of Artificial Intelligence (AI) is bringing in a personal choice revolution, wherein AI-driven individual decisions transform food systems from dinner tables, to the farms, and back to our plates. In this context, AI algorithms refine personal dietary choices, subsequently sha** agricultural outputs, and promoting an optimized feedback loop from consumption to cultivation. Initially, we delve into AI tools and techniques spanning the food supply chain, and subsequently assess how AI subfields$\unicode{x2013}$encompassing machine learning, computer vision, and speech recognition$\unicode{x2013}$are harnessed within the AI-enabled Food System (AIFS) framework, which increasingly leverages Internet of Things, multimodal sensors and real-time data exchange. We spotlight the AIFS framework, emphasizing its fusion of AI with technologies such as digitalization, big data analytics, biotechnology, and IoT extensively used in modern food systems in every component. This paradigm shifts the conventional "farm to fork" narrative to a cyclical "consumer-driven farm to fork" model for better achieving sustainable, nutritious diets. This paper explores AI's promise and the intrinsic challenges it poses within the food domain. By championing stringent AI governance, uniform data architectures, and cross-disciplinary partnerships, we argue that AI, when synergized with consumer-centric strategies, holds the potential to steer food systems toward a sustainable trajectory. We furnish a comprehensive survey for the state-of-the-art in diverse facets of food systems, subsequently pinpointing gaps and advocating for the judicious and efficacious deployment of emergent AI methodologies.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
Robust Identity Perceptual Watermark Against Deepfake Face Swap**
Authors:
Tianyi Wang,
Mengxiao Huang,
Harry Cheng,
Bin Ma,
Yinglong Wang
Abstract:
Notwithstanding offering convenience and entertainment to society, Deepfake face swap** has caused critical privacy issues with the rapid development of deep generative models. Due to imperceptible artifacts in high-quality synthetic images, passive detection models against face swap** in recent years usually suffer performance dam** regarding the generalizability issue. Therefore, several s…
▽ More
Notwithstanding offering convenience and entertainment to society, Deepfake face swap** has caused critical privacy issues with the rapid development of deep generative models. Due to imperceptible artifacts in high-quality synthetic images, passive detection models against face swap** in recent years usually suffer performance dam** regarding the generalizability issue. Therefore, several studies have been attempted to proactively protect the original images against malicious manipulations by inserting invisible signals in advance. However, the existing proactive defense approaches demonstrate unsatisfactory results with respect to visual quality, detection accuracy, and source tracing ability. In this study, to fulfill the research gap, we propose the first robust identity perceptual watermarking framework that concurrently performs detection and source tracing against Deepfake face swap** proactively. We assign identity semantics regarding the image contents to the watermarks and devise an unpredictable and nonreversible chaotic encryption system to ensure watermark confidentiality. The watermarks are encoded and recovered by jointly training an encoder-decoder framework along with adversarial image manipulations. Falsification and source tracing are accomplished by justifying the consistency between the content-matched identity perceptual watermark and the recovered robust watermark from the image. Extensive experiments demonstrate state-of-the-art detection performance on Deepfake face swap** under both cross-dataset and cross-manipulation settings.
△ Less
Submitted 15 March, 2024; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Stochastic Smoothed Gradient Descent Ascent for Federated Minimax Optimization
Authors:
Wei Shen,
Minhui Huang,
Jiawei Zhang,
Cong Shen
Abstract:
In recent years, federated minimax optimization has attracted growing interest due to its extensive applications in various machine learning tasks. While Smoothed Alternative Gradient Descent Ascent (Smoothed-AGDA) has proved its success in centralized nonconvex minimax optimization, how and whether smoothing technique could be helpful in federated setting remains unexplored. In this paper, we pro…
▽ More
In recent years, federated minimax optimization has attracted growing interest due to its extensive applications in various machine learning tasks. While Smoothed Alternative Gradient Descent Ascent (Smoothed-AGDA) has proved its success in centralized nonconvex minimax optimization, how and whether smoothing technique could be helpful in federated setting remains unexplored. In this paper, we propose a new algorithm termed Federated Stochastic Smoothed Gradient Descent Ascent (FESS-GDA), which utilizes the smoothing technique for federated minimax optimization. We prove that FESS-GDA can be uniformly used to solve several classes of federated minimax problems and prove new or better analytical convergence results for these settings. We showcase the practical efficiency of FESS-GDA in practical federated learning tasks of training generative adversarial networks (GANs) and fair classification.
△ Less
Submitted 18 April, 2024; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Towards Omni-supervised Referring Expression Segmentation
Authors:
Minglang Huang,
Yiyi Zhou,
Gen Luo,
Guannan Jiang,
Weilin Zhuang,
Xiaoshuai Sun
Abstract:
Referring Expression Segmentation (RES) is an emerging task in computer vision, which segments the target instances in images based on text descriptions. However, its development is plagued by the expensive segmentation labels. To address this issue, we propose a new learning task for RES called Omni-supervised Referring Expression Segmentation (Omni-RES), which aims to make full use of unlabeled,…
▽ More
Referring Expression Segmentation (RES) is an emerging task in computer vision, which segments the target instances in images based on text descriptions. However, its development is plagued by the expensive segmentation labels. To address this issue, we propose a new learning task for RES called Omni-supervised Referring Expression Segmentation (Omni-RES), which aims to make full use of unlabeled, fully labeled and weakly labeled data, e.g., referring points or grounding boxes, for efficient RES training. To accomplish this task, we also propose a novel yet strong baseline method for Omni-RES based on the recently popular teacher-student learning, where the weak labels are not directly transformed into supervision signals but used as a yardstick to select and refine high-quality pseudo-masks for teacher-student learning. To validate the proposed Omni-RES method, we apply it to a set of state-of-the-art RES models and conduct extensive experiments on a bunch of RES datasets. The experimental results yield the obvious merits of Omni-RES than the fully-supervised and semi-supervised training schemes. For instance, with only 10% fully labeled data, Omni-RES can help the base model achieve 100% fully supervised performance, and it also outperform the semi-supervised alternative by a large margin, e.g., +14.93% on RefCOCO and +14.95% on RefCOCO+, respectively. More importantly, Omni-RES also enable the use of large-scale vision-langauges like Visual Genome to facilitate low-cost RES training, and achieve new SOTA performance of RES, e.g., 80.66 on RefCOCO.
△ Less
Submitted 27 November, 2023; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Prompt-based Logical Semantics Enhancement for Implicit Discourse Relation Recognition
Authors:
Chenxu Wang,
** Jian,
Mu Huang
Abstract:
Implicit Discourse Relation Recognition (IDRR), which infers discourse relations without the help of explicit connectives, is still a crucial and challenging task for discourse parsing. Recent works tend to exploit the hierarchical structure information from the annotated senses, which demonstrate enhanced discourse relation representations can be obtained by integrating sense hierarchy. Neverthel…
▽ More
Implicit Discourse Relation Recognition (IDRR), which infers discourse relations without the help of explicit connectives, is still a crucial and challenging task for discourse parsing. Recent works tend to exploit the hierarchical structure information from the annotated senses, which demonstrate enhanced discourse relation representations can be obtained by integrating sense hierarchy. Nevertheless, the performance and robustness for IDRR are significantly constrained by the availability of annotated data. Fortunately, there is a wealth of unannotated utterances with explicit connectives, that can be utilized to acquire enriched discourse relation features. In light of such motivation, we propose a Prompt-based Logical Semantics Enhancement (PLSE) method for IDRR. Essentially, our method seamlessly injects knowledge relevant to discourse relation into pre-trained language models through prompt-based connective prediction. Furthermore, considering the prompt-based connective prediction exhibits local dependencies due to the deficiency of masked language model (MLM) in capturing global semantics, we design a novel self-supervised learning objective based on mutual information maximization to derive enhanced representations of logical semantics for IDRR. Experimental results on PDTB 2.0 and CoNLL16 datasets demonstrate that our method achieves outstanding and consistent performance against the current state-of-the-art models.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
Evolution of topological charge through chiral anomaly transport
Authors:
Zilin Yuan,
An** Huang,
Wen-Hao Zhou,
Guo-Liang Ma,
Mei Huang
Abstract:
Built upon the state-of-the-art model a multiphase transport (AMPT), we develop a new module of chiral anomaly transport (CAT), which can trace the evolution of the initial topological charge of gauge field created through sphaleron transition at finite temperature and external magnetic field in heavy ion collisions. The eventual experimental signals of chiral magnetic effect(CME) can be measured.…
▽ More
Built upon the state-of-the-art model a multiphase transport (AMPT), we develop a new module of chiral anomaly transport (CAT), which can trace the evolution of the initial topological charge of gauge field created through sphaleron transition at finite temperature and external magnetic field in heavy ion collisions. The eventual experimental signals of chiral magnetic effect(CME) can be measured. The CAT explicitly shows the generation and evolution of the charge separation, and the signals of CME through the CAT are quantitatively in agreement with the experimental measurements in Au+Au collision at $\sqrt{s}=200 {\rm GeV}$, and the centrality dependence of the CME fraction follows that of the fireball temperature.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Riesz type theorems for $κ$-pluriharmonic map**s, invariant harmonic quasiregular map**s and harmonic quasiregular map**s
Authors:
Shaolin Chen,
Manzi Huang
Abstract:
The main purpose of this paper is to develop some methods to improve and generalize the main results in a recent paper by Liu and Zhu (Adv. Math., 2023, i.e., \cite{L-Z}). The paper consists of two parts. In the first part, we discuss the Riesz type theorem in the setting of $n$-dimensional complex spaces for all $n\geq 1$. In this part, we first introduce the family of $κ$-pluriharmonic map**s…
▽ More
The main purpose of this paper is to develop some methods to improve and generalize the main results in a recent paper by Liu and Zhu (Adv. Math., 2023, i.e., \cite{L-Z}). The paper consists of two parts. In the first part, we discuss the Riesz type theorem in the setting of $n$-dimensional complex spaces for all $n\geq 1$. In this part, we first introduce the family of $κ$-pluriharmonic map**s of the $n$-dimensional complex unit ball. Then we establish two Riesz type theorems for these map**s, which are the $n$-dimensional versions of Theorems 1.1 and 1.2 in \cite{L-Z}, respectively. Furthermore, even when $n=1$, our first result shows that the assumption of the real parts of the map**s not being negative (or being negative) in \cite[Theorem 1.1]{L-Z} is redundant; and our second result illustrates that the assumption of "quasiconformality" on the map**s in \cite[Theorem 1.2]{L-Z} can be replaced by the weaker one of "quasiregularity". In the second part, we investigate the Riesz type theorem in the setting of $n$-dimensional real spaces for all $n\geq 2$. In this part, first, we prove a Riesz type theorem for invariant harmonic quasiregular map**s of the unit $n$-dimensional real ball. Our result indicates that $(i)$ the range of the parameter $p$ discussed in \cite[Theorem 1.3]{L-Z} can be changed from $(1,2)$ to $(1,\infty)$; $(ii)$ the assumption of the first coordinate functions of the map**s being non-zero in \cite[Theorem 1.3]{L-Z} is redundant. In this way, we complete the discussions carried out in \cite[Theorems 1.3 and 1.4]{L-Z}. Second, we obtain a Riesz type theorem for harmonic $K$-quasiregular map**s of the unit $n$-dimensional real ball. Our result demonstrates that the range of the parameter $p$ discussed in \cite[Theorem 2.1]{K-2023} can be changed from $(1,2)$ to $(1,\infty)$.
△ Less
Submitted 29 October, 2023; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Language Models Hallucinate, but May Excel at Fact Verification
Authors:
Jian Guan,
Jesse Dodge,
David Wadden,
Minlie Huang,
Hao Peng
Abstract:
Recent progress in natural language processing (NLP) owes much to remarkable advances in large language models (LLMs). Nevertheless, LLMs frequently "hallucinate," resulting in non-factual outputs. Our carefully-designed human evaluation substantiates the serious hallucination issue, revealing that even GPT-3.5 produces factual outputs less than 25% of the time. This underscores the importance of…
▽ More
Recent progress in natural language processing (NLP) owes much to remarkable advances in large language models (LLMs). Nevertheless, LLMs frequently "hallucinate," resulting in non-factual outputs. Our carefully-designed human evaluation substantiates the serious hallucination issue, revealing that even GPT-3.5 produces factual outputs less than 25% of the time. This underscores the importance of fact verifiers in order to measure and incentivize progress. Our systematic investigation affirms that LLMs can be repurposed as effective fact verifiers with strong correlations with human judgments. Surprisingly, FLAN-T5-11B, the least factual generator in our study, performs the best as a fact verifier, even outperforming more capable LLMs like GPT3.5 and ChatGPT. Delving deeper, we analyze the reliance of these LLMs on high-quality evidence, as well as their deficiencies in robustness and generalization ability. Our study presents insights for develo** trustworthy generation models.
△ Less
Submitted 20 March, 2024; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Reforming Physics Exams Using Openly Accessible Large Isomorphic Problem Banks created with the assistance of Generative AI: an Explorative Study
Authors:
Zhongzhou Chen,
Emily Frederick,
Colleen Cui,
Munaimah Khan,
Christopher Klatt,
Mercedith Huang,
Shiyang Su
Abstract:
This paper explores using large isomorphic problem banks to overcome many challenges of traditional exams in large STEM classes, especially the threat of content sharing websites and generative AI to the security of exam items. We first introduce an efficient procedure for creating large numbers of isomorphic physics problems, assisted by the large language model GPT-3 and several other open-sourc…
▽ More
This paper explores using large isomorphic problem banks to overcome many challenges of traditional exams in large STEM classes, especially the threat of content sharing websites and generative AI to the security of exam items. We first introduce an efficient procedure for creating large numbers of isomorphic physics problems, assisted by the large language model GPT-3 and several other open-source tools. We then propose that if exam items are randomly drawn from large enough problem banks, then giving students open access to problem banks prior to the exam will not dramatically impact students' performance on the exam or lead to wide-spread rote-memorization of solutions. We tested this hypothesis on two mid-term physics exams, comparing students' performance on problems drawn from open isomorphic problem banks to similar transfer problems that were not accessible to students prior to the exam. We found that on both exams, both open bank and transfer problems had the highest difficulty. The differences in percent correct were between 5% to 10%, which is comparable to the differences between different isomorphic versions of the same problem type. Item response theory analysis found that both types of problem have high discrimination (>1.5) with no significant differences. Student performance on open-bank and transfer problems are highly correlated with each other, and the correlations are stronger than average correlations between problems on the exam. Exploratory factor analysis also found that open-bank and transfer problems load on the same factor, and even formed their own factor on the second exam. Those observations all suggest that giving students open access to large isomorphic problem banks only had a small impact on students' performance on the exam but could have significant potential in reforming traditional classroom exams.
△ Less
Submitted 22 October, 2023;
originally announced October 2023.
-
Semi-relativistic antisymmetrized molecular dynamics for energetic neutron production in intermediate energy heavy-ion reactions
Authors:
Q. Hu,
G. Y. Tian,
R. Wada,
X. Q. Liu,
W. P. Lin,
H. Zheng,
Y. P. Zhang,
Z. Q. Chen,
R. Han,
M. R. Huang
Abstract:
Relativistic corrections have been made in the non-relativistic antisymmetrized molecular dynamics (AMD) simulations to apply to the high energy neutron production in the $^{12}$C+$^{12}$C and $^{16}$O+$^{12}$C collisions at incident energies of 290 and 400 MeV/nucleon. The corrections are made in kinematics alone and no nucleon-nucleon inelastic scatterings nor meson productions are taken into ac…
▽ More
Relativistic corrections have been made in the non-relativistic antisymmetrized molecular dynamics (AMD) simulations to apply to the high energy neutron production in the $^{12}$C+$^{12}$C and $^{16}$O+$^{12}$C collisions at incident energies of 290 and 400 MeV/nucleon. The corrections are made in kinematics alone and no nucleon-nucleon inelastic scatterings nor meson productions are taken into account, and AMD with the relativistic corrections is called semi-relativistic AMD. The three-nucleon collision (3NC) and Fermi boost in the collision processes are taken into account in the non-relativistic AMD. Since the relativistic corrections tend to compensate in each other, the difference between the semi-relativistic and non-relativistic results become small. High energy tails of the available experimental neutron double differential cross sections, especially at larger angles, are well reproduced by AMD with the 3NC term both with non-relativistic and semi-relativistic simulations. These results indicate that the high energy neutrons are dominantly produced by the 3NC process in this incident energy range.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality
Authors:
Liyuan Wang,
**gyi Xie,
Xingxing Zhang,
Mingyi Huang,
Hang Su,
Jun Zhu
Abstract:
Prompt-based continual learning is an emerging direction in leveraging pre-trained knowledge for downstream continual learning, and has almost reached the performance pinnacle under supervised pre-training. However, our empirical research reveals that the current strategies fall short of their full potential under the more realistic self-supervised pre-training, which is essential for handling vas…
▽ More
Prompt-based continual learning is an emerging direction in leveraging pre-trained knowledge for downstream continual learning, and has almost reached the performance pinnacle under supervised pre-training. However, our empirical research reveals that the current strategies fall short of their full potential under the more realistic self-supervised pre-training, which is essential for handling vast quantities of unlabeled data in practice. This is largely due to the difficulty of task-specific knowledge being incorporated into instructed representations via prompt parameters and predicted by uninstructed representations at test time. To overcome the exposed sub-optimality, we conduct a theoretical analysis of the continual learning objective in the context of pre-training, and decompose it into hierarchical components: within-task prediction, task-identity inference, and task-adaptive prediction. Following these empirical and theoretical insights, we propose Hierarchical Decomposition (HiDe-)Prompt, an innovative approach that explicitly optimizes the hierarchical components with an ensemble of task-specific prompts and statistics of both uninstructed and instructed representations, further with the coordination of a contrastive regularization strategy. Our extensive experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning (e.g., up to 15.01% and 9.61% lead on Split CIFAR-100 and Split ImageNet-R, respectively). Our code is available at \url{https://github.com/thu-ml/HiDe-Prompt}.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Memory efficient location recommendation through proximity-aware representation
Authors:
Xuan Luo,
Mingqing Huang,
Rui Lv,
Hui Zhao
Abstract:
Sequential location recommendation plays a huge role in modern life, which can enhance user experience, bring more profit to businesses and assist in government administration. Although methods for location recommendation have evolved significantly thanks to the development of recommendation systems, there is still limited utilization of geographic information, along with the ongoing challenge of…
▽ More
Sequential location recommendation plays a huge role in modern life, which can enhance user experience, bring more profit to businesses and assist in government administration. Although methods for location recommendation have evolved significantly thanks to the development of recommendation systems, there is still limited utilization of geographic information, along with the ongoing challenge of addressing data sparsity. In response, we introduce a Proximity-aware based region representation for Sequential Recommendation (PASR for short), built upon the Self-Attention Network architecture. We tackle the sparsity issue through a novel loss function employing importance sampling, which emphasizes informative negative samples during optimization. Moreover, PASR enhances the integration of geographic information by employing a self-attention-based geography encoder to the hierarchical grid and proximity grid at each GPS point. To further leverage geographic information, we utilize the proximity-aware negative samplers to enhance the quality of negative samples. We conducted evaluations using three real-world Location-Based Social Networking (LBSN) datasets, demonstrating that PASR surpasses state-of-the-art sequential location recommendation methods
△ Less
Submitted 24 October, 2023; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Hierarchical Side-Tuning for Vision Transformers
Authors:
Weifeng Lin,
Ziheng Wu,
Wentao Yang,
Mingxin Huang,
Jun Huang,
Lianwen **
Abstract:
Fine-tuning pre-trained Vision Transformers (ViTs) has showcased significant promise in enhancing visual recognition tasks. Yet, the demand for individualized and comprehensive fine-tuning processes for each task entails substantial computational and memory costs, posing a considerable challenge. Recent advancements in Parameter-Efficient Transfer Learning (PETL) have shown potential for achieving…
▽ More
Fine-tuning pre-trained Vision Transformers (ViTs) has showcased significant promise in enhancing visual recognition tasks. Yet, the demand for individualized and comprehensive fine-tuning processes for each task entails substantial computational and memory costs, posing a considerable challenge. Recent advancements in Parameter-Efficient Transfer Learning (PETL) have shown potential for achieving high performance with fewer parameter updates compared to full fine-tuning. However, their effectiveness is primarily observed in simple tasks like image classification, while they encounter challenges with more complex vision tasks like dense prediction. To address this gap, this study aims to identify an effective tuning method that caters to a wider range of visual tasks. In this paper, we introduce Hierarchical Side-Tuning (HST), an innovative PETL method facilitating the transfer of ViT models to diverse downstream tasks. Diverging from existing methods that focus solely on fine-tuning parameters within specific input spaces or modules, HST employs a lightweight Hierarchical Side Network (HSN). This network leverages intermediate activations from the ViT backbone to model multi-scale features, enhancing prediction capabilities. To evaluate HST, we conducted comprehensive experiments across a range of visual tasks, including classification, object detection, instance segmentation, and semantic segmentation. Remarkably, HST achieved state-of-the-art performance in 13 out of the 19 tasks on the VTAB-1K benchmark, with the highest average Top-1 accuracy of 76.1%, while fine-tuning a mere 0.78M parameters. When applied to object detection and semantic segmentation tasks on the COCO and ADE20K testdev benchmarks, HST outperformed existing PETL methods and even surpassed full fine-tuning.
△ Less
Submitted 15 May, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Task-Adaptive Tokenization: Enhancing Long-Form Text Generation Efficacy in Mental Health and Beyond
Authors:
Siyang Liu,
Naihao Deng,
Sahand Sabour,
Yilin Jia,
Minlie Huang,
Rada Mihalcea
Abstract:
We propose task-adaptive tokenization as a way to adapt the generation pipeline to the specifics of a downstream task and enhance long-form generation in mental health. Inspired by insights from cognitive science, our task-adaptive tokenizer samples variable segmentations from multiple outcomes, with sampling probabilities optimized based on task-specific data. We introduce a strategy for building…
▽ More
We propose task-adaptive tokenization as a way to adapt the generation pipeline to the specifics of a downstream task and enhance long-form generation in mental health. Inspired by insights from cognitive science, our task-adaptive tokenizer samples variable segmentations from multiple outcomes, with sampling probabilities optimized based on task-specific data. We introduce a strategy for building a specialized vocabulary and introduce a vocabulary merging protocol that allows for the integration of task-specific tokens into the pre-trained model's tokenization step. Through extensive experiments on psychological question-answering tasks in both Chinese and English, we find that our task-adaptive tokenization approach brings a significant improvement in generation performance while using up to 60% fewer tokens. Preliminary experiments point to promising results when using our tokenization approach with very large language models.
△ Less
Submitted 13 November, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
Evaluating Hallucinations in Chinese Large Language Models
Authors:
Qinyuan Cheng,
Tianxiang Sun,
Wenwei Zhang,
Siyin Wang,
Xiangyang Liu,
Mozhi Zhang,
Junliang He,
Mianqiu Huang,
Zhangyue Yin,
Kai Chen,
Xipeng Qiu
Abstract:
In this paper, we establish a benchmark named HalluQA (Chinese Hallucination Question-Answering) to measure the hallucination phenomenon in Chinese large language models. HalluQA contains 450 meticulously designed adversarial questions, spanning multiple domains, and takes into account Chinese historical culture, customs, and social phenomena. During the construction of HalluQA, we consider two ty…
▽ More
In this paper, we establish a benchmark named HalluQA (Chinese Hallucination Question-Answering) to measure the hallucination phenomenon in Chinese large language models. HalluQA contains 450 meticulously designed adversarial questions, spanning multiple domains, and takes into account Chinese historical culture, customs, and social phenomena. During the construction of HalluQA, we consider two types of hallucinations: imitative falsehoods and factual errors, and we construct adversarial samples based on GLM-130B and ChatGPT. For evaluation, we design an automated evaluation method using GPT-4 to judge whether a model output is hallucinated. We conduct extensive experiments on 24 large language models, including ERNIE-Bot, Baichuan2, ChatGLM, Qwen, SparkDesk and etc. Out of the 24 models, 18 achieved non-hallucination rates lower than 50%. This indicates that HalluQA is highly challenging. We analyze the primary types of hallucinations in different types of models and their causes. Additionally, we discuss which types of hallucinations should be prioritized for different types of models.
△ Less
Submitted 25 October, 2023; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Language Model Decoding as Direct Metrics Optimization
Authors:
Haozhe Ji,
Pei Ke,
Hongning Wang,
Minlie Huang
Abstract:
Despite the remarkable advances in language modeling, current mainstream decoding methods still struggle to generate texts that align with human texts across different aspects. In particular, sampling-based methods produce less-repetitive texts which are often disjunctive in discourse, while search-based methods maintain topic coherence at the cost of increased repetition. Overall, these methods f…
▽ More
Despite the remarkable advances in language modeling, current mainstream decoding methods still struggle to generate texts that align with human texts across different aspects. In particular, sampling-based methods produce less-repetitive texts which are often disjunctive in discourse, while search-based methods maintain topic coherence at the cost of increased repetition. Overall, these methods fall short in achieving holistic alignment across a broad range of aspects. In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts measured by multiple metrics of desired aspects simultaneously. The resulting decoding distribution enjoys an analytical solution that scales the input language model distribution via a sequence-level energy function defined by these metrics. And most importantly, we prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts. To facilitate tractable sampling from this globally normalized distribution, we adopt the Sampling-Importance-Resampling technique. Experiments on various domains and model scales demonstrate the superiority of our method in metrics alignment with human texts and human evaluation over strong baselines.
△ Less
Submitted 5 June, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
Authors:
Zhibin Gou,
Zhihong Shao,
Yeyun Gong,
Yelong Shen,
Yujiu Yang,
Minlie Huang,
Nan Duan,
Weizhu Chen
Abstract:
Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics. In this paper, we propose ToRA a series of Tool-integrated Reasoning Agents designed to solve challenging mathematical problems by seamlessly integrating natural language reasoning with the utilization of external tools (e.g., computation libraries and symbolic solvers)…
▽ More
Large language models have made significant progress in various language tasks, yet they still struggle with complex mathematics. In this paper, we propose ToRA a series of Tool-integrated Reasoning Agents designed to solve challenging mathematical problems by seamlessly integrating natural language reasoning with the utilization of external tools (e.g., computation libraries and symbolic solvers), thereby amalgamating the analytical prowess of language and the computational efficiency of tools. To train ToRA, we curate interactive tool-use trajectories on mathematical datasets, apply imitation learning on the annotations, and propose output space sha** to further refine models' reasoning behavior. As a result, ToRA models significantly outperform open-source models on 10 mathematical reasoning datasets across all scales with 13%-19% absolute improvements on average. Notably, ToRA-7B reaches 44.6% on the competition-level dataset MATH, surpassing the best open-source model WizardMath-70B by 22% absolute. ToRA-Code-34B is also the first open-source model that achieves an accuracy exceeding 50% on MATH, which significantly outperforms GPT-4's CoT result, and is competitive with GPT-4 solving problems with programs. Additionally, we conduct a comprehensive analysis of the benefits and remaining challenges of tool interaction for mathematical reasoning, providing valuable insights for future research.
△ Less
Submitted 21 February, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Non-equilibrium molecular dynamics of steady-state fluid transport through a 2D membrane driven by a concentration gradient
Authors:
Daniel J. Rankin,
David M. Huang
Abstract:
We use a novel non-equilibrium algorithm to simulate steady-state fluid transport through a two-dimensional (2D) membrane due to a concentration gradient by molecular dynamics (MD) for the first time. We confirm that, as required by the Onsager reciprocal relations in the linear-response regime, the solution flux obtained using this algorithm agrees with the excess solute flux obtained from an est…
▽ More
We use a novel non-equilibrium algorithm to simulate steady-state fluid transport through a two-dimensional (2D) membrane due to a concentration gradient by molecular dynamics (MD) for the first time. We confirm that, as required by the Onsager reciprocal relations in the linear-response regime, the solution flux obtained using this algorithm agrees with the excess solute flux obtained from an established non-equilibrium MD algorithm for pressure-driven flow. In addition, we show that the concentration-gradient solution flux in this regime is quantified far more efficiently by explicitly applying a transmembrane concentration difference using our algorithm than by applying Onsager reciprocity to pressure-driven flow. The simulated fluid fluxes are captured with reasonable quantitative accuracy by our previously derived continuum theory of concentration-gradient-driven fluid transport through a 2D membrane [J. Chem. Phys. 151, 044705 (2019)] for a wide range of solution and membrane parameters even though the simulated pore sizes are only several times the size of the fluid particles. The simulations deviate from the theory especially for strong solute--membrane interactions relative to the thermal energy, for which the theoretical approximations break down. Our findings will be beneficial for molecular-level understanding of fluid transport driven by concentration gradients through membranes made from 2D materials, which have diverse applications in energy harvesting, molecular separations, and biosensing.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
A First Principles Derivation of Energy Conserving Momentum Jumps in Surface Hop** Simulations
Authors:
Dorothy Miaoyu Huang,
Austin T. Green,
Craig C. Martens
Abstract:
The fewest switches surface hop** (FSSH) method proposed by Tully in 1990 [J. C Tully, J. Chem. Phys. 93, 1061 (1990)] -- along with its many later variations -- is basis for most practical simulations of molecular dynamics with electronic transitions in realistic systems. Despite its popularity, a rigorous formal derivation of the algorithm has yet to be achieved. In this paper, we derive the e…
▽ More
The fewest switches surface hop** (FSSH) method proposed by Tully in 1990 [J. C Tully, J. Chem. Phys. 93, 1061 (1990)] -- along with its many later variations -- is basis for most practical simulations of molecular dynamics with electronic transitions in realistic systems. Despite its popularity, a rigorous formal derivation of the algorithm has yet to be achieved. In this paper, we derive the energy conserving momentum jumps characterizing FSSH from the perspective of quantum trajectory surface hop** (QTSH [C. C. Martens, J. Phys. Chem. A 123, 1110 (2019)]. In the limit of localized nonadiabatic transitions, simple mathematical and physical arguments allow the FSSH algorithm to be derived from first principles. For general processes, the quantum forces characterizing the QTSH method provides accurate results for nonadiabatic dynamics with rigorous energy conservation at the ensemble level within the consistency of the underlying stochastic surface hop** without resorting to the artificial momentum rescaling of FSSH.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Localization-Guided Track: A Deep Association Multi-Object Tracking Framework Based on Localization Confidence of Detections
Authors:
Ting Meng,
Chunyun Fu,
Mingguang Huang,
Xiyang Wang,
Jiawei He,
Tao Huang,
Wankai Shi
Abstract:
In currently available literature, no tracking-by-detection (TBD) paradigm-based tracking method has considered the localization confidence of detection boxes. In most TBD-based methods, it is considered that objects of low detection confidence are highly occluded and thus it is a normal practice to directly disregard such objects or to reduce their priority in matching. In addition, appearance si…
▽ More
In currently available literature, no tracking-by-detection (TBD) paradigm-based tracking method has considered the localization confidence of detection boxes. In most TBD-based methods, it is considered that objects of low detection confidence are highly occluded and thus it is a normal practice to directly disregard such objects or to reduce their priority in matching. In addition, appearance similarity is not a factor to consider for matching these objects. However, in terms of the detection confidence fusing classification and localization, objects of low detection confidence may have inaccurate localization but clear appearance; similarly, objects of high detection confidence may have inaccurate localization or unclear appearance; yet these objects are not further classified. In view of these issues, we propose Localization-Guided Track (LG-Track). Firstly, localization confidence is applied in MOT for the first time, with appearance clarity and localization accuracy of detection boxes taken into account, and an effective deep association mechanism is designed; secondly, based on the classification confidence and localization confidence, a more appropriate cost matrix can be selected and used; finally, extensive experiments have been conducted on MOT17 and MOT20 datasets. The results show that our proposed method outperforms the compared state-of-art tracking methods. For the benefit of the community, our code has been made publicly at https://github.com/mengting2023/LG-Track.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Large Nonreciprocity of Shear-Horizontal Surface Acoustic Waves induced by Magnetoelastic Bilayers
Authors:
Mingxian Huang,
Yuanyuan Liu,
Wenbin Hu,
Yutong Wu,
Wen Wang,
Wei He,
Huaiwu Zhang,
Feiming Bai
Abstract:
We report large nonreciprocity in the transmission of shear-horizontal surface acoustic waves (SAWs) on LiTaO3 substrate coated with a FeCoSiB/NiFeCu magnetoelastic bilayer. The large difference in saturation magnetization of the two layers not only brings nonreciprocal spin waves (SWs), but also ensures the phonon-magnon (SAWs-SWs) coupling at relatively low wavenumbers. It is found that the angl…
▽ More
We report large nonreciprocity in the transmission of shear-horizontal surface acoustic waves (SAWs) on LiTaO3 substrate coated with a FeCoSiB/NiFeCu magnetoelastic bilayer. The large difference in saturation magnetization of the two layers not only brings nonreciprocal spin waves (SWs), but also ensures the phonon-magnon (SAWs-SWs) coupling at relatively low wavenumbers. It is found that the angle between the magnetization and the wavevector play important roles in determining the strength of magnetoelastic coupling and nonreciprocity, simultaneously. A large nonreciprocal transmission of SAWs about 30 dB (i.e. 60 dB/mm) is demonstrated at 2.33 GHz. In addition, the dispersion relation between coupled SH-SAWs and nonreciprocal SWs is developed, which provide a good insight into the observed phenomena. Our results offer a convenient approach to implement nonreciprocal SAW isolators or circulators.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
Distributional Estimation of Data Uncertainty for Surveillance Face Anti-spoofing
Authors:
Mouxiao Huang
Abstract:
Face recognition systems have become increasingly vulnerable to security threats in recent years, prompting the use of Face Anti-spoofing (FAS) to protect against various types of attacks, such as phone unlocking, face payment, and self-service security inspection. While FAS has demonstrated its effectiveness in traditional settings, securing it in long-distance surveillance scenarios presents a s…
▽ More
Face recognition systems have become increasingly vulnerable to security threats in recent years, prompting the use of Face Anti-spoofing (FAS) to protect against various types of attacks, such as phone unlocking, face payment, and self-service security inspection. While FAS has demonstrated its effectiveness in traditional settings, securing it in long-distance surveillance scenarios presents a significant challenge. These scenarios often feature low-quality face images, necessitating the modeling of data uncertainty to improve stability under extreme conditions. To address this issue, this work proposes Distributional Estimation (DisE), a method that converts traditional FAS point estimation to distributional estimation by modeling data uncertainty during training, including feature (mean) and uncertainty (variance). By adjusting the learning strength of clean and noisy samples for stability and accuracy, the learned uncertainty enhances DisE's performance. The method is evaluated on SuHiFiMask [1], a large-scale and challenging FAS dataset in surveillance scenarios. Results demonstrate that DisE achieves comparable performance on both ACER and AUC metrics.
△ Less
Submitted 18 September, 2023;
originally announced September 2023.
-
SafetyBench: Evaluating the Safety of Large Language Models
Authors:
Zhexin Zhang,
Leqi Lei,
Lindong Wu,
Rui Sun,
Yongkang Huang,
Chong Long,
Xiao Liu,
Xuanyu Lei,
Jie Tang,
Minlie Huang
Abstract:
With the rapid development of Large Language Models (LLMs), increasing attention has been paid to their safety concerns. Consequently, evaluating the safety of LLMs has become an essential task for facilitating the broad applications of LLMs. Nevertheless, the absence of comprehensive safety evaluation benchmarks poses a significant impediment to effectively assess and enhance the safety of LLMs.…
▽ More
With the rapid development of Large Language Models (LLMs), increasing attention has been paid to their safety concerns. Consequently, evaluating the safety of LLMs has become an essential task for facilitating the broad applications of LLMs. Nevertheless, the absence of comprehensive safety evaluation benchmarks poses a significant impediment to effectively assess and enhance the safety of LLMs. In this work, we present SafetyBench, a comprehensive benchmark for evaluating the safety of LLMs, which comprises 11,435 diverse multiple choice questions spanning across 7 distinct categories of safety concerns. Notably, SafetyBench also incorporates both Chinese and English data, facilitating the evaluation in both languages. Our extensive tests over 25 popular Chinese and English LLMs in both zero-shot and few-shot settings reveal a substantial performance advantage for GPT-4 over its counterparts, and there is still significant room for improving the safety of current LLMs. We also demonstrate that the measured safety understanding abilities in SafetyBench are correlated with safety generation abilities. Data and evaluation guidelines are available at \url{https://github.com/thu-coai/SafetyBench}{https://github.com/thu-coai/SafetyBench}. Submission entrance and leaderboard are available at \url{https://llmbench.ai/safety}{https://llmbench.ai/safety}.
△ Less
Submitted 24 June, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Interplay between the muon $g-2$ anomaly and the PTA nHZ gravitational waves from domain walls in next-to minimal supersymmetric standard model
Authors:
Ming Xia Huang,
Fei Wang,
Ying Kai Zhang
Abstract:
With some explicitly $Z_3$ breaking terms in the NMSSM effective superpotential and scalar potential, domain walls (DWs) from spontaneously breaking of the discrete symmetry in approximate $Z_3$-invariant NMSSM can collapse and lead to observable stochastic gravitational wave (GW) background signals. In the presence of a hidden sector, such terms may originate from the geometric superconformal bre…
▽ More
With some explicitly $Z_3$ breaking terms in the NMSSM effective superpotential and scalar potential, domain walls (DWs) from spontaneously breaking of the discrete symmetry in approximate $Z_3$-invariant NMSSM can collapse and lead to observable stochastic gravitational wave (GW) background signals. In the presence of a hidden sector, such terms may originate from the geometric superconformal breaking with holomorphic quadratic correction to frame function when the global scale-invariant superpotential is naturally embedded into the canonical superconformal supergravity models. The smallness of such mass parameters in the NMSSM may be traced back to the original superconformal invariance. Naive estimations indicate that a SUSY explanation to muon $g-2$ anomaly can have tension with the constraints on SUSY by PTA data, because large SUSY contributions to $Δa_μ$ in general needs relatively light superpartners while present $Ω_{gw}^0$ can set the lower bounds for $m_{soft}$. We calculate numerically the signatures of GW produced from the collapse of DWs and find that the observed nHZ stochastic GW background by NANOGrav, etc., can indeed be explained with proper tiny values of $χm_{3/2}\sim 10^{-14}{\rm eV}$ for $χS^2$ case (and $χm_{3/2}\sim 10^{-10}{\rm eV}$ for $χH_u H_d$ case), respectively. Besides, there are still some parameter points, whose GW spectra intersect with the NANOGrav signal region, that can explain the muon $g-2$ anomaly to $1σ$ range.
△ Less
Submitted 17 April, 2024; v1 submitted 12 September, 2023;
originally announced September 2023.
-
$D_{(s)}-$ mesons semileptonic form factors in the 4-flavor holographic QCD
Authors:
Hiwa A. Ahmed,
Yidian Chen,
Mei Huang
Abstract:
We investigate semileptonic form factors of $D_{(s)}$ meson from a modified soft-wall 4-flavor holographic model. The model successfully reproduces the masses and decay constants of various mesons, including $ρ$, $K^*$, $D^*$, $D_s^*$, $a_1$, $K_1$, $f_1$, $D_1$,$D_{s1}$, $π$, $K$, $η$, $D$, and $D_s$. Moreover, we study the semileptonic decay processes $D^{+} \to (π, K, η) l^{+} ν_{l}$ and…
▽ More
We investigate semileptonic form factors of $D_{(s)}$ meson from a modified soft-wall 4-flavor holographic model. The model successfully reproduces the masses and decay constants of various mesons, including $ρ$, $K^*$, $D^*$, $D_s^*$, $a_1$, $K_1$, $f_1$, $D_1$,$D_{s1}$, $π$, $K$, $η$, $D$, and $D_s$. Moreover, we study the semileptonic decay processes $D^{+} \to (π, K, η) l^{+} ν_{l}$ and $D_{s}^{+} \to ( K, η) l^{+} ν_{l}$, associated with the vector meson exchange, as well as $D_{(s)}^{+} \to K^{} l^{+} ν_{l}$, associated with the vector and axial vector meson exchange. The form factors $f_{+}(q^{2})$ for $D \toπ$ and $D_{(s)}\to K$ decays agree excellently with experimental and lattice data, outperforming other theoretical approaches. The $f_{+}(q^{2})$ form factor for $D^{+} \to η$ is compatible with experimental data, while a slight discrepancy is observed for $D_{s}^{+} \to η$ at large $q^{2}$. Additionally, we predict the vector form factors $V(q^{2})$ and $A_{1}(q^{2})$ for $D \to K^{}$ and $D_{s} \to K^{}$ decays, respectively. The results agree well with other approaches and lattice data at maximum recoil ($q^{2}=0$).
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Detecting Violations of Differential Privacy for Quantum Algorithms
Authors:
Ji Guan,
Wang Fang,
Mingyu Huang,
Mingsheng Ying
Abstract:
Quantum algorithms for solving a wide range of practical problems have been proposed in the last ten years, such as data search and analysis, product recommendation, and credit scoring. The concern about privacy and other ethical issues in quantum computing naturally rises up. In this paper, we define a formal framework for detecting violations of differential privacy for quantum algorithms. A det…
▽ More
Quantum algorithms for solving a wide range of practical problems have been proposed in the last ten years, such as data search and analysis, product recommendation, and credit scoring. The concern about privacy and other ethical issues in quantum computing naturally rises up. In this paper, we define a formal framework for detecting violations of differential privacy for quantum algorithms. A detection algorithm is developed to verify whether a (noisy) quantum algorithm is differentially private and automatically generate bugging information when the violation of differential privacy is reported. The information consists of a pair of quantum states that violate the privacy, to illustrate the cause of the violation. Our algorithm is equipped with Tensor Networks, a highly efficient data structure, and executed both on TensorFlow Quantum and TorchQuantum which are the quantum extensions of famous machine learning platforms -- TensorFlow and PyTorch, respectively. The effectiveness and efficiency of our algorithm are confirmed by the experimental results of almost all types of quantum algorithms already implemented on realistic quantum computers, including quantum supremacy algorithms (beyond the capability of classical algorithms), quantum machine learning models, quantum approximate optimization algorithms, and variational quantum eigensolvers with up to 21 quantum bits.
△ Less
Submitted 9 September, 2023;
originally announced September 2023.
-
Irreversible entropy transport enhanced by fermionic superfluidity
Authors:
Philipp Fabritius,
Jeffrey Mohan,
Mohsen Talebi,
Simon Wili,
Wilhelm Zwerger,
Meng-Zi Huang,
Tilman Esslinger
Abstract:
The nature of particle and entropy flow between two superfluids is often understood in terms of reversible flow carried by an entropy-free, macroscopic wavefunction. While this wavefunction is responsible for many intriguing properties of superfluids and superconductors, its interplay with excitations in non-equilibrium situations is less understood. Here, we observe large concurrent flows of both…
▽ More
The nature of particle and entropy flow between two superfluids is often understood in terms of reversible flow carried by an entropy-free, macroscopic wavefunction. While this wavefunction is responsible for many intriguing properties of superfluids and superconductors, its interplay with excitations in non-equilibrium situations is less understood. Here, we observe large concurrent flows of both particles and entropy through a ballistic channel connecting two strongly interacting fermionic superfluids. Both currents respond nonlinearly to chemical potential and temperature biases. We find that the entropy transported per particle is much larger than the prediction of superfluid hydrodynamics in the linear regime and largely independent of changes in the channel's geometry. In contrast, the timescales of advective and diffusive entropy transport vary significantly with the channel geometry. In our setting, superfluidity counterintuitively increases the speed of entropy transport. Moreover, we develop a phenomenological model describing the nonlinear dynamics within the framework of generalised gradient dynamics. Our approach for measuring entropy currents may help elucidate mechanisms of heat transfer in superfluids and superconducting devices.
△ Less
Submitted 22 April, 2024; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Large Language Models Are Not Robust Multiple Choice Selectors
Authors:
Chujie Zheng,
Hao Zhou,
Fandong Meng,
Jie Zhou,
Minlie Huang
Abstract:
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs). This work shows that modern LLMs are vulnerable to option position changes in MCQs due to their inherent "selection bias", namely, they prefer to select specific option IDs as answers (like "Option A"). Through extensive empirical analyses with 20 LLMs on three benchmarks…
▽ More
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs). This work shows that modern LLMs are vulnerable to option position changes in MCQs due to their inherent "selection bias", namely, they prefer to select specific option IDs as answers (like "Option A"). Through extensive empirical analyses with 20 LLMs on three benchmarks, we pinpoint that this behavioral bias primarily stems from LLMs' token bias, where the model a priori assigns more probabilistic mass to specific option ID tokens (e.g., A/B/C/D) when predicting answers from the option IDs. To mitigate selection bias, we propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution. PriDe first estimates the prior by permutating option contents on a small number of test samples, and then applies the estimated prior to debias the remaining samples. We demonstrate that it achieves interpretable and transferable debiasing with high computational efficiency. We hope this work can draw broader research attention to the bias and robustness of modern LLMs.
△ Less
Submitted 21 February, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.