Search | arXiv e-print repository

Conformal Prediction for Deep Classifier via Label Ranking

Authors: Jianguo Huang, Huajun Xi, Linjun Zhang, Huaxiu Yao, Yue Qiu, Hongxin Wei

Abstract: Conformal prediction is a statistical framework that generates prediction sets containing ground-truth labels with a desired coverage guarantee. The predicted probabilities produced by machine learning models are generally miscalibrated, leading to large prediction sets in conformal prediction. To address this issue, we propose a novel algorithm named $\textit{Sorted Adaptive Prediction Sets}$ (SA… ▽ More Conformal prediction is a statistical framework that generates prediction sets containing ground-truth labels with a desired coverage guarantee. The predicted probabilities produced by machine learning models are generally miscalibrated, leading to large prediction sets in conformal prediction. To address this issue, we propose a novel algorithm named $\textit{Sorted Adaptive Prediction Sets}$ (SAPS), which discards all the probability values except for the maximum softmax probability. The key idea behind SAPS is to minimize the dependence of the non-conformity score on the probability values while retaining the uncertainty information. In this manner, SAPS can produce compact prediction sets and communicate instance-wise uncertainty. Extensive experiments validate that SAPS not only lessens the prediction sets but also broadly enhances the conditional coverage rate of prediction sets. △ Less

Submitted 6 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Accepted by ICML 2024

arXiv:2310.00754 [pdf, other]

Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Authors: Yiyang Zhou, Chenhang Cui, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, Huaxiu Yao

Abstract: Large vision-language models (LVLMs) have shown remarkable abilities in understanding visual information with human languages. However, LVLMs still suffer from object hallucination, which is the problem of generating descriptions that include objects that do not actually exist in the images. This can negatively impact many vision-language tasks, such as visual summarization and reasoning. To addre… ▽ More Large vision-language models (LVLMs) have shown remarkable abilities in understanding visual information with human languages. However, LVLMs still suffer from object hallucination, which is the problem of generating descriptions that include objects that do not actually exist in the images. This can negatively impact many vision-language tasks, such as visual summarization and reasoning. To address this issue, we propose a simple yet powerful algorithm, LVLM Hallucination Revisor (LURE), to post-hoc rectify object hallucination in LVLMs by reconstructing less hallucinatory descriptions. LURE is grounded in a rigorous statistical analysis of the key factors underlying object hallucination, including co-occurrence (the frequent appearance of certain objects alongside others in images), uncertainty (objects with higher uncertainty during LVLM decoding), and object position (hallucination often appears in the later part of the generated text). LURE can also be seamlessly integrated with any LVLMs. We evaluate LURE on six open-source LVLMs, achieving a 23% improvement in general object hallucination evaluation metrics over the previous best approach. In both GPT and human evaluations, LURE consistently ranks at the top. Our data and code are available at https://github.com/YiyangZhou/LURE. △ Less

Submitted 16 March, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

Comments: Accepted by ICLR 2024

arXiv:2309.14674

Leveraging Herpangina Data to Enhance Hospital-level Prediction of Hand-Foot-and-Mouth Disease Admissions Using UPTST

Authors: Guoqi Yu, Hailun Yao, Huan Zheng, Ximing Xu

Abstract: Outbreaks of hand-foot-and-mouth disease(HFMD) have been associated with significant morbidity and, in severe cases, mortality. Accurate forecasting of daily admissions of pediatric HFMD patients is therefore crucial for aiding the hospital in preparing for potential outbreaks and mitigating nosocomial transmissions. To address this pressing need, we propose a novel transformer-based model with a… ▽ More Outbreaks of hand-foot-and-mouth disease(HFMD) have been associated with significant morbidity and, in severe cases, mortality. Accurate forecasting of daily admissions of pediatric HFMD patients is therefore crucial for aiding the hospital in preparing for potential outbreaks and mitigating nosocomial transmissions. To address this pressing need, we propose a novel transformer-based model with a U-net shape, utilizing the patching strategy and the joint prediction strategy that capitalizes on insights from herpangina, a disease closely correlated with HFMD. This model also integrates representation learning by introducing reconstruction loss as an auxiliary loss. The results show that our U-net Patching Time Series Transformer (UPTST) model outperforms existing approaches in both long- and short-arm prediction accuracy of HFMD at hospital-level. Furthermore, the exploratory extension experiments show that the model's capabilities extend beyond prediction of infectious disease, suggesting broader applicability in various domains. △ Less

Submitted 6 October, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: No finished yet

arXiv:2309.13642 [pdf, ps, other]

One sided a_idempotent, one sided a_equivalent and SEP elements in a ring with involution

Authors: Hua Yao, Junchao Wei

Abstract: In order to study the properties of SEP elements, we propose the concepts of one sided a_idempotent and one sided a_equivalent. Under the condition that an element in a ring is both group invertible and MP_invertible, some equivalent conditions of such an element to be an SEP element are given based on these two concepts, as will as based on projections and the second and the third power of some p… ▽ More In order to study the properties of SEP elements, we propose the concepts of one sided a_idempotent and one sided a_equivalent. Under the condition that an element in a ring is both group invertible and MP_invertible, some equivalent conditions of such an element to be an SEP element are given based on these two concepts, as will as based on projections and the second and the third power of some products of some elements. △ Less

Submitted 24 September, 2023; originally announced September 2023.

Comments: 11 pages

arXiv:2309.13165 [pdf, other]

Large Language Models Are Also Good Prototypical Commonsense Reasoners

Authors: Chenin Li, Qianglong Chen, Yin Zhang, Yifei Zhang, Hongxiang Yao

Abstract: Commonsense reasoning is a pivotal skill for large language models, yet it presents persistent challenges in specific tasks requiring this competence. Traditional fine-tuning approaches can be resource-intensive and potentially compromise a model's generalization capacity. Furthermore, state-of-the-art language models like GPT-3.5 and Claude are primarily accessible through API calls, which makes… ▽ More Commonsense reasoning is a pivotal skill for large language models, yet it presents persistent challenges in specific tasks requiring this competence. Traditional fine-tuning approaches can be resource-intensive and potentially compromise a model's generalization capacity. Furthermore, state-of-the-art language models like GPT-3.5 and Claude are primarily accessible through API calls, which makes fine-tuning models challenging. To address these challenges, we draw inspiration from the outputs of large models for tailored tasks and semi-automatically developed a set of novel prompts from several perspectives, including task-relevance, supportive evidence generation (e.g. chain-of-thought and knowledge), diverse path decoding to aid the model. Experimental results on ProtoQA dataset demonstrate that with better designed prompts we can achieve the new state-of-art(SOTA) on the ProtoQA leaderboard, improving the Max Answer@1 score by 8%, Max Incorrect@1 score by 4% (breakthrough 50% for the first time) compared to the previous SOTA model and achieved an improvement on StrategyQA and CommonsenseQA2.0 (3% and 1%, respectively). Furthermore, with the generated Chain-of-Thought and knowledge, we can improve the interpretability of the model while also surpassing the previous SOTA models. We hope that our work can provide insight for the NLP community to develop better prompts and explore the potential of large language models for more complex reasoning tasks. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.10257 [pdf, other]

Observation of universal dissipative dynamics in strongly correlated quantum gas

Authors: Yajuan Zhao, Ye Tian, Jilai Ye, Yue Wu, Zihan Zhao, Zhihao Chi, Tian Tian, Hepeng Yao, Jiazhong Hu, Yu Chen, Wenlan Chen

Abstract: Dissipation is unavoidable in quantum systems. It usually induces decoherences and changes quantum correlations. To access the information of strongly correlated quantum matters, one has to overcome or suppress dissipation to extract out the underlying quantum phenomena. However, here we find an opposite effect that dissipation can be utilized as a powerful tool to probe the intrinsic correlations… ▽ More Dissipation is unavoidable in quantum systems. It usually induces decoherences and changes quantum correlations. To access the information of strongly correlated quantum matters, one has to overcome or suppress dissipation to extract out the underlying quantum phenomena. However, here we find an opposite effect that dissipation can be utilized as a powerful tool to probe the intrinsic correlations of quantum many-body systems. Applying highly-controllable dissipation in ultracold atomic systems, we observe a universal dissipative dynamics in strongly correlated one-dimensional quantum gases. The total particle number of this system follows a universal stretched-exponential decay, and the stretched exponent measures the anomalous dimension of the spectral function, a critical exponent characterizing strong quantum fluctuations of this system. This method could have broad applications in detecting strongly correlated features, including spin-charge separations and Fermi arcs in quantum materials. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.07185 [pdf]

A Health Monitoring System Based on Flexible Triboelectric Sensors for Intelligence Medical Internet of Things and its Applications in Virtual Reality

Authors: Junqi Mao, Puen Zhou, Xiaoyao Wang, Hongbo Yao, Liuyang Liang, Yiqiao Zhao, Jiawei Zhang, Dayan Ban, Haiwu Zheng

Abstract: The Internet of Medical Things (IoMT) is a platform that combines Internet of Things (IoT) technology with medical applications, enabling the realization of precision medicine, intelligent healthcare, and telemedicine in the era of digitalization and intelligence. However, the IoMT faces various challenges, including sustainable power supply, human adaptability of sensors and the intelligence of s… ▽ More The Internet of Medical Things (IoMT) is a platform that combines Internet of Things (IoT) technology with medical applications, enabling the realization of precision medicine, intelligent healthcare, and telemedicine in the era of digitalization and intelligence. However, the IoMT faces various challenges, including sustainable power supply, human adaptability of sensors and the intelligence of sensors. In this study, we designed a robust and intelligent IoMT system through the synergistic integration of flexible wearable triboelectric sensors and deep learning-assisted data analytics. We embedded four triboelectric sensors into a wristband to detect and analyze limb movements in patients suffering from Parkinson's Disease (PD). By further integrating deep learning-assisted data analytics, we actualized an intelligent healthcare monitoring system for the surveillance and interaction of PD patients, which includes location/trajectory tracking, heart monitoring and identity recognition. This innovative approach enabled us to accurately capture and scrutinize the subtle movements and fine motor of PD patients, thus providing insightful feedback and comprehensive assessment of the patients conditions. This monitoring system is cost-effective, easily fabricated, highly sensitive, and intelligent, consequently underscores the immense potential of human body sensing technology in a Health 4.0 society. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2309.07109 [pdf, ps, other]

Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

Authors: Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli , et al. (606 additional authors not shown)

Abstract: The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neu… ▽ More The core-collapse supernova (CCSN) is considered one of the most energetic astrophysical events in the universe. The early and prompt detection of neutrinos before (pre-SN) and during the supernova (SN) burst presents a unique opportunity for multi-messenger observations of CCSN events. In this study, we describe the monitoring concept and present the sensitivity of the system to pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton liquid scintillator detector currently under construction in South China. The real-time monitoring system is designed to ensure both prompt alert speed and comprehensive coverage of progenitor stars. It incorporates prompt monitors on the electronic board as well as online monitors at the data acquisition stage. Assuming a false alert rate of 1 per year, this monitoring system exhibits sensitivity to pre-SN neutrinos up to a distance of approximately 1.6 (0.9) kiloparsecs and SN neutrinos up to about 370 (360) kiloparsecs for a progenitor mass of 30 solar masses, considering both normal and inverted mass ordering scenarios. The pointing ability of the CCSN is evaluated by analyzing the accumulated event anisotropy of inverse beta decay interactions from pre-SN or SN neutrinos. This, along with the early alert, can play a crucial role in facilitating follow-up multi-messenger observations of the next galactic or nearby extragalactic CCSN. △ Less

Submitted 4 December, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: 24 pages, 9 figures, accepted for the publication at JCAP

arXiv:2309.05211 [pdf, other]

On Quaternion Higher-Order Singular Value Decomposition: Models and Analysis

Authors: Hanxin Ya, Ying Wang, Yuning Yang

Abstract: Higher-order singular value decomposition (HOSVD) is one of the most celebrated tensor decompositions that generalizes matrix SVD to higher-order tensors. It was recently extended to the quaternion domain \cite{miao2023quat} (we refer to it as L-QHOSVD in this work). However, due to the non-commutativity of quaternion multiplications, L-QHOSVD is not consistent with matrix SVD when the order of th… ▽ More Higher-order singular value decomposition (HOSVD) is one of the most celebrated tensor decompositions that generalizes matrix SVD to higher-order tensors. It was recently extended to the quaternion domain \cite{miao2023quat} (we refer to it as L-QHOSVD in this work). However, due to the non-commutativity of quaternion multiplications, L-QHOSVD is not consistent with matrix SVD when the order of the quaternion tensor reduces to 2; moreover, theoretical guaranteed truncated L-QHOSVD was not investigated. To derive a more natural higher-order generalization of the quaternion matrix SVD, we first utilize the feature that left and right multiplications of quaternions are inconsistent to define left and right quaternion tensor unfoldings and left and right mode-k products. Then, by using these basic tools, we propose a two-sided quaternion higher-order singular value decomposition (TS-QHOSVD). TS-QHOSVD has the following two main features: 1) it computes two factor matrices at a time from SVDs of left and right unfoldings, inheriting certain parallel properties of the original HOSVD; 2) it is consistent with matrix SVD when the order of the tensor is 2. In addition, we study truncated TS-QHOSVD and establish its error bound measured by the tail energy; correspondingly, we also present truncated L-QHOSVD and its error bound. Deriving the error bounds is nontrivial, as the proofs are more complicated than their real counterparts, again due to the non-commutativity of quaternion multiplications. Finally, we illustrate the derived properties of TS-QHOSVD and its efficacy via some numerical examples. △ Less

Submitted 17 October, 2023; v1 submitted 10 September, 2023; originally announced September 2023.

arXiv:2308.14869 [pdf, other]

PROSO Toolbox: a unified protein-constrained genome-scale modelling framework for strain designing and optimization

Authors: Haoyang Yao, Laurence Yang

Abstract: The genome-scale metabolic model with protein constraint (PC-model) has been increasingly popular for microbial metabolic simulations. We present PROSO Toolbox, a unified and simple-to-use PC-model toolbox that takes any high-quality genome-scale metabolic reconstruction as the input. The toolbox can construct a PC-model automatically, apply various algorithms for computational strain design and s… ▽ More The genome-scale metabolic model with protein constraint (PC-model) has been increasingly popular for microbial metabolic simulations. We present PROSO Toolbox, a unified and simple-to-use PC-model toolbox that takes any high-quality genome-scale metabolic reconstruction as the input. The toolbox can construct a PC-model automatically, apply various algorithms for computational strain design and simulation, and help unveil metabolism from gene expression data through a state-of-the-art OVERLAY workflow. It also has detailed tutorials and documentation for maximum accessibility to researchers from diverse backgrounds. PROSO Toolbox, tutorials, and documentation are freely available online: https://github.com/QCSB/PROSO-Toolbox. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: 4 pages, 1 figure

arXiv:2308.12319 [pdf, other]

RemovalNet: DNN Fingerprint Removal Attacks

Authors: Hongwei Yao, Zheng Li, Kunzhe Huang, Jian Lou, Zhan Qin, Kui Ren

Abstract: With the performance of deep neural networks (DNNs) remarkably improving, DNNs have been widely used in many areas. Consequently, the DNN model has become a valuable asset, and its intellectual property is safeguarded by ownership verification techniques (e.g., DNN fingerprinting). However, the feasibility of the DNN fingerprint removal attack and its potential influence remains an open problem. I… ▽ More With the performance of deep neural networks (DNNs) remarkably improving, DNNs have been widely used in many areas. Consequently, the DNN model has become a valuable asset, and its intellectual property is safeguarded by ownership verification techniques (e.g., DNN fingerprinting). However, the feasibility of the DNN fingerprint removal attack and its potential influence remains an open problem. In this paper, we perform the first comprehensive investigation of DNN fingerprint removal attacks. Generally, the knowledge contained in a DNN model can be categorized into general semantic and fingerprint-specific knowledge. To this end, we propose a min-max bilevel optimization-based DNN fingerprint removal attack named RemovalNet, to evade model ownership verification. The lower-level optimization is designed to remove fingerprint-specific knowledge. While in the upper-level optimization, we distill the victim model's general semantic knowledge to maintain the surrogate model's performance. We conduct extensive experiments to evaluate the fidelity, effectiveness, and efficiency of the RemovalNet against four advanced defense methods on six metrics. The empirical results demonstrate that (1) the RemovalNet is effective. After our DNN fingerprint removal attack, the model distance between the target and surrogate models is x100 times higher than that of the baseline attacks, (2) the RemovalNet is efficient. It uses only 0.2% (400 samples) of the substitute dataset and 1,000 iterations to conduct our attack. Besides, compared with advanced model stealing attacks, the RemovalNet saves nearly 85% of computational resources at most, (3) the RemovalNet achieves high fidelity that the created surrogate model maintains high accuracy after the DNN fingerprint removal process. Our code is available at: https://github.com/grasses/RemovalNet. △ Less

Submitted 22 November, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted to IEEE TDSC, code is available at: https://github.com/grasses/RemovalNet

arXiv:2308.12213 [pdf, other]

CLIPN for Zero-Shot OOD Detection: Teaching CLIP to Say No

Authors: Hualiang Wang, Yi Li, Huifeng Yao, Xiaomeng Li

Abstract: Out-of-distribution (OOD) detection refers to training the model on an in-distribution (ID) dataset to classify whether the input images come from unknown classes. Considerable effort has been invested in designing various OOD detection methods based on either convolutional neural networks or transformers. However, zero-shot OOD detection methods driven by CLIP, which only require class names for… ▽ More Out-of-distribution (OOD) detection refers to training the model on an in-distribution (ID) dataset to classify whether the input images come from unknown classes. Considerable effort has been invested in designing various OOD detection methods based on either convolutional neural networks or transformers. However, zero-shot OOD detection methods driven by CLIP, which only require class names for ID, have received less attention. This paper presents a novel method, namely CLIP saying no (CLIPN), which empowers the logic of saying no within CLIP. Our key motivation is to equip CLIP with the capability of distinguishing OOD and ID samples using positive-semantic prompts and negation-semantic prompts. Specifically, we design a novel learnable no prompt and a no text encoder to capture negation semantics within images. Subsequently, we introduce two loss functions: the image-text binary-opposite loss and the text semantic-opposite loss, which we use to teach CLIPN to associate images with no prompts, thereby enabling it to identify unknown samples. Furthermore, we propose two threshold-free inference algorithms to perform OOD detection by utilizing negation semantics from no prompts and the text encoder. Experimental results on 9 benchmark datasets (3 ID datasets and 6 OOD datasets) for the OOD detection task demonstrate that CLIPN, based on ViT-B-16, outperforms 7 well-used algorithms by at least 2.34% and 11.64% in terms of AUROC and FPR95 for zero-shot OOD detection on ImageNet-1K. Our CLIPN can serve as a solid foundation for effectively leveraging CLIP in downstream OOD tasks. The code is available on https://github.com/xmed-lab/CLIPN. △ Less

Submitted 23 August, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

Comments: ICCV 2023

MSC Class: 68T45 ACM Class: I.4.9

arXiv:2308.12191 [pdf, other]

Sign Language Translation with Iterative Prototype

Authors: Huijie Yao, Wengang Zhou, Hao Feng, Hezhen Hu, Hao Zhou, Houqiang Li

Abstract: This paper presents IP-SLT, a simple yet effective framework for sign language translation (SLT). Our IP-SLT adopts a recurrent structure and enhances the semantic representation (prototype) of the input sign language video via an iterative refinement manner. Our idea mimics the behavior of human reading, where a sentence can be digested repeatedly, till reaching accurate understanding. Technicall… ▽ More This paper presents IP-SLT, a simple yet effective framework for sign language translation (SLT). Our IP-SLT adopts a recurrent structure and enhances the semantic representation (prototype) of the input sign language video via an iterative refinement manner. Our idea mimics the behavior of human reading, where a sentence can be digested repeatedly, till reaching accurate understanding. Technically, IP-SLT consists of feature extraction, prototype initialization, and iterative prototype refinement. The initialization module generates the initial prototype based on the visual feature extracted by the feature extraction module. Then, the iterative refinement module leverages the cross-attention mechanism to polish the previous prototype by aggregating it with the original video feature. Through repeated refinement, the prototype finally converges to a more stable and accurate state, leading to a fluent and appropriate translation. In addition, to leverage the sequential dependence of prototypes, we further propose an iterative distillation loss to compress the knowledge of the final iteration into previous ones. As the autoregressive decoding process is executed only once in inference, our IP-SLT is ready to improve various SLT systems with acceptable overhead. Extensive experiments are conducted on public benchmarks to demonstrate the effectiveness of the IP-SLT. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: Accepted by ICCV 2023

arXiv:2308.11348 [pdf, other]

Careful at Estimation and Bold at Exploration

Authors: Xing Chen, Yijun Liu, Zhaogeng Liu, Hechang Chen, Hengshuai Yao, Yi Chang

Abstract: Exploration strategies in continuous action space are often heuristic due to the infinite actions, and these kinds of methods cannot derive a general conclusion. In prior work, it has been shown that policy-based exploration is beneficial for continuous action space in deterministic policy reinforcement learning(DPRL). However, policy-based exploration in DPRL has two prominent issues: aimless exp… ▽ More Exploration strategies in continuous action space are often heuristic due to the infinite actions, and these kinds of methods cannot derive a general conclusion. In prior work, it has been shown that policy-based exploration is beneficial for continuous action space in deterministic policy reinforcement learning(DPRL). However, policy-based exploration in DPRL has two prominent issues: aimless exploration and policy divergence, and the policy gradient for exploration is only sometimes helpful due to inaccurate estimation. Based on the double-Q function framework, we introduce a novel exploration strategy to mitigate these issues, separate from the policy gradient. We first propose the greedy Q softmax update schema for Q value update. The expected Q value is derived by weighted summing the conservative Q value over actions, and the weight is the corresponding greedy Q value. Greedy Q takes the maximum value of the two Q functions, and conservative Q takes the minimum value of the two different Q functions. For practicality, this theoretical basis is then extended to allow us to combine action exploration with the Q value update, except for the premise that we have a surrogate policy that behaves like this exploration policy. In practice, we construct such an exploration policy with a few sampled actions, and to meet the premise, we learn such a surrogate policy by minimizing the KL divergence between the target policy and the exploration policy constructed by the conservative Q. We evaluate our method on the Mujoco benchmark and demonstrate superior performance compared to previous state-of-the-art methods across various environments, particularly in the most complex Humanoid environment. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 20 pages

arXiv:2308.09732 [pdf, other]

Baird Counterexample is Solved: with an example of How to Debug a Two-time-scale Algorithm

Authors: Hengshuai Yao

Abstract: Baird counterexample was proposed by Leemon Baird in 1995, first used to show that the Temporal Difference (TD(0)) algorithm diverges on this example. Since then, it is often used to test and compare off-policy learning algorithms. Gradient TD algorithms solved the divergence issue of TD on Baird counterexample. However, their convergence on this example is still very slow, and the nature of the s… ▽ More Baird counterexample was proposed by Leemon Baird in 1995, first used to show that the Temporal Difference (TD(0)) algorithm diverges on this example. Since then, it is often used to test and compare off-policy learning algorithms. Gradient TD algorithms solved the divergence issue of TD on Baird counterexample. However, their convergence on this example is still very slow, and the nature of the slowness is not well understood, e.g., see (Sutton and Barto 2018). This note is to understand in particular, why TDC is slow on this example, and provide a debugging analysis to understand this behavior. Our debugging technique can be used to study the convergence behavior of two-time-scale stochastic approximation algorithms. We also provide empirical results of the recent Impression GTD algorithm on this example, showing the convergence is very fast, in fact, in a linear rate. We conclude that Baird counterexample is solved, by an algorithm with the convergence guarantee to the TD solution in general, and a fast convergence rate. △ Less

Submitted 2 September, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

arXiv:2308.08609 [pdf, other]

Pair density wave superconductivity: a microscopic model in two dimensions

Authors: Yi-Fan Jiang, Hong Yao

Abstract: Pair-density-wave (PDW) superconductivity is a long-sought exotic state with oscillating superconducting order parameter without the need of applying external magnetic field. So far it has been rare in establishing a two-dimensional (2D) microscopic model with such exotic long-range order in its ground state. Here we propose to study PDW superconductivity in a minimal model of spinless fermions on… ▽ More Pair-density-wave (PDW) superconductivity is a long-sought exotic state with oscillating superconducting order parameter without the need of applying external magnetic field. So far it has been rare in establishing a two-dimensional (2D) microscopic model with such exotic long-range order in its ground state. Here we propose to study PDW superconductivity in a minimal model of spinless fermions on the honeycomb lattice with nearest-neighbor (NN) and next-nearest-neighbor (NNN) interaction $V_1$ and $V_2$, respectively. By performing a state-of-the-art density-matrix renormalization group (DMRG) study of this $t$-$V_1$-$V_2$ model at finite do** on six-leg and eight-leg honeycomb cylinders, we showed that the ground state exhibits PDW ordering (namely quasi-long-range order with a divergent PDW susceptibility). Remarkably this PDW state persists on the wider cylinder with 2D-like Fermi surfaces (FS). To the best of our knowledge, this is probably the first controlled numerical evidence of PDW in systems with 2D-like FS. △ Less

Submitted 16 August, 2023; originally announced August 2023.

Comments: 4 pages, 3 figures

arXiv:2308.06222 [pdf, other]

High-temperature superconductivity induced by the Su-Schrieffer-Heeger electron-phonon coupling

Authors: Xun Cai, Zi-Xiang Li, Hong Yao

Abstract: Experimental quest for high-temperature and room-temperature superconductivity (SC) at ambient pressure has been a long-standing research theme in physics. It has also been desired to construct reliable microscopic mechanisms that may achieve high-temperature SC. Here we systematically explore SC in the Su-Schrieffer-Heeger (SSH) electron-phonon coupling models by performing numerically-exact quan… ▽ More Experimental quest for high-temperature and room-temperature superconductivity (SC) at ambient pressure has been a long-standing research theme in physics. It has also been desired to construct reliable microscopic mechanisms that may achieve high-temperature SC. Here we systematically explore SC in the Su-Schrieffer-Heeger (SSH) electron-phonon coupling models by performing numerically-exact quantum Monte-Carlo simulations. Our results reliably showed that superconducting $T_c$ of the SSH models is high, remarkably higher than those in the Holstein models, particularly in strong electron-phonon coupling regime. This is mainly because SSH phonons can not only induce strong pairing between electrons but also help the phase coherence of Cooper pairs, thus realizing higher $T_c$. As mechanism of higher-$T_c$ of the SSH models could be potentially relevant to realistic materials, it paves a promising way to find higher-temperature SC in the future. △ Less

Submitted 14 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

Comments: 5 pages plus supplemental materials, 4 figures

arXiv:2308.04144 [pdf, other]

Cooling bosons by dimensional reduction

Authors: Yanliang Guo, Hepeng Yao, Sudipta Dhar, Lorenzo Pizzino, Milena Horvath, Thierry Giamarchi, Manuele Landini, Hanns-Christoph Nägerl

Abstract: Cold atomic gases provide a remarkable testbed to study the physics of interacting many-body quantum systems. They have started to play a major role as quantum simulators, given the high degree of control that is possible. A crucial element is given by the necessarily non-zero temperature. However cooling to the required ultralow temperatures or even simply measuring the temperature directly on th… ▽ More Cold atomic gases provide a remarkable testbed to study the physics of interacting many-body quantum systems. They have started to play a major role as quantum simulators, given the high degree of control that is possible. A crucial element is given by the necessarily non-zero temperature. However cooling to the required ultralow temperatures or even simply measuring the temperature directly on the system can prove to be very challenging tasks. Here, we implement thermometry on strongly interacting two- and one-dimensional Bose gases with high sensitivity in the nano-Kelvin temperature range. Our method is aided by the fact that the decay of the first-order correlation function is very sensitive to the temperature when interactions are strong. We find that there may be a significant temperature variation when the three-dimensional quantum gas is cut into two-dimensional slices or into one-dimensional tubes. Strikingly, the temperature for the one-dimensional case can be much lower than the initial temperature. Our findings show that this decrease results from the interplay of dimensional reduction and strong interactions. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.02816 [pdf, other]

PromptCARE: Prompt Copyright Protection by Watermark Injection and Verification

Authors: Hongwei Yao, Jian Lou, Kui Ren, Zhan Qin

Abstract: Large language models (LLMs) have witnessed a meteoric rise in popularity among the general public users over the past few months, facilitating diverse downstream tasks with human-level accuracy and proficiency. Prompts play an essential role in this success, which efficiently adapt pre-trained LLMs to task-specific applications by simply prepending a sequence of tokens to the query texts. However… ▽ More Large language models (LLMs) have witnessed a meteoric rise in popularity among the general public users over the past few months, facilitating diverse downstream tasks with human-level accuracy and proficiency. Prompts play an essential role in this success, which efficiently adapt pre-trained LLMs to task-specific applications by simply prepending a sequence of tokens to the query texts. However, designing and selecting an optimal prompt can be both expensive and demanding, leading to the emergence of Prompt-as-a-Service providers who profit by providing well-designed prompts for authorized use. With the growing popularity of prompts and their indispensable role in LLM-based services, there is an urgent need to protect the copyright of prompts against unauthorized use. In this paper, we propose PromptCARE, the first framework for prompt copyright protection through watermark injection and verification. Prompt watermarking presents unique challenges that render existing watermarking techniques developed for model and dataset copyright verification ineffective. PromptCARE overcomes these hurdles by proposing watermark injection and verification schemes tailor-made for prompts and NLP characteristics. Extensive experiments on six well-known benchmark datasets, using three prevalent pre-trained LLMs (BERT, RoBERTa, and Facebook OPT-1.3b), demonstrate the effectiveness, harmlessness, robustness, and stealthiness of PromptCARE. △ Less

Submitted 28 November, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

Comments: To Appear in the 45th IEEE Symposium on Security and Privacy 2024, code is available at: https://github.com/grasses/PromptCARE

arXiv:2308.00411 [pdf, other]

Experimental Observation of the 2D-1D Dimensional Crossover in Strongly Interacting Ultracold Bosons

Authors: Yanliang Guo, Hepeng Yao, Satwik Ramanjanappa, Sudipta Dhar, Milena Horvath, Lorenzo Pizzino, Thierry Giamarchi, Manuele Landini, Hanns-Christoph Nägerl

Abstract: Dimensionality plays an essential role in determining the nature and properties of a physical system. For quantum systems the impact of interactions and fluctuations is enhanced in lower dimensions, leading to a great diversity of genuine quantum effects for reduced dimensionality. In most cases, the dimension is fixed to some integer value. Here, we experimentally probe the dimensional crossover… ▽ More Dimensionality plays an essential role in determining the nature and properties of a physical system. For quantum systems the impact of interactions and fluctuations is enhanced in lower dimensions, leading to a great diversity of genuine quantum effects for reduced dimensionality. In most cases, the dimension is fixed to some integer value. Here, we experimentally probe the dimensional crossover from two to one dimension using strongly interacting ultracold bosons in variable lattice potentials and compare the data to ab-initio theory that takes into account non-homogeneous trap** and non-zero temperature. From a precise measurement of the momentum distribution we analyze the characteristic decay of the one-body correlation function in the two dimensionalities and then track how the decay is modified in the crossover. A varying two-slope structure is revealed, reflecting the fact that the particles see their dimensionality as being one or two depending on whether they are probed on short or long distances, respectively. Our observations demonstrate how quantum properties in the strongly-correlated regime evolve in the dimensional crossover as a result of the interplay between dimensionality, interactions, and temperature. △ Less

Submitted 1 August, 2023; originally announced August 2023.

arXiv:2307.15892 [pdf, other]

A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using $L$-$λ$ Smoothness

Authors: Hengshuai Yao

Abstract: Gradient Temporal Difference (GTD) algorithms (Sutton et al., 2008, 2009) are the first $O(d)$ ($d$ is the number features) algorithms that have convergence guarantees for off-policy learning with linear function approximation. Liu et al. (2015) and Dalal et. al. (2018) proved the convergence rates of GTD, GTD2 and TDC are $O(t^{-α/2})$ for some $α\in (0,1)$. This bound is tight (Dalal et al., 202… ▽ More Gradient Temporal Difference (GTD) algorithms (Sutton et al., 2008, 2009) are the first $O(d)$ ($d$ is the number features) algorithms that have convergence guarantees for off-policy learning with linear function approximation. Liu et al. (2015) and Dalal et. al. (2018) proved the convergence rates of GTD, GTD2 and TDC are $O(t^{-α/2})$ for some $α\in (0,1)$. This bound is tight (Dalal et al., 2020), and slower than $O(1/\sqrt{t})$. GTD algorithms also have two step-size parameters, which are difficult to tune. In literature, there is a "single-time-scale" formulation of GTD. However, this formulation still has two step-size parameters. This paper presents a truly single-time-scale GTD algorithm for minimizing the Norm of Expected td Update (NEU) objective, and it has only one step-size parameter. We prove that the new algorithm, called Impression GTD, converges at least as fast as $O(1/t)$. Furthermore, based on a generalization of the expected smoothness (Gower et al. 2019), called $L$-$λ$ smoothness, we are able to prove that the new GTD converges even faster, in fact, with a linear rate. Our rate actually also improves Gower et al.'s result with a tighter bound under a weaker assumption. Besides Impression GTD, we also prove the rates of three other GTD algorithms, one by Yao and Liu (2008), another called A-transpose-TD (Sutton et al., 2008), and a counterpart of A-transpose-TD. The convergence rates of all the four GTD algorithms are proved in a single generic GTD framework to which $L$-$λ$ smoothness applies. Empirical results on Random walks, Boyan chain, and Baird counterexample show that Impression GTD converges much faster than existing GTD algorithms for both on-policy and off-policy learning problems, with well-performing step-sizes in a big range. △ Less

Submitted 2 September, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

arXiv:2307.11311 [pdf, ps, other]

doi 10.1017/jfm.2023.808

Entropy and fluctuation relations in isotropic turbulence

Authors: H. Yao, T. A. Zaki, C. Meneveau

Abstract: Based on a generalized local Kolmogorov-Hill equation expressing the evolution of kinetic energy integrated over spheres of size $\ell$ in the inertial range of fluid turbulence, we examine a possible definition of entropy and entropy generation for turbulence. Its measurement from direct numerical simulations in isotropic turbulence leads to confirmation of the validity of the fluctuation relatio… ▽ More Based on a generalized local Kolmogorov-Hill equation expressing the evolution of kinetic energy integrated over spheres of size $\ell$ in the inertial range of fluid turbulence, we examine a possible definition of entropy and entropy generation for turbulence. Its measurement from direct numerical simulations in isotropic turbulence leads to confirmation of the validity of the fluctuation relation (FR) from non-equilibrium thermodynamics in the inertial range of turbulent flows. Specifically, the ratio of probability densities of forward and inverse cascade at scale $\ell$ is shown to follow exponential behavior with the entropy generation rate if the latter is defined by including an appropriately defined notion of ``temperature of turbulence'' proportional to the kinetic energy at scale $\ell$. △ Less

Submitted 24 September, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

Journal ref: vol. 973, R6, 2023

arXiv:2307.10722 [pdf]

Observation of long-range ferromagnetism via anomalous supercurrents in a spin-orbit coupled superconductor

Authors: B. K. Xiang, Y. S. Lin, Q. S. He, J. J. Zhu, B. R. Chen, Y. F. Wang, K. Y. Liang, Z. J. Li, H. X. Yao, C. X. Wu, T. Y. Zhou, M. H. Fang, Y. Lu, I. V. Tokatly, F. S. Bergeret, Y. H. Wang

Abstract: Conventional superconductors naturally disfavor ferromagnetism because the supercurrent-carrying electrons are paired into anti-parallel spin singlets. In superconductors with strong Rashba spin-orbit coupling, impurity magnetic moments induce supercurrents through the spin-galvanic effect. As a result, long-range ferromagnetic interaction among the impurity moments may be mediated through such an… ▽ More Conventional superconductors naturally disfavor ferromagnetism because the supercurrent-carrying electrons are paired into anti-parallel spin singlets. In superconductors with strong Rashba spin-orbit coupling, impurity magnetic moments induce supercurrents through the spin-galvanic effect. As a result, long-range ferromagnetic interaction among the impurity moments may be mediated through such anomalous supercurrents in a similar fashion as in itinerant ferromagnets. Fe(Se,Te) is such a superconductor with topological surface bands, previously shown to exhibit quantum anomalous vortices around impurity spins. Here, we take advantage of the flux sensitivity of scanning superconducting quantum interference devices to investigate superconducting Fe(Se,Te) in the regime where supercurrents around impurities overlap. We find homogeneous remanent flux patterns after applying a supercurrent through the sample. The patterns are consistent with anomalous edge and bulk supercurrents generated by in-plane magnetization, which occur above a current threshold and follow hysteresis loops reminiscent of those of a ferromagnet. Similar long-range magnetic orders can be generated by Meissner current under a small out-of-plane magnetic field. The magnetization weakens with increasing temperature and disappears after thermal cycling to above superconducting critical temperature; further suggesting superconductivity is central to establishing and maintaining the magnetic order. These observations demonstrate surface anomalous supercurrents as a mediator for ferromagnetism in a spin-orbit coupled superconductor, which may potentially be utilized for low-power cryogenic memory. △ Less

Submitted 20 July, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

arXiv:2307.10568 [pdf, other]

doi 10.1017/jfm.2023.1066

Comparing local energy cascade rates in isotropic turbulence using structure function and filtering formulations

Authors: H. Yao, M. Schnaubelt, A. Szalay, T. Zaki, C. Meneveau

Abstract: Two common definitions of the spatially local rate of kinetic energy cascade at some scale $\ell$ in turbulent flows are (i) the cubic velocity difference term appearing in the generalized Kolmogorov-Hill equation (GKHE) (structure function approach), and (ii) the subfilter-scale energy flux term in the transport equation for subgrid-scale kinetic energy (filtering approach). We perform a comparat… ▽ More Two common definitions of the spatially local rate of kinetic energy cascade at some scale $\ell$ in turbulent flows are (i) the cubic velocity difference term appearing in the generalized Kolmogorov-Hill equation (GKHE) (structure function approach), and (ii) the subfilter-scale energy flux term in the transport equation for subgrid-scale kinetic energy (filtering approach). We perform a comparative study of both quantities based on direct numerical simulation data of isotropic turbulence at Taylor-scale Reynolds number of 1250. While observations of negative subfilter-scale energy flux (backscatter) have in the past led to debates regarding interpretation and relevance of such observations, we argue that the interpretation of the local structure function-based cascade rate definition is unambiguous since it arises from a divergence term in scale space. Conditional averaging is used to explore the relationship between the local cascade rate and the local filtered viscous dissipation rate as well as filtered velocity gradient tensor properties such as its invariants. We find statistically robust evidence of inverse cascade when both the large-scale rotation rate is strong and the large-scale strain rate is weak. Even stronger net inverse cascading is observed in the ``vortex compression'' $R>0$, $Q>0$ quadrant where $R$ and $Q$ are velocity gradient invariants. Qualitatively similar, but quantitatively much weaker trends are observed for the conditionally averaged subfilter scale energy flux. Flow visualizations show consistent trends, namely that spatially the inverse cascade events appear to be located within large-scale vortices, specifically in subregions when $R$ is large. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Journal ref: Journal of Fluid Mechanics vol. 980, A42, 2024

arXiv:2307.10150 [pdf]

Direct observation of chiral edge current at zero magnetic field in odd-layer MnBi$_2$Te$_4$

Authors: **jiang Zhu, Yang Feng, Xiaodong Zhou, Yongchao Wang, Zichen Lian, Weiyan Lin, Qiushi He, Yishi Lin, Youfang Wang, Hongxu Yao, Hao Li, Yang Wu, **g Wang, Jian Shen, **song Zhang, Yayu Wang, Yihua Wang

Abstract: The chiral edge current is the boundary manifestation of the Chern number of a quantum anomalous Hall (QAH) insulator. Its direct observation is assumed to require well-quantized Hall conductance, and is so far lacking. The recently discovered van der Waals antiferromagnet MnBi$_2$Te$_4$ is theorized as a QAH in odd-layers but has shown Hall resistivity below the quantization value at zero magneti… ▽ More The chiral edge current is the boundary manifestation of the Chern number of a quantum anomalous Hall (QAH) insulator. Its direct observation is assumed to require well-quantized Hall conductance, and is so far lacking. The recently discovered van der Waals antiferromagnet MnBi$_2$Te$_4$ is theorized as a QAH in odd-layers but has shown Hall resistivity below the quantization value at zero magnetic field. Here, we perform scanning superconducting quantum interference device (sSQUID) microscopy on these seemingly failed QAH insulators to image their current distribution. When gated to the charge neutral point, our device exhibits edge current, which flows unidirectionally on the odd-layer boundary both with vacuum and with the even-layer. The chirality of such edge current reverses with the magnetization of the bulk. Surprisingly, we find the edge channels coexist with finite bulk conduction even though the bulk chemical potential is in the band gap, suggesting their robustness under significant edge-bulk scattering. Our result establishes the existence of chiral edge currents in a topological antiferromagnet and offers an alternative for identifying QAH states. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2307.06546 [pdf, other]

Forward and inverse energy cascade and fluctuation relation in fluid turbulence adhere to Kolmogorov's refined similarity hypothesis

Authors: H. Yao, P. K. Yeung, T. A. Zaki, C. Meneveau

Abstract: We study fluctuations of the local energy cascade rate $Φ_\ell$ in turbulent flows at scales ($\ell$) in the inertial range. According to the Kolmogorov refined similarity hypothesis (KRSH), relevant statistical properties of $Φ_\ell$ should depend on $ε_\ell$, the viscous dissipation rate locally averaged over a sphere of size $\ell$, rather than on the global average dissipation. However, the va… ▽ More We study fluctuations of the local energy cascade rate $Φ_\ell$ in turbulent flows at scales ($\ell$) in the inertial range. According to the Kolmogorov refined similarity hypothesis (KRSH), relevant statistical properties of $Φ_\ell$ should depend on $ε_\ell$, the viscous dissipation rate locally averaged over a sphere of size $\ell$, rather than on the global average dissipation. However, the validity of KRSH applied to $Φ_\ell$ has not yet been tested from data. Conditional averages such as $\langle Φ_\ell|ε_{\ell}\rangle$ as well as of higher-order moments are measured from Direct Numerical Simulations data, and results clearly adhere to the predictions from KRSH. Remarkably, the same is true when considering forward ($Φ_\ell>0$) and inverse ($Φ_\ell<0$) cascade events separately. Measured ratios of forward and inverse cascade probability densities further show that a fluctuation relation adhering to the KRSH can be observed, raising the hope that important features of turbulence may be described using concepts from non-equilibrium thermodynamics. △ Less

Submitted 6 January, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

Journal ref: Phys. Rev. Lett. 132, 164001 , 2024

arXiv:2307.02148 [pdf]

Compound Attention and Neighbor Matching Network for Multi-contrast MRI Super-resolution

Authors: Wenxuan Chen, Sirui Wu, Shuai Wang, Zhongsen Li, Jia Yang, Huifeng Yao, Xiaolei Song

Abstract: Multi-contrast magnetic resonance imaging (MRI) reflects information about human tissue from different perspectives and has many clinical applications. By utilizing the complementary information among different modalities, multi-contrast super-resolution (SR) of MRI can achieve better results than single-image super-resolution. However, existing methods of multi-contrast MRI SR have the following… ▽ More Multi-contrast magnetic resonance imaging (MRI) reflects information about human tissue from different perspectives and has many clinical applications. By utilizing the complementary information among different modalities, multi-contrast super-resolution (SR) of MRI can achieve better results than single-image super-resolution. However, existing methods of multi-contrast MRI SR have the following shortcomings that may limit their performance: First, existing methods either simply concatenate the reference and degraded features or exploit global feature-matching between them, which are unsuitable for multi-contrast MRI SR. Second, although many recent methods employ transformers to capture long-range dependencies in the spatial dimension, they neglect that self-attention in the channel dimension is also important for low-level vision tasks. To address these shortcomings, we proposed a novel network architecture with compound-attention and neighbor matching (CANM-Net) for multi-contrast MRI SR: The compound self-attention mechanism effectively captures the dependencies in both spatial and channel dimension; the neighborhood-based feature-matching modules are exploited to match degraded features and adjacent reference features and then fuse them to obtain the high-quality images. We conduct experiments of SR tasks on the IXI, fastMRI, and real-world scanning datasets. The CANM-Net outperforms state-of-the-art approaches in both retrospective and prospective experiments. Moreover, the robustness study in our work shows that the CANM-Net still achieves good performance when the reference and degraded images are imperfectly registered, proving good potential in clinical applications. △ Less

Submitted 16 September, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2307.00932 [pdf]

A large calcium-imaging dataset reveals a systematic V4 organization for natural scenes

Authors: Tianye Wang, Haoxuan Yao, Tai Sing Lee, Jiayi Hong, Yang Li, Hongfei Jiang, Ian Max Andolina, Shiming Tang

Abstract: The visual system evolved to process natural scenes, yet most of our understanding of the topology and function of visual cortex derives from studies using artificial stimuli. To gain deeper insights into visual processing of natural scenes, we utilized widefield calcium-imaging of primate V4 in response to many natural images, generating a large dataset of columnar-scale responses. We used this d… ▽ More The visual system evolved to process natural scenes, yet most of our understanding of the topology and function of visual cortex derives from studies using artificial stimuli. To gain deeper insights into visual processing of natural scenes, we utilized widefield calcium-imaging of primate V4 in response to many natural images, generating a large dataset of columnar-scale responses. We used this dataset to build a digital twin of V4 via deep learning, generating a detailed topographical map of natural image preferences at each cortical position. The map revealed clustered functional domains for specific classes of natural image features. These ranged from surface-related attributes like color and texture to shape-related features such as edges, curvature, and facial features. We validated the model-predicted domains with additional widefield calcium-imaging and single-cell resolution two-photon imaging. Our study illuminates the detailed topological organization and neural codes in V4 that represent natural scenes. △ Less

Submitted 23 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: 39 pages, 14 figures

arXiv:2306.11976 [pdf, other]

Interactive Molecular Discovery with Natural Language

Authors: Zheni Zeng, Bangchen Yin, Shipeng Wang, Jiarui Liu, Cheng Yang, Haishen Yao, Xingzhi Sun, Maosong Sun, Guotong Xie, Zhiyuan Liu

Abstract: Natural language is expected to be a key medium for various human-machine interactions in the era of large language models. When it comes to the biochemistry field, a series of tasks around molecules (e.g., property prediction, molecule mining, etc.) are of great significance while having a high technical threshold. Bridging the molecule expressions in natural language and chemical language can no… ▽ More Natural language is expected to be a key medium for various human-machine interactions in the era of large language models. When it comes to the biochemistry field, a series of tasks around molecules (e.g., property prediction, molecule mining, etc.) are of great significance while having a high technical threshold. Bridging the molecule expressions in natural language and chemical language can not only hugely improve the interpretability and reduce the operation difficulty of these tasks, but also fuse the chemical knowledge scattered in complementary materials for a deeper comprehension of molecules. Based on these benefits, we propose the conversational molecular design, a novel task adopting natural language for describing and editing target molecules. To better accomplish this task, we design ChatMol, a knowledgeable and versatile generative pre-trained model, enhanced by injecting experimental property information, molecular spatial knowledge, and the associations between natural and chemical languages into it. Several typical solutions including large language models (e.g., ChatGPT) are evaluated, proving the challenge of conversational molecular design and the effectiveness of our knowledge enhancement method. Case observations and analysis are conducted to provide directions for further exploration of natural-language interaction in molecular discovery. △ Less

Submitted 20 June, 2023; originally announced June 2023.

arXiv:2306.11338 [pdf, other]

FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Authors: Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Kui Ren, Zhan Qin

Abstract: Machine Learning as a Service (MLaaS) platforms have gained popularity due to their accessibility, cost-efficiency, scalability, and rapid development capabilities. However, recent research has highlighted the vulnerability of cloud-based models in MLaaS to model extraction attacks. In this paper, we introduce FDINET, a novel defense mechanism that leverages the feature distribution of deep neural… ▽ More Machine Learning as a Service (MLaaS) platforms have gained popularity due to their accessibility, cost-efficiency, scalability, and rapid development capabilities. However, recent research has highlighted the vulnerability of cloud-based models in MLaaS to model extraction attacks. In this paper, we introduce FDINET, a novel defense mechanism that leverages the feature distribution of deep neural network (DNN) models. Concretely, by analyzing the feature distribution from the adversary's queries, we reveal that the feature distribution of these queries deviates from that of the model's training set. Based on this key observation, we propose Feature Distortion Index (FDI), a metric designed to quantitatively measure the feature distribution deviation of received queries. The proposed FDINET utilizes FDI to train a binary detector and exploits FDI similarity to identify colluding adversaries from distributed extraction attacks. We conduct extensive experiments to evaluate FDINET against six state-of-the-art extraction attacks on four benchmark datasets and four popular model architectures. Empirical results demonstrate the following findings FDINET proves to be highly effective in detecting model extraction, achieving a 100% detection accuracy on DFME and DaST. FDINET is highly efficient, using just 50 queries to raise an extraction alarm with an average confidence of 96.08% for GTSRB. FDINET exhibits the capability to identify colluding adversaries with an accuracy exceeding 91%. Additionally, it demonstrates the ability to detect two types of adaptive attacks. △ Less

Submitted 21 June, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

Comments: 13 pages, 7 figures

arXiv:2306.09567 [pdf, other]

doi 10.1088/1475-7516/2023/09/001

JUNO sensitivity to the annihilation of MeV dark matter in the galactic halo

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato , et al. (581 additional authors not shown)

Abstract: We discuss JUNO sensitivity to the annihilation of MeV dark matter in the galactic halo via detecting inverse beta decay reactions of electron anti-neutrinos resulting from the annihilation. We study possible backgrounds to the signature, including the reactor neutrinos, diffuse supernova neutrino background, charged- and neutral-current interactions of atmospheric neutrinos, backgrounds from muon… ▽ More We discuss JUNO sensitivity to the annihilation of MeV dark matter in the galactic halo via detecting inverse beta decay reactions of electron anti-neutrinos resulting from the annihilation. We study possible backgrounds to the signature, including the reactor neutrinos, diffuse supernova neutrino background, charged- and neutral-current interactions of atmospheric neutrinos, backgrounds from muon-induced fast neutrons and cosmogenic isotopes. A fiducial volume cut, as well as the pulse shape discrimination and the muon veto are applied to suppress the above backgrounds. It is shown that JUNO sensitivity to the thermally averaged dark matter annihilation rate in 10 years of exposure would be significantly better than the present-day best limit set by Super-Kamiokande and would be comparable to that expected by Hyper-Kamiokande. △ Less

Submitted 13 September, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: 25 pages, 9 figures, matches the publised version

Journal ref: JCAP 09 (2023) 001

arXiv:2306.05718 [pdf, other]

Learning Domain-Aware Detection Head with Prompt Tuning

Authors: Haochen Li, Rui Zhang, Hantao Yao, Xinkai Song, Yifan Hao, Yongwei Zhao, Ling Li, Yunji Chen

Abstract: Domain adaptive object detection (DAOD) aims to generalize detectors trained on an annotated source domain to an unlabelled target domain. However, existing methods focus on reducing the domain bias of the detection backbone by inferring a discriminative visual encoder, while ignoring the domain bias in the detection head. Inspired by the high generalization of vision-language models (VLMs), apply… ▽ More Domain adaptive object detection (DAOD) aims to generalize detectors trained on an annotated source domain to an unlabelled target domain. However, existing methods focus on reducing the domain bias of the detection backbone by inferring a discriminative visual encoder, while ignoring the domain bias in the detection head. Inspired by the high generalization of vision-language models (VLMs), applying a VLM as the robust detection backbone following a domain-aware detection head is a reasonable way to learn the discriminative detector for each domain, rather than reducing the domain bias in traditional methods. To achieve the above issue, we thus propose a novel DAOD framework named Domain-Aware detection head with Prompt tuning (DA-Pro), which applies the learnable domain-adaptive prompt to generate the dynamic detection head for each domain. Formally, the domain-adaptive prompt consists of the domain-invariant tokens, domain-specific tokens, and the domain-related textual description along with the class label. Furthermore, two constraints between the source and target domains are applied to ensure that the domain-adaptive prompt can capture the domains-shared and domain-specific knowledge. A prompt ensemble strategy is also proposed to reduce the effect of prompt disturbance. Comprehensive experiments over multiple cross-domain adaptation tasks demonstrate that using the domain-adaptive prompt can produce an effectively domain-related detection head for boosting domain-adaptive object detection. Our code is available at https://github.com/Therock90421/DA-Pro. △ Less

Submitted 9 October, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: Accepted by NeurIPS 2023

arXiv:2306.04974 [pdf, other]

Conservative Prediction via Data-Driven Confidence Minimization

Authors: Caroline Choi, Fahim Tajwar, Yoonho Lee, Huaxiu Yao, Ananya Kumar, Chelsea Finn

Abstract: In safety-critical applications of machine learning, it is often desirable for a model to be conservative, abstaining from making predictions on unknown inputs which are not well-represented in the training data. However, detecting unknown examples is challenging, as it is impossible to anticipate all potential inputs at test time. To address this, prior work (Hendrycks et al., 2018) minimizes mod… ▽ More In safety-critical applications of machine learning, it is often desirable for a model to be conservative, abstaining from making predictions on unknown inputs which are not well-represented in the training data. However, detecting unknown examples is challenging, as it is impossible to anticipate all potential inputs at test time. To address this, prior work (Hendrycks et al., 2018) minimizes model confidence on an auxiliary outlier dataset carefully curated to be disjoint from the training distribution. We theoretically analyze the choice of auxiliary dataset for confidence minimization, revealing two actionable insights: (1) if the auxiliary set contains unknown examples similar to those seen at test time, confidence minimization leads to provable detection of unknown test examples, and (2) if the first condition is satisfied, it is unnecessary to filter out known examples for out-of-distribution (OOD) detection. Motivated by these guidelines, we propose the Data-Driven Confidence Minimization (DCM) framework, which minimizes confidence on an uncertainty dataset. We apply DCM to two problem settings in which conservative prediction is paramount -- selective classification and OOD detection -- and provide a realistic way to gather uncertainty data for each setting. In our experiments, DCM consistently outperforms existing selective classification approaches on 4 datasets when tested on unseen distributions and outperforms state-of-the-art OOD detection methods on 12 ID-OOD dataset pairs, reducing FPR (at TPR $95\%$) by $6.3\%$ and $58.1\%$ on CIFAR-10 and CIFAR-100 compared to Outlier Exposure. △ Less

Submitted 3 June, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

Comments: Transactions on Machine Learning Research (TMLR), 2024

arXiv:2306.04616 [pdf]

doi 10.1088/2053-1583/acfa0f

Dielectric breakdown and sub-wavelength patterning of monolayer hexagonal boron nitride using femtosecond pulses

Authors: Sabeeh Irfan Ahmad, Emmanuel Sarpong, Arpit Dave, Hsin-Yu Yao, Joel M. Solomon, **g-Kai Jiang, Chih-Wei Luo, Wen-Hao Chang, Tsing-Hua Her

Abstract: Hexagonal boron nitride (hBN) has emerged as a promising two-dimensional (2D) material for many applications in photonics. Although its linear and nonlinear optical properties have been extensively studied, its interaction with high-intensity laser pulses, which is important for high-harmonic generation, fabricating quantum emitters, and maskless patterning of hBN, has not been investigated. Here… ▽ More Hexagonal boron nitride (hBN) has emerged as a promising two-dimensional (2D) material for many applications in photonics. Although its linear and nonlinear optical properties have been extensively studied, its interaction with high-intensity laser pulses, which is important for high-harmonic generation, fabricating quantum emitters, and maskless patterning of hBN, has not been investigated. Here we report the first study of dielectric breakdown in hBN monolayers induced by single femtosecond laser pulses. We show that hBN has the highest breakdown threshold among all existing 2D materials. This enables us to observe clearly for the first time a linear dependence of breakdown threshold on the bandgap energy for 2D materials, demonstrating such a linear dependency is a universal scaling law independent of the dimensionality. We also observe counter-intuitively that hBN, which has a larger bandgap and mechanical strength than quartz, has a lower breakdown threshold. This implies carrier generation in hBN is much more efficient. Furthermore, we demonstrate the clean removal of hBN without damage to the surrounding hBN film or the substrate, indicating that hBN is optically very robust. The ablated features are shown to possess very small edge roughness, which is attributed to its ultrahigh fracture toughness. Finally, we demonstrate femtosecond laser patterning of hBN with sub-wavelength resolution, including an isolated stripe width of 200 nm. Our work advances the knowledge of light-hBN interaction in the strong field regime and firmly establishes femtosecond lasers as novel and promising tools for one-step deterministic patterning of hBN monolayers. △ Less

Submitted 7 June, 2023; originally announced June 2023.

Comments: 21 pages in total. 16 pages in the main text, the rest are supplementary. 6 figures in the main text, 5 figures in the supplementary data

Journal ref: https://iopscience.iop.org/article/10.1088/2053-1583/acfa0f

arXiv:2305.17030 [pdf, other]

doi 10.3847/1538-4365/acfd29

The First LHAASO Catalog of Gamma-Ray Sources

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022.… ▽ More We present the first catalog of very-high energy and ultra-high energy gamma-ray sources detected by the Large High Altitude Air Shower Observatory (LHAASO). The catalog was compiled using 508 days of data collected by the Water Cherenkov Detector Array (WCDA) from March 2021 to September 2022 and 933 days of data recorded by the Kilometer Squared Array (KM2A) from January 2020 to September 2022. This catalog represents the main result from the most sensitive large coverage gamma-ray survey of the sky above 1 TeV, covering declination from $-$20$^{\circ}$ to 80$^{\circ}$. In total, the catalog contains 90 sources with an extended size smaller than $2^\circ$ and a significance of detection at $> 5σ$. Based on our source association criteria, 32 new TeV sources are proposed in this study. Among the 90 sources, 43 sources are detected with ultra-high energy ($E > 100$ TeV) emission at $> 4σ$ significance level. We provide the position, extension, and spectral characteristics of all the sources in this catalog. △ Less

Submitted 27 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 40 pages, 13 figures, 4 tables

Journal ref: The Astrophysical Journal Supplement Series, 271 (2024) 25

arXiv:2305.15909 [pdf, other]

Camera-Incremental Object Re-Identification with Identity Knowledge Evolution

Authors: Hantao Yao, Lu Yu, Jifei Luo, Changsheng Xu

Abstract: Object Re-identification (ReID) aims to retrieve the probe object from many gallery images with the ReID model inferred based on a stationary camera-free dataset by associating and collecting the identities across all camera views. When deploying the ReID algorithm in real-world scenarios, the aspect of storage, privacy constraints, and dynamic changes of cameras would degrade its generalizability… ▽ More Object Re-identification (ReID) aims to retrieve the probe object from many gallery images with the ReID model inferred based on a stationary camera-free dataset by associating and collecting the identities across all camera views. When deploying the ReID algorithm in real-world scenarios, the aspect of storage, privacy constraints, and dynamic changes of cameras would degrade its generalizability and applicability. Treating each camera's data independently, we introduce a novel ReID task named Camera-Incremental Object Re-identification (CIOR) by continually optimizing the ReID mode from the incoming stream of the camera dataset. Since the identities under different camera views might describe the same object, associating and distilling the knowledge of common identities would boost the discrimination and benefit from alleviating the catastrophic forgetting. In this paper, we propose a novel Identity Knowledge Evolution (IKE) framework for CIOR, consisting of the Identity Knowledge Association (IKA), Identity Knowledge Distillation (IKD), and Identity Knowledge Update (IKU). IKA is proposed to discover the common identities between the current identity and historical identities. IKD has applied to distillate historical identity knowledge from common identities and quickly adapt the historical model to the current camera view. After each camera has been trained, IKU is applied to continually expand the identity knowledge by combining the historical and current identity memories. The evaluation of Market-CL and Veri-CL shows the Identity Knowledge Evolution (IKE) effectiveness for CIOR. code:https://github.com/htyao89/Camera-Incremental-Object-ReID △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.14975 [pdf, other]

Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence Scores from Language Models Fine-Tuned with Human Feedback

Authors: Katherine Tian, Eric Mitchell, Allan Zhou, Archit Sharma, Rafael Rafailov, Huaxiu Yao, Chelsea Finn, Christopher D. Manning

Abstract: A trustworthy real-world prediction system should produce well-calibrated confidence scores; that is, its confidence in an answer should be indicative of the likelihood that the answer is correct, enabling deferral to an expert in cases of low-confidence predictions. Recent studies have shown that unsupervised pre-training produces large language models (LMs) whose conditional probabilities are re… ▽ More A trustworthy real-world prediction system should produce well-calibrated confidence scores; that is, its confidence in an answer should be indicative of the likelihood that the answer is correct, enabling deferral to an expert in cases of low-confidence predictions. Recent studies have shown that unsupervised pre-training produces large language models (LMs) whose conditional probabilities are remarkably well-calibrated. However, the most widely-used LMs are fine-tuned with reinforcement learning from human feedback (RLHF-LMs), and some studies have suggested that RLHF-LMs produce conditional probabilities that are very poorly calibrated. In light of this perceived weakness, we conduct a broad evaluation of methods for extracting confidence scores from RLHF-LMs. For RLHF-LMs such as ChatGPT, GPT-4, and Claude, we find that verbalized confidences emitted as output tokens are typically better-calibrated than the model's conditional probabilities on the TriviaQA, SciQ, and TruthfulQA benchmarks, often reducing the expected calibration error by a relative 50%. △ Less

Submitted 24 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: EMNLP 2023 Camera Ready

arXiv:2305.08382 [pdf, ps, other]

Systematic study of fusion barriers with energy dependent barrier radius

Authors: Yeruoxi Chen, Hong Yao, Min Liu, Junlong Tian, Peiwei Wen, Ning Wang

Abstract: Considering energy dependence of the barrier radius in heavy-ion fusion reactions, a modified Siwek-Wilczyński (MSW) fusion cross section formula is proposed. With the MSW formula, the fusion barrier parameters for 367 reaction systems are systematically extracted, based on 443 datasets of measured cross sections. We find that the fusion excitation functions for about $60\%$ reaction systems can b… ▽ More Considering energy dependence of the barrier radius in heavy-ion fusion reactions, a modified Siwek-Wilczyński (MSW) fusion cross section formula is proposed. With the MSW formula, the fusion barrier parameters for 367 reaction systems are systematically extracted, based on 443 datasets of measured cross sections. We find that the fusion excitation functions for about $60\%$ reaction systems can be better described by introducing the energy dependence of the barrier radius which is due to the dynamical effects at energies near and below the barrier. Considering both the influence of the geometry radii and that of the reduced de Broglie wavelength of the colliding nuclei, the barrier heights are well reproduced with only one model parameter. The extracted barrier radius parameters linearly decrease with the effective fissility parameter, and the width of the barrier distribution relates to the barrier height and as well as the reduced de Broglie wavelength at energies around the Coulomb barrier. △ Less

Submitted 29 May, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

Comments: 9 figures, 1 table, accepted for publication in Atomic Data and Nuclear Data Tables

arXiv:2305.05372 [pdf, other]

doi 10.1103/PhysRevLett.131.151001

Measurement of ultra-high-energy diffuse gamma-ray emission of the Galactic plane from 10 TeV to 1 PeV with LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer ar… ▽ More The diffuse Galactic $γ$-ray emission, mainly produced via interactions between cosmic rays and the interstellar medium and/or radiation field, is a very important probe of the distribution, propagation, and interaction of cosmic rays in the Milky Way. In this work we report the measurements of diffuse $γ$-rays from the Galactic plane between 10 TeV and 1 PeV energies, with the square kilometer array of the Large High Altitude Air Shower Observatory (LHAASO). Diffuse emissions from the inner ($15^{\circ}<l<125^{\circ}$, $|b|<5^{\circ}$) and outer ($125^{\circ}<l<235^{\circ}$, $|b|<5^{\circ}$) Galactic plane are detected with $29.1σ$ and $12.7σ$ significance, respectively. The outer Galactic plane diffuse emission is detected for the first time in the very- to ultra-high-energy domain ($E>10$~TeV). The energy spectrum in the inner Galaxy regions can be described by a power-law function with an index of $-2.99\pm0.04$, which is different from the curved spectrum as expected from hadronic interactions between locally measured cosmic rays and the line-of-sight integrated gas content. Furthermore, the measured flux is higher by a factor of $\sim3$ than the prediction. A similar spectrum with an index of $-2.99\pm0.07$ is found in the outer Galaxy region, and the absolute flux for $10\lesssim E\lesssim60$ TeV is again higher than the prediction for hadronic cosmic ray interactions. The latitude distributions of the diffuse emission are consistent with the gas distribution, while the longitude distributions show clear deviation from the gas distribution. The LHAASO measurements imply that either additional emission sources exist or cosmic ray intensities have spatial variations. △ Less

Submitted 19 August, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

Comments: 12 pages, 8 figures, 5 tables; accepted for publication in Physical Review Letters; source mask file provided as ancillary file

Journal ref: Phys. Rev. Lett. 131, 151001 (2023)

arXiv:2304.04214 [pdf, other]

Synthetic turbulence generator for the wall-modeled LES lattice Boltzmann method

Authors: Xiao Xue, Hua-Dong Yao, Lars Davidson

Abstract: The synthetic turbulence generator (STG) lies at the interface of the Reynolds averaged Navier-Stokes (RANS) simulation and large eddy simulation (LES). This paper presents a STG for the multiple-relaxation-time(MRT) lattice Boltzmann method(LBM) framework at high friction Reynolds numbers, with consideration of near wall modeling. The Reichardt wall law, in combination with a force-based method,… ▽ More The synthetic turbulence generator (STG) lies at the interface of the Reynolds averaged Navier-Stokes (RANS) simulation and large eddy simulation (LES). This paper presents a STG for the multiple-relaxation-time(MRT) lattice Boltzmann method(LBM) framework at high friction Reynolds numbers, with consideration of near wall modeling. The Reichardt wall law, in combination with a force-based method, is used to model the near wall field. The STG wall-modeled(STG-WM) LES results are compared with turbulent channel flow simulations at $Re_τ=1000,2000,5200$ at different resolutions. The results demonstrate good agreement with DNS, with the adaptation length of 6 to 8 boundary layer thickness. This method has a wide range of potentials for hybrid RANS/LES-LBM related applications at high friction Reynolds numbers. △ Less

Submitted 9 April, 2023; originally announced April 2023.

arXiv:2304.03935 [pdf, other]

Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks

Authors: Yuzhen Mao, Zhun Deng, Huaxiu Yao, Ting Ye, Kenji Kawaguchi, James Zou

Abstract: As machine learning has been deployed ubiquitously across applications in modern data science, algorithmic fairness has become a great concern. Among them, imposing fairness constraints during learning, i.e. in-processing fair training, has been a popular type of training method because they don't require accessing sensitive attributes during test time in contrast to post-processing methods. While… ▽ More As machine learning has been deployed ubiquitously across applications in modern data science, algorithmic fairness has become a great concern. Among them, imposing fairness constraints during learning, i.e. in-processing fair training, has been a popular type of training method because they don't require accessing sensitive attributes during test time in contrast to post-processing methods. While this has been extensively studied in classical machine learning models, their impact on deep neural networks remains unclear. Recent research has shown that adding fairness constraints to the objective function leads to severe over-fitting to fairness criteria in large models, and how to solve this challenge is an important open question. To tackle this, we leverage the wisdom and power of pre-training and fine-tuning and develop a simple but novel framework to train fair neural networks in an efficient and inexpensive way -- last-layer fine-tuning alone can effectively promote fairness in deep neural networks. This framework offers valuable insights into representation learning for training fair neural networks. △ Less

Submitted 14 July, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

Comments: Published at the ICML 2023 Workshop on Spurious Correlations, Invariance, and Stability

arXiv:2303.17882 [pdf, other]

Visual Anomaly Detection via Dual-Attention Transformer and Discriminative Flow

Authors: Haiming Yao, Wei Luo, Wenyong Yu

Abstract: In this paper, we introduce the novel state-of-the-art Dual-attention Transformer and Discriminative Flow (DADF) framework for visual anomaly detection. Based on only normal knowledge, visual anomaly detection has wide applications in industrial scenarios and has attracted significant attention. However, most existing methods fail to meet the requirements. In contrast, the proposed DTDF presents a… ▽ More In this paper, we introduce the novel state-of-the-art Dual-attention Transformer and Discriminative Flow (DADF) framework for visual anomaly detection. Based on only normal knowledge, visual anomaly detection has wide applications in industrial scenarios and has attracted significant attention. However, most existing methods fail to meet the requirements. In contrast, the proposed DTDF presents a new paradigm: it firstly leverages a pre-trained network to acquire multi-scale prior embeddings, followed by the development of a vision Transformer with dual attention mechanisms, namely self-attention and memorial-attention, to achieve two-level reconstruction for prior embeddings with the sequential and normality association. Additionally, we propose using normalizing flow to establish discriminative likelihood for the joint distribution of prior and reconstructions at each scale. The DADF achieves 98.3/98.4 of image/pixel AUROC on Mvtec AD; 83.7 of image AUROC and 67.4 of pixel sPRO on Mvtec LOCO AD benchmarks, demonstrating the effectiveness of our proposed approach. △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: Submission to IEEE Transactions On Industrial Informatics

arXiv:2303.15978 [pdf, other]

doi 10.1103/PhysRevE.108.024139

Coined quantum walks on the line: Disorder, entanglement, and localization

Authors: Louie Hong Yao, Sascha Wald

Abstract: Disorder in coined quantum walks generally leads to localization. We investigate the influence of the localization on the entanglement properties of coined quantum walks. Specifically, we consider quantum walks on the line and explore the effects of quenched disorder in the coin operations. After confirming that our choice of disorder localizes the walker, we study how the localization affects the… ▽ More Disorder in coined quantum walks generally leads to localization. We investigate the influence of the localization on the entanglement properties of coined quantum walks. Specifically, we consider quantum walks on the line and explore the effects of quenched disorder in the coin operations. After confirming that our choice of disorder localizes the walker, we study how the localization affects the properties of the coined quantum walk. We find that the mixing properties of the walk are altered nontrivially with mixing being improved at short time scales. Special focus is given to the influence of coin disorder on the properties of the quantum state and the coin-walker entanglement. We find that disorder alters the quantum state significantly even when the walker probability distribution is still close to the nondisordered case. We observe that, generically, coin disorder decreases the coin-walker entanglement and that the localization leaves distinct traces in the entanglement entropy and the entanglement negativity of the coined quantum walk. △ Less

Submitted 29 August, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

Comments: 12 pages, 11 figures, similar to published version

Journal ref: Phys. Rev. E 108, 024139 (2023)

arXiv:2303.13868 [pdf, other]

Physically Adversarial Infrared Patches with Learnable Shapes and Locations

Authors: Wei Xingxing, Yu Jie, Huang Yao

Abstract: Owing to the extensive application of infrared object detectors in the safety-critical tasks, it is necessary to evaluate their robustness against adversarial examples in the real world. However, current few physical infrared attacks are complicated to implement in practical application because of their complex transformation from digital world to physical world. To address this issue, in this pap… ▽ More Owing to the extensive application of infrared object detectors in the safety-critical tasks, it is necessary to evaluate their robustness against adversarial examples in the real world. However, current few physical infrared attacks are complicated to implement in practical application because of their complex transformation from digital world to physical world. To address this issue, in this paper, we propose a physically feasible infrared attack method called "adversarial infrared patches". Considering the imaging mechanism of infrared cameras by capturing objects' thermal radiation, adversarial infrared patches conduct attacks by attaching a patch of thermal insulation materials on the target object to manipulate its thermal distribution. To enhance adversarial attacks, we present a novel aggregation regularization to guide the simultaneous learning for the patch' shape and location on the target object. Thus, a simple gradient-based optimization can be adapted to solve for them. We verify adversarial infrared patches in different object detection tasks with various object detectors. Experimental results show that our method achieves more than 90\% Attack Success Rate (ASR) versus the pedestrian detector and vehicle detector in the physical environment, where the objects are captured in different angles, distances, postures, and scenes. More importantly, adversarial infrared patch is easy to implement, and it only needs 0.5 hours to be constructed in the physical world, which verifies its effectiveness and efficiency. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: accepted by CVPR2023

arXiv:2303.13283 [pdf, other]

Visual-Language Prompt Tuning with Knowledge-guided Context Optimization

Authors: Hantao Yao, Rui Zhang, Changsheng Xu

Abstract: Prompt tuning is an effective way to adapt the pre-trained visual-language model (VLM) to the downstream task using task-related textual tokens. Representative CoOp-based work combines the learnable textual tokens with the class tokens to obtain specific textual knowledge. However, the specific textual knowledge is the worse generalization to the unseen classes because it forgets the essential gen… ▽ More Prompt tuning is an effective way to adapt the pre-trained visual-language model (VLM) to the downstream task using task-related textual tokens. Representative CoOp-based work combines the learnable textual tokens with the class tokens to obtain specific textual knowledge. However, the specific textual knowledge is the worse generalization to the unseen classes because it forgets the essential general textual knowledge having a strong generalization ability. To tackle this issue, we introduce a novel Knowledge-guided Context Optimization (KgCoOp) to enhance the generalization ability of the learnable prompt for unseen classes. The key insight of KgCoOp is that forgetting about essential knowledge can be alleviated by reducing the discrepancy between the learnable prompt and the hand-crafted prompt. Especially, KgCoOp minimizes the discrepancy between the textual embeddings generated by learned prompts and the hand-crafted prompts. Finally, adding the KgCoOp upon the contrastive loss can make a discriminative prompt for both seen and unseen tasks. Extensive evaluation of several benchmarks demonstrates that the proposed Knowledge-guided Context Optimization is an efficient method for prompt tuning, \emph{i.e.,} achieves better performance with less training time. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: accepted by CVPR23

arXiv:2303.12419 [pdf, other]

BiCro: Noisy Correspondence Rectification for Multi-modality Data via Bi-directional Cross-modal Similarity Consistency

Authors: Shuo Yang, Zhaopan Xu, Kai Wang, Yang You, Hongxun Yao, Tongliang Liu, Min Xu

Abstract: As one of the most fundamental techniques in multimodal learning, cross-modal matching aims to project various sensory modalities into a shared feature space. To achieve this, massive and correctly aligned data pairs are required for model training. However, unlike unimodal datasets, multimodal datasets are extremely harder to collect and annotate precisely. As an alternative, the co-occurred data… ▽ More As one of the most fundamental techniques in multimodal learning, cross-modal matching aims to project various sensory modalities into a shared feature space. To achieve this, massive and correctly aligned data pairs are required for model training. However, unlike unimodal datasets, multimodal datasets are extremely harder to collect and annotate precisely. As an alternative, the co-occurred data pairs (e.g., image-text pairs) collected from the Internet have been widely exploited in the area. Unfortunately, the cheaply collected dataset unavoidably contains many mismatched data pairs, which have been proven to be harmful to the model's performance. To address this, we propose a general framework called BiCro (Bidirectional Cross-modal similarity consistency), which can be easily integrated into existing cross-modal matching models and improve their robustness against noisy data. Specifically, BiCro aims to estimate soft labels for noisy data pairs to reflect their true correspondence degree. The basic idea of BiCro is motivated by that -- taking image-text matching as an example -- similar images should have similar textual descriptions and vice versa. Then the consistency of these two similarities can be recast as the estimated soft labels to train the matching model. The experiments on three popular cross-modal matching datasets demonstrate that our method significantly improves the noise-robustness of various matching models, and surpass the state-of-the-art by a clear margin. △ Less

Submitted 8 June, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: CVPR 2023

arXiv:2303.08713 [pdf, other]

doi 10.1088/1751-8121/acd0e4

Perturbative Field-Theoretical Analysis of Three-Species Cyclic Predator-Prey Models

Authors: Louie Hong Yao, Mohamed Swailem, Ulrich Dobramysl, Uwe C. Täuber

Abstract: We apply a perturbative Doi--Peliti field-theoretical analysis to the stochastic spatially extended symmetric Rock-Paper-Scissors (RPS) and May--Leonard (ML) models, in which three species compete cyclically. Compared to the two-species Lotka--Volterra predator-prey (LV) model, according to numerical simulations, these cyclical models appear to be less affected by intrinsic stochastic fluctuations… ▽ More We apply a perturbative Doi--Peliti field-theoretical analysis to the stochastic spatially extended symmetric Rock-Paper-Scissors (RPS) and May--Leonard (ML) models, in which three species compete cyclically. Compared to the two-species Lotka--Volterra predator-prey (LV) model, according to numerical simulations, these cyclical models appear to be less affected by intrinsic stochastic fluctuations. Indeed, we demonstrate that the qualitative features of the ML model are insensitive to intrinsic reaction noise. In contrast, and although not yet observed in numerical simulations, we find that the RPS model acquires significant fluctuation-induced renormalizations in the perturbative regime, similar to the LV model. We also study the formation of spatio-temporal structures in the framework of stability analysis and provide a clearcut explanation for the absence of spatial patterns in the RPS system, whereas the spontaneous emergence of spatio-temporal structures features prominently in the LV and the ML models. △ Less

Submitted 11 May, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 45 pages, 13 figures

Journal ref: J. Phys. A: Math Theor. 56 (2023) 225001

arXiv:2303.08154 [pdf, other]

doi 10.1103/PhysRevResearch.5.L032040

Training variational quantum algorithms with random gate activation

Authors: Shuo Liu, Shi-Xin Zhang, Shao-Kai Jian, Hong Yao

Abstract: Variational quantum algorithms (VQAs) hold great potentials for near-term applications and are promising to achieve quantum advantage on practical tasks. However, VQAs suffer from severe barren plateau problem as well as have a large probability of being trapped in local minima. In this Letter, we propose a novel training algorithm with random quantum gate activation for VQAs to efficiently addres… ▽ More Variational quantum algorithms (VQAs) hold great potentials for near-term applications and are promising to achieve quantum advantage on practical tasks. However, VQAs suffer from severe barren plateau problem as well as have a large probability of being trapped in local minima. In this Letter, we propose a novel training algorithm with random quantum gate activation for VQAs to efficiently address these two issues. This new algorithm processes effectively much fewer training parameters than the conventional plain optimization strategy, which efficiently mitigates barren plateaus with the same expressive capability. Additionally, by randomly adding two-qubit gates to the circuit ansatz, the optimization trajectories can escape from local minima and reach the global minimum more frequently due to more sources of randomness. In real quantum experiments, the new training algorithm can also reduce the quantum computational resources required and be more quantum noise resilient. We apply our training algorithm to solve variational quantum simulation problems for ground states and present convincing results that showcase the advantages of our novel strategy where better performance is achieved by the combination of mitigating barren plateaus, esca** from local minima, and reducing the effect of quantum noises. We further propose that the entanglement phase transition could be one underlying reason why our RA training is so effective. △ Less

Submitted 14 March, 2023; originally announced March 2023.

Comments: 4.5 pages + references + supplemental, 4 figures

Journal ref: Phys. Rev. Research 5, L032040 (2023)

arXiv:2303.05768 [pdf, other]

Learning Global-Local Correspondence with Semantic Bottleneck for Logical Anomaly Detection

Authors: Haiming Yao, Wenyong Yu, Wei Luo, Zhenfeng Qiang, Donghao Luo, Xiaotian Zhang

Abstract: This paper presents a novel framework, named Global-Local Correspondence Framework (GLCF), for visual anomaly detection with logical constraints. Visual anomaly detection has become an active research area in various real-world applications, such as industrial anomaly detection and medical disease diagnosis. However, most existing methods focus on identifying local structural degeneration anomalie… ▽ More This paper presents a novel framework, named Global-Local Correspondence Framework (GLCF), for visual anomaly detection with logical constraints. Visual anomaly detection has become an active research area in various real-world applications, such as industrial anomaly detection and medical disease diagnosis. However, most existing methods focus on identifying local structural degeneration anomalies and often fail to detect high-level functional anomalies that involve logical constraints. To address this issue, we propose a two-branch approach that consists of a local branch for detecting structural anomalies and a global branch for detecting logical anomalies. To facilitate local-global feature correspondence, we introduce a novel semantic bottleneck enabled by the visual Transformer. Moreover, we develop feature estimation networks for each branch separately to detect anomalies. Our proposed framework is validated using various benchmarks, including industrial datasets, Mvtec AD, Mvtec Loco AD, and the Retinal-OCT medical dataset. Experimental results show that our method outperforms existing methods, particularly in detecting logical anomalies. △ Less

Submitted 28 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

Comments: Submission to IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

arXiv:2303.05172 [pdf, other]

doi 10.1016/j.nima.2023.168680

The JUNO experiment Top Tracker

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Muhammad Akram, Abid Aleem, Tsagkarakis Alexandros, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, Burin Asavapibhop, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato , et al. (592 additional authors not shown)

Abstract: The main task of the Top Tracker detector of the neutrino reactor experiment Jiangmen Underground Neutrino Observatory (JUNO) is to reconstruct and extrapolate atmospheric muon tracks down to the central detector. This muon tracker will help to evaluate the contribution of the cosmogenic background to the signal. The Top Tracker is located above JUNO's water Cherenkov Detector and Central Detector… ▽ More The main task of the Top Tracker detector of the neutrino reactor experiment Jiangmen Underground Neutrino Observatory (JUNO) is to reconstruct and extrapolate atmospheric muon tracks down to the central detector. This muon tracker will help to evaluate the contribution of the cosmogenic background to the signal. The Top Tracker is located above JUNO's water Cherenkov Detector and Central Detector, covering about 60% of the surface above them. The JUNO Top Tracker is constituted by the decommissioned OPERA experiment Target Tracker modules. The technology used consists in walls of two planes of plastic scintillator strips, one per transverse direction. Wavelength shifting fibres collect the light signal emitted by the scintillator strips and guide it to both ends where it is read by multianode photomultiplier tubes. Compared to the OPERA Target Tracker, the JUNO Top Tracker uses new electronics able to cope with the high rate produced by the high rock radioactivity compared to the one in Gran Sasso underground laboratory. This paper will present the new electronics and mechanical structure developed for the Top Tracker of JUNO along with its expected performance based on the current detector simulation. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: 20 pages

Journal ref: Nucl.Instrum.Meth.A 1057 (2023) 168680

Showing 101–150 of 611 results for author: Ya, H