-
Electrical Impedance Tomography Based Closed-loop Tumor Treating Fields in Dynamic Lung Tumors
Authors:
Minmin Wang,
Xu Xie,
Yuxi Guo,
Liying Zhu,
Yue Lan,
Haitang Yang,
Yun Pan,
Guangdi Chen,
Shaomin Zhang,
Maomao Zhang
Abstract:
Tumor Treating Fields (TTFields) is a non-invasive anticancer modality that utilizes alternating electric fields to disrupt cancer cell division and growth. While generally well-tolerated with minimal side effects, traditional TTFields therapy for lung tumors faces challenges due to the influence of respiratory motion. We design a novel closed-loop TTFields strategy for lung tumors by incorporatin…
▽ More
Tumor Treating Fields (TTFields) is a non-invasive anticancer modality that utilizes alternating electric fields to disrupt cancer cell division and growth. While generally well-tolerated with minimal side effects, traditional TTFields therapy for lung tumors faces challenges due to the influence of respiratory motion. We design a novel closed-loop TTFields strategy for lung tumors by incorporating electrical impedance tomography (EIT) for real-time respiratory phase monitoring and dynamic parameter adjustments. Furthermore, we conduct theoretical analysis to evaluate the performance of the proposed method using the lung motion model. Compared to conventional TTFields settings, we observed that variations in the electrical conductivity of lung during different respiratory phases led to a decrease in the average electric field intensity within lung tumors, transitioning from end-expiratory (1.08 V/cm) to end-inspiratory (0.87 V/cm) phases. Utilizing our proposed closed-Loop TTFields approach at the same dose setting (2400 mA, consistent with the traditional TTFields setting), we can achieve a higher and consistent average electric field strength at the tumor site (1.30 V/cm) across different respiratory stages. Our proposed closed-loop TTFields method has the potential to improved lung tumor therapy by mitigating the impact of respiratory motion.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Parallel fast random bit generation based on spectrotemporally uncorrelated Brillouin random fiber lasing oscillation
Authors:
Yuxi Pang,
Shaonian Ma,
Qiang Ji,
Xian Zhao,
Zengguang Qin,
Zhaojun Liu,
** Lu,
Xiaoyi Bao,
Yan** Xu
Abstract:
Correlations existing between spectral components in multi-wavelength lasers have been the key challenge that hinders these laser sources from being developed to chaotic comb entropy sources for parallel random bit generation. Herein, spectrotemporally uncorrelated multi-order Stokes/anti-Stokes emissions are achieved by cooperatively exploiting nonlinear optical processes including cascaded stimu…
▽ More
Correlations existing between spectral components in multi-wavelength lasers have been the key challenge that hinders these laser sources from being developed to chaotic comb entropy sources for parallel random bit generation. Herein, spectrotemporally uncorrelated multi-order Stokes/anti-Stokes emissions are achieved by cooperatively exploiting nonlinear optical processes including cascaded stimulated Brillouin scattering and quasi-phase-matched four-wave mixing in a Brillouin random fiber laser. Chaotic instabilities induced by random mode resonance are enhanced and disorderly redistributed among different lasing lines through complex nonlinear optical interactions, which comprehensively releases the inherent correlation among multiple Stokes/anti-Stokes emission lines, realizing a chaotic frequency comb with multiple spectrotemporally uncorrelated channels. Parallel fast random bit generation is fulfilled with 31 channels, single-channel bit rate of 35-Gbps and total bit rate of 1.085-Tbps. National Institute of Standards and Technology statistic tests verify the randomness of generated bit streams. This work, in a simple and efficient way, breaks the correlation barrier for utilizing multi-wavelength laser to achieve high-quality spectrotemporally uncorrelated chaotic laser source, opening new avenues for achieving greatly accelerated random bit generation through parallelization and potentially revolutionizing the current architecture of secure communication and high-performance computation.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
SlerpFace: Face Template Protection via Spherical Linear Interpolation
Authors:
Zhizhou Zhong,
Yuxi Mi,
Yuge Huang,
Jianqing Xu,
Guodong Mu,
Shouhong Ding,
**gyun Zhang,
Rizen Guo,
Yunsheng Wu,
Shuigeng Zhou
Abstract:
Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in the template. This paper identifies an emerging privacy attack form utilizing diffusion models that could nullify prior protection, referred to as in…
▽ More
Contemporary face recognition systems use feature templates extracted from face images to identify persons. To enhance privacy, face template protection techniques are widely employed to conceal sensitive identity and appearance information stored in the template. This paper identifies an emerging privacy attack form utilizing diffusion models that could nullify prior protection, referred to as inversion attacks. The attack can synthesize high-quality, identity-preserving face images from templates, revealing persons' appearance. Based on studies of the diffusion model's generative capability, this paper proposes a defense to deteriorate the attack, by rotating templates to a noise-like distribution. This is achieved efficiently by spherically and linearly interpolating templates, or slerp, on their located hypersphere. This paper further proposes to group-wisely divide and drop out templates' feature dimensions, to enhance the irreversibility of rotated templates. The division of groups and dropouts within each group are learned in a recognition-favored way. The proposed techniques are concretized as a novel face template protection technique, SlerpFace. Extensive experiments show that SlerpFace provides satisfactory recognition accuracy and comprehensive privacy protection against inversion and other attack forms, superior to prior arts.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Supersolid Phase in the Diluted Holstein Model
Authors:
**gyao Meng,
Yuxi Zhang,
Rafael M. Fernandes,
Tianxing Ma,
R. T. Scalettar
Abstract:
The Holstein model on a square lattice at half-filling has a well-established finite temperature phase transition to an insulating state with long range charge density wave (CDW) order. Because this CDW formation suppresses pairing, a superconducting (SC) phase emerges only with do**. In this work, we study the effects of dilution of the local phonon degrees of freedom in the Holstein model whil…
▽ More
The Holstein model on a square lattice at half-filling has a well-established finite temperature phase transition to an insulating state with long range charge density wave (CDW) order. Because this CDW formation suppresses pairing, a superconducting (SC) phase emerges only with do**. In this work, we study the effects of dilution of the local phonon degrees of freedom in the Holstein model while kee** the system at half filling. We find not only that the CDW remains present up to a dilution fraction $f \sim 0.15$, but also that long range pairing is stabilized with increasing $f$, resulting in a {\it supersolid} regime centered at $f \approx 0.10$, where long range diagonal and off-diagonal correlations coexist. Further dilution results in a purely SC phase, and ultimately in a normal metal. Our results provide a new route to the supersolid phase via the introduction of impurities at fixed positions which both increase quantum fluctuations and also are immune to the competing tendency to phase separation often observed in the doped case.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Instance Temperature Knowledge Distillation
Authors:
Zhengbo Zhang,
Yuxi Zhou,
Jia Gong,
Jun Liu,
Zhigang Tu
Abstract:
Knowledge distillation (KD) enhances the performance of a student network by allowing it to learn the knowledge transferred from a teacher network incrementally. Existing methods dynamically adjust the temperature to enable the student network to adapt to the varying learning difficulties at different learning stages of KD. KD is a continuous process, but when adjusting the temperature, these meth…
▽ More
Knowledge distillation (KD) enhances the performance of a student network by allowing it to learn the knowledge transferred from a teacher network incrementally. Existing methods dynamically adjust the temperature to enable the student network to adapt to the varying learning difficulties at different learning stages of KD. KD is a continuous process, but when adjusting the temperature, these methods consider only the immediate benefits of the operation in the current learning phase and fail to take into account its future returns. To address this issue, we formulate the adjustment of temperature as a sequential decision-making task and propose a method based on reinforcement learning, termed RLKD. Importantly, we design a novel state representation to enable the agent to make more informed action (i.e. instance temperature adjustment). To handle the problem of delayed rewards in our method due to the KD setting, we explore an instance reward calibration approach. In addition,we devise an efficient exploration strategy that enables the agent to learn valuable instance temperature adjustment policy more efficiently. Our framework can serve as a plug-and-play technique to be inserted into various KD methods easily, and we validate its effectiveness on both image classification and object detection tasks. Our project is at https://www.zayx.me/ITKD.github.io/.
△ Less
Submitted 7 July, 2024; v1 submitted 27 June, 2024;
originally announced July 2024.
-
Global solutions and uniform convergence stability for compressible Navier-Stokes equations with Oldroyd-type constitutive law
Authors:
Sébastien Boyaval,
Yuxi Hu,
Na Wang
Abstract:
We consider one dimensional isentropic compressible Navier-Stokes equations with Oldroyd-type constitutive law. By establishing uniform a priori estimates (with respect to re-laxation time), we show global existence of smooth solutions with small initial data. Moreover,we get global-in-time convergence of the system towards the classical isentropic compressible Navier-Stokes equations.
We consider one dimensional isentropic compressible Navier-Stokes equations with Oldroyd-type constitutive law. By establishing uniform a priori estimates (with respect to re-laxation time), we show global existence of smooth solutions with small initial data. Moreover,we get global-in-time convergence of the system towards the classical isentropic compressible Navier-Stokes equations.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Solving Co-Path/Cycle Packing Faster than $3^k$
Authors:
Yuxi Liu,
Mingyu Xiao
Abstract:
The \textsc{Co-Path/Cycle Packing} problem asks whether we can delete at most $k$ vertices from the input graph such that the remaining graph is a collection of induced paths and cycles. \textsc{Co-Path/Cycle Packing} is a fundamental graph problem that has important applications in bioinformatics. Although this problem has been extensively studied in parameterized algorithms, it seems hard to bre…
▽ More
The \textsc{Co-Path/Cycle Packing} problem asks whether we can delete at most $k$ vertices from the input graph such that the remaining graph is a collection of induced paths and cycles. \textsc{Co-Path/Cycle Packing} is a fundamental graph problem that has important applications in bioinformatics. Although this problem has been extensively studied in parameterized algorithms, it seems hard to break the running time bound $3^k$. In 2015, Feng et al. provided an $O^*(3^k)$-time randomized algorithm. Recently, Tsur showed that this problem can be solved in $O^*(3^k)$ time deterministically. In this paper, by combining several techniques such as path decomposition, dynamic programming, and branch-and-search methods, we show that \textsc{Co-Path/Cycle Packing} can be solved in $O^*(2.8192^k)$ time. As a by-product, we also show that the \textsc{$d$-Bounded-Degree Vertex Deletion} problem, a generalization of \textsc{Co-Path/Cycle Packing}, can be solved in $O^*((d + 2)^p)$ time if a path decomposition of width $p$ is given, which implies that \textsc{$d$-Bounded-Degree Vertex Deletion} is FPT with parameter $p+d$.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
DeTriever: Decoder-representation-based Retriever for Improving NL2SQL In-Context Learning
Authors:
Yuxi Feng,
Raymond Li,
Zhenan Fan,
Giuseppe Carenini,
Mohammadreza Pourreza,
Weiwei Zhang,
Yong Zhang
Abstract:
While in-context Learning (ICL) has proven to be an effective technique to improve the performance of Large Language Models (LLMs) in a variety of complex tasks, notably in translating natural language questions into Structured Query Language (NL2SQL), the question of how to select the most beneficial demonstration examples remains an open research problem. While prior works often adapted off-the-…
▽ More
While in-context Learning (ICL) has proven to be an effective technique to improve the performance of Large Language Models (LLMs) in a variety of complex tasks, notably in translating natural language questions into Structured Query Language (NL2SQL), the question of how to select the most beneficial demonstration examples remains an open research problem. While prior works often adapted off-the-shelf encoders to retrieve examples dynamically, an inherent discrepancy exists in the representational capacities between the external retrievers and the LLMs. Further, optimizing the selection of examples is a non-trivial task, since there are no straightforward methods to assess the relative benefits of examples without performing pairwise inference. To address these shortcomings, we propose DeTriever, a novel demonstration retrieval framework that learns a weighted combination of LLM hidden states, where rich semantic information is encoded. To train the model, we propose a proxy score that estimates the relative benefits of examples based on the similarities between output queries. Experiments on two popular NL2SQL benchmarks demonstrate that our method significantly outperforms the state-of-the-art baselines on one-shot NL2SQL tasks.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Simulation Models for Exploring Magnetic Reconnection
Authors:
Michael Shay,
Subash Adhikari,
Naoki Beesho,
Joachim Birn,
Jorg Buechner,
Paul Cassak,
Li-Jen Chen,
Yuxi Chen,
Giulia Cozzani,
Jim Drake,
Fan Guo,
Michael Hesse,
Neeraj Jain,
Yann Pfau-Kempf,
Yu Lin,
Yi-Hsin Liu,
Mitsuo Oka,
Yuri A. Omelchenko,
Minna Palmroth,
Oreste Pezzi,
Patricia H. Reiff,
Marc Swisdak,
Frank Toffoletto,
Gabor Toth,
Richard A. Wolf
Abstract:
Simulations have played a critical role in the advancement of our knowledge of magnetic reconnection. However, due to the inherently multiscale nature of reconnection, it is impossible to simulate all physics at all scales. For this reason, a wide range of simulation methods have been crafted to study particular aspects and consequences of magnetic reconnection. This chapter reviews many of these…
▽ More
Simulations have played a critical role in the advancement of our knowledge of magnetic reconnection. However, due to the inherently multiscale nature of reconnection, it is impossible to simulate all physics at all scales. For this reason, a wide range of simulation methods have been crafted to study particular aspects and consequences of magnetic reconnection. This chapter reviews many of these methods, laying out critical assumptions, numerical techniques, and giving examples of scientific results. Plasma models described include magnetohydrodynamics (MHD), Hall MHD, Hybrid, kinetic particle-in-cell (PIC), kinetic Vlasov, Fluid models with embedded PIC, Fluid models with direct feedback from energetic populations, and the Rice Convection Model (RCM).
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Phonon heat conduction across slippery interfaces in twisted graphite
Authors:
Fuwei Yang,
Wenjiang Zhou,
Zhibin Zhang,
Xuanyu Huang,
**gwen Zhang,
Nianjie Liang,
Wujuan Yan,
Yuxi Wang,
Mingchao Ding,
Quanlin Guo,
Yu Han,
Te-Huan Liu,
Kaihui Liu,
Quanshui Zheng,
Bai Song
Abstract:
Interlayer rotation in van der Waals (vdW) materials offers great potential for manipulating phonon dynamics and heat flow in advanced electronics with ever higher compactness and power density. However, despite extensive theoretical efforts in recent years, experimental measurements remain scarce especially due to the critical challenges of preparing single-crystalline twisted interfaces and prob…
▽ More
Interlayer rotation in van der Waals (vdW) materials offers great potential for manipulating phonon dynamics and heat flow in advanced electronics with ever higher compactness and power density. However, despite extensive theoretical efforts in recent years, experimental measurements remain scarce especially due to the critical challenges of preparing single-crystalline twisted interfaces and probing interfacial thermal transport with sufficient resolution. Here, we exploited the intrinsic twisted interfaces in highly oriented pyrolytic graphite (HOPG). By develo** novel experimental schemes based on microfabricated mesas, we managed to achieve simultaneous mechanical characterizations and thermal measurements. In particular, we pushed the HOPG mesas with a microprobe to identify and rotate single-crystalline intrinsic interfaces owing to their slippery nature as is well known in structural superlubricity. Remarkably, we observed over 30-fold suppression of thermal conductance for the slippery interfaces by using epitaxial graphite as a control. Nonetheless, the interfacial conductance remains around 600 $\mathrm{MWm^{-2}K^{-1}}$ which surpasses the highest values for artificially stacked vdW structures by more than five times. Further, atomic simulations revealed the predominant role of the transverse acoustic phonons. Together, our findings highlight a general physical picture that directly correlates interfacial thermal transport with sliding resistance, and lay the foundation for twist-enabled thermal management which are particularly beneficial to twistronics and slidetronics.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption
Authors:
Anqi Li,
Yuxi Liu,
Huihui Bai,
Feng Li,
Runmin Cong,
Meng Wang,
Yao Zhao
Abstract:
Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, Control-GIC, the first capable of…
▽ More
Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, Control-GIC, the first capable of fine-grained bitrate adaption across a broad spectrum while ensuring high-fidelity and generality compression. We base Control-GIC on a VQGAN framework representing an image as a sequence of variable-length codes (i.e. VQ-indices), which can be losslessly compressed and exhibits a direct positive correlation with the bitrates. Therefore, drawing inspiration from the classical coding principle, we naturally correlate the information density of local image patches with their granular representations, to achieve dynamic adjustment of the code quantity following different granularity decisions. This implies we can flexibly determine a proper allocation of granularity for the patches to acquire desirable compression rates. We further develop a probabilistic conditional decoder that can trace back to historic encoded multi-granularity representations according to transmitted codes, and then reconstruct hierarchical granular features in the formalization of conditional probability, enabling more informative aggregation to improve reconstruction realism. Our experiments show that Control-GIC allows highly flexible and controllable bitrate adaption and even once compression on an entire dataset to fulfill constrained bitrate conditions. Experimental results demonstrate its superior performance over recent state-of-the-art methods.
△ Less
Submitted 5 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
SelfGNN: Self-Supervised Graph Neural Networks for Sequential Recommendation
Authors:
Yuxi Liu,
Lianghao Xia,
Chao Huang
Abstract:
Sequential recommendation effectively addresses information overload by modeling users' temporal and sequential interaction patterns. To overcome the limitations of supervision signals, recent approaches have adopted self-supervised learning techniques in recommender systems. However, there are still two critical challenges that remain unsolved. Firstly, existing sequential models primarily focus…
▽ More
Sequential recommendation effectively addresses information overload by modeling users' temporal and sequential interaction patterns. To overcome the limitations of supervision signals, recent approaches have adopted self-supervised learning techniques in recommender systems. However, there are still two critical challenges that remain unsolved. Firstly, existing sequential models primarily focus on long-term modeling of individual interaction sequences, overlooking the valuable short-term collaborative relationships among the behaviors of different users. Secondly, real-world data often contain noise, particularly in users' short-term behaviors, which can arise from temporary intents or misclicks. Such noise negatively impacts the accuracy of both graph and sequence models, further complicating the modeling process. To address these challenges, we propose a novel framework called Self-Supervised Graph Neural Network (SelfGNN) for sequential recommendation. The SelfGNN framework encodes short-term graphs based on time intervals and utilizes Graph Neural Networks (GNNs) to learn short-term collaborative relationships. It captures long-term user and item representations at multiple granularity levels through interval fusion and dynamic behavior modeling. Importantly, our personalized self-augmented learning structure enhances model robustness by mitigating noise in short-term graphs based on long-term user interests and personal stability. Extensive experiments conducted on four real-world datasets demonstrate that SelfGNN outperforms various state-of-the-art baselines. Our model implementation codes are available at https://github.com/HKUDS/SelfGNN.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
SMPLX-Lite: A Realistic and Drivable Avatar Benchmark with Rich Geometry and Texture Annotations
Authors:
Yujiao Jiang,
Qingmin Liao,
Zhaolong Wang,
Xiangru Lin,
Zongqing Lu,
Yuxi Zhao,
Hanqing Wei,
**grui Ye,
Yu Zhang,
Zhi**g Shao
Abstract:
Recovering photorealistic and drivable full-body avatars is crucial for numerous applications, including virtual reality, 3D games, and tele-presence. Most methods, whether reconstruction or generation, require large numbers of human motion sequences and corresponding textured meshes. To easily learn a drivable avatar, a reasonable parametric body model with unified topology is paramount. However,…
▽ More
Recovering photorealistic and drivable full-body avatars is crucial for numerous applications, including virtual reality, 3D games, and tele-presence. Most methods, whether reconstruction or generation, require large numbers of human motion sequences and corresponding textured meshes. To easily learn a drivable avatar, a reasonable parametric body model with unified topology is paramount. However, existing human body datasets either have images or textured models and lack parametric models which fit clothes well. We propose a new parametric model SMPLX-Lite-D, which can fit detailed geometry of the scanned mesh while maintaining stable geometry in the face, hand and foot regions. We present SMPLX-Lite dataset, the most comprehensive clothing avatar dataset with multi-view RGB sequences, keypoints annotations, textured scanned meshes, and textured SMPLX-Lite-D models. With the SMPLX-Lite dataset, we train a conditional variational autoencoder model that takes human pose and facial keypoints as input, and generates a photorealistic drivable human avatar.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
CiliaGraph: Enabling Expression-enhanced Hyper-Dimensional Computation in Ultra-Lightweight and One-Shot Graph Classification on Edge
Authors:
Yuxi Han,
Jihe Wang,
Danghui Wang
Abstract:
Graph Neural Networks (GNNs) are computationally demanding and inefficient when applied to graph classification tasks in resource-constrained edge scenarios due to their inherent process, involving multiple rounds of forward and backward propagation. As a lightweight alternative, Hyper-Dimensional Computing (HDC), which leverages high-dimensional vectors for data encoding and processing, offers a…
▽ More
Graph Neural Networks (GNNs) are computationally demanding and inefficient when applied to graph classification tasks in resource-constrained edge scenarios due to their inherent process, involving multiple rounds of forward and backward propagation. As a lightweight alternative, Hyper-Dimensional Computing (HDC), which leverages high-dimensional vectors for data encoding and processing, offers a more efficient solution by addressing computational bottleneck. However, current HDC methods primarily focus on static graphs and neglect to effectively capture node attributes and structural information, which leads to poor accuracy. In this work, we propose CiliaGraph, an enhanced expressive yet ultra-lightweight HDC model for graph classification. This model introduces a novel node encoding strategy that preserves relative distance isomorphism for accurate node connection representation. In addition, node distances are utilized as edge weights for information aggregation, and the encoded node attributes and structural information are concatenated to obtain a comprehensive graph representation. Furthermore, we explore the relationship between orthogonality and dimensionality to reduce the dimensions, thereby further enhancing computational efficiency. Compared to the SOTA GNNs, extensive experiments show that CiliaGraph reduces memory usage and accelerates training speed by an average of 292 times(up to 2341 times) and 103 times(up to 313 times) respectively while maintaining comparable accuracy.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval
Authors:
Sihe Zhang,
Qingdong He,
**long Peng,
Yuxi Li,
Zhengkai Jiang,
Jiafu Wu,
Mingmin Chi,
Yabiao Wang,
Chengjie Wang
Abstract:
Image retrieval aims to identify visually similar images within a database using a given query image. Traditional methods typically employ both global and local features extracted from images for matching, and may also apply re-ranking techniques to enhance accuracy. However, these methods often fail to account for the noise present in query images, which can stem from natural or human-induced fac…
▽ More
Image retrieval aims to identify visually similar images within a database using a given query image. Traditional methods typically employ both global and local features extracted from images for matching, and may also apply re-ranking techniques to enhance accuracy. However, these methods often fail to account for the noise present in query images, which can stem from natural or human-induced factors, thereby negatively impacting retrieval performance. To mitigate this issue, we introduce a novel setting for low-quality image retrieval, and propose an Adaptive Noise-Based Network (AdapNet) to learn robust abstract representations. Specifically, we devise a quality compensation block trained to compensate for various low-quality factors in input images. Besides, we introduce an innovative adaptive noise-based loss function, which dynamically adjusts its focus on the gradient in accordance with image quality, thereby augmenting the learning of unknown noisy samples during training and enhancing intra-class compactness. To assess the performance, we construct two datasets with low-quality queries, which is built by applying various types of noise on clean query images on the standard Revisited Oxford and Revisited Paris datasets. Comprehensive experimental results illustrate that AdapNet surpasses state-of-the-art methods on the Noise Revisited Oxford and Noise Revisited Paris benchmarks, while maintaining competitive performance on high-quality datasets. The code and constructed datasets will be made available.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation
Authors:
Yuxi Li,
Yi Liu,
Yuekang Li,
Ling Shi,
Gelei Deng,
Shengquan Chen,
Kailong Wang
Abstract:
Large language models (LLMs) have transformed the field of natural language processing, but they remain susceptible to jailbreaking attacks that exploit their capabilities to generate unintended and potentially harmful content. Existing token-level jailbreaking techniques, while effective, face scalability and efficiency challenges, especially as models undergo frequent updates and incorporate adv…
▽ More
Large language models (LLMs) have transformed the field of natural language processing, but they remain susceptible to jailbreaking attacks that exploit their capabilities to generate unintended and potentially harmful content. Existing token-level jailbreaking techniques, while effective, face scalability and efficiency challenges, especially as models undergo frequent updates and incorporate advanced defensive measures. In this paper, we introduce JailMine, an innovative token-level manipulation approach that addresses these limitations effectively. JailMine employs an automated "mining" process to elicit malicious responses from LLMs by strategically selecting affirmative outputs and iteratively reducing the likelihood of rejection. Through rigorous testing across multiple well-known LLMs and datasets, we demonstrate JailMine's effectiveness and efficiency, achieving a significant average reduction of 86% in time consumed while maintaining high success rates averaging 95%, even in the face of evolving defensive strategies. Our work contributes to the ongoing effort to assess and mitigate the vulnerability of LLMs to jailbreaking attacks, underscoring the importance of continued vigilance and proactive measures to enhance the security and reliability of these powerful language models.
△ Less
Submitted 19 June, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
Evaluating and Modeling Social Intelligence: A Comparative Study of Human and AI Capabilities
Authors:
Junqi Wang,
Chunhui Zhang,
Jiapeng Li,
Yuxi Ma,
Lixing Niu,
Jiaheng Han,
Yujia Peng,
Yixin Zhu,
Lifeng Fan
Abstract:
Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; Bubeck et al., 2023; Kosinski, 2023; Shiffrin & Mitchell, 2023; Ullman, 2023), the current study introduces a benchmark for evaluating social intelligence, one of the most distinctive aspects of human cognition. We developed a comprehensive theoretical framework for s…
▽ More
Facing the current debate on whether Large Language Models (LLMs) attain near-human intelligence levels (Mitchell & Krakauer, 2023; Bubeck et al., 2023; Kosinski, 2023; Shiffrin & Mitchell, 2023; Ullman, 2023), the current study introduces a benchmark for evaluating social intelligence, one of the most distinctive aspects of human cognition. We developed a comprehensive theoretical framework for social dynamics and introduced two evaluation tasks: Inverse Reasoning (IR) and Inverse Inverse Planning (IIP). Our approach also encompassed a computational model based on recursive Bayesian inference, adept at elucidating diverse human behavioral patterns. Extensive experiments and detailed analyses revealed that humans surpassed the latest GPT models in overall performance, zero-shot learning, one-shot generalization, and adaptability to multi-modalities. Notably, GPT models demonstrated social intelligence only at the most basic order (order = 0), in stark contrast to human social intelligence (order >= 2). Further examination indicated a propensity of LLMs to rely on pattern recognition for shortcuts, casting doubt on their possession of authentic human-level social intelligence. Our codes, dataset, appendix and human data are released at https://github.com/bigai-ai/Evaluate-n-Model-Social-Intelligence.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Early phase simultaneous multi-band observations of Type II supernova SN 2024ggi with Mephisto
Authors:
Xinlei Chen,
Brajesh Kumar,
Xinzhong Er,
Helong Guo,
Yuan-Pei Yang,
Weikang Lin,
Yuan Fang,
Guowang Du,
Chenxu Liu,
Jiewei Zhao,
Tianyu Zhang,
Yuxi Bao,
Xingzhu Zou,
Yu Pan,
Yu Wang,
Xufeng Zhu,
Kaushik Chatterjee,
Xiangkun Liu,
Dezi Liu,
Edoardo P. Lagioia,
Geeta Rangwal,
Shiyan Zhong,
**ghua Zhang,
Jianhui Lian,
Yongzhi Cai
, et al. (2 additional authors not shown)
Abstract:
We present early-phase good cadence simultaneous multi-band ($ugi$, $vrz$--bands) imaging of nearby supernova SN 2024ggi, which exploded in the nearby galaxy, NGC~3621. A quick follow-up was conducted within less than a day after the explosion and continued $\sim$23 days. The $uvg$-band light curves display a rapid rise ($\sim$1.4 mag day$^{-1}$) to maximum in $\sim$4 days and absolute magnitude…
▽ More
We present early-phase good cadence simultaneous multi-band ($ugi$, $vrz$--bands) imaging of nearby supernova SN 2024ggi, which exploded in the nearby galaxy, NGC~3621. A quick follow-up was conducted within less than a day after the explosion and continued $\sim$23 days. The $uvg$-band light curves display a rapid rise ($\sim$1.4 mag day$^{-1}$) to maximum in $\sim$4 days and absolute magnitude $M_{g}\sim$--17.75 mag. The post-peak decay rate in redder bands is $\sim$0.01 mag day$^{-1}$. Different colors (e.g., $u-g$ and $v-r$) of SN~2024ggi are slightly redder than SN~2023ixf. A significant rise ($\sim$12.5 kK) in black-body temperature (optical) was noticed within $\sim$2 days after the explosion, which successively decreased, indicating shock break out inside a dense circumstellar medium (CSM) surrounding the progenitor. Using semi-analytical modeling, the ejecta mass and progenitor radius were estimated as 1.2 M$_{\odot}$ and $\sim$550 R$_{\odot}$, respectively. The archival deep images ($g,r,i,z$-bands) from the Dark Energy Camera Legacy Survey (DECaLS) were examined, and a possible progenitor was detected in each band ($\sim$22--22.5 mag) and had a mass range of 14--17 M$_{\odot}$.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework
Authors:
Yiheng Huang,
Hui Yang,
Chuanchen Luo,
Yuxi Wang,
Shibiao Xu,
Zhaoxiang Zhang,
Man Zhang,
Junran Peng
Abstract:
Thanks to the powerful generative capacity of diffusion models, recent years have witnessed rapid progress in human motion generation. Existing diffusion-based methods employ disparate network architectures and training strategies. The effect of the design of each component is still unclear. In addition, the iterative denoising process consumes considerable computational overhead, which is prohibi…
▽ More
Thanks to the powerful generative capacity of diffusion models, recent years have witnessed rapid progress in human motion generation. Existing diffusion-based methods employ disparate network architectures and training strategies. The effect of the design of each component is still unclear. In addition, the iterative denoising process consumes considerable computational overhead, which is prohibitive for real-time scenarios such as virtual characters and humanoid robots. For this reason, we first conduct a comprehensive investigation into network architectures, training strategies, and inference processs. Based on the profound analysis, we tailor each component for efficient high-quality human motion generation. Despite the promising performance, the tailored model still suffers from foot skating which is an ubiquitous issue in diffusion-based solutions. To eliminate footskate, we identify foot-ground contact and correct foot motions along the denoising process. By organically combining these well-designed components together, we present StableMoFusion, a robust and efficient framework for human motion generation. Extensive experimental results show that our StableMoFusion performs favorably against current state-of-the-art methods. Project page: https://h-y1heng.github.io/StableMoFusion-page/
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Mack modes in supersonic boundary layer
Authors:
Nader Masmoudi,
Yuxi Wang,
Di Wu,
Zhifei Zhang
Abstract:
Understanding the transition mechanism of boundary layer flows is of great significance in physics and engineering, especially due to the current development of supersonic and hypersonic aircraft. In this paper, we construct multiple unstable acoustic modes so-called Mack modes, which play a crucial role during the early stage of transition in the supersonic boundary layer. To this end, we develop…
▽ More
Understanding the transition mechanism of boundary layer flows is of great significance in physics and engineering, especially due to the current development of supersonic and hypersonic aircraft. In this paper, we construct multiple unstable acoustic modes so-called Mack modes, which play a crucial role during the early stage of transition in the supersonic boundary layer. To this end, we develop an inner-outer gluing iteration to solve a hyperbolic-elliptic mixed type and singular system.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Deep Learning for Detecting and Early Predicting Chronic Obstructive Pulmonary Disease from Spirogram Time Series: A UK Biobank Study
Authors:
Shuhao Mei,
Yuxi Zhou,
Jiahao Xu,
Yuxuan Wan,
Shan Cao,
Qinghao Zhao,
Shijia Geng,
Junqing Xie,
Shenda Hong
Abstract:
Chronic Obstructive Pulmonary Disease (COPD) is a chronic inflammatory lung condition that causes airflow obstruction. The existing methods can only detect patients who already have COPD based on obvious features shown in the spirogram (In this article, the spirogram specifically involves measuring Volume-Flow curve time series). Early prediction of COPD risk is vital for monitoring COPD disease p…
▽ More
Chronic Obstructive Pulmonary Disease (COPD) is a chronic inflammatory lung condition that causes airflow obstruction. The existing methods can only detect patients who already have COPD based on obvious features shown in the spirogram (In this article, the spirogram specifically involves measuring Volume-Flow curve time series). Early prediction of COPD risk is vital for monitoring COPD disease progression, slowing it down, or even preventing its onset. However, these methods fail to early predict an individual's probability of COPD in the future based on subtle features in the spirogram. To address this gap, for the first time, we propose DeepSpiro, a method based on deep learning for early prediction of future COPD risk. DeepSpiro consists of four parts. First, we construct Volume-Flow curves guided by Time-Volume instability smoothing (SpiroSmoother) to enhance the stability of the original Volume-Flow curves precisely. Second, we extract critical features from the evolution of varied-length key patches (SpiroEncoder) to capture the key temporal evolution from original high-dimensional dynamic sequences to a unified low-dimensional temporal representation. Third, we explain the model based on temporal attention and heterogeneous feature fusion (SpiroExplainer), which integrates information from heterogeneous data such as spirogram and demographic information. Fourth, we predict the risk of COPD based on the evolution of key patch concavity (SpiroPredictor), enabling accurate prediction of the risk of disease in high-risk patients who are not yet diagnosed, for up to 1, 2, 3, 4, 5 years, and beyond. We conduct experiments on the UK Biobank dataset. Results show that DeepSpiro achieves an AUC value of 0.8328 in the task of detecting COPD. In early prediction tasks, high-risk and low-risk groups show significant differences in the future, with a p-value of <0.001.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
Exploring prompts to elicit memorization in masked language model-based named entity recognition
Authors:
Yuxi Xia,
Anastasiia Sedova,
Pedro Henrique Luz de Araujo,
Vasiliki Kougia,
Lisa Nußbaumer,
Benjamin Roth
Abstract:
Training data memorization in language models impacts model capability (generalization) and safety (privacy risk). This paper focuses on analyzing prompts' impact on detecting the memorization of 6 masked language model-based named entity recognition models. Specifically, we employ a diverse set of 400 automatically generated prompts, and a pairwise dataset where each pair consists of one person's…
▽ More
Training data memorization in language models impacts model capability (generalization) and safety (privacy risk). This paper focuses on analyzing prompts' impact on detecting the memorization of 6 masked language model-based named entity recognition models. Specifically, we employ a diverse set of 400 automatically generated prompts, and a pairwise dataset where each pair consists of one person's name from the training set and another name out of the set. A prompt completed with a person's name serves as input for getting the model's confidence in predicting this name. Finally, the prompt performance of detecting model memorization is quantified by the percentage of name pairs for which the model has higher confidence for the name from the training set. We show that the performance of different prompts varies by as much as 16 percentage points on the same model, and prompt engineering further increases the gap. Moreover, our experiments demonstrate that prompt performance is model-dependent but does generalize across different name sets. A comprehensive analysis indicates how prompt performance is influenced by prompt properties, contained tokens, and the model's self-attention weights on the prompt.
△ Less
Submitted 5 May, 2024;
originally announced May 2024.
-
Quantitative homogenization of state-constraint Hamilton--Jacobi equations on perforated domains and applications
Authors:
Yuxi Han,
Wenjia **g,
Hiroyoshi Mitake,
Hung V. Tran
Abstract:
We study the periodic homogenization problem of state-constraint Hamilton--Jacobi equations on perforated domains in the convex setting and obtain the optimal convergence rate. We then consider a dilute situation in which the holes' diameter is much smaller than the microscopic scale. Finally, a homogenization problem with domain defects where some holes are missing is analyzed.
We study the periodic homogenization problem of state-constraint Hamilton--Jacobi equations on perforated domains in the convex setting and obtain the optimal convergence rate. We then consider a dilute situation in which the holes' diameter is much smaller than the microscopic scale. Finally, a homogenization problem with domain defects where some holes are missing is analyzed.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
The impact of stellar metallicity on rotation and activity evolution in the Kepler field using gyro-kinematic ages
Authors:
Victor See,
Yuxi,
Lu,
Louis Amard,
Julia Roquette
Abstract:
In recent years, there has been a push to understand how chemical composition affects the magnetic activity levels of main sequence low-mass stars. Results indicate that more metal-rich stars are more magnetically active for a given stellar mass and rotation period. This metallicity dependence has implications for how the rotation periods and activity levels of low-mass stars evolve over their lif…
▽ More
In recent years, there has been a push to understand how chemical composition affects the magnetic activity levels of main sequence low-mass stars. Results indicate that more metal-rich stars are more magnetically active for a given stellar mass and rotation period. This metallicity dependence has implications for how the rotation periods and activity levels of low-mass stars evolve over their lifetimes. Numerical modelling suggests that at late ages more metal-rich stars should be rotating more slowly and be more magnetically active. In this work, we study the rotation and activity evolution of low-mass stars using a sample of Kepler field stars. We use the gyro-kinematic age dating technique to estimate ages for our sample and use the photometric activity index as our proxy for magnetic activity. We find clear evidence that, at late ages, more metal-rich stars have spun down to slower rotation in agreement with the theoretical modeling. However, further investigation is required to definitively determine whether the magnetic activity evolution occurs in a metallicity dependent way.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning
Authors:
Yuxi Xie,
Anirudh Goyal,
Wenyue Zheng,
Min-Yen Kan,
Timothy P. Lillicrap,
Kenji Kawaguchi,
Michael Shieh
Abstract:
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process inspired by the successful strategy employed by AlphaZero. Our work leverages Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level…
▽ More
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process inspired by the successful strategy employed by AlphaZero. Our work leverages Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals. To enhance consistency in intermediate steps, we combine outcome validation and stepwise self-evaluation, continually updating the quality assessment of newly generated data. The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data. Theoretical analysis reveals the importance of using on-policy sampled data for successful self-improving. Extensive evaluations on various arithmetic and commonsense reasoning tasks demonstrate remarkable performance improvements over existing models. For instance, our approach outperforms the Mistral-7B Supervised Fine-Tuning (SFT) baseline on GSM8K, MATH, and ARC-C, with substantial increases in accuracy to $81.8\%$ (+$5.9\%$), $34.7\%$ (+$5.8\%$), and $76.4\%$ (+$15.8\%$), respectively. Additionally, our research delves into the training and inference compute tradeoff, providing insights into how our method effectively maximizes performance gains. Our code is publicly available at https://github.com/YuxiXie/MCTS-DPO.
△ Less
Submitted 17 June, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Asymptotic stability of composite waves of viscous shock and rarefaction for relaxed compressible Navier-Stokes equations
Authors:
Renyong guan,
Yuxi Hu
Abstract:
The time asymptotic stability for one-dimensional relaxed compressible Navier-Stokes equations is studied. We show that the composite waves of viscous shock and rarefaction are asymptotically nonlinear stable with both small wave strength and small initial perturbations. Moreover, as the relaxation parameter goes to zero, the solutions of relaxed system are shown to converge globally in time to th…
▽ More
The time asymptotic stability for one-dimensional relaxed compressible Navier-Stokes equations is studied. We show that the composite waves of viscous shock and rarefaction are asymptotically nonlinear stable with both small wave strength and small initial perturbations. Moreover, as the relaxation parameter goes to zero, the solutions of relaxed system are shown to converge globally in time to that of classical system. The methods are based on relative entropy, the a-contraction with shifts theory and basic energy estimates.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
Improved Algorithm for Reachability in $d$-VASS
Authors:
Yuxi Fu,
Qizhe Yang,
Yangluo Zheng
Abstract:
An $\mathsf{F}_{d}$ upper bound for the reachability problem in vector addition systems with states (VASS) in fixed dimension is given, where $\mathsf{F}_d$ is the $d$-th level of the Grzegorczyk hierarchy of complexity classes. The new algorithm combines the idea of the linear path scheme characterization of the reachability in the $2$-dimension VASSes with the general decomposition algorithm by…
▽ More
An $\mathsf{F}_{d}$ upper bound for the reachability problem in vector addition systems with states (VASS) in fixed dimension is given, where $\mathsf{F}_d$ is the $d$-th level of the Grzegorczyk hierarchy of complexity classes. The new algorithm combines the idea of the linear path scheme characterization of the reachability in the $2$-dimension VASSes with the general decomposition algorithm by Mayr, Kosaraju and Lambert. The result improves the $\mathsf{F}_{d + 4}$ upper bound due to Leroux and Schmitz (LICS 2019).
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction
Authors:
Zheye Deng,
Chunkit Chan,
Weiqi Wang,
Yuxi Sun,
Wei Fan,
Tianshi Zheng,
Yauwai Yim,
Yangqiu Song
Abstract:
The task of condensing large chunks of textual information into concise and structured tables has gained attention recently due to the emergence of Large Language Models (LLMs) and their potential benefit for downstream tasks, such as text summarization and text mining. Previous approaches often generate tables that directly replicate information from the text, limiting their applicability in broa…
▽ More
The task of condensing large chunks of textual information into concise and structured tables has gained attention recently due to the emergence of Large Language Models (LLMs) and their potential benefit for downstream tasks, such as text summarization and text mining. Previous approaches often generate tables that directly replicate information from the text, limiting their applicability in broader contexts, as text-to-table generation in real-life scenarios necessitates information extraction, reasoning, and integration. However, there is a lack of both datasets and methodologies towards this task. In this paper, we introduce LiveSum, a new benchmark dataset created for generating summary tables of competitions based on real-time commentary texts. We evaluate the performances of state-of-the-art LLMs on this task in both fine-tuning and zero-shot settings, and additionally propose a novel pipeline called $T^3$(Text-Tuple-Table) to improve their performances. Extensive experimental results demonstrate that LLMs still struggle with this task even after fine-tuning, while our approach can offer substantial performance gains without explicit training. Further analyses demonstrate that our method exhibits strong generalization abilities, surpassing previous approaches on several other text-to-table datasets. Our code and data can be found at https://github.com/HKUST-KnowComp/LiveSum-TTT.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
MaterialSeg3D: Segmenting Dense Materials from 2D Priors for 3D Assets
Authors:
Zeyu Li,
Ruitong Gan,
Chuanchen Luo,
Yuxi Wang,
Jiaheng Liu,
Ziwei Zhu Man Zhang,
Qing Li,
Xucheng Yin,
Zhaoxiang Zhang,
Junran Peng
Abstract:
Driven by powerful image diffusion models, recent research has achieved the automatic creation of 3D objects from textual or visual guidance. By performing score distillation sampling (SDS) iteratively across different views, these methods succeed in lifting 2D generative prior to the 3D space. However, such a 2D generative image prior bakes the effect of illumination and shadow into the texture.…
▽ More
Driven by powerful image diffusion models, recent research has achieved the automatic creation of 3D objects from textual or visual guidance. By performing score distillation sampling (SDS) iteratively across different views, these methods succeed in lifting 2D generative prior to the 3D space. However, such a 2D generative image prior bakes the effect of illumination and shadow into the texture. As a result, material maps optimized by SDS inevitably involve spurious correlated components. The absence of precise material definition makes it infeasible to relight the generated assets reasonably in novel scenes, which limits their application in downstream scenarios. In contrast, humans can effortlessly circumvent this ambiguity by deducing the material of the object from its appearance and semantics. Motivated by this insight, we propose MaterialSeg3D, a 3D asset material generation framework to infer underlying material from the 2D semantic prior. Based on such a prior model, we devise a mechanism to parse material in 3D space. We maintain a UV stack, each map of which is unprojected from a specific viewpoint. After traversing all viewpoints, we fuse the stack through a weighted voting scheme and then employ region unification to ensure the coherence of the object parts. To fuel the learning of semantics prior, we collect a material dataset, named Materialized Individual Objects (MIO), which features abundant images, diverse categories, and accurate annotations. Extensive quantitative and qualitative experiments demonstrate the effectiveness of our method.
△ Less
Submitted 16 May, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
Authors:
Yuxi Ren,
Xin Xia,
Yanzuo Lu,
Jiacheng Zhang,
Jie Wu,
Pan Xie,
Xing Wang,
Xuefeng Xiao
Abstract:
Recently, a series of diffusion-aware distillation algorithms have emerged to alleviate the computational overhead associated with the multi-step inference process of Diffusion Models (DMs). Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation. However, these approaches suffer from severe performance degra…
▽ More
Recently, a series of diffusion-aware distillation algorithms have emerged to alleviate the computational overhead associated with the multi-step inference process of Diffusion Models (DMs). Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation. However, these approaches suffer from severe performance degradation or domain shifts. To address these limitations, we propose Hyper-SD, a novel framework that synergistically amalgamates the advantages of ODE Trajectory Preservation and Reformulation, while maintaining near-lossless performance during step compression. Firstly, we introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments, which facilitates the preservation of the original ODE trajectory from a higher-order perspective. Secondly, we incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process. Thirdly, we integrate score distillation to further improve the low-step generation capability of the model and offer the first attempt to leverage a unified LoRA to support the inference process at all steps. Extensive experiments and user studies demonstrate that Hyper-SD achieves SOTA performance from 1 to 8 inference steps for both SDXL and SD1.5. For example, Hyper-SDXL surpasses SDXL-Lightning by +0.68 in CLIP Score and +0.51 in Aes Score in the 1-step inference.
△ Less
Submitted 22 May, 2024; v1 submitted 21 April, 2024;
originally announced April 2024.
-
MESSENGER observations of Mercury's planetary ion escape rates and their dependence on true anomaly angle
Authors:
Weijie Sun,
Ryan M. Dewey,
Xianzhe Jia,
Jim M. Raines,
James A. Slavin,
Yuxi Chen,
Tai Phan,
Gangkai Poh,
Shaosui Xu,
Anna Milillo,
Robert Lillis,
Yoshifumi Saito,
Stefano Livi,
Stefano Orsini
Abstract:
This study investigates the escape of Mercury's sodium-group ions (Na+-group, including ions with m/q from 21 to 30 amu/e) and their dependence on true anomaly angle (TAA), i.e., Mercury's orbital phase around the Sun, using measurements from MESSENGER. The measurements are categorized into solar wind, magnetosheath, and magnetosphere, and further divided into four TAA intervals. Na+-group ions fo…
▽ More
This study investigates the escape of Mercury's sodium-group ions (Na+-group, including ions with m/q from 21 to 30 amu/e) and their dependence on true anomaly angle (TAA), i.e., Mercury's orbital phase around the Sun, using measurements from MESSENGER. The measurements are categorized into solar wind, magnetosheath, and magnetosphere, and further divided into four TAA intervals. Na+-group ions form escape plumes in the solar wind and magnetosheath, with higher fluxes along the solar wind's motional electric field. The total escape rates vary from 0.2 to 1 times 10^{25} atoms/s with the magnetosheath being the main esca** region. These rates exhibit a TAA dependence, peaking near the perihelion and similar during Mercury's remaining orbit. Despite Mercury's tenuous exosphere, Na+-group ions escape rate is comparable to other inner planets. This can be attributed to several processes, including that Na+-group ions may include several ion species, efficient photoionization frequency for elements within Na+-group, etc.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Authors:
Zhenyang Ni,
Rui Ye,
Yuxi Wei,
Zhen Xiang,
Yanfeng Wang,
Siheng Chen
Abstract:
Vision-Large-Language-models(VLMs) have great application prospects in autonomous driving. Despite the ability of VLMs to comprehend and make decisions in complex scenarios, their integration into safety-critical autonomous driving systems poses serious security risks. In this paper, we propose BadVLMDriver, the first backdoor attack against VLMs for autonomous driving that can be launched in prac…
▽ More
Vision-Large-Language-models(VLMs) have great application prospects in autonomous driving. Despite the ability of VLMs to comprehend and make decisions in complex scenarios, their integration into safety-critical autonomous driving systems poses serious security risks. In this paper, we propose BadVLMDriver, the first backdoor attack against VLMs for autonomous driving that can be launched in practice using physical objects. Unlike existing backdoor attacks against VLMs that rely on digital modifications, BadVLMDriver uses common physical items, such as a red balloon, to induce unsafe actions like sudden acceleration, highlighting a significant real-world threat to autonomous vehicle safety. To execute BadVLMDriver, we develop an automated pipeline utilizing natural language instructions to generate backdoor training samples with embedded malicious behaviors. This approach allows for flexible trigger and behavior selection, enhancing the stealth and practicality of the attack in diverse scenarios. We conduct extensive experiments to evaluate BadVLMDriver for two representative VLMs, five different trigger objects, and two types of malicious backdoor behaviors. BadVLMDriver achieves a 92% attack success rate in inducing a sudden acceleration when coming across a pedestrian holding a red balloon. Thus, BadVLMDriver not only demonstrates a critical security risk but also emphasizes the urgent need for develo** robust defense mechanisms to protect against such vulnerabilities in autonomous driving technologies.
△ Less
Submitted 22 April, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Authors:
Yuzhu Cai,
Sheng Yin,
Yuxi Wei,
Chenxin Xu,
Weibo Mao,
Felix Juefei-Xu,
Siheng Chen,
Yanfeng Wang
Abstract:
The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designe…
▽ More
The burgeoning landscape of text-to-image models, exemplified by innovations such as Midjourney and DALLE 3, has revolutionized content creation across diverse sectors. However, these advancements bring forth critical ethical concerns, particularly with the misuse of open-source models to generate content that violates societal norms. Addressing this, we introduce Ethical-Lens, a framework designed to facilitate the value-aligned usage of text-to-image tools without necessitating internal model revision. Ethical-Lens ensures value alignment in text-to-image models across toxicity and bias dimensions by refining user commands and rectifying model outputs. Systematic evaluation metrics, combining GPT4-V, HEIM, and FairFace scores, assess alignment capability. Our experiments reveal that Ethical-Lens enhances alignment capabilities to levels comparable with or superior to commercial models like DALLE 3, ensuring user-generated content adheres to ethical standards while maintaining image quality. This study indicates the potential of Ethical-Lens to ensure the sustainable development of open-source text-to-image tools and their beneficial integration into society. Our code is available at https://github.com/yuzhu-cai/Ethical-Lens.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data
Authors:
Ivan DeAndres-Tame,
Ruben Tolosana,
Pietro Melzi,
Ruben Vera-Rodriguez,
Minchul Kim,
Christian Rathgeb,
Xiaoming Liu,
Aythami Morales,
Julian Fierrez,
Javier Ortega-Garcia,
Zhizhou Zhong,
Yuge Huang,
Yuxi Mi,
Shouhong Ding,
Shuigeng Zhou,
Shuai He,
Lingzhi Fu,
Heng Cong,
Rongyu Zhang,
Zhihong Xiao,
Evgeny Smirnov,
Anton Pimenov,
Aleksei Grigorev,
Denis Timoshenko,
Kaleb Mesfin Asfaw
, et al. (33 additional authors not shown)
Abstract:
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data…
▽ More
Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Glitch Tokens in Large Language Models: Categorization Taxonomy and Effective Detection
Authors:
Yuxi Li,
Yi Liu,
Gelei Deng,
Ying Zhang,
Wenjia Song,
Ling Shi,
Kailong Wang,
Yuekang Li,
Yang Liu,
Haoyu Wang
Abstract:
With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality of resp…
▽ More
With the expanding application of Large Language Models (LLMs) in various domains, it becomes imperative to comprehensively investigate their unforeseen behaviors and consequent outcomes. In this study, we introduce and systematically explore the phenomenon of "glitch tokens", which are anomalous tokens produced by established tokenizers and could potentially compromise the models' quality of response. Specifically, we experiment on seven top popular LLMs utilizing three distinct tokenizers and involving a totally of 182,517 tokens. We present categorizations of the identified glitch tokens and symptoms exhibited by LLMs when interacting with glitch tokens. Based on our observation that glitch tokens tend to cluster in the embedding space, we propose GlitchHunter, a novel iterative clustering-based technique, for efficient glitch token detection. The evaluation shows that our approach notably outperforms three baseline methods on eight open-source LLMs. To the best of our knowledge, we present the first comprehensive study on glitch tokens. Our new detection further provides valuable insights into mitigating tokenization-related errors in LLMs.
△ Less
Submitted 19 April, 2024; v1 submitted 15 April, 2024;
originally announced April 2024.
-
UniFL: Improve Stable Diffusion via Unified Feedback Learning
Authors:
Jiacheng Zhang,
Jie Wu,
Yuxi Ren,
Xin Xia,
Huafeng Kuang,
Pan Xie,
Jiashi Li,
Xuefeng Xiao,
Min Zheng,
Lean Fu,
Guanbin Li
Abstract:
Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications. However, despite these significant advancements, the current competitive solutions still suffer from several limitations, including inferior visual quality, a lack of aesthetic appeal, and inefficient inference, without a comprehensive solutio…
▽ More
Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications. However, despite these significant advancements, the current competitive solutions still suffer from several limitations, including inferior visual quality, a lack of aesthetic appeal, and inefficient inference, without a comprehensive solution in sight. To address these challenges, we present UniFL, a unified framework that leverages feedback learning to enhance diffusion models comprehensively. UniFL stands out as a universal, effective, and generalizable solution applicable to various diffusion models, such as SD1.5 and SDXL. Notably, UniFL incorporates three key components: perceptual feedback learning, which enhances visual quality; decoupled feedback learning, which improves aesthetic appeal; and adversarial feedback learning, which optimizes inference speed. In-depth experiments and extensive user studies validate the superior performance of our proposed method in enhancing both the quality of generated models and their acceleration. For instance, UniFL surpasses ImageReward by 17% user preference in terms of generation quality and outperforms LCM and SDXL Turbo by 57% and 20% in 4-step inference. Moreover, we have verified the efficacy of our approach in downstream tasks, including Lora, ControlNet, and AnimateDiff.
△ Less
Submitted 22 May, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Authors:
Yuxi Ren,
Jie Wu,
Yanzuo Lu,
Huafeng Kuang,
Xin Xia,
Xionghui Wang,
Qianqian Wang,
Yixing Zhu,
Pan Xie,
Shiyin Wang,
Xuefeng Xiao,
Yitong Wang,
Min Zheng,
Lean Fu
Abstract:
Recent advancements in diffusion-based generative image editing have sparked a profound revolution, resha** the landscape of image outpainting and inpainting tasks. Despite these strides, the field grapples with inherent challenges, including: i) inferior quality; ii) poor consistency; iii) insufficient instrcution adherence; iv) suboptimal generation efficiency. To address these obstacles, we p…
▽ More
Recent advancements in diffusion-based generative image editing have sparked a profound revolution, resha** the landscape of image outpainting and inpainting tasks. Despite these strides, the field grapples with inherent challenges, including: i) inferior quality; ii) poor consistency; iii) insufficient instrcution adherence; iv) suboptimal generation efficiency. To address these obstacles, we present ByteEdit, an innovative feedback learning framework meticulously designed to Boost, Comply, and Accelerate Generative Image Editing tasks. ByteEdit seamlessly integrates image reward models dedicated to enhancing aesthetics and image-text alignment, while also introducing a dense, pixel-level reward model tailored to foster coherence in the output. Furthermore, we propose a pioneering adversarial and progressive feedback learning strategy to expedite the model's inference speed. Through extensive large-scale user evaluations, we demonstrate that ByteEdit surpasses leading generative image editing products, including Adobe, Canva, and MeiTu, in both generation quality and consistency. ByteEdit-Outpainting exhibits a remarkable enhancement of 388% and 135% in quality and consistency, respectively, when compared to the baseline model. Experiments also verfied that our acceleration models maintains excellent performance results in terms of quality and consistency.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Authors:
Yuxi Xiao,
Qianqian Wang,
Shangzhan Zhang,
Nan Xue,
Sida Peng,
Yujun Shen,
Xiaowei Zhou
Abstract:
Recovering dense and long-range pixel motion in videos is a challenging problem. Part of the difficulty arises from the 3D-to-2D projection process, leading to occlusions and discontinuities in the 2D motion domain. While 2D motion can be intricate, we posit that the underlying 3D motion can often be simple and low-dimensional. In this work, we propose to estimate point trajectories in 3D space to…
▽ More
Recovering dense and long-range pixel motion in videos is a challenging problem. Part of the difficulty arises from the 3D-to-2D projection process, leading to occlusions and discontinuities in the 2D motion domain. While 2D motion can be intricate, we posit that the underlying 3D motion can often be simple and low-dimensional. In this work, we propose to estimate point trajectories in 3D space to mitigate the issues caused by image projection. Our method, named SpatialTracker, lifts 2D pixels to 3D using monocular depth estimators, represents the 3D content of each frame efficiently using a triplane representation, and performs iterative updates using a transformer to estimate 3D trajectories. Tracking in 3D allows us to leverage as-rigid-as-possible (ARAP) constraints while simultaneously learning a rigidity embedding that clusters pixels into different rigid parts. Extensive evaluation shows that our approach achieves state-of-the-art tracking performance both qualitatively and quantitatively, particularly in challenging scenarios such as out-of-plane rotation.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
SQL-Encoder: Improving NL2SQL In-Context Learning Through a Context-Aware Encoder
Authors:
Mohammadreza Pourreza,
Davood Rafiei,
Yuxi Feng,
Raymond Li,
Zhenan Fan,
Weiwei Zhang
Abstract:
Detecting structural similarity between queries is essential for selecting examples in in-context learning models. However, assessing structural similarity based solely on the natural language expressions of queries, without considering SQL queries, presents a significant challenge. This paper explores the significance of this similarity metric and proposes a model for accurately estimating it. To…
▽ More
Detecting structural similarity between queries is essential for selecting examples in in-context learning models. However, assessing structural similarity based solely on the natural language expressions of queries, without considering SQL queries, presents a significant challenge. This paper explores the significance of this similarity metric and proposes a model for accurately estimating it. To achieve this, we leverage a dataset comprising 170k question pairs, meticulously curated to train a similarity prediction model. Our comprehensive evaluation demonstrates that the proposed model adeptly captures the structural similarity between questions, as evidenced by improvements in Kendall-Tau distance and precision@k metrics. Notably, our model outperforms strong competitive embedding models from OpenAI and Cohere. Furthermore, compared to these competitive models, our proposed encoder enhances the downstream performance of NL2SQL models in 1-shot in-context learning scenarios by 1-2\% for GPT-3.5-turbo, 4-8\% for CodeLlama-7B, and 2-3\% for CodeLlama-13B.
△ Less
Submitted 24 March, 2024;
originally announced March 2024.
-
SceneX:Procedural Controllable Large-scale Scene Generation via Large-language Models
Authors:
Mengqi Zhou,
Jun Hou,
Chuanchen Luo,
Yuxi Wang,
Zhaoxiang Zhang,
Junran Peng
Abstract:
Due to its great application potential, large-scale scene generation has drawn extensive attention in academia and industry. Recent research employs powerful generative models to create desired scenes and achieves promising results. However, most of these methods represent the scene using 3D primitives (e.g. point cloud or radiance field) incompatible with the industrial pipeline, which leads to a…
▽ More
Due to its great application potential, large-scale scene generation has drawn extensive attention in academia and industry. Recent research employs powerful generative models to create desired scenes and achieves promising results. However, most of these methods represent the scene using 3D primitives (e.g. point cloud or radiance field) incompatible with the industrial pipeline, which leads to a substantial gap between academic research and industrial deployment. Procedural Controllable Generation (PCG) is an efficient technique for creating scalable and high-quality assets, but it is unfriendly for ordinary users as it demands profound domain expertise. To address these issues, we resort to using the large language model (LLM) to drive the procedural modeling. In this paper, we introduce a large-scale scene generation framework, SceneX, which can automatically produce high-quality procedural models according to designers' textual descriptions.Specifically, the proposed method comprises two components, PCGBench and PCGPlanner. The former encompasses an extensive collection of accessible procedural assets and thousands of hand-craft API documents. The latter aims to generate executable actions for Blender to produce controllable and precise 3D assets guided by the user's instructions. Our SceneX can generate a city spanning 2.5 km times 2.5 km with delicate layout and geometric structures, drastically reducing the time cost from several weeks for professional PCG engineers to just a few hours for an ordinary user. Extensive experiments demonstrated the capability of our method in controllable large-scale scene generation and editing, including asset placement and season translation.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Evaluating the impact of instrumental variables in propensity score models using synthetic and negative control experiments
Authors:
Yuxi Tian,
Nicole Pratt,
Laura L Hester,
George Hripcsak,
Martijn J Schuemie,
Marc A Suchard
Abstract:
In pharmacoepidemiology research, instrumental variables (IVs) are variables that strongly predict treatment but have no causal effect on the outcome of interest except through the treatment. There remain concerns about the inclusion of IVs in propensity score (PS) models amplifying estimation bias and reducing precision. Some PS modeling approaches attempt to address the potential effects of IVs,…
▽ More
In pharmacoepidemiology research, instrumental variables (IVs) are variables that strongly predict treatment but have no causal effect on the outcome of interest except through the treatment. There remain concerns about the inclusion of IVs in propensity score (PS) models amplifying estimation bias and reducing precision. Some PS modeling approaches attempt to address the potential effects of IVs, including selecting only covariates for the PS model that are strongly associated to the outcome of interest, thus screening out IVs. We conduct a study utilizing simulations and negative control experiments to evaluate the effect of IVs on PS model performance and to uncover best PS practices for real-world studies. We find that simulated IVs have a weak effect on bias and precision in both simulations and negative control experiments based on real-world data. In simulation experiments, PS methods that utilize outcome data, including the high-dimensional propensity score, produce the least estimation bias. However, in real-world settings underlying causal structures are unknown, and negative control experiments can illustrate a PS model's ability to minimize systematic bias. We find that large-scale, regularized regression based PS models in this case provide the most centered negative control distributions, suggesting superior performance in real-world scenarios.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Dual-sided transparent display
Authors:
Suman Halder,
Yunho Shin,
Yidan Peng,
Long Wang,
Liye Duan,
Paul Schmalenberg,
Guangkui Qin,
Yuxi Gao,
Ercan M. Dede,
Deng-Ke Yang,
Sean P. Rodrigues
Abstract:
In the past decade, display technology has been reimagined to meet the needs of the virtual world. By map** information onto a scene through a transparent display, users can simultaneously visualize both the real world and layers of virtual elements. However, advances in augmented reality (AR) technology have primarily focused on wearable gear or personal devices. Here we present a single displa…
▽ More
In the past decade, display technology has been reimagined to meet the needs of the virtual world. By map** information onto a scene through a transparent display, users can simultaneously visualize both the real world and layers of virtual elements. However, advances in augmented reality (AR) technology have primarily focused on wearable gear or personal devices. Here we present a single display capable of delivering visual information to observers positioned on either side of the transparent device. This dual-sided display system employs a polymer stabilized liquid crystal waveguide technology to achieve a transparency window of 65% while offering active-matrix control. An early-stage prototype exhibits full-color information via time-sequential processing of a red-green-blue (RGB) light-emitting diode (LED) strip. The dual-sided display provides a perspective on transparent mediums as display devices for human-centric and service-related experiences that can support both enhanced bi-directional user interactions and new media platforms.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Privacy-Preserving Face Recognition Using Trainable Feature Subtraction
Authors:
Yuxi Mi,
Zhizhou Zhong,
Yuge Huang,
Jiazhen Ji,
Jianqing Xu,
Jun Wang,
Shaoming Wang,
Shouhong Ding,
Shuigeng Zhou
Abstract:
The widespread adoption of face recognition has led to increasing privacy concerns, as unauthorized access to face images can expose sensitive personal information. This paper explores face image protection against viewing and recovery attacks. Inspired by image compression, we propose creating a visually uninformative face image through feature subtraction between an original face and its model-p…
▽ More
The widespread adoption of face recognition has led to increasing privacy concerns, as unauthorized access to face images can expose sensitive personal information. This paper explores face image protection against viewing and recovery attacks. Inspired by image compression, we propose creating a visually uninformative face image through feature subtraction between an original face and its model-produced regeneration. Recognizable identity features within the image are encouraged by co-training a recognition model on its high-dimensional feature representation. To enhance privacy, the high-dimensional representation is crafted through random channel shuffling, resulting in randomized recognizable images devoid of attacker-leverageable texture details. We distill our methodologies into a novel privacy-preserving face recognition method, MinusFace. Experiments demonstrate its high recognition accuracy and effective privacy protection. Its code is available at https://github.com/Tencent/TFace.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
A kinetic-magnetohydrodynamic model with adaptive mesh refinement for modeling heliosphere neutral-plasma interaction
Authors:
Yuxi Chen,
Gabor Toth,
Erick Powell,
Talha Arshad,
Ethan Bair,
Marc Kornbleuth,
Merav Opher
Abstract:
The charge exchange between the interstellar medium (ISM) and the solar wind plasma is crucial for determining the structures of the heliosphere. Since both the neutral-ion and neutral-neutral collision mean free paths are either comparable to or larger than the size of the heliosphere, the neutral phase space distribution can deviate far away from the Maxwellian distribution. A kinetic descriptio…
▽ More
The charge exchange between the interstellar medium (ISM) and the solar wind plasma is crucial for determining the structures of the heliosphere. Since both the neutral-ion and neutral-neutral collision mean free paths are either comparable to or larger than the size of the heliosphere, the neutral phase space distribution can deviate far away from the Maxwellian distribution. A kinetic description for the neutrals is crucial for accurately modeling the heliosphere. It is computationally challenging to run three-dimensional (3D) time-dependent kinetic simulations due to the large number of macro-particles. In this paper, we present the new highly efficient SHIELD-2 model with a kinetic model of neutrals and a magnetohydrodynamic (MHD) model for the ions and electrons. To improve the simulation efficiency, we implement adaptive mesh refinement (AMR) and particle splitting and merging algorithms for the neutral particles to reduce the particle number that is required for an accurate simulation. We present several tests to verify and demonstrate the capabilities of the model.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Continual Forgetting for Pre-trained Vision Models
Authors:
Hongbo Zhao,
Bolin Ni,
Haochen Wang,
Junsong Fan,
Fei Zhu,
Yuxi Wang,
Yuntao Chen,
Gaofeng Meng,
Zhaoxiang Zhang
Abstract:
For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners. These requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while ma…
▽ More
For privacy and security concerns, the need to erase unwanted information from pre-trained vision models is becoming evident nowadays. In real-world scenarios, erasure requests originate at any time from both users and model owners. These requests usually form a sequence. Therefore, under such a setting, selective information is expected to be continuously removed from a pre-trained model while maintaining the rest. We define this problem as continual forgetting and identify two key challenges. (i) For unwanted knowledge, efficient and effective deleting is crucial. (ii) For remaining knowledge, the impact brought by the forgetting procedure should be minimal. To address them, we propose Group Sparse LoRA (GS-LoRA). Specifically, towards (i), we use LoRA modules to fine-tune the FFN layers in Transformer blocks for each forgetting task independently, and towards (ii), a simple group sparse regularization is adopted, enabling automatic selection of specific LoRA groups and zeroing out the others. GS-LoRA is effective, parameter-efficient, data-efficient, and easy to implement. We conduct extensive experiments on face recognition, object detection and image classification and demonstrate that GS-LoRA manages to forget specific classes with minimal impact on other classes. Codes will be released on \url{https://github.com/bjzhb666/GS-LoRA}.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
A Deep Learning Method for Beat-Level Risk Analysis and Interpretation of Atrial Fibrillation Patients during Sinus Rhythm
Authors:
Jun Lei,
Yuxi Zhou,
Xue Tian,
Qinghao Zhao,
Qi Zhang,
Shijia Geng,
Qingbo Wu,
Shenda Hong
Abstract:
Atrial Fibrillation (AF) is a common cardiac arrhythmia. Many AF patients experience complications such as stroke and other cardiovascular issues. Early detection of AF is crucial. Existing algorithms can only distinguish ``AF rhythm in AF patients'' from ``sinus rhythm in normal individuals'' . However, AF patients do not always exhibit AF rhythm, posing a challenge for diagnosis when the AF rhyt…
▽ More
Atrial Fibrillation (AF) is a common cardiac arrhythmia. Many AF patients experience complications such as stroke and other cardiovascular issues. Early detection of AF is crucial. Existing algorithms can only distinguish ``AF rhythm in AF patients'' from ``sinus rhythm in normal individuals'' . However, AF patients do not always exhibit AF rhythm, posing a challenge for diagnosis when the AF rhythm is absent. To address this, this paper proposes a novel artificial intelligence (AI) algorithm to distinguish ``sinus rhythm in AF patients'' and ``sinus rhythm in normal individuals'' in beat-level. We introduce beat-level risk interpreters, trend risk interpreters, addressing the interpretability issues of deep learning models and the difficulty in explaining AF risk trends. Additionally, the beat-level information fusion decision is presented to enhance model accuracy. The experimental results demonstrate that the average AUC for single beats used as testing data from CPSC 2021 dataset is 0.7314. By employing 150 beats for information fusion decision algorithm, the average AUC can reach 0.7591. Compared to previous segment-level algorithms, we utilized beats as input, reducing data dimensionality and making the model more lightweight, facilitating deployment on portable medical devices. Furthermore, we draw new and interesting findings through average beat analysis and subgroup analysis, considering varying risk levels.
△ Less
Submitted 17 March, 2024;
originally announced March 2024.
-
zoomies: A tool to infer stellar age from vertical action in Gaia data
Authors:
Sheila Sagear,
Adrian M. Price-Whelan,
Sarah Ballard,
Yuxi,
Lu,
Ruth Angus,
David W. Hogg
Abstract:
Stellar age measurements are fundamental to understanding a wide range of astronomical processes, including galactic dynamics, stellar evolution, and planetary system formation. However, extracting age information from Main Sequence stars is complicated, with techniques often relying on age proxies in the absence of direct measurements. The Gaia data releases have enabled detailed studies of the d…
▽ More
Stellar age measurements are fundamental to understanding a wide range of astronomical processes, including galactic dynamics, stellar evolution, and planetary system formation. However, extracting age information from Main Sequence stars is complicated, with techniques often relying on age proxies in the absence of direct measurements. The Gaia data releases have enabled detailed studies of the dynamical properties of stars within the Milky Way, offering new opportunities to understand the relationship between stellar age and dynamics. In this study, we leverage high-precision astrometric data from Gaia DR3 to construct a stellar age prediction model based only on stellar dynamical properties; namely, the vertical action. We calibrate two distinct, hierarchical stellar age--vertical action relations, first employing asteroseismic ages for red giant branch stars, then isochrone ages for main-sequence turn-off stars. We describe a framework called "zoomies" based on this calibration, by which we can infer ages for any star given its vertical action. This tool is open-source and intended for community use. We compare dynamical age estimates from "zoomies" with ages derived from other techniques for a sample of open clusters and main-sequence stars with asteroseismic age measurements. We also compare dynamical age estimates for stellar samples from the Kepler, K2, and TESS exoplanet transit surveys. While dynamical age relations are associated with large uncertainty, they are generally mass-independent and depend on homogeneously measured astrometric data. These age predictions are uniquely useful for large-scale demographic investigations, especially in disentangling the relationship between planet occurrence, metallicity, and age for low-mass stars.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Stellar Mergers or Truly Young? Intermediate-Age Stars on Highly-Radial Orbits in the Milky Way's Stellar Halo
Authors:
Danny Horta,
Yuxi,
Lu,
Melissa K. Ness,
Mariangela Lisanti,
Adrian M. Price-Whelan
Abstract:
Reconstructing the mass assembly history of the Milky Way relies on obtaining detailed measurements of the properties of many stars in the Galaxy, especially in the stellar halo. One of the most constraining quantities is stellar age, as it can shed light on the accretion time and quenching of star formation in merging satellites. However, obtaining reliable age estimates for large samples of halo…
▽ More
Reconstructing the mass assembly history of the Milky Way relies on obtaining detailed measurements of the properties of many stars in the Galaxy, especially in the stellar halo. One of the most constraining quantities is stellar age, as it can shed light on the accretion time and quenching of star formation in merging satellites. However, obtaining reliable age estimates for large samples of halo stars is difficult. We report published ages of 120 subgiant halo stars with highly-radial orbits that likely belong to the debris of the \textsl{Gaia-Enceladus/Sausage}~(GES) galaxy. The majority of these halo stars are old, with an age distribution characterized by a median of 11.6~Gyr and 16$^{\rm th}$(84$^{\rm th}$) percentile of 10.5~(12.7)~Gyr. However, the distribution is skewed, with a tail of younger stars that span ages down to $\sim6$--$9$ Gyr. All highly-radial halo stars have chemical and kinematic/orbital quantities that associate them with the GES debris. Initial results suggest that these intermediate-age stars are not a product of mass transfer and/or stellar mergers, which can bias their age determination low. If this conclusion is upheld by upcoming spectro-photometric studies, then the presence of these stars will pose an important challenge for constraining the properties of the GES merger and the accretion history of the Galaxy.
△ Less
Submitted 10 June, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
Specification Overfitting in Artificial Intelligence
Authors:
Benjamin Roth,
Pedro Henrique Luz de Araujo,
Yuxi Xia,
Saskia Kaltenbrunner,
Christoph Korab
Abstract:
Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology's potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, i…
▽ More
Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology's potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, imperfect proxies that capture isolated aspects of the underlying requirements. Given possible trade-offs between different metrics and their vulnerability to over-optimization, integrating specification metrics in system development processes is not trivial. This paper defines specification overfitting, a scenario where systems focus excessively on specified metrics to the detriment of high-level requirements and task performance. We present an extensive literature survey to categorize how researchers propose, measure, and optimize specification metrics in several AI fields (e.g., natural language processing, computer vision, reinforcement learning). Using a keyword-based search on papers from major AI conferences and journals between 2018 and mid-2023, we identify and analyze 74 papers that propose or optimize specification metrics. We find that although most papers implicitly address specification overfitting (e.g., by reporting more than one specification metric), they rarely discuss which role specification metrics should play in system development or explicitly define the scope and assumptions behind metric formulations.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Topological Protection of Optical Skyrmions through Complex Media
Authors:
An Aloysius Wang,
Zimo Zhao,
Yifei Ma,
Yuxi Cai,
Tade Marozsak,
Binguo Chen,
Honghui He,
Lin Luo,
Martin J Booth,
Steve J Elston,
Stephen M Morris,
Chao He
Abstract:
Recent experimental realizations of optical Skyrmions through the techniques of structured light have opened the doors to a completely new way of representing data in electromagnetic fields, namely its topology. Apart from potentially enhancing the bandwidth of optical systems, the intrinsically discrete nature of the topological number allows Skyrmions to naturally interface with the digital worl…
▽ More
Recent experimental realizations of optical Skyrmions through the techniques of structured light have opened the doors to a completely new way of representing data in electromagnetic fields, namely its topology. Apart from potentially enhancing the bandwidth of optical systems, the intrinsically discrete nature of the topological number allows Skyrmions to naturally interface with the digital world. However, investigations into the topological protection of optical Skyrmions through various media remain limited to date. Here, we rigorously define the optical Skyrmion and establish a framework that can be used to analyze the effects of complex media on the topology of Skyrmion fields. Using this framework, we provide simple criteria for spatially varying retarders, diattenuators, depolarizers, and combinations of the former under which topological protection is guaranteed. We then present experimental results validating the robustness of the Skyrmion number against corrupting media and discuss ways of extending the optical Skyrmion to more general settings. We believe that the work presented in this paper provides a theoretical underpinning for the use of Skyrmions in practical applications ranging from optical communications to photonic computing.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.