Search | arXiv e-print repository

Autonomous Shuttle Operation for Vulnerable Populations: Lessons and Experiences

Authors: Ren Zhong, Zhaofeng Tian, **ghui Liao, Weisong Shi

Abstract: The increasing shortage of drivers poses a significant threat to vulnerable populations, particularly seniors and disabled individuals who heavily depend on public transportation for accessing healthcare services and social events. Autonomous Vehicles (AVs) emerge as a promising alternative, offering potential improvements in accessibility and independence for these groups. However, current design… ▽ More The increasing shortage of drivers poses a significant threat to vulnerable populations, particularly seniors and disabled individuals who heavily depend on public transportation for accessing healthcare services and social events. Autonomous Vehicles (AVs) emerge as a promising alternative, offering potential improvements in accessibility and independence for these groups. However, current designs and studies often overlook the unique needs and experiences of these populations, leading to potential accessibility barriers. This paper presents a detailed case study of an autonomous shuttle test specifically tailored for seniors and disabled individuals, conducted during the early stages of the COVID-19 pandemic. The service, which lasted 13 weeks, catered to approximately 1500 passengers in an urban setting, aiming to facilitate access to essential services. Drawing from the safety operator's experiences and direct observations, we identify critical user experience and safety challenges faced by vulnerable passengers. Based on our findings, we propose targeted initiatives to enhance the safety, accessibility, and user education of AV technology for seniors and disabled individuals. These include increasing educational opportunities to familiarize these groups with AV technology, designing AVs with a focus on diversity and inclusion, and improving training programs for AV operators to address the unique needs of vulnerable populations. Through these initiatives, we aim to bridge the gap in AV accessibility and ensure that these technologies benefit all members of society. △ Less

Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.16864 [pdf, other]

doi 10.1109/LWC.2024.3355397

Joint Resource Allocation and Trajectory Design for Resilient Multi-UAV Communication Networks

Authors: Linghui Ge, Xiao Liang, Hua Zhang, Peihao Dong, Jianxin Liao, **gyu Wang

Abstract: In contrast to terrestrial wireless networks, dynamic Unmanned Aerial Vehicle (UAV) networks are susceptible to unexpected link failures arising from UAV breakdowns or the depletion of its batteries. Drastic user rate fluctuations and sum rate drops can occur due to the unexpected UAV link failures. Previous research has focused primarily on re-establishing these links to maintain service continui… ▽ More In contrast to terrestrial wireless networks, dynamic Unmanned Aerial Vehicle (UAV) networks are susceptible to unexpected link failures arising from UAV breakdowns or the depletion of its batteries. Drastic user rate fluctuations and sum rate drops can occur due to the unexpected UAV link failures. Previous research has focused primarily on re-establishing these links to maintain service continuity, while neglecting overall system performance, including sum rate and user rate fluctuations. This letter proposes a resilient UAV network design utilizing the modern portfolio theory (MPT), which jointly optimizes the bandwidth allocation, UAV-user association, and UAV trajectories to enhance the overall service stability. Specifically, the design incorporates a novel utility function based on MPT to achieve a better balance between the sum rate and user rate fluctuations. To solve the joint optimization problem, we propose an iterative algorithm based on alternating optimization (AO) and successive convex approximation (SCA). Simulation results show that our scheme outperforms the other two baselines in terms of sum rate and user rate fluctuations. Furthermore, the resilience requirement in terms of sum rate, user rate fluctuations and user fairness can be achieved by flexibly tuning weight factor in our proposed algorithm. △ Less

Submitted 20 January, 2024; originally announced February 2024.

Journal ref: IEEE Wireless Communications Letters, 2024

arXiv:2402.16379 [pdf, other]

TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement

Authors: Zhaopeng Feng, Yan Zhang, Hao Li, Bei Wu, Jiayu Liao, Wenqiang Liu, Jun Lang, Yang Feng, Jian Wu, Zuozhu Liu

Abstract: Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs still contain multiple errors. Importantly, feeding back such error information into the LLMs can lead to self-refinement and result in improved translation performance. Motivated by these insights, we introduce a systematic… ▽ More Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT). However, careful evaluations by human reveal that the translations produced by LLMs still contain multiple errors. Importantly, feeding back such error information into the LLMs can lead to self-refinement and result in improved translation performance. Motivated by these insights, we introduce a systematic LLM-based self-refinement translation framework, named \textbf{TEaR}, which stands for \textbf{T}ranslate, \textbf{E}stimate, \textbf{a}nd \textbf{R}efine, marking a significant step forward in this direction. Our findings demonstrate that 1) our self-refinement framework successfully assists LLMs in improving their translation quality across a wide range of languages, whether it's from high-resource languages to low-resource ones or whether it's English-centric or centered around other languages; 2) TEaR exhibits superior systematicity and interpretability; 3) different estimation strategies yield varied impacts, directly affecting the effectiveness of the final corrections. Additionally, traditional neural translation models and evaluation models operate separately, often focusing on singular tasks due to their limited capabilities, while general-purpose LLMs possess the capability to undertake both tasks simultaneously. We further conduct cross-model correction experiments to investigate the potential relationship between the translation and evaluation capabilities of general-purpose LLMs. Our code and data are available at https://github.com/fzp0424/self_correct_mt △ Less

Submitted 21 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: Our code and data are available at https://github.com/fzp0424/self_correct_mt

arXiv:2402.15673 [pdf, other]

Studying exotic hadrons in high energy nuclear collisions

Authors: Xingyu Guo, **feng Liao, Hongxi Xing

Abstract: Studies of exotic hadrons such as the $χ_{cl}(3872)$ state provide crucial insights into the fundamental force governing the strong interaction dynamics, with an emerging new frontier to investigate their production in high energy nuclear collisions where a partonic medium is present. This contribution discusses the production mechanisms of exotic hadrons in such collisions and analyzes novel effe… ▽ More Studies of exotic hadrons such as the $χ_{cl}(3872)$ state provide crucial insights into the fundamental force governing the strong interaction dynamics, with an emerging new frontier to investigate their production in high energy nuclear collisions where a partonic medium is present. This contribution discusses the production mechanisms of exotic hadrons in such collisions and analyzes novel effects from the partonic medium, demonstrating the potential to use heavy ion measurements for deciphering their internal structure and understanding their in-medium evolutions. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: 4 pages, 4 figures, contribution to the proceeding of Quark Matter 2023. arXiv admin note: text overlap with arXiv:2302.03828

arXiv:2402.12657 [pdf, ps, other]

Coded Backscattering Communication with LTE Pilots as Ambient Signal

Authors: **gyi Liao, Kalle Ruttik, Riku Jantti, Phan-Huy Dinh-Thuy

Abstract: The 3GPP has recently conducted a study on the Ambient Internet of Things (AIoT), with a particular emphasis on examining backscatter communications as one of the primary techniques under consideration. Previous investigations into Ambient Backscatter Communications (AmBC) within the long term evolution (LTE) downlink have shown that it is feasible to utilize the user equipment channel estimator a… ▽ More The 3GPP has recently conducted a study on the Ambient Internet of Things (AIoT), with a particular emphasis on examining backscatter communications as one of the primary techniques under consideration. Previous investigations into Ambient Backscatter Communications (AmBC) within the long term evolution (LTE) downlink have shown that it is feasible to utilize the user equipment channel estimator as a receiver for demodulating frequency shift keyed (FSK) messages transmitted by the backscatter devices. In practical deployment scenarios, the backscattered link often experiences a low signal-to-noise ratio, leading to subpar bit error rate (BER) performance in the case of uncoded transmissions. In this paper, we propose the adoption of the same convolutional coding methodology for backscatter links that is already employed for LTE downlink control signals. This approach facilitates the reuse of identical demodulation functions at the modem for both control signals and backscattered AIoT messages. To assess the performance of the proposed scheme, we conducted experiments utilizing real LTE downlink signals generated by a mobile operator within an office environment. When compared to uncoded FSK, convolutional channel coding delivers a notable gain of approximately 6 dB at a BER of $10^{-3}$. Consequently, the AmBC system demonstrates a high level of reliability, achieving a BER of $10^{-3}$ at a Signal-to-Noise Ratio (SNR) of 5 dB. △ Less

Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.11629 [pdf, ps, other]

Criteria for nilpotency of fusion systems

Authors: Jie Jian, Jun Liao, Heguo Liu

Abstract: Let $p$ be an odd prime and let $\mathcal{F}$ be a fusion system over a finite $p$-group $P$. A fusion system $\mathcal{F}$ is said to be nilpotent if $\mathcal{F}=\mathcal{F}_{P}(P)$. In this paper we provide new criteria for saturated fusion systems $\mathcal{F}$ to be nilpotent, which can be viewed as extension of the $p$-nilpotency theorem of Glauberman and Thompson for fusion systems attribut… ▽ More Let $p$ be an odd prime and let $\mathcal{F}$ be a fusion system over a finite $p$-group $P$. A fusion system $\mathcal{F}$ is said to be nilpotent if $\mathcal{F}=\mathcal{F}_{P}(P)$. In this paper we provide new criteria for saturated fusion systems $\mathcal{F}$ to be nilpotent, which can be viewed as extension of the $p$-nilpotency theorem of Glauberman and Thompson for fusion systems attributed to Kessar and Linckelmann. △ Less

Submitted 18 February, 2024; originally announced February 2024.

MSC Class: 20F19; 20J15

arXiv:2402.06700 [pdf, other]

Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement

Authors: Muning Wen, Junwei Liao, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen

Abstract: Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. Traditional approaches often depend on meticulously designed prompts, high-quality examples, or additional reward models for in-context learning, supervised fine-tuning, or RLHF. Reinforcement learning (RL) presents a dynamic alternative for LLMs to overcome these dependencies by engaging di… ▽ More Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. Traditional approaches often depend on meticulously designed prompts, high-quality examples, or additional reward models for in-context learning, supervised fine-tuning, or RLHF. Reinforcement learning (RL) presents a dynamic alternative for LLMs to overcome these dependencies by engaging directly with task-specific environments. Nonetheless, it faces significant hurdles: 1) instability stemming from the exponentially vast action space requiring exploration; 2) challenges in assigning token-level credit based on action-level reward signals, resulting in discord between maximizing rewards and accurately modeling corpus data. In response to these challenges, we introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. At the heart of ETPO is our novel per-token soft Bellman update, designed to harmonize the RL process with the principles of language modeling. This methodology decomposes the Q-function update from a coarse action-level view to a more granular token-level perspective, backed by theoretical proof of optimization consistency. Crucially, this decomposition renders linear time complexity in action exploration. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks; results underline ETPO's potential as a robust method for refining the interactive decision-making capabilities of language agents. For a more detailed preliminary work describing our motivation for token-level decomposition and applying it in PPO methods, please refer to arXiv:2405.15821. △ Less

Submitted 6 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

arXiv:2402.03162 [pdf, other]

doi 10.1145/3641519.3657481

Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion

Authors: Shiyuan Yang, Liang Hou, Haibin Huang, Chongyang Ma, Pengfei Wan, Di Zhang, Xiaodong Chen, **g Liao

Abstract: Recent text-to-video diffusion models have achieved impressive progress. In practice, users often desire the ability to control object motion and camera movement independently for customized video creation. However, current methods lack the focus on separately controlling object motion and camera movement in a decoupled manner, which limits the controllability and flexibility of text-to-video mode… ▽ More Recent text-to-video diffusion models have achieved impressive progress. In practice, users often desire the ability to control object motion and camera movement independently for customized video creation. However, current methods lack the focus on separately controlling object motion and camera movement in a decoupled manner, which limits the controllability and flexibility of text-to-video models. In this paper, we introduce Direct-a-Video, a system that allows users to independently specify motions for multiple objects as well as camera's pan and zoom movements, as if directing a video. We propose a simple yet effective strategy for the decoupled control of object motion and camera movement. Object motion is controlled through spatial cross-attention modulation using the model's inherent priors, requiring no additional optimization. For camera movement, we introduce new temporal cross-attention layers to interpret quantitative camera movement parameters. We further employ an augmentation-based approach to train these layers in a self-supervised manner on a small-scale dataset, eliminating the need for explicit motion annotation. Both components operate independently, allowing individual or combined control, and can generalize to open-domain scenarios. Extensive experiments demonstrate the superiority and effectiveness of our method. Project page and code are available at https://direct-a-video.github.io/. △ Less

Submitted 6 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.03025 [pdf, other]

Understanding and Guiding Weakly Supervised Entity Alignment with Potential Isomorphism Propagation

Authors: Yuanyi Wang, Wei Tang, Haifeng Sun, Zirui Zhuang, Xiaoyuan Fu, **gyu Wang, Qi Qi, Jianxin Liao

Abstract: Weakly Supervised Entity Alignment (EA) is the task of identifying equivalent entities across diverse knowledge graphs (KGs) using only a limited number of seed alignments. Despite substantial advances in aggregation-based weakly supervised EA, the underlying mechanisms in this setting remain unexplored. In this paper, we present a propagation perspective to analyze weakly supervised EA and explai… ▽ More Weakly Supervised Entity Alignment (EA) is the task of identifying equivalent entities across diverse knowledge graphs (KGs) using only a limited number of seed alignments. Despite substantial advances in aggregation-based weakly supervised EA, the underlying mechanisms in this setting remain unexplored. In this paper, we present a propagation perspective to analyze weakly supervised EA and explain the existing aggregation-based EA models. Our theoretical analysis reveals that these models essentially seek propagation operators for pairwise entity similarities. We further prove that, despite the structural heterogeneity of different KGs, the potentially aligned entities within aggregation-based EA models have isomorphic subgraphs, which is the core premise of EA but has not been investigated. Leveraging this insight, we introduce a potential isomorphism propagation operator to enhance the propagation of neighborhood information across KGs. We develop a general EA framework, PipEA, incorporating this operator to improve the accuracy of every type of aggregation-based model without altering the learning process. Extensive experiments substantiate our theoretical findings and demonstrate PipEA's significant performance gains over state-of-the-art weakly supervised EA methods. Our work not only advances the field but also enhances our comprehension of aggregation-based weakly supervised EA. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2401.17859 [pdf, other]

Towards Semantic Consistency: Dirichlet Energy Driven Robust Multi-Modal Entity Alignment

Authors: Yuanyi Wang, Haifeng Sun, Jiabo Wang, **gyu Wang, Wei Tang, Qi Qi, Shaoling Sun, Jianxin Liao

Abstract: In Multi-Modal Knowledge Graphs (MMKGs), Multi-Modal Entity Alignment (MMEA) is crucial for identifying identical entities across diverse modal attributes. However, semantic inconsistency, mainly due to missing modal attributes, poses a significant challenge. Traditional approaches rely on attribute interpolation, but this often introduces modality noise, distorting the original semantics. Moreove… ▽ More In Multi-Modal Knowledge Graphs (MMKGs), Multi-Modal Entity Alignment (MMEA) is crucial for identifying identical entities across diverse modal attributes. However, semantic inconsistency, mainly due to missing modal attributes, poses a significant challenge. Traditional approaches rely on attribute interpolation, but this often introduces modality noise, distorting the original semantics. Moreover, the lack of a universal theoretical framework limits advancements in achieving semantic consistency. This study introduces a novel approach, DESAlign, which addresses these issues by applying a theoretical framework based on Dirichlet energy to ensure semantic consistency. We discover that semantic inconsistency leads to model overfitting to modality noise, causing performance fluctuations, particularly when modalities are missing. DESAlign innovatively combats over-smoothing and interpolates absent semantics using existing modalities. Our approach includes a multi-modal knowledge graph learning strategy and a propagation technique that employs existing semantic features to compensate for missing ones, providing explicit Euler solutions. Comprehensive evaluations across 60 benchmark splits, including monolingual and bilingual scenarios, demonstrate that DESAlign surpasses existing methods, setting a new standard in performance. Further testing with high rates of missing modalities confirms its robustness, offering an effective solution to semantic inconsistency in real-world MMKGs. △ Less

Submitted 19 March, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

Comments: arXiv admin note: text overlap with arXiv:2307.16210 by other authors

arXiv:2401.17807 [pdf, other]

Advances in 3D Generation: A Survey

Authors: Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, **gbo Zhang, Zhihao Liang, **g Liao, Yan-Pei Cao, Ying Shan

Abstract: Generating 3D models lies at the core of computer graphics and has been the focus of decades of research. With the emergence of advanced neural representations and generative models, the field of 3D content generation is develo** rapidly, enabling the creation of increasingly high-quality and diverse 3D models. The rapid growth of this field makes it difficult to stay abreast of all recent devel… ▽ More Generating 3D models lies at the core of computer graphics and has been the focus of decades of research. With the emergence of advanced neural representations and generative models, the field of 3D content generation is develo** rapidly, enabling the creation of increasingly high-quality and diverse 3D models. The rapid growth of this field makes it difficult to stay abreast of all recent developments. In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications. Specifically, we introduce the 3D representations that serve as the backbone for 3D generation. Furthermore, we provide a comprehensive overview of the rapidly growing literature on generation methods, categorized by the type of algorithmic paradigms, including feedforward generation, optimization-based generation, procedural generation, and generative novel view synthesis. Lastly, we discuss available datasets, applications, and open challenges. We hope this survey will help readers explore this exciting topic and foster further advancements in the field of 3D content generation. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 33 pages, 12 figures

arXiv:2401.16995 [pdf, other]

Compton Scattering on 4He with Nuclear One- and Two-Body Densities

Authors: Harald W. Griesshammer, Junjie Liao, Judith A. McGovern, Andreas Nogga, Daniel R. Phillips

Abstract: We present the first \emph{ab initio} calculation of elastic Compton scattering from 4He. It is carried out to $\mathcal{O}(e^2 δ^3)$ [N3LO] in the $δ$ expansion of $χ$EFT. At this order and for this target, the only free parameters are the scalar-isoscalar electric and magnetic dipole polarisabilities of the nucleon. Adopting current values for these yields a parameter-free prediction. This compa… ▽ More We present the first \emph{ab initio} calculation of elastic Compton scattering from 4He. It is carried out to $\mathcal{O}(e^2 δ^3)$ [N3LO] in the $δ$ expansion of $χ$EFT. At this order and for this target, the only free parameters are the scalar-isoscalar electric and magnetic dipole polarisabilities of the nucleon. Adopting current values for these yields a parameter-free prediction. This compares favourably with the world data from HI$γ$S, Illinois and Lund for photon energies $50\;\mathrm{MeV}\lesssimω\lesssim120\;\mathrm{MeV}$ within our theoretical uncertainties of $\pm10\%$. We predict a cross section up to 7 times that for deuterium. As in 3He, this emphasises and tests the key role of meson-exchange currents between np pairs in Compton scattering on light nuclei. We assess the sensitivity of the cross section and beam asymmetry to the nucleon polarisabilities, providing clear guidance to future experiments seeking to further constrain them. The calculation becomes tractable by use of the Transition Density Method. The one- and two-body densities generated from 5 chiral potentials and the AV18$+$UIX potential are available using the python package provided at \url{https://pypi.org/project/nucdens/}. △ Less

Submitted 20 June, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: 38 pages LaTeX2e (pdflatex) including 13 figures as 14 .pdf files using includegraphics. Minor grammatical/typographical changes, figures 3, 4 and 7 typographically corrected without changes of substance. Text- and figure-identical to published version

arXiv:2401.15673 [pdf, other]

Using the Transition-Density Formalism in the First Computation of 4He Compton Scattering

Authors: Harald W. Griesshammer, Junjie Liao, Judith A. McGovern, Andreas Nogga, Daniel R. Phillips

Abstract: The method and results of the first theory description of 4He Compton scattering at nuclear energies is presented, with a focus on figures. An upcoming publication [1] contains details and a comprehensive list of references. The method and results of the first theory description of 4He Compton scattering at nuclear energies is presented, with a focus on figures. An upcoming publication [1] contains details and a comprehensive list of references. △ Less

Submitted 28 January, 2024; originally announced January 2024.

Comments: 6 pages LaTeX2e (pdflatex) including 4 figures as 6 .pdf files using includegraphics and webofc class. Contribution to the Proceedings of the 16th International Conference on Meson-Nucleon Physics and the Structure of the Nucleon (MENU 2023), Mainz 16-20 October 2023

arXiv:2401.12798 [pdf, other]

Gradient Flow of Energy: A General and Efficient Approach for Entity Alignment Decoding

Authors: Yuanyi Wang, Haifeng Sun, **gyu Wang, Qi Qi, Shaoling Sun, Jianxin Liao

Abstract: Entity alignment (EA), a pivotal process in integrating multi-source Knowledge Graphs (KGs), seeks to identify equivalent entity pairs across these graphs. Most existing approaches regard EA as a graph representation learning task, concentrating on enhancing graph encoders. However, the decoding process in EA - essential for effective operation and alignment accuracy - has received limited attenti… ▽ More Entity alignment (EA), a pivotal process in integrating multi-source Knowledge Graphs (KGs), seeks to identify equivalent entity pairs across these graphs. Most existing approaches regard EA as a graph representation learning task, concentrating on enhancing graph encoders. However, the decoding process in EA - essential for effective operation and alignment accuracy - has received limited attention and remains tailored to specific datasets and model architectures, necessitating both entity and additional explicit relation embeddings. This specificity limits its applicability, particularly in GNN-based models. To address this gap, we introduce a novel, generalized, and efficient decoding approach for EA, relying solely on entity embeddings. Our method optimizes the decoding process by minimizing Dirichlet energy, leading to the gradient flow within the graph, to maximize graph homophily. The discretization of the gradient flow produces a fast and scalable approach, termed Triple Feature Propagation (TFP). TFP innovatively generalizes adjacency matrices to multi-views matrices:entity-to-entity, entity-to-relation, relation-to-entity, and relation-to-triple. The gradient flow through generalized matrices enables TFP to harness the multi-view structural information of KGs. Rigorous experimentation on diverse public datasets demonstrates that our approach significantly enhances various EA methods. Notably, the approach achieves these advancements with less than 6 seconds of additional computational time, establishing a new benchmark in efficiency and adaptability for future EA methods. △ Less

Submitted 17 April, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.02143 [pdf, other]

Graph Neural Networks for Tabular Data Learning: A Survey with Taxonomy and Directions

Authors: Cheng-Te Li, Yu-Che Tsai, Chih-Yao Chen, Jay Chiehen Liao

Abstract: In this survey, we dive into Tabular Data Learning (TDL) using Graph Neural Networks (GNNs), a domain where deep learning-based approaches have increasingly shown superior performance in both classification and regression tasks compared to traditional methods. The survey highlights a critical gap in deep neural TDL methods: the underrepresentation of latent correlations among data instances and fe… ▽ More In this survey, we dive into Tabular Data Learning (TDL) using Graph Neural Networks (GNNs), a domain where deep learning-based approaches have increasingly shown superior performance in both classification and regression tasks compared to traditional methods. The survey highlights a critical gap in deep neural TDL methods: the underrepresentation of latent correlations among data instances and feature values. GNNs, with their innate capability to model intricate relationships and interactions between diverse elements of tabular data, have garnered significant interest and application across various TDL domains. Our survey provides a systematic review of the methods involved in designing and implementing GNNs for TDL (GNN4TDL). It encompasses a detailed investigation into the foundational aspects and an overview of GNN-based TDL methods, offering insights into their evolving landscape. We present a comprehensive taxonomy focused on constructing graph structures and representation learning within GNN-based TDL methods. In addition, the survey examines various training plans, emphasizing the integration of auxiliary tasks to enhance the effectiveness of instance representations. A critical part of our discussion is dedicated to the practical application of GNNs across a spectrum of GNN4TDL scenarios, demonstrating their versatility and impact. Lastly, we discuss the limitations and propose future research directions, aiming to spur advancements in GNN4TDL. This survey serves as a resource for researchers and practitioners, offering a thorough understanding of GNNs' role in revolutionizing TDL and pointing towards future innovations in this promising area. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: Under review, ongoing work, Github page: https://github.com/Roytsai27/awesome-GNN4TDL

arXiv:2401.02017 [pdf, other]

The origin of High-velocity stars considering the impact of the Large Magellanic Cloud

Authors: Jiwei Liao, Cuihua Du, Mingji Deng, Dashuang Ye, Hefan Li, Yang Huang, Jianrong Shi, Jun Ma

Abstract: Utilizing astrometric parameters sourced from \textit{Gaia} Data Release 3 and radial velocities obtained from various spectroscopic surveys, we identify 519 high-velocity stars (HiVels) with a total velocity in the Galactocentric restframe greater than 70\% of their local escape velocity under the {\tt\string Gala} {\tt\string MilkyWayPotential}. Our analysis reveals that the majority of these Hi… ▽ More Utilizing astrometric parameters sourced from \textit{Gaia} Data Release 3 and radial velocities obtained from various spectroscopic surveys, we identify 519 high-velocity stars (HiVels) with a total velocity in the Galactocentric restframe greater than 70\% of their local escape velocity under the {\tt\string Gala} {\tt\string MilkyWayPotential}. Our analysis reveals that the majority of these HiVels are metal-poor late-type giants, and we show 9 HiVels that are unbound candidates to the Galaxy with escape probabilities of 50\%. To investigate the origins of these HiVels, we classify them into four categories and consider the impact of the Large Magellanic Cloud (LMC) potential on their backward-integration trajectories. Specifically, we find that one of the HiVels can track back to the Galactic Center, and three HiVels may originate from the Sagittarius dwarf spheroidal galaxy (Sgr dSph). Furthermore, some HiVels appear to be ejected from the Galactic disk, while others formed within the Milky Way or have an extragalactic origin. Given that the LMC has a significant impact on the orbits of Sgr dSph, we examine the reported HiVels that originate from the Sgr dSph, with a few of them passing within the half-light radius of the Sgr dSph. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: 17 pages, 5figures, accepted for publication in AJ

arXiv:2401.01491

A Hybrid Neural Network Model For Predicting The Nitrate Concentration In The Recirculating Aquaculture System

Authors: Xiangyu Fan, Jiaxin Lia, Yingzhe Wang, Yingsha Qu, Hao Li, Keming Qu, Zhengguo Cui

Abstract: This study was groundbreaking in its application of neural network models for nitrate management in the Recirculating Aquaculture System (RAS). A hybrid neural network model was proposed, which accurately predicted daily nitrate concentration and its trends using six water quality parameters. We conducted a 105-day aquaculture experiment, during which we collected 450 samples from five sets of RAS… ▽ More This study was groundbreaking in its application of neural network models for nitrate management in the Recirculating Aquaculture System (RAS). A hybrid neural network model was proposed, which accurately predicted daily nitrate concentration and its trends using six water quality parameters. We conducted a 105-day aquaculture experiment, during which we collected 450 samples from five sets of RAS to train our model (C-L-A model) which incorporates Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and self-Attention. Furthermore, we obtained 90 samples from a standalone RAS as the testing data to evaluate the performance of the model in practical applications. The experimental results proved that the C-L-A model accurately predicted nitrate concentration in RAS and maintained good performance even with a reduced proportion of training data. We recommend using water quality parameters from the past 7 days to forecast future nitrate concentration, as this timeframe allows the model to achieve maximum generalization capability. Additionally, we compared the performance of the C-L-A model with three basic neural network models (CNN, LSTM, self-Attention) as well as three hybrid neural network models (CNN-LSTM, CNN-Attention, LSTM-Attention). The results demonstrated that the C-L-A model (R2=0.956) significantly outperformed the other neural network models (R2=0.901-0.927). Our study suggests that the utilization of neural network models, specifically the C-L-A model, could potentially assist the RAS industry in conserving resources for daily nitrate monitoring. △ Less

Submitted 15 January, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

Comments: The content of this paper needs to be further filled and improved

arXiv:2312.15961 [pdf, other]

Observation of Magnon Dam** Minimum Induced by Kondo Coupling in a van der Waals Ferromagnet Fe$_{3-x}$GeTe$_{2}$

Authors: Song Bao, Junsen Wang, Shin-ichiro Yano, Yanyan Shangguan, Zhentao Huang, Junbo Liao, Wei Wang, Yuan Gao, Bo Zhang, Shufan Cheng, Hao Xu, Zhao-Yang Dong, Shun-Li Yu, Wei Li, Jian-Xin Li, **sheng Wen

Abstract: In heavy-fermion systems with $f$ electrons, there is an intricate interplay between Kondo screening and magnetic correlations, which can give rise to various exotic phases. Recently, similar interplay appears to also occur in $d$-electron systems, but the underlying mechanism remains elusive. Here, using inelastic neutron scattering, we investigate the temperature evolution of the low-energy spin… ▽ More In heavy-fermion systems with $f$ electrons, there is an intricate interplay between Kondo screening and magnetic correlations, which can give rise to various exotic phases. Recently, similar interplay appears to also occur in $d$-electron systems, but the underlying mechanism remains elusive. Here, using inelastic neutron scattering, we investigate the temperature evolution of the low-energy spin waves in a metallic van der Waals ferromagnet Fe$_{3-x}$GeTe$_{2}$ (Curie temperature $T_{\rm C}\sim160$ K), where the Kondo-lattice behavior emerges in the ferromagnetic phase below a characteristic temperature $T^*\sim90$ K. We observe that the magnon dam** constant diverges at both low and high temperatures, exhibiting a minimum coincidentally around $T^*$. Such an observation is analogous to the resistivity minimum as due to the single-impurity Kondo effect. This unusual behavior is described by a formula that combines logarithmic and power terms, representing the dominant contributions from Kondo screening and thermal fluctuations, respectively. Furthermore, we find that the magnon dam** increases with momentum below $T_{\rm C}$. These findings can be explained by considering spin-flip electron-magnon scattering, which serves as a magnonic analog of the Kondo-impurity scattering, and thus provides a measure of the Kondo coupling through magnons. Our results provide critical insights into how Kondo coupling manifests itself in a system with magnetic ordering and shed light on the coexistence of and interplay between magnetic order and Kondo effect in itinerant 3$d$-electron systems. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: 8 pages, 4 figures

arXiv:2312.15943 [pdf, other]

doi 10.1038/s41467-023-41791-9

Direct observation of topological magnon polarons in a multiferroic material

Authors: Song Bao, Zhao-Long Gu, Yanyan Shangguan, Zhentao Huang, Junbo Liao, Xiaoxue Zhao, Bo Zhang, Zhao-Yang Dong, Wei Wang, Ryoichi Kajimoto, Mitsutaka Nakamura, Tom Fennell, Shun-Li Yu, Jian-Xin Li, **sheng Wen

Abstract: Magnon polarons are novel elementary excitations possessing hybrid magnonic and phononic signatures, and are responsible for many exotic spintronic and magnonic phenomena. Despite long-term sustained experimental efforts in chasing for magnon polarons, direct spectroscopic evidence of their existence is hardly observed. Here, we report the direct observation of magnon polarons using neutron spectr… ▽ More Magnon polarons are novel elementary excitations possessing hybrid magnonic and phononic signatures, and are responsible for many exotic spintronic and magnonic phenomena. Despite long-term sustained experimental efforts in chasing for magnon polarons, direct spectroscopic evidence of their existence is hardly observed. Here, we report the direct observation of magnon polarons using neutron spectroscopy on a multiferroic Fe$_{2}$Mo$_{3}$O$_{8}$ possessing strong magnon-phonon coupling. Specifically, below the magnetic ordering temperature, a gap opens at the nominal intersection of the original magnon and phonon bands, leading to two separated magnon-polaron bands. Each of the bands undergoes mixing, interconverting and reversing between its magnonic and phononic components. We attribute the formation of magnon polarons to the strong magnon-phonon coupling induced by Dzyaloshinskii-Moriya interaction. Intriguingly, we find that the band-inverted magnon polarons are topologically nontrivial. These results uncover exotic elementary excitations arising from the magnon-phonon coupling, and offer a new route to topological states by considering hybridizations between different types of fundamental excitations. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: 11 pages, 5 figures, published in Nature Communications

Journal ref: Nature Communications 14, 6093 (2023)

arXiv:2312.15932 [pdf, other]

doi 10.1038/s41567-023-02212-2

Observation of a 1/3 Magnetisation Plateau Phase as Evidence for the Kitaev Interaction in a Honeycomb-Lattice Antiferromagnet

Authors: Yanyan Shangguan, Song Bao, Zhao-Yang Dong, Ning Xi, Yi-Peng Gao, Zhen Ma, Wei Wang, Zhongyuan Qi, Shuai Zhang, Zhentao Huang, Junbo Liao, Xiaoxue Zhao, Bo Zhang, Shufan Cheng, Hao Xu, Dehong Yu, Richard A. Mole, Naoki Murai, Seiko Ohira-Kawamura, Lunhua He, Jiazheng Hao, Qing-Bo Yan, Fengqi Song, Wei Li, Shun-Li Yu , et al. (2 additional authors not shown)

Abstract: Fractional magnetisation plateaus, in which the magnetisation is pinned at a fraction of its saturated value within a range of external magnetic field, are spectacular macroscopic manifestations of the collective quantum behaviours. One prominent example of the plateau phase is found in spin-1/2 triangular-lattice antiferromagnets featuring strong geometrical frustration, and is often interpreted… ▽ More Fractional magnetisation plateaus, in which the magnetisation is pinned at a fraction of its saturated value within a range of external magnetic field, are spectacular macroscopic manifestations of the collective quantum behaviours. One prominent example of the plateau phase is found in spin-1/2 triangular-lattice antiferromagnets featuring strong geometrical frustration, and is often interpreted as quantum-fluctuation-stabilised state in magnetic field via the "order-by-disorder" mechanism. Here, we observe an unprecedented 1/3 magnetisation plateau between 5.2 and 7.4 T at 2 K in a spin-1 antiferromagnet Na$_3$Ni$_2$BiO$_6$ with a honeycomb lattice, where conventionally no geometrical frustration is anticipated. By carrying out elastic neutron scattering measurements, we propose the spin structure of the plateau phase to be an unusual partial spin-flop ferrimagnetic order, transitioning from the zigzag antiferromagnetic order in zero field. Our theoretical calculations show that the plateau phase is stabilised by the bond-anisotropic Kitaev interaction. These results provide a new paradigm for the exploration of rich quantum phases in frustrated magnets and exotic Kitaev physics in high-spin systems. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: Submitted version, 10 pages, 5 figures. Final version has been published in Nature Physics

Journal ref: Nature Physics 19, 1883-1889 (2023)

arXiv:2312.15898 [pdf, ps, other]

doi 10.1103/PhysRevA.109.053521

Simultaneous ground-state cooling of two levitated nanoparticles by coherent scattering

Authors: Yi Xu, Yu-Hong Liu, Cheng Liu, Jie-Qiao Liao

Abstract: Simultaneous ground-state cooling of two levitated nanoparticles is a crucial prerequisite for investigation of macroscopic quantum effects such as quantum entanglement and quantum correlation involving translational motion of particles. Here we consider a coupled cavity-levitated-particle system and present a detailed derivation of its Hamiltonian. We find that the $y$-direction motions of the tw… ▽ More Simultaneous ground-state cooling of two levitated nanoparticles is a crucial prerequisite for investigation of macroscopic quantum effects such as quantum entanglement and quantum correlation involving translational motion of particles. Here we consider a coupled cavity-levitated-particle system and present a detailed derivation of its Hamiltonian. We find that the $y$-direction motions of the two particles are decoupled from the cavity field and both the $x$- and $z$-direction motions, and that the $z$-direction motions can be further decoupled from the cavity field and the $x$-direction motions by choosing proper locations of the particles. We study the simultaneous cooling of these mechanical modes in both the three-mode and five-mode cavity-levitated optomechanical models. It is found that there exists the dark-mode effect when the two tweezers have the same powers, which suppress the simultaneous ground-state cooling. Nevertheless, the simultaneous ground-state cooling of these modes can be realized by breaking the dark-mode effect under proper parameters. Our system provides a versatile platform to study quantum effects and applications in cavity-levitated optomechanical systems. △ Less

Submitted 19 May, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Comments: 18 pages, 8 figures

Journal ref: Phys. Rev. A 109, 053521 (2024)

arXiv:2312.14389 [pdf, other]

StyleRetoucher: Generalized Portrait Image Retouching with GAN Priors

Authors: Wanchao Su, Can Wang, Chen Liu, Hangzhou Han, Hongbo Fu, **g Liao

Abstract: Creating fine-retouched portrait images is tedious and time-consuming even for professional artists. There exist automatic retouching methods, but they either suffer from over-smoothing artifacts or lack generalization ability. To address such issues, we present StyleRetoucher, a novel automatic portrait image retouching framework, leveraging StyleGAN's generation and generalization ability to imp… ▽ More Creating fine-retouched portrait images is tedious and time-consuming even for professional artists. There exist automatic retouching methods, but they either suffer from over-smoothing artifacts or lack generalization ability. To address such issues, we present StyleRetoucher, a novel automatic portrait image retouching framework, leveraging StyleGAN's generation and generalization ability to improve an input portrait image's skin condition while preserving its facial details. Harnessing the priors of pretrained StyleGAN, our method shows superior robustness: a). performing stably with fewer training samples and b). generalizing well on the out-domain data. Moreover, by blending the spatial features of the input image and intermediate features of the StyleGAN layers, our method preserves the input characteristics to the largest extent. We further propose a novel blemish-aware feature selection mechanism to effectively identify and remove the skin blemishes, improving the image skin condition. Qualitative and quantitative evaluations validate the great generalization capability of our method. Further experiments show StyleRetoucher's superior performance to the alternative solutions in the image retouching task. We also conduct a user perceptive study to confirm the superior retouching performance of our method over the existing state-of-the-art alternatives. △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: 13 pages, 15 figures

arXiv:2312.07539 [pdf, other]

HeadArtist: Text-conditioned 3D Head Generation with Self Score Distillation

Authors: Hongyu Liu, Xuan Wang, Ziyu Wan, Yujun Shen, Yibing Song, **g Liao, Qifeng Chen

Abstract: This work presents HeadArtist for 3D head generation from text descriptions. With a landmark-guided ControlNet serving as the generative prior, we come up with an efficient pipeline that optimizes a parameterized 3D head model under the supervision of the prior distillation itself. We call such a process self score distillation (SSD). In detail, given a sampled camera pose, we first render an imag… ▽ More This work presents HeadArtist for 3D head generation from text descriptions. With a landmark-guided ControlNet serving as the generative prior, we come up with an efficient pipeline that optimizes a parameterized 3D head model under the supervision of the prior distillation itself. We call such a process self score distillation (SSD). In detail, given a sampled camera pose, we first render an image and its corresponding landmarks from the head model, and add some particular level of noise onto the image. The noisy image, landmarks, and text condition are then fed into the frozen ControlNet twice for noise prediction. Two different classifier-free guidance (CFG) weights are applied during these two predictions, and the prediction difference offers a direction on how the rendered image can better match the text of interest. Experimental results suggest that our approach delivers high-quality 3D head sculptures with adequate geometry and photorealistic appearance, significantly outperforming state-ofthe-art methods. We also show that the same pipeline well supports editing the generated heads, including both geometry deformation and appearance change. △ Less

Submitted 8 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: Amazing results are shown in https://kumapowerliu.github.io/HeadArtist. Accepted by SIGGRAPH 2024

arXiv:2312.06663 [pdf, other]

CAD: Photorealistic 3D Generation via Adversarial Distillation

Authors: Ziyu Wan, Despoina Paschalidou, Ian Huang, Hongyu Liu, Bokui Shen, Xiaoyu Xiang, **g Liao, Leonidas Guibas

Abstract: The increased demand for 3D data in AR/VR, robotics and gaming applications, gave rise to powerful generative pipelines capable of synthesizing high-quality 3D objects. Most of these models rely on the Score Distillation Sampling (SDS) algorithm to optimize a 3D representation such that the rendered image maintains a high likelihood as evaluated by a pre-trained diffusion model. However, finding a… ▽ More The increased demand for 3D data in AR/VR, robotics and gaming applications, gave rise to powerful generative pipelines capable of synthesizing high-quality 3D objects. Most of these models rely on the Score Distillation Sampling (SDS) algorithm to optimize a 3D representation such that the rendered image maintains a high likelihood as evaluated by a pre-trained diffusion model. However, finding a correct mode in the high-dimensional distribution produced by the diffusion model is challenging and often leads to issues such as over-saturation, over-smoothing, and Janus-like artifacts. In this paper, we propose a novel learning paradigm for 3D synthesis that utilizes pre-trained diffusion models. Instead of focusing on mode-seeking, our method directly models the distribution discrepancy between multi-view renderings and diffusion priors in an adversarial manner, which unlocks the generation of high-fidelity and photorealistic 3D content, conditioned on a single image and prompt. Moreover, by harnessing the latent space of GANs and expressive diffusion model priors, our method facilitates a wide variety of 3D applications including single-view reconstruction, high diversity generation and continuous 3D interpolation in the open domain. The experiments demonstrate the superiority of our pipeline compared to previous works in terms of generation quality and diversity. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: Project page: http://raywzy.com/CAD/

arXiv:2312.06274 [pdf, other]

Dark-Mode Theorems for Quantum Networks

Authors: Jian Huang, Cheng Liu, Xun-Wei Xu, Jie-Qiao Liao

Abstract: We propose and prove two theorems for determining the number of dark modes in linear two-component quantum networks composed of two types of bosonic modes. This is achieved by diagonalizing the two sub-networks of the same type of modes, map** the networks to either a standard or a thick arrowhead matrix, and analyzing the linear dependence and independence between the column vectors associated… ▽ More We propose and prove two theorems for determining the number of dark modes in linear two-component quantum networks composed of two types of bosonic modes. This is achieved by diagonalizing the two sub-networks of the same type of modes, map** the networks to either a standard or a thick arrowhead matrix, and analyzing the linear dependence and independence between the column vectors associated with degenerate normal modes in the coupling matrix. We confirm the two theorems by checking the simultaneous ground-state cooling of the mechanical modes in linearized optomechanical networks. These results also work for linear fermionic networks and other networks described by quadratic coupled-mode Hamiltonian. The present method can be extended to study the dark-state effect in driven atom systems and to construct large decoherence-free subspaces for processing quantum information. This work will initiate the studies on dynamical, transport, and statistical properties of linear networks with decoupled subspaces. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 43 pages, 17 figures

arXiv:2312.02445 [pdf, other]

LLaRA: Large Language-Recommendation Assistant

Authors: Jiayi Liao, Sihang Li, Zhengyi Yang, Jiancan Wu, Yancheng Yuan, Xiang Wang, Xiangnan He

Abstract: Sequential recommendation aims to predict users' next interaction with items based on their past engagement sequence. Recently, the advent of Large Language Models (LLMs) has sparked interest in leveraging them for sequential recommendation, viewing it as language modeling. Previous studies represent items within LLMs' input prompts as either ID indices or textual metadata. However, these approach… ▽ More Sequential recommendation aims to predict users' next interaction with items based on their past engagement sequence. Recently, the advent of Large Language Models (LLMs) has sparked interest in leveraging them for sequential recommendation, viewing it as language modeling. Previous studies represent items within LLMs' input prompts as either ID indices or textual metadata. However, these approaches often fail to either encapsulate comprehensive world knowledge or exhibit sufficient behavioral understanding. To combine the complementary strengths of conventional recommenders in capturing behavioral patterns of users and LLMs in encoding world knowledge about items, we introduce Large Language-Recommendation Assistant (LLaRA). Specifically, it uses a novel hybrid prompting method that integrates ID-based item embeddings learned by traditional recommendation models with textual item features. Treating the "sequential behaviors of users" as a distinct modality beyond texts, we employ a projector to align the traditional recommender's ID embeddings with the LLM's input space. Moreover, rather than directly exposing the hybrid prompt to LLMs, a curriculum learning strategy is adopted to gradually ramp up training complexity. Initially, we warm up the LLM using text-only prompts, which better suit its inherent language modeling ability. Subsequently, we progressively transition to the hybrid prompts, training the model to seamlessly incorporate the behavioral knowledge from the traditional sequential recommender into the LLM. Empirical results validate the effectiveness of our proposed framework. Codes are available at https://github.com/ljy0ustc/LLaRA. △ Less

Submitted 4 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: 11 pages, 5 figures

arXiv:2312.02157 [pdf, other]

Mesh-Guided Neural Implicit Field Editing

Authors: Can Wang, Mingming He, Menglei Chai, Dongdong Chen, **g Liao

Abstract: Neural implicit fields have emerged as a powerful 3D representation for reconstructing and rendering photo-realistic views, yet they possess limited editability. Conversely, explicit 3D representations, such as polygonal meshes, offer ease of editing but may not be as suitable for rendering high-quality novel views. To harness the strengths of both representations, we propose a new approach that e… ▽ More Neural implicit fields have emerged as a powerful 3D representation for reconstructing and rendering photo-realistic views, yet they possess limited editability. Conversely, explicit 3D representations, such as polygonal meshes, offer ease of editing but may not be as suitable for rendering high-quality novel views. To harness the strengths of both representations, we propose a new approach that employs a mesh as a guiding mechanism in editing the neural radiance field. We first introduce a differentiable method using marching tetrahedra for polygonal mesh extraction from the neural implicit field and then design a differentiable color extractor to assign colors obtained from the volume renderings to this extracted mesh. This differentiable colored mesh allows gradient back-propagation from the explicit mesh to the implicit fields, empowering users to easily manipulate the geometry and color of neural implicit fields. To enhance user control from coarse-grained to fine-grained levels, we introduce an octree-based structure into its optimization. This structure prioritizes the edited regions and the surface part, making our method achieve fine-grained edits to the neural implicit field and accommodate various user modifications, including object additions, component removals, specific area deformations, and adjustments to local and global colors. Through extensive experiments involving diverse scenes and editing operations, we have demonstrated the capabilities and effectiveness of our method. Our project page is: \url{https://cassiepython.github.io/MNeuEdit/} △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: Project page: https://cassiepython.github.io/MNeuEdit/

arXiv:2311.16961 [pdf, other]

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Authors: **gbo Zhang, Xiaoyu Li, Qi Zhang, Yanpei Cao, Ying Shan, **g Liao

Abstract: Generating a 3D human model from a single reference image is challenging because it requires inferring textures and geometries in invisible views while maintaining consistency with the reference image. Previous methods utilizing 3D generative models are limited by the availability of 3D training data. Optimization-based methods that lift text-to-image diffusion models to 3D generation often fail t… ▽ More Generating a 3D human model from a single reference image is challenging because it requires inferring textures and geometries in invisible views while maintaining consistency with the reference image. Previous methods utilizing 3D generative models are limited by the availability of 3D training data. Optimization-based methods that lift text-to-image diffusion models to 3D generation often fail to preserve the texture details of the reference image, resulting in inconsistent appearances in different views. In this paper, we propose HumanRef, a 3D human generation framework from a single-view input. To ensure the generated 3D model is photorealistic and consistent with the input image, HumanRef introduces a novel method called reference-guided score distillation sampling (Ref-SDS), which effectively incorporates image guidance into the generation process. Furthermore, we introduce region-aware attention to Ref-SDS, ensuring accurate correspondence between different body regions. Experimental results demonstrate that HumanRef outperforms state-of-the-art methods in generating 3D clothed humans with fine geometry, photorealistic textures, and view-consistent appearances. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: Homepage: https://eckertzhang.github.io/HumanRef.github.io/

arXiv:2311.15872 [pdf, other]

Case study of the validity of truncation schemes of kinetic equations of motion: few magnetic impurities in a semiconductor quantum ring

Authors: J. M. Lia, P. I. Tamborenea

Abstract: We carry out a study on the validity and limitations of truncation schemes customarily employed to treat the quantum kinetic equations of motion of complex interacting systems. Our system of choice is a semiconductor quantum ring with one electron interacting with few magnetic impurities via a Kondo-like Hamiltonian. This system is an interesting prototype which displays the necessary complexity w… ▽ More We carry out a study on the validity and limitations of truncation schemes customarily employed to treat the quantum kinetic equations of motion of complex interacting systems. Our system of choice is a semiconductor quantum ring with one electron interacting with few magnetic impurities via a Kondo-like Hamiltonian. This system is an interesting prototype which displays the necessary complexity when suitably scaled (large number of magnetic impurities) but can also be solved exactly when few impurities are present. The complexity in this system comes from the indirect electron-mediated impurity-impurity interaction and is reflected in the Heisenberg equations of motion, which form an infinite hierarchy. For the cases of two and three magnetic impurities, we solve for the quantum dynamics of our system both exactly and following a truncation scheme developed for diluted magnetic semiconductors in the bulk. We find an excellent agreement between the two approaches when physical observables like the impurities' spin angular momentum are computed for times that well exceed the time window of validity of perturbation theory. On the other hand, we find that within time ranges of physical interest, the truncation scheme introduces negative populations which represents a serious methodological drawback. △ Less

Submitted 27 November, 2023; originally announced November 2023.

Comments: 15 pages, 3 figures

arXiv:2311.11484 [pdf, other]

Tripartite quantum entanglement with squeezed optomechanics

Authors: Ya-Feng Jiao, Yun-Lan Zuo, Yan Wang, Wangjun Lu, Jie-Qiao Liao, Le-Man Kuang, Hui **g

Abstract: The ability to engineer entangled states that involve macroscopic objects is of particular importance for a wide variety of quantum-enabled technologies, ranging from quantum information processing to quantum sensing. Here we propose how to achieve coherent manipulation and enhancement of quantum entanglement in a hybrid optomechanical system, which consists of a Fabry-Pérot cavity with two movabl… ▽ More The ability to engineer entangled states that involve macroscopic objects is of particular importance for a wide variety of quantum-enabled technologies, ranging from quantum information processing to quantum sensing. Here we propose how to achieve coherent manipulation and enhancement of quantum entanglement in a hybrid optomechanical system, which consists of a Fabry-Pérot cavity with two movable mirrors, an optical parametric amplifier (OPA), and an injected squeezed vacuum reservoir. We show that the advantages of this system are twofold: (i) one can effectively regulate the light-mirror interactions by introducing a squeezed intracavity mode via the OPA; (ii) when properly matching the squeezing parameters between the squeezed cavity mode and the injected squeezed vacuum reservoir, the optical input noises can be suppressed completely. These peculiar features of this system allow us to generate and manipulate quantum entanglement in a coherent and controllable way. More importantly, we also find that such controllable entanglement, under some specific squeezing parameters, can be considerably enhanced in comparison with those of the conventional optomechanical system. Our work, providing a promising method to regulate and tailor the light-mirror interaction, are poised to serve as a useful tool for engineering various quantum effects which are based on cavity optomechanics. △ Less

Submitted 19 November, 2023; originally announced November 2023.

Comments: 11 pages, 3 figures

arXiv:2311.10497 [pdf, other]

Photo-Detection Efficiency measurement for FBK HD Near-UV sensitive SiPMs at 10 K temperature

Authors: Meiyuenan Ma, Jiangfeng Zhou, Fengbo Gu, Junhui Liao, Yuanning Gao, Zhaohua Peng, Jian Zheng, Guangpeng An, Lifeng Zhang, Lei Zhang, Zhuo Liang, Xiuliang Zhao

Abstract: We report the characterization of the FBK ``NUV-HD-Cryo'' SiPMs at 10 K temperature. With 405 nm and 530 nm light, we measured the photo-detection efficiency (PDE) at the bias voltages between 6 to 11 V overvoltage (OV). The PDE reaches $\sim$ 40\% for 405 nm and 530 nm light with a bias voltage of OV 9 V. A bias voltage higher than 9 V leads to a slightly greater PDE. We also measured the SiPMs'… ▽ More We report the characterization of the FBK ``NUV-HD-Cryo'' SiPMs at 10 K temperature. With 405 nm and 530 nm light, we measured the photo-detection efficiency (PDE) at the bias voltages between 6 to 11 V overvoltage (OV). The PDE reaches $\sim$ 40\% for 405 nm and 530 nm light with a bias voltage of OV 9 V. A bias voltage higher than 9 V leads to a slightly greater PDE. We also measured the SiPMs' PDE at room temperature (RT). The results are consistent with the measurements on the similar model SiPMs by other groups. The I-V curve of the SiPMs differs significantly from the conventional one measured at RT. The dark current ratio (DCR) is tested to be $\sim$ 1 Hz for the 92 mm$^2$ SiPMs, or 0.01 Hz/mm$^2$, which is $\sim$ 7 orders lower than the ratio tested at RT. The SiPMs' performance at 10 K demonstrated that it could be equipped on a liquid helium detector as the photosensor to search for rare events, including but not limited to dark matter searches. △ Less

Submitted 29 December, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.10334 [pdf]

Mirrored Transformation Optics

Authors: Junke Liao, Pengfei Zhao, Zhinbing Zhang, Wen Xiao, Huanyang Chen

Abstract: A mirrored transformation optics (MTO) approach is presented to overcome the material mismatch in transformation optics. It makes good use of the reflection behavior and introduce a mirrored medium to offset the phase discontinuities. Using this approach, a high-performance planar focusing lens of transmission-type is designed, which has large concentration ratio than other focusing lens obtained… ▽ More A mirrored transformation optics (MTO) approach is presented to overcome the material mismatch in transformation optics. It makes good use of the reflection behavior and introduce a mirrored medium to offset the phase discontinuities. Using this approach, a high-performance planar focusing lens of transmission-type is designed, which has large concentration ratio than other focusing lens obtained by generalized Snell law. The MTO will not change any functionality of the original lens and promising potential applications in imaging and light energy harvesting. △ Less

Submitted 22 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2311.06529 [pdf, ps, other]

doi 10.1007/s41605-023-00409-w

Insight-HXMT on-orbit thermal control status and thermal deformation impact analysis

Authors: Aimei Zhang, Yifan Zhang, **yuan Liao, Yupeng Xu, Yusa Wang, Wenbo Luo, Yupeng Zhou, Zhiying Qian, Xiaobo Li, Fangjun Lu, Shuangnan Zhang, Liming Song, Congzhan Liu, Fan Zhang, Jianyin Nie, Juan Wang, Sheng Yang, Tong Zhang, Xiao**g Liu, Ruijie Wang, Xufang Li, Yifei Zhang, Zhengwei Li, Xuefeng Lu, He Xu , et al. (1 additional authors not shown)

Abstract: Purpose: The Hard X-ray Modulation Telescope is China's first X-ray astronomy satellite launched on June 15th, 2017, dubbed Insight-HXMT. Active and passive thermal control measures are employed to keep devices at suitable temperatures. In this paper, we analyzed the on-orbit thermal monitoring data of the first 5 years and investigated the effect of thermal deformation on the point spread functio… ▽ More Purpose: The Hard X-ray Modulation Telescope is China's first X-ray astronomy satellite launched on June 15th, 2017, dubbed Insight-HXMT. Active and passive thermal control measures are employed to keep devices at suitable temperatures. In this paper, we analyzed the on-orbit thermal monitoring data of the first 5 years and investigated the effect of thermal deformation on the point spread function (PSF) of the telescopes. Methods: We examined the data of the on-orbit temperatures measured using 157 thermistors placed on the collimators, detectors and their support structures and compared the results with the thermal control requirements. The thermal deformation was evaluated by the relative orientation of the two star sensors installed on the main support structure. its effect was estimated with evolution of the PSF obtained with calibration scanning observations of the Crab nebula. Conclusion: The on-orbit temperatures met the thermal control requirements thus far, and the effect of thermal deformation on the PSF was negligible after the on-orbit pointing calibration. △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: 25 pages, 35 figures, submitted

arXiv:2311.05415 [pdf, other]

EEG-DG: A Multi-Source Domain Generalization Framework for Motor Imagery EEG Classification

Authors: Xiao-Cong Zhong, Qisong Wang, Dan Liu, Zhihuang Chen, **g-Xiao Liao, **wei Sun, Yudong Zhang, Feng-Lei Fan

Abstract: Motor imagery EEG classification plays a crucial role in non-invasive Brain-Computer Interface (BCI) research. However, the classification is affected by the non-stationarity and individual variations of EEG signals. Simply pooling EEG data with different statistical distributions to train a classification model can severely degrade the generalization performance. To address this issue, the existi… ▽ More Motor imagery EEG classification plays a crucial role in non-invasive Brain-Computer Interface (BCI) research. However, the classification is affected by the non-stationarity and individual variations of EEG signals. Simply pooling EEG data with different statistical distributions to train a classification model can severely degrade the generalization performance. To address this issue, the existing methods primarily focus on domain adaptation, which requires access to the target data during training. This is unrealistic in many EEG application scenarios. In this paper, we propose a novel multi-source domain generalization framework called EEG-DG, which leverages multiple source domains with different statistical distributions to build generalizable models on unseen target EEG data. We optimize both the marginal and conditional distributions to ensure the stability of the joint distribution across source domains and extend it to a multi-source domain generalization framework to achieve domain-invariant feature representation, thereby alleviating calibration efforts. Systematic experiments on a simulative dataset and BCI competition datasets IV-2a and IV-2b demonstrate the superiority of our proposed EEG-DG over state-of-the-art methods. Specifically, EEG-DG achieves an average classification accuracy/kappa value of 81.79%/0.7572 and 87.12%/0.7424 on datasets IV-2a and IV-2b, respectively, which even outperforms some domain adaptation methods. Our code is available at https://github.com/XC-ZhongHIT/EEG-DG for free download and evaluation. △ Less

Submitted 9 November, 2023; originally announced November 2023.

arXiv:2311.05196 [pdf]

Solving Combinatorial Optimization Problems on Fujitsu Digital Annealer

Authors: Yu-Ting Kao, Jia-Le Liao, Hsiu-Chuan Hsu

Abstract: Combinatorial optimization problems are ubiquitous in various disciplines and applications. Many heuristic algorithms have been devoted to solve these types of problems. In order to increase the efficiency for finding the optimal solutions, an application-specific hardware, called digital annealer (DA) has been developed for solving combinatorial optimization problems using quadratic unconstrained… ▽ More Combinatorial optimization problems are ubiquitous in various disciplines and applications. Many heuristic algorithms have been devoted to solve these types of problems. In order to increase the efficiency for finding the optimal solutions, an application-specific hardware, called digital annealer (DA) has been developed for solving combinatorial optimization problems using quadratic unconstrained binary optimization (QUBO) formulations. In this study, we formulated the number partitioning problem and the graph partitioning problem into QUBO forms and solved such problems with the DA developed by Fujitsu Ltd. The QUBO formulation of the number partitioning problem is fully connected. The DA found the overall runtime for the optimal solution to be less than 30 seconds for 6500 binary variables. For the graph partitioning problem, we adopted modularity as the metric for determining the quality of the partitions. For Zachary's Karate Club graph, the modularity obtained was 0.445, a 6% increase against D-wave Quantum Annealer and Simulated Annealing. Moreover, to explore the DA's potential applications to real-world problems, we used the search for communities or virtual microgrids in a power distribution network as an example. The problem was formulated into graph partitioning. It is shown that the DA effectively identified community structures in the IEEE 33-bus and IEEE 118-bus network. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: Proceedings of 2023 TANET & NCS

arXiv:2311.02305 [pdf, other]

OSM vs HD Maps: Map Representations for Trajectory Prediction

Authors: **g-Yan Liao, Parth Doshi, Zihan Zhang, David Paz, Henrik Christensen

Abstract: While High Definition (HD) Maps have long been favored for their precise depictions of static road elements, their accessibility constraints and susceptibility to rapid environmental changes impede the widespread deployment of autonomous driving, especially in the motion forecasting task. In this context, we propose to leverage OpenStreetMap (OSM) as a promising alternative to HD Maps for long-ter… ▽ More While High Definition (HD) Maps have long been favored for their precise depictions of static road elements, their accessibility constraints and susceptibility to rapid environmental changes impede the widespread deployment of autonomous driving, especially in the motion forecasting task. In this context, we propose to leverage OpenStreetMap (OSM) as a promising alternative to HD Maps for long-term motion forecasting. The contributions of this work are threefold: firstly, we extend the application of OSM to long-horizon forecasting, doubling the forecasting horizon compared to previous studies. Secondly, through an expanded receptive field and the integration of intersection priors, our OSM-based approach exhibits competitive performance, narrowing the gap with HD Map-based models. Lastly, we conduct an exhaustive context-aware analysis, providing deeper insights in motion forecasting across diverse scenarios as well as conducting class-aware comparisons. This research not only advances long-term motion forecasting with coarse map representations but additionally offers a potential scalable solution within the domain of autonomous driving. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2311.01759 [pdf, other]

TinyFormer: Efficient Transformer Design and Deployment on Tiny Devices

Authors: Jianlei Yang, Jiacheng Liao, Fanding Lei, Meichen Liu, Junyi Chen, Lingkun Long, Han Wan, Bei Yu, Weisheng Zhao

Abstract: Develo** deep learning models on tiny devices (e.g. Microcontroller units, MCUs) has attracted much attention in various embedded IoT applications. However, it is challenging to efficiently design and deploy recent advanced models (e.g. transformers) on tiny devices due to their severe hardware resource constraints. In this work, we propose TinyFormer, a framework specifically designed to develo… ▽ More Develo** deep learning models on tiny devices (e.g. Microcontroller units, MCUs) has attracted much attention in various embedded IoT applications. However, it is challenging to efficiently design and deploy recent advanced models (e.g. transformers) on tiny devices due to their severe hardware resource constraints. In this work, we propose TinyFormer, a framework specifically designed to develop and deploy resource-efficient transformers on MCUs. TinyFormer mainly consists of SuperNAS, SparseNAS and SparseEngine. Separately, SuperNAS aims to search for an appropriate supernet from a vast search space. SparseNAS evaluates the best sparse single-path model including transformer architecture from the identified supernet. Finally, SparseEngine efficiently deploys the searched sparse models onto MCUs. To the best of our knowledge, SparseEngine is the first deployment framework capable of performing inference of sparse models with transformer on MCUs. Evaluation results on the CIFAR-10 dataset demonstrate that TinyFormer can develop efficient transformers with an accuracy of $96.1\%$ while adhering to hardware constraints of $1$MB storage and $320$KB memory. Additionally, TinyFormer achieves significant speedups in sparse inference, up to $12.2\times$, when compared to the CMSIS-NN library. TinyFormer is believed to bring powerful transformers into TinyML scenarios and greatly expand the scope of deep learning applications. △ Less

Submitted 3 November, 2023; originally announced November 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.19331 [pdf, other]

AdapINT: A Flexible and Adaptive In-Band Network Telemetry System Based on Deep Reinforcement Learning

Authors: Penghui Zhang, Hua Zhang, Yibo Pi, Zijian Cao, **gyu Wang, Jianxin Liao

Abstract: In-band Network Telemetry (INT) has emerged as a promising network measurement technology. However, existing network telemetry systems lack the flexibility to meet diverse telemetry requirements and are also difficult to adapt to dynamic network environments. In this paper, we propose AdapINT, a versatile and adaptive in-band network telemetry framework assisted by dual-timescale probes, including… ▽ More In-band Network Telemetry (INT) has emerged as a promising network measurement technology. However, existing network telemetry systems lack the flexibility to meet diverse telemetry requirements and are also difficult to adapt to dynamic network environments. In this paper, we propose AdapINT, a versatile and adaptive in-band network telemetry framework assisted by dual-timescale probes, including long-period auxiliary probes (APs) and short-period dynamic probes (DPs). Technically, the APs collect basic network status information, which is used for the path planning of DPs. To achieve full network coverage, we propose an auxiliary probes path deployment (APPD) algorithm based on the Depth-First-Search (DFS). The DPs collect specific network information for telemetry tasks. To ensure that the DPs can meet diverse telemetry requirements and adapt to dynamic network environments, we apply the deep reinforcement learning (DRL) technique and transfer learning method to design the dynamic probes path deployment (DPPD) algorithm. The evaluation results show that AdapINT can redesign the telemetry system according to telemetry requirements and network environments. AdapINT can reduce telemetry latency by 75\% in online games and video conferencing scenarios. For overhead-aware networks, AdapINT can reduce control overheads by 34\% in cloud computing services. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: 14 pages, 19 figures

arXiv:2310.12504 [pdf, other]

Conceptual design and progress of transmitting $\sim$ MV DC HV into 4 K LHe detectors

Authors: Zhuo Liang, Fengbo Gu, Jiangfeng Zhou, Junhui Liao, Yuanning Gao, Zhaohua Peng, Jian Zheng, Guangpeng An, Meiyuenan Ma, Lifeng Zhang, Lei Zhang, Xiuliang Zhao, Junfeng Xia, Gang Liu, Shangmao Hu

Abstract: A dual-phase TPC (Time Projection Chamber) is more advanced in characterizing an event than a single-phase one because it can, in principle, reconstruct the 3D (X-Y-Z) image of the event, while a single-phase detector can only show a 2D (X-Y) picture. As a result, more enriched physics is expected for a dual-phase detector than a single-phase one. However, to build such a detector, DC HV (High Vol… ▽ More A dual-phase TPC (Time Projection Chamber) is more advanced in characterizing an event than a single-phase one because it can, in principle, reconstruct the 3D (X-Y-Z) image of the event, while a single-phase detector can only show a 2D (X-Y) picture. As a result, more enriched physics is expected for a dual-phase detector than a single-phase one. However, to build such a detector, DC HV (High Voltage) must be delivered into the chamber (to have a static electric field), which is a challenging task, especially for an LHe detector due to the extremely low temperature, $\sim$ 4 K, and the very high voltage, $\sim$ MV (Million Volts). This article introduces a convincing design for transmitting $\sim$ MV DC into a 4 K LHe detector. We also report the progress of manufacturing a 100 kV DC feedthrough capable of working at 4 K. Surprisingly, we realized that the technology we developed here might be a valuable reference to the scientists and engineers aiming to build residential bases on the Moon or Mars. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2310.12496 [pdf, other]

doi 10.1140/epjp/s13360-024-05195-y

A novel nuclear recoil calibration for liquid noble gas detectors

Authors: Fengbo Gu, Jiangfeng Zhou, Junhui Liao, Yuanning Gao, Zhuo Liang, Meiyuenan Ma, Zhaohua Peng, Lifeng Zhang, Lei Zhang, Jian Zheng

Abstract: According to many dark matter models, a potential signal registered in a detector would feature a single-scattering nuclear recoil (NR). So, it is crucial to calibrate the detector's response to NR events. The conventional calibrations implement $\sim$ keV to MeV neutrons, which can be produced by an accelerator, a neutron generator, or a radioactive source. Although the calibrating methods have b… ▽ More According to many dark matter models, a potential signal registered in a detector would feature a single-scattering nuclear recoil (NR). So, it is crucial to calibrate the detector's response to NR events. The conventional calibrations implement $\sim$ keV to MeV neutrons, which can be produced by an accelerator, a neutron generator, or a radioactive source. Although the calibrating methods have been widely employed, they could be improved in several ways: (a) the incident neutron energy should be more monoenergetic, (b) the calibrating NR energy should line up with the region of interest (ROI) of the experiment, and (c) the intensity of the beam should be appropriate. In the paper, we introduce a novel NR calibration method for liquid helium detectors, in which a helium beam ($α$ particles) will be implemented to calibrate the detectors. The helium beam can (i) be tuned precisely to have a jitter of $\lesssim $ 4\% (the $α$ beam's kinetic energy is equivalent to the recoil energy in the conventional calibrations with fast neutrons); (ii) have an energy between $\sim$ 100 eV and tens of keV; and (iii) provide a tunable flux from nA to 100 $μ$A, which presents convenience in beam pipe configuration to obtain a $\sim$ 100 Hz events rate so that the events pileup would be ignorable. △ Less

Submitted 23 May, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

Journal ref: Eur. Phys. J. Plus (2024) 139:437

arXiv:2310.11864 [pdf, other]

VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Authors: Hongliang Zhong, **gbo Zhang, **g Liao

Abstract: We propose VQ-NeRF, a two-branch neural network model that incorporates Vector Quantization (VQ) to decompose and edit reflectance fields in 3D scenes. Conventional neural reflectance fields use only continuous representations to model 3D scenes, despite the fact that objects are typically composed of discrete materials in reality. This lack of discretization can result in noisy material decomposi… ▽ More We propose VQ-NeRF, a two-branch neural network model that incorporates Vector Quantization (VQ) to decompose and edit reflectance fields in 3D scenes. Conventional neural reflectance fields use only continuous representations to model 3D scenes, despite the fact that objects are typically composed of discrete materials in reality. This lack of discretization can result in noisy material decomposition and complicated material editing. To address these limitations, our model consists of a continuous branch and a discrete branch. The continuous branch follows the conventional pipeline to predict decomposed materials, while the discrete branch uses the VQ mechanism to quantize continuous materials into individual ones. By discretizing the materials, our model can reduce noise in the decomposition process and generate a segmentation map of discrete materials. Specific materials can be easily selected for further editing by clicking on the corresponding area of the segmentation outcomes. Additionally, we propose a dropout-based VQ codeword ranking strategy to predict the number of materials in a scene, which reduces redundancy in the material segmentation process. To improve usability, we also develop an interactive interface to further assist material editing. We evaluate our model on both computer-generated and real-world scenes, demonstrating its superior performance. To the best of our knowledge, our model is the first to enable discrete material editing in 3D scenes. △ Less

Submitted 10 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted by TVCG. Project Page: https://jtbzhl.github.io/VQ-NeRF.github.io/

arXiv:2310.10651 [pdf, other]

HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending

Authors: Tianyi Wei, Dongdong Chen, Wenbo Zhou, **g Liao, Weiming Zhang, Gang Hua, Nenghai Yu

Abstract: Hair editing has made tremendous progress in recent years. Early hair editing methods use well-drawn sketches or masks to specify the editing conditions. Even though they can enable very fine-grained local control, such interaction modes are inefficient for the editing conditions that can be easily specified by language descriptions or reference images. Thanks to the recent breakthrough of cross-m… ▽ More Hair editing has made tremendous progress in recent years. Early hair editing methods use well-drawn sketches or masks to specify the editing conditions. Even though they can enable very fine-grained local control, such interaction modes are inefficient for the editing conditions that can be easily specified by language descriptions or reference images. Thanks to the recent breakthrough of cross-modal models (e.g., CLIP), HairCLIP is the first work that enables hair editing based on text descriptions or reference images. However, such text-driven and reference-driven interaction modes make HairCLIP unable to support fine-grained controls specified by sketch or mask. In this paper, we propose HairCLIPv2, aiming to support all the aforementioned interactions with one unified framework. Simultaneously, it improves upon HairCLIP with better irrelevant attributes (e.g., identity, background) preservation and unseen text descriptions support. The key idea is to convert all the hair editing tasks into hair transfer tasks, with editing conditions converted into different proxies accordingly. The editing effects are added upon the input image by blending the corresponding proxy features within the hairstyle or hair color feature spaces. Besides the unprecedented user interaction mode support, quantitative and qualitative experiments demonstrate the superiority of HairCLIPv2 in terms of editing effects, irrelevant attribute preservation and visual naturalness. Our code is available at \url{https://github.com/wty-ustc/HairCLIPv2}. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: ICCV 2023, code is available at https://github.com/wty-ustc/HairCLIPv2

arXiv:2310.10522 [pdf, other]

Observation of GRB 221009A early afterglow in X/$γ$-ray energy band

Authors: Chao Zheng, Yan-Qiu Zhang, Shao-Lin Xiong, Cheng-Kui Li, He Gao, Wang-Chen Xue, Jia-Cong Liu, Chen-Wei Wang, Wen-Jun Tan, Wen-Xi Peng, Zheng-Hua An, Ce Cai, Ming-Yu Ge, Dong-Ya Guo, Yue Huang, Bing Li, Ti-Pei Li, Xiao-Bo Li, Xin-Qiao Li, Xu-Fang Li, **-Yuan Liao, Cong-Zhan Liu, Fang-Jun Lu, Xiang Ma, Rui Qiao , et al. (23 additional authors not shown)

Abstract: The early afterglow of a Gamma-ray burst (GRB) can provide critical information on the jet and progenitor of the GRB. The extreme brightness of GRB 221009A allows us to probe its early afterglow in unprecedented detail. In this letter, we report comprehensive observation results of the early afterglow of GRB 221009A (from $T_0$+660 s to $T_0$+1860 s, where $T_0$ is the \textit{Insight}-HXMT/HE tri… ▽ More The early afterglow of a Gamma-ray burst (GRB) can provide critical information on the jet and progenitor of the GRB. The extreme brightness of GRB 221009A allows us to probe its early afterglow in unprecedented detail. In this letter, we report comprehensive observation results of the early afterglow of GRB 221009A (from $T_0$+660 s to $T_0$+1860 s, where $T_0$ is the \textit{Insight}-HXMT/HE trigger time) in X/$γ$-ray energy band (from 20 keV to 20 MeV) by \textit{Insight}-HXMT/HE, GECAM-C and \textit{Fermi}/GBM. We find that the spectrum of the early afterglow in 20 keV-20 MeV could be well described by a cutoff power-law with an extra power-law which dominates the low and high energy bands respectively. The cutoff power-law $E_{\rm peak}$ is $\sim$ 30 keV and the power-law photon index is $\sim$ 1.8 throughout the early afterglow phase. By fitting the light curves in different energy bands, we find that a significant achromatic break (from keV to TeV) is required at $T_0$ + 1246$^{+27}_{-26}$ s (i.e. 1021 s since the afterglow starting time $T_{\rm AG}$=$T_0$+225 s), providing compelling evidence of a jet break. Interestingly, both the pre-break and post-break decay slopes vary with energy, and these two slopes become closer in the lower energy band, making the break less identifiable. Intriguingly, the spectrum of the early afterglow experienced a slight hardening before the break and a softening after the break. These results provide new insights into the understanding of this remarkable GRB. △ Less

Submitted 19 January, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted for publication in ApJ Letters on 19-Jan-2024, 11 pages, 7 figures and 2 tables

arXiv:2310.07222 [pdf, other]

doi 10.1145/3581783.3612200

Uni-paint: A Unified Framework for Multimodal Image Inpainting with Pretrained Diffusion Model

Authors: Shiyuan Yang, Xiaodong Chen, **g Liao

Abstract: Recently, text-to-image denoising diffusion probabilistic models (DDPMs) have demonstrated impressive image generation capabilities and have also been successfully applied to image inpainting. However, in practice, users often require more control over the inpainting process beyond textual guidance, especially when they want to composite objects with customized appearance, color, shape, and layout… ▽ More Recently, text-to-image denoising diffusion probabilistic models (DDPMs) have demonstrated impressive image generation capabilities and have also been successfully applied to image inpainting. However, in practice, users often require more control over the inpainting process beyond textual guidance, especially when they want to composite objects with customized appearance, color, shape, and layout. Unfortunately, existing diffusion-based inpainting methods are limited to single-modal guidance and require task-specific training, hindering their cross-modal scalability. To address these limitations, we propose Uni-paint, a unified framework for multimodal inpainting that offers various modes of guidance, including unconditional, text-driven, stroke-driven, exemplar-driven inpainting, as well as a combination of these modes. Furthermore, our Uni-paint is based on pretrained Stable Diffusion and does not require task-specific training on specific datasets, enabling few-shot generalizability to customized images. We have conducted extensive qualitative and quantitative evaluations that show our approach achieves comparable results to existing single-modal methods while offering multimodal inpainting capabilities not available in other methods. Code will be available at https://github.com/ysy31415/unipaint. △ Less

Submitted 11 October, 2023; originally announced October 2023.

Comments: Accepted by ACMMM'23

arXiv:2310.05202 [pdf, other]

doi 10.1109/TIV.2024.3412198

Enhancing Cross-Dataset Performance of Distracted Driving Detection With Score-Softmax Classifier

Authors: Cong Duan, Zixuan Liu, Jiahao Xia, Minghai Zhang, Jiacai Liao, Libo Cao

Abstract: Deep neural networks enable real-time monitoring of in-vehicle driver, facilitating the timely prediction of distractions, fatigue, and potential hazards. This technology is now integral to intelligent transportation systems. Recent research has exposed unreliable cross-dataset end-to-end driver behavior recognition due to overfitting, often referred to as ``shortcut learning", resulting from limi… ▽ More Deep neural networks enable real-time monitoring of in-vehicle driver, facilitating the timely prediction of distractions, fatigue, and potential hazards. This technology is now integral to intelligent transportation systems. Recent research has exposed unreliable cross-dataset end-to-end driver behavior recognition due to overfitting, often referred to as ``shortcut learning", resulting from limited data samples. In this paper, we introduce the Score-Softmax classifier, which addresses this issue by enhancing inter-class independence and Intra-class uncertainty. Motivated by human rating patterns, we designed a two-dimensional supervisory matrix based on marginal Gaussian distributions to train the classifier. Gaussian distributions help amplify intra-class uncertainty while ensuring the Score-Softmax classifier learns accurate knowledge. Furthermore, leveraging the summation of independent Gaussian distributed random variables, we introduced a multi-channel information fusion method. This strategy effectively resolves the multi-information fusion challenge for the Score-Softmax classifier. Concurrently, we substantiate the necessity of transfer learning and multi-dataset combination. We conducted cross-dataset experiments using the SFD, AUCDD-V1, and 100-Driver datasets, demonstrating that Score-Softmax improves cross-dataset performance without modifying the model architecture. This provides a new approach for enhancing neural network generalization. Additionally, our information fusion approach outperforms traditional methods. △ Less

Submitted 20 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2310.03810 [pdf, other]

Influence of disorder on antidot vortex Majorana states in 3D topological insulators

Authors: Rafał Rechciński, Aleksei Khindanov, Dmitry I. Pikulin, Jian Liao, Leonid P. Rokhinson, Yong P. Chen, Roman M. Lutchyn, Jukka I. Väyrynen

Abstract: Topological insulator/superconductor two-dimensional heterostructures are promising candidates for realizing topological superconductivity and Majorana modes. In these systems, a vortex pinned by a pre-fabricated antidot in the superconductor can host Majorana zero-energy modes (MZMs), which are exotic quasiparticles that may enable quantum information processing. However, a major challenge is to… ▽ More Topological insulator/superconductor two-dimensional heterostructures are promising candidates for realizing topological superconductivity and Majorana modes. In these systems, a vortex pinned by a pre-fabricated antidot in the superconductor can host Majorana zero-energy modes (MZMs), which are exotic quasiparticles that may enable quantum information processing. However, a major challenge is to design devices that can manipulate the information encoded in these MZMs. One of the key factors is to create small and clean antidots, so that the MZMs, localized in the vortex core, have a large gap to other excitations. If the antidot is too large or too disordered, the level spacing for the subgap vortex states may become smaller than temperature. In this paper, we numerically investigate the effects of disorder, chemical potential, and antidot size on the subgap vortex spectrum, using a two-dimensional effective model of the topological insulator surface. Our model allows us to simulate large system sizes with vortices up to 1.8 $μ$m in diameter. We also compare our disorder model with the transport data from existing experiments. We find that the spectral gap can exhibit a non-monotonic behavior as a function of disorder strength, and that it can be tuned by applying a gate voltage. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Comments: 10 pages, 6 figures

arXiv:2309.16600 [pdf, other]

doi 10.1038/s42005-024-01713-7

Constraining Ultralight Dark Matter through an Accelerated Resonant Search

Authors: Zitong Xu, Xiaolin Ma, Kai Wei, Yuxuan He, Xing Heng, Xiaofei Huang, Tengyu Ai, Jian Liao, Wei Ji, Jia Liu, Xiao-** Wang, Dmitry Budker

Abstract: Experiments aimed at detecting ultralight dark matter typically rely on resonant effects, which are sensitive to the dark matter mass that matches the resonance frequency. In this study, we investigate the nucleon couplings of ultralight axion dark matter using a magnetometer operating in a nuclear magnetic resonance (NMR) mode. Our approach involves the use of a $^{21}$Ne spin-based sensor, which… ▽ More Experiments aimed at detecting ultralight dark matter typically rely on resonant effects, which are sensitive to the dark matter mass that matches the resonance frequency. In this study, we investigate the nucleon couplings of ultralight axion dark matter using a magnetometer operating in a nuclear magnetic resonance (NMR) mode. Our approach involves the use of a $^{21}$Ne spin-based sensor, which features the lowest nuclear magnetic moment among noble-gas spins. This configuration allows us to achieve an ultrahigh sensitivity of 0.73 fT/Hz$^{1/2}$ at around 5 Hz, corresponding to energy resolution of approximately 1.5$\times 10^{-23}\,\rm{eV/Hz^{1/2}}$. Our analysis reveals that under certain conditions it is beneficial to scan the frequency with steps significantly larger than the resonance width. The analytical results are in agreement with experimental data and the scan strategy is potentially applicable to other resonant searches. Further, our study establishes stringent constraints on axion-like particles (ALP) in the 4.5--15.5 Hz Compton-frequency range coupling to neutrons and protons, improving on prior work by several-fold. Within a band around 4.6--6.6 Hz and around 7.5 Hz, our laboratory findings surpass astrophysical limits derived from neutron-star cooling. Hence, we demonstrate an accelerated resonance search for ultralight dark matter, achieving an approximately 30-fold increase in scanning step while maintaining competitive sensitivity. △ Less

Submitted 11 July, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: 13 pages, 11 figures, accepted by Communications Physics

arXiv:2309.15333 [pdf]

Three steps towards dose optimization for oncology dose finding

Authors: Jason J. Z. Liao, Ekaterine Asatiani, Qingyang Liu, Kevin Hou

Abstract: Traditional dose selection for oncology registration trials typically employs a one- or two-step single maximum tolerated dose (MTD) approach. However, this approach may not be appropriate for molecularly targeted therapy that tends to have toxicity profiles that are markedly different to cytotoxic agents. The US Food and Drug Administration launched Project Optimus to reform dose optimization in… ▽ More Traditional dose selection for oncology registration trials typically employs a one- or two-step single maximum tolerated dose (MTD) approach. However, this approach may not be appropriate for molecularly targeted therapy that tends to have toxicity profiles that are markedly different to cytotoxic agents. The US Food and Drug Administration launched Project Optimus to reform dose optimization in oncology drug development and has recently released a related Guidance for Industry. In response to these initiatives, we propose a "three steps towards dose optimization" procedure and discuss the details in dose optimization designs and analyses in this manuscript. The first step is dose-escalation to identify the MTD or maximum administered dose with an efficient hybrid design, which can offer good overdose control and increases the likelihood of the recommended MTD being close to the true MTD. The second step is the selection of appropriate recommended doses for expansion (RDEs), based on all available data including emerging safety, pharmacokinetics, pharmacodynamics, and other biomarker information. The third step is dose optimization, which uses data from a randomized fractional factorial design with multiple RDEs explored in multiple tumor cohorts during the expansion phase to ensure a feasible dose is selected for registration trials, and that the tumor type most sensitive to the investigative treatment is identified. We believe using this three-step approach can increase the likelihood of selecting the optimal dose for registration trial, one that demonstrates a balanced safety profile while retaining much of the efficacy observed at the MTD. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 22 pages, 7 figures and 2 tables

arXiv:2309.15040 [pdf]

Zero-Energy-Device for 6G: First Real-Time Backscatter Communication thanks to the Detection of Pilots from an Ambient Commercial Cellular Network

Authors: Papis Ndiaye, Dinh-Thuy Phan-Huy, Ayman Hassan, **gyi Liao, Xiyu Wang, Kalle Ruttik, Riku Jantti

Abstract: Ambient backscatter communication technology (AmBC) and a novel device category called zero-energy devices (ZED) have recently emerged as potential components for the forthcoming 6th generation (6G) networks. A ZED communicates with a smartphone without emitting additional radio waves, by backscattering ambient waves from base stations. Thanks to its very low consumption, a ZED powers itself by ha… ▽ More Ambient backscatter communication technology (AmBC) and a novel device category called zero-energy devices (ZED) have recently emerged as potential components for the forthcoming 6th generation (6G) networks. A ZED communicates with a smartphone without emitting additional radio waves, by backscattering ambient waves from base stations. Thanks to its very low consumption, a ZED powers itself by harvesting ambient light energy. However, the time variations of data traffic in cellular networks prevents AmBC to work properly. Recent works have demonstrated experimentally that a backscatter device could be detected by listening only ambient pilot signals (which are steady) instead of the whole ambient signal (which is bursty) of 4G. However, these experiments were run with a 4G base station emulator and a bulky energy greedy backscatter device. In this paper, for the first time, we demonstrate real-time AmBC on the field, with Orange commercial 4G network as ambient source and Orange Zero-Energy Device. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 3 pages, 7 figures , 6Get2023

arXiv:2309.14623 [pdf, other]

Text-to-Image Generation for Abstract Concepts

Authors: Jiayi Liao, Xu Chen, Qiang Fu, Lun Du, Xiangnan He, Xiang Wang, Shi Han, Dongmei Zhang

Abstract: Recent years have witnessed the substantial progress of large-scale models across various domains, such as natural language processing and computer vision, facilitating the expression of concrete concepts. Unlike concrete concepts that are usually directly associated with physical objects, expressing abstract concepts through natural language requires considerable effort, which results from their… ▽ More Recent years have witnessed the substantial progress of large-scale models across various domains, such as natural language processing and computer vision, facilitating the expression of concrete concepts. Unlike concrete concepts that are usually directly associated with physical objects, expressing abstract concepts through natural language requires considerable effort, which results from their intricate semantics and connotations. An alternative approach is to leverage images to convey rich visual information as a supplement. Nevertheless, existing Text-to-Image (T2I) models are primarily trained on concrete physical objects and tend to fail to visualize abstract concepts. Inspired by the three-layer artwork theory that identifies critical factors, intent, object and form during artistic creation, we propose a framework of Text-to-Image generation for Abstract Concepts (TIAC). The abstract concept is clarified into a clear intent with a detailed definition to avoid ambiguity. LLMs then transform it into semantic-related physical objects, and the concept-dependent form is retrieved from an LLM-extracted form pattern set. Information from these three aspects will be integrated to generate prompts for T2I models via LLM. Evaluation results from human assessments and our newly designed metric concept score demonstrate the effectiveness of our framework in creating images that can sufficiently express abstract concepts. △ Less

Submitted 27 September, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

Showing 51–100 of 725 results for author: Lia, J