Search | arXiv e-print repository

Error mitigated shadow estimation based on virtual distillation

Authors: Ruyu Yang, Xiaoming Sun, Hongyi Zhou

Abstract: Shadow estimation is a method for deducing numerous properties of an unknown quantum state through a limited set of measurements, which suffers from noises in quantum devices. In this paper, we introduce an error-mitigated shadow estimation approach based on virtual distillation, tailored for applications in near-term quantum devices. Our methodology leverages the qubit reset technique, thereby re… ▽ More Shadow estimation is a method for deducing numerous properties of an unknown quantum state through a limited set of measurements, which suffers from noises in quantum devices. In this paper, we introduce an error-mitigated shadow estimation approach based on virtual distillation, tailored for applications in near-term quantum devices. Our methodology leverages the qubit reset technique, thereby reducing the associated qubit overhead. Crucially, our approach ensures that the required qubit resources remain independent of the desired accuracy and avoid an exponential measurement overhead, marking a substantial advancement in practical applications. Furthermore, our technique accommodates a mixed Clifford and Pauli-type shadow, which can result in a reduction in the number of required measurements across various scenarios. We also study the trade-off between circuit depth and measurement overhead quantitatively. Through numerical simulations, we substantiate the efficacy of our error mitigation method, establishing its utility in enhancing the robustness of shadow estimations on near-term quantum devices. △ Less

Submitted 15 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: 16 pages, 6 figures

arXiv:2402.18571 [pdf, other]

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

Authors: Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang

Abstract: Fine-grained control over large language models (LLMs) remains a significant challenge, hindering their adaptability to diverse user needs. While Reinforcement Learning from Human Feedback (RLHF) shows promise in aligning LLMs, its reliance on scalar rewards often limits its ability to capture diverse user preferences in real-world applications. To address this limitation, we introduce the Directi… ▽ More Fine-grained control over large language models (LLMs) remains a significant challenge, hindering their adaptability to diverse user needs. While Reinforcement Learning from Human Feedback (RLHF) shows promise in aligning LLMs, its reliance on scalar rewards often limits its ability to capture diverse user preferences in real-world applications. To address this limitation, we introduce the Directional Preference Alignment (DPA) framework. Unlike the scalar-reward RLHF, DPA incorporates multi-objective reward modeling to represent diverse preference profiles. Additionally, DPA models user preferences as directions (i.e., unit vectors) in the reward space to achieve user-dependent preference control. Our method involves training a multi-objective reward model and then fine-tuning the LLM with a preference-conditioned variant of Rejection Sampling Finetuning (RSF), an RLHF method adopted by Llama 2. This method enjoys a better performance trade-off across various reward objectives. In comparison with the scalar-reward RLHF, DPA offers users intuitive control over LLM generation: they can arithmetically specify their desired trade-offs (e.g., more helpfulness with less verbosity). We also validate the effectiveness of DPA with real-world alignment experiments on Mistral-7B. Our method provides straightforward arithmetic control over the trade-off between helpfulness and verbosity while maintaining competitive performance with strong baselines such as Direct Preference Optimization (DPO). △ Less

Submitted 6 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: The code and model are released at https://github.com/Haoxiang-Wang/directional-preference-alignment

arXiv:2402.18152 [pdf, other]

Boosting Neural Representations for Videos with a Conditional Decoder

Authors: Xinjie Zhang, Ren Yang, Dailan He, Xingtong Ge, Tongda Xu, Yan Wang, Hongwei Qin, Jun Zhang

Abstract: Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing, showing remarkable versatility across various video tasks. However, existing methods often fail to fully leverage their representation capabilities, primarily due to inadequate alignment of intermediate features during target frame decoding. This paper introduces a universal boosting frame… ▽ More Implicit neural representations (INRs) have emerged as a promising approach for video storage and processing, showing remarkable versatility across various video tasks. However, existing methods often fail to fully leverage their representation capabilities, primarily due to inadequate alignment of intermediate features during target frame decoding. This paper introduces a universal boosting framework for current implicit video representation approaches. Specifically, we utilize a conditional decoder with a temporal-aware affine transform module, which uses the frame index as a prior condition to effectively align intermediate features with target frames. Besides, we introduce a sinusoidal NeRV-like block to generate diverse intermediate features and achieve a more balanced parameter distribution, thereby enhancing the model's capacity. With a high-frequency information-preserving reconstruction loss, our approach successfully boosts multiple baseline INRs in the reconstruction quality and convergence speed for video regression, and exhibits superior inpainting and interpolation results. Further, we integrate a consistent entropy minimization technique and develop video codecs based on these boosted INRs. Experiments on the UVG dataset confirm that our enhanced codecs significantly outperform baseline INRs and offer competitive rate-distortion performance compared to traditional and learning-based codecs. Code is available at https://github.com/Xinjie-Q/Boosting-NeRV. △ Less

Submitted 16 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: Accept by CVPR 2024

arXiv:2402.16796 [pdf, other]

Expressive Whole-Body Control for Humanoid Robots

Authors: Xuxin Cheng, Yandong Ji, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang

Abstract: Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world? We propose to learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. To train such a policy, we leverage the large-scale human motion capture data from the graphics community in a Reinforcement Learning framework. However, directly performing imitati… ▽ More Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world? We propose to learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. To train such a policy, we leverage the large-scale human motion capture data from the graphics community in a Reinforcement Learning framework. However, directly performing imitation learning with the motion capture dataset would not work on the real humanoid robot, given the large gap in degrees of freedom and physical capabilities. Our method Expressive Whole-Body Control (Exbody) tackles this problem by encouraging the upper humanoid body to imitate a reference motion, while relaxing the imitation constraint on its two legs and only requiring them to follow a given velocity robustly. With training in simulation and Sim2Real transfer, our policy can control a humanoid robot to walk in different styles, shake hands with humans, and even dance with a human in the real world. We conduct extensive studies and comparisons on diverse motions in both simulation and the real world to show the effectiveness of our approach. △ Less

Submitted 5 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: Website: https://expressive-humanoid.github.io

arXiv:2402.14293 [pdf, other]

Leveraging Large Language Models for Concept Graph Recovery and Question Answering in NLP Education

Authors: Rui Yang, Boming Yang, Sixun Ouyang, Tianwei She, Aosong Feng, Yuang Jiang, Freddy Lecue, **ghui Lu, Irene Li

Abstract: In the domain of Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated promise in text-generation tasks. However, their educational applications, particularly for domain-specific queries, remain underexplored. This study investigates LLMs' capabilities in educational scenarios, focusing on concept graph recovery and question-answering (QA). We assess LLMs' zero-shot per… ▽ More In the domain of Natural Language Processing (NLP), Large Language Models (LLMs) have demonstrated promise in text-generation tasks. However, their educational applications, particularly for domain-specific queries, remain underexplored. This study investigates LLMs' capabilities in educational scenarios, focusing on concept graph recovery and question-answering (QA). We assess LLMs' zero-shot performance in creating domain-specific concept graphs and introduce TutorQA, a new expert-verified NLP-focused benchmark for scientific graph reasoning and QA. TutorQA consists of five tasks with 500 QA pairs. To tackle TutorQA queries, we present CGLLM, a pipeline integrating concept graphs with LLMs for answering diverse questions. Our results indicate that LLMs' zero-shot concept graph recovery is competitive with supervised methods, showing an average 3% F1 score improvement. In TutorQA tasks, LLMs achieve up to 26% F1 score enhancement. Moreover, human evaluation and analysis show that CGLLM generates answers with more fine-grained concepts. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.13529 [pdf]

doi 10.1155/2021/6638730

Multitier Service Migration Framework Based on Mobility Prediction in Mobile Edge Computing

Authors: Run Yang, Hui He, Weizhe Zhang

Abstract: Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud can reduce computing latency and backhaul load simultaneously. However, new challenges incurred by user mobility and limited coverage of MEC server service arise. Services should be dynamically migrated… ▽ More Mobile edge computing (MEC) pushes computing resources to the edge of the network and distributes them at the edge of the mobile network. Offloading computing tasks to the edge instead of the cloud can reduce computing latency and backhaul load simultaneously. However, new challenges incurred by user mobility and limited coverage of MEC server service arise. Services should be dynamically migrated between multiple MEC servers to maintain service performance due to user movement. Tackling this problem is nontrivial because it is arduous to predict user movement, and service migration will generate service interruptions and redundant network traffic. Service interruption time must be minimized, and redundant network traffic should be reduced to ensure service quality. In this paper, the container lives migration technology based on prediction is studied, and an online prediction method based on map data that does not rely on prior knowledge such as user trajectories is proposed to address this challenge in terms of mobility prediction accuracy. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 13 pages, 9 figures

Journal ref: Wireless Communications and Mobile Computing, 2021

arXiv:2402.12948 [pdf, other]

GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick

Authors: Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao

Abstract: Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty. Decoding-based watermark, particularly the GumbelMax-trick-based watermark(GM watermark), is a standout solution for safeguarding machine-generated texts due to its notable detectability. However, GM watermark encounters a major challenge with generation div… ▽ More Large language models (LLMs) excellently generate human-like text, but also raise concerns about misuse in fake news and academic dishonesty. Decoding-based watermark, particularly the GumbelMax-trick-based watermark(GM watermark), is a standout solution for safeguarding machine-generated texts due to its notable detectability. However, GM watermark encounters a major challenge with generation diversity, always yielding identical outputs for the same prompt, negatively impacting generation diversity and user experience. To overcome this limitation, we propose a new type of GM watermark, the Logits-Addition watermark, and its three variants, specifically designed to enhance diversity. Among these, the GumbelSoft watermark (a softmax variant of the Logits-Addition watermark) demonstrates superior performance in high diversity settings, with its AUROC score outperforming those of the two alternative variants by 0.1 to 0.3 and surpassing other decoding-based watermarking methods by a minimum of 0.1. △ Less

Submitted 28 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.12626 [pdf, other]

Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors

Authors: Yiwei Lu, Matthew Y. R. Yang, Gautam Kamath, Yaoliang Yu

Abstract: Machine learning models have achieved great success in supervised learning tasks for end-to-end training, which requires a large amount of labeled data that is not always feasible. Recently, many practitioners have shifted to self-supervised learning methods that utilize cheap unlabeled data to learn a general feature extractor via pre-training, which can be further applied to personalized downstr… ▽ More Machine learning models have achieved great success in supervised learning tasks for end-to-end training, which requires a large amount of labeled data that is not always feasible. Recently, many practitioners have shifted to self-supervised learning methods that utilize cheap unlabeled data to learn a general feature extractor via pre-training, which can be further applied to personalized downstream tasks by simply training an additional linear layer with limited labeled data. However, such a process may also raise concerns regarding data poisoning attacks. For instance, indiscriminate data poisoning attacks, which aim to decrease model utility by injecting a small number of poisoned data into the training set, pose a security risk to machine learning models, but have only been studied for end-to-end supervised learning. In this paper, we extend the exploration of the threat of indiscriminate attacks on downstream tasks that apply pre-trained feature extractors. Specifically, we propose two types of attacks: (1) the input space attacks, where we modify existing attacks to directly craft poisoned data in the input space. However, due to the difficulty of optimization under constraints, we further propose (2) the feature targeted attacks, where we mitigate the challenge with three stages, firstly acquiring target parameters for the linear head; secondly finding poisoned features by treating the learned feature representations as a dataset; and thirdly inverting the poisoned features back to the input space. Our experiments examine such attacks in popular downstream tasks of fine-tuning on the same dataset and transfer learning that considers domain adaptation. Empirical results reveal that transfer learning is more vulnerable to our attacks. Additionally, input space attacks are a strong threat if no countermeasures are posed, but are otherwise weaker than feature targeted attacks. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: Accepted to SaTML 2024

arXiv:2402.12623

Effective Edge Ranking via Random Walk with Restart

Authors: Renchi Yang

Abstract: Given a network G, edge centrality is a metric used to evaluate the importance of edges in G, which is a key concept in analyzing networks and finds vast applications involving edge ranking. In spite of a wealth of research on devising edge centrality measures, they incur either prohibitively high computation costs or varied deficiencies that lead to sub-optimal ranking quality. To overcome thei… ▽ More Given a network G, edge centrality is a metric used to evaluate the importance of edges in G, which is a key concept in analyzing networks and finds vast applications involving edge ranking. In spite of a wealth of research on devising edge centrality measures, they incur either prohibitively high computation costs or varied deficiencies that lead to sub-optimal ranking quality. To overcome their limitations, this paper proposes EdgeRAKE, a new centrality measure for edge ranking that leverages the novel notion of the edgewise random walk with restart. Based thereon, we present a linear-complexity algorithm for EdgeRAKE approximation, followed by an in-depth theoretical analysis in terms of various aspects. Extensive experiments comparing EdgeRAKE against six edge centrality metrics in graph analytics tasks on real networks showcase that EdgeRAKE offers superior practical effectiveness without significantly reducing computation efficiency △ Less

Submitted 6 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: Incomplete theory and experiments. Will upload a new version later

arXiv:2402.12132 [pdf, other]

SSTKG: Simple Spatio-Temporal Knowledge Graph for Intepretable and Versatile Dynamic Information Embedding

Authors: Ruiyi Yang, Flora D. Salim, Hao Xue

Abstract: Knowledge graphs (KGs) have been increasingly employed for link prediction and recommendation using real-world datasets. However, the majority of current methods rely on static data, neglecting the dynamic nature and the hidden spatio-temporal attributes of real-world scenarios. This often results in suboptimal predictions and recommendations. Although there are effective spatio-temporal inference… ▽ More Knowledge graphs (KGs) have been increasingly employed for link prediction and recommendation using real-world datasets. However, the majority of current methods rely on static data, neglecting the dynamic nature and the hidden spatio-temporal attributes of real-world scenarios. This often results in suboptimal predictions and recommendations. Although there are effective spatio-temporal inference methods, they face challenges such as scalability with large datasets and inadequate semantic understanding, which impede their performance. To address these limitations, this paper introduces a novel framework - Simple Spatio-Temporal Knowledge Graph (SSTKG), for constructing and exploring spatio-temporal KGs. To integrate spatial and temporal data into KGs, our framework exploited through a new 3-step embedding method. Output embeddings can be used for future temporal sequence prediction and spatial information recommendation, providing valuable insights for various applications such as retail sales forecasting and traffic volume prediction. Our framework offers a simple but comprehensive way to understand the underlying patterns and trends in dynamic KG, thereby enhancing the accuracy of predictions and the relevance of recommendations. This work paves the way for more effective utilization of spatio-temporal data in KGs, with potential impacts across a wide range of sectors. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: for Web conf 2024. 8 pages context

arXiv:2402.12082 [pdf, other]

X-ray multibeam ptychography at up to 20 keV: nano-lithography enhances X-ray nano-imaging

Authors: Tang Li, Maik Kahnt, Thomas L. Sheppard, Runqing Yang, Ken Vidar Falch, Roman Zvagelsky, Pablo Villanueva-Perez, Martin Wegener, Mikhail Lyubomirskiy

Abstract: Non-destructive nano-imaging of the internal structure of solid matter is only feasible using hard X-rays due to their high penetration. The highest resolution images are achieved at synchrotron radiation sources (SRF), offering superior spectral brightness and enabling methods such as X-ray ptychography delivering single-digit nm resolution. However the resolution or field of view is ultimately c… ▽ More Non-destructive nano-imaging of the internal structure of solid matter is only feasible using hard X-rays due to their high penetration. The highest resolution images are achieved at synchrotron radiation sources (SRF), offering superior spectral brightness and enabling methods such as X-ray ptychography delivering single-digit nm resolution. However the resolution or field of view is ultimately constrained by the available coherent flux. To address this, the beam's incoherent fraction can be exploited using multiple parallel beams in an approach known as X-ray multibeam ptychography (MBP). This expands the domain of X-ray ptychography to larger samples or more rapid measurements. Both qualities favor the study of complex composite or functional samples, such as catalysts, energy materials, or electronic devices. The challenges of performing ptychography at high energy and with many parallel beams must be overcome to extract the full advantages for extended samples while minimizing beam attenuation. Here, we report the application of MBP with up to 12 beams and at photon energies of 13 and 20 keV. We demonstrate performance for various samples: a Siemens star test pattern, a porous Ni/\ce{Al2O3} catalyst, a microchip, and gold nano-crystal clusters, exceeding the measurement limits of conventional hard X-ray ptychography without compromising image quality. △ Less

Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.11286 [pdf, other]

Prospects for joint reconstruction of imaging air Cherenkov Telescope array and extensive air shower array

Authors: Zhipeng Zhang, Ruizhi Yang, Shoushan Zhang, Liqiao Yin, Jiali Liu, Yudong Wang, Lingling Ma, Zhen Cao

Abstract: In this paper we proposed a joint reconstruction of \gray events using both extensive air array (EAS) and Imaging air Cherenkov Telescope array (IACT). We considered eight Cherenkov telescopes to be built on the LHAASO (Large High Altitude Air Shower Observatory) site and investigate the improvement in differential sensitivity when combining the information from both IACT and Moun detectors of LHA… ▽ More In this paper we proposed a joint reconstruction of \gray events using both extensive air array (EAS) and Imaging air Cherenkov Telescope array (IACT). We considered eight Cherenkov telescopes to be built on the LHAASO (Large High Altitude Air Shower Observatory) site and investigate the improvement in differential sensitivity when combining the information from both IACT and Moun detectors of LHAASO-KM2A. We found that due to the higher cosmic ray background rejection power and higher gamma ray retention ratio provided by muon detectors of LHAASO, such a joint reconstruction can significantly improve the sensitivity of IACTs, especially for extended sources and long exposure time. In this article, we have shown the performance of an eight-telescopes mini array, and our results indicate that above $10~\rm TeV$, the sensitivity can be improved by muon detector from $25\% - 60\%$ in different energy ranges. △ Less

Submitted 28 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: 6 pages, 6 figures

arXiv:2402.10441 [pdf, other]

Barrier-Enhanced Homotopic Parallel Trajectory Optimization for Safety-Critical Autonomous Driving

Authors: Lei Zheng, Rui Yang, Michael Yu Wang, Jun Ma

Abstract: Enforcing safety while preventing overly conservative behaviors is essential for autonomous vehicles to achieve high task performance. In this paper, we propose a barrier-enhanced homotopic parallel trajectory optimization (BHPTO) approach with over-relaxed alternating direction method of multipliers (ADMM) for real-time integrated decision-making and planning. To facilitate safety interactions be… ▽ More Enforcing safety while preventing overly conservative behaviors is essential for autonomous vehicles to achieve high task performance. In this paper, we propose a barrier-enhanced homotopic parallel trajectory optimization (BHPTO) approach with over-relaxed alternating direction method of multipliers (ADMM) for real-time integrated decision-making and planning. To facilitate safety interactions between the ego vehicle (EV) and surrounding vehicles, a spatiotemporal safety module exhibiting bi-convexity is developed on the basis of barrier function. Varying barrier coefficients are adopted for different time steps in a planning horizon to account for the motion uncertainties of surrounding HVs and mitigate conservative behaviors. Additionally, we exploit the discrete characteristics of driving maneuvers to initialize nominal behavior-oriented free-end homotopic trajectories based on reachability analysis, and each trajectory is locally constrained to a specific driving maneuver while sharing the same task objectives. By leveraging the bi-convexity of the safety module and the kinematics of the EV, we formulate the BHPTO as a bi-convex optimization problem. Then constraint transcription and over-relaxed ADMM are employed to streamline the optimization process, such that multiple trajectories are generated in real time with feasibility guarantees. Through a series of experiments, the proposed development demonstrates improved task accuracy, stability, and consistency in various traffic scenarios using synthetic and real-world traffic datasets. △ Less

Submitted 26 February, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.10207 [pdf, other]

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

Authors: Rui Yang, Xiaoman Pan, Feng Luo, Shuang Qiu, Han Zhong, Dong Yu, Jianshu Chen

Abstract: We consider the problem of multi-objective alignment of foundation models with human preferences, which is a critical step towards helpful and harmless AI systems. However, it is generally costly and unstable to fine-tune large foundation models using reinforcement learning (RL), and the multi-dimensionality, heterogeneity, and conflicting nature of human preferences further complicate the alignme… ▽ More We consider the problem of multi-objective alignment of foundation models with human preferences, which is a critical step towards helpful and harmless AI systems. However, it is generally costly and unstable to fine-tune large foundation models using reinforcement learning (RL), and the multi-dimensionality, heterogeneity, and conflicting nature of human preferences further complicate the alignment process. In this paper, we introduce Rewards-in-Context (RiC), which conditions the response of a foundation model on multiple rewards in its prompt context and applies supervised fine-tuning for alignment. The salient features of RiC are simplicity and adaptivity, as it only requires supervised fine-tuning of a single foundation model and supports dynamic adjustment for user preferences during inference time. Inspired by the analytical solution of an abstracted convex optimization problem, our dynamic inference-time adjustment method approaches the Pareto-optimal solution for multiple objectives. Empirical evidence demonstrates the efficacy of our method in aligning both Large Language Models (LLMs) and diffusion models to accommodate diverse rewards with only around 10% GPU hours compared with multi-objective RL baseline. △ Less

Submitted 5 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: Accepted by ICML 2024

arXiv:2402.07205 [pdf, ps, other]

Baryogenesis in quantum fluctuation modified gravity

Authors: Rong-Jia Yang, Yong-Ben Shi

Abstract: We consider baryogenesis in quantum fluctuation modified gravity. We explore three forms (two are newly proposed here) of baryogenesis interaction and discuss the effect of these interaction terms on the baryon-to-entropy ratio during the radiation era of the expanding universe. We constrain the model parameters with the current observational data, implying that this modified gravity is capable to… ▽ More We consider baryogenesis in quantum fluctuation modified gravity. We explore three forms (two are newly proposed here) of baryogenesis interaction and discuss the effect of these interaction terms on the baryon-to-entropy ratio during the radiation era of the expanding universe. We constrain the model parameters with the current observational data, implying that this modified gravity is capable to address the issue of baryon asymmetry in a successful manner. △ Less

Submitted 17 April, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

Comments: 11 pages, 6 figures

arXiv:2402.01138 [pdf, other]

Graph Neural Networks in EEG-based Emotion Recognition: A Survey

Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Ruizhi Yang, Liming Zhai, Ziyu Jia, Yang Liu

Abstract: Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. H… ▽ More Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. However, brain region dependencies in emotional EEG have physiological bases that distinguish GNNs in this field from those in other time series fields. Besides, there is neither a comprehensive review nor guidance for constructing GNNs in EEG-based emotion recognition. In the survey, our categorization reveals the commonalities and differences of existing approaches under a unified framework of graph construction. We analyze and categorize methods from three stages in the framework to provide clear guidance on constructing GNNs in EEG-based emotion recognition. In addition, we discuss several open challenges and future directions, such as Temporal full-connected graph and Graph condensation. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2402.00744 [pdf, other]

BATON: Aligning Text-to-Audio Model with Human Preference Feedback

Authors: Huan Liao, Haonan Han, Kai Yang, Tianjiao Du, Rui Yang, Zunnan Xu, Qinmei Xu, **gquan Liu, Jiasheng Lu, Xiu Li

Abstract: With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment betw… ▽ More With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment between generated audio and text prompt using human preference feedback. Our BATON comprises three key stages: Firstly, we curated a dataset containing both prompts and the corresponding generated audio, which was then annotated based on human feedback. Secondly, we introduced a reward model using the constructed dataset, which can mimic human preference by assigning rewards to input text-audio pairs. Finally, we employed the reward model to fine-tune an off-the-shelf text-to-audio model. The experiment results demonstrate that our BATON can significantly improve the generation quality of the original text-to-audio models, concerning audio integrity, temporal relationship, and alignment with human preference. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.16150 [pdf]

Sliding ferroelectric memories and synapses

Authors: Xiuzhen Li, Biao Qin, Yaxian Wang, Yue Xi, Zhiheng Huang, Mengze Zhao, Yalin Peng, Zitao Chen, Zitian Pan, Jundong Zhu, Chenyang Cui, Rong Yang, Wei Yang, Sheng Meng, Dongxia Shi, Xuedong Bai, Can Liu, Na Li, Jianshi Tang, Kaihui Liu, Luojun Du, Guangyu Zhang

Abstract: Ferroelectric materials with switchable electric polarization hold great promise for a plethora of emergent applications, such as post-Moore's law nanoelectronics, beyond-Boltzmann transistors, non-volatile memories, and above-bandgap photovoltaic devices. Recent advances have uncovered an exotic sliding ferroelectric mechanism, which endows to design atomically thin ferroelectrics from non-ferroe… ▽ More Ferroelectric materials with switchable electric polarization hold great promise for a plethora of emergent applications, such as post-Moore's law nanoelectronics, beyond-Boltzmann transistors, non-volatile memories, and above-bandgap photovoltaic devices. Recent advances have uncovered an exotic sliding ferroelectric mechanism, which endows to design atomically thin ferroelectrics from non-ferroelectric parent monolayers. Although notable progress has been witnessed in understanding its fundamental properties, functional devices based on sliding ferroelectrics, the key touchstone toward applications, remain elusive. Here, we demonstrate the rewritable, non-volatile memory devices at room-temperature utilizing a two-dimensional (2D) sliding ferroelectric semiconductor of rhombohedral-stacked bilayer molybdenum disulfide. The 2D sliding ferroelectric memories (SFeMs) show superior performances with a large memory window of >8V, a high conductance ratio of above 106, a long retention time of >10 years, and a programming endurance greater than 104 cycles. Remarkably, flexible SFeMs are achieved with state-of-the-art performances competitive to their rigid counterparts and maintain their performances post bending over 103 cycles. Furthermore, synapse-specific Hebbian forms of plasticity and image recognition with a high accuracy of 97.81% are demonstrated based on flexible SFeMs. Our work demonstrates the sliding ferroelectric memories and synaptic plasticity on both rigid and flexible substrates, highlighting the great potential of sliding ferroelectrics for emerging technological applications in brain-inspired in-memory computing, edge intelligence and energy-efficient wearable electronics. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 16 pages, 4 figures

arXiv:2401.15654 [pdf, ps, other]

doi 10.1140/epjc/s10052-024-12856-w

Accretion of matter by a Charged dilaton black hole

Authors: Yinan Jia, Tong-Yu He, Wen-Qian Wang, Zhan-Wen Han, Rong-Jia Yang

Abstract: Considering accretion onto a charged dilaton black hole, the fundamental equations governing accretion, general analytic expressions for critical points, critical velocity, critical speed of sound, and ultimately the mass accretion rate are obtained. A new constraint on the dilation parameter coming from string theory is found and the case for polytropic gas is delved into a detailed discussion. I… ▽ More Considering accretion onto a charged dilaton black hole, the fundamental equations governing accretion, general analytic expressions for critical points, critical velocity, critical speed of sound, and ultimately the mass accretion rate are obtained. A new constraint on the dilation parameter coming from string theory is found and the case for polytropic gas is delved into a detailed discussion. It is found that the dialtion and the adiabatic index of accreted material have deep effects on the accretion process. △ Less

Submitted 18 May, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

Comments: 9 pages, 3 figures

Journal ref: Eur.Phys.J.C 84 (2024) 5, 501

arXiv:2401.14589 [pdf]

Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias

Authors: Yu He Ke, Rui Yang, Sui An Lie, Taylor Xin Yi Lim, Hairil Rizal Abdullah, Daniel Shu Wei Ting, Nan Liu

Abstract: Background: Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. Addressing these biases presents a formidable challenge in the medical field. Objective: This study explores the role of large language models (LLMs) in mitigating these biases through the utilization of a multi-agent framework. We simulate the clinical decisi… ▽ More Background: Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. Addressing these biases presents a formidable challenge in the medical field. Objective: This study explores the role of large language models (LLMs) in mitigating these biases through the utilization of a multi-agent framework. We simulate the clinical decision-making processes through multi-agent conversation and evaluate its efficacy in improving diagnostic accuracy. Methods: A total of 16 published and unpublished case reports where cognitive biases have resulted in misdiagnoses were identified from the literature. In the multi-agent framework, we leveraged GPT-4 to facilitate interactions among four simulated agents to replicate clinical team dynamics. Each agent has a distinct role: 1) To make the final diagnosis after considering the discussions, 2) The devil's advocate and correct confirmation and anchoring bias, 3) The tutor and facilitator of the discussion to reduce premature closure bias, and 4) To record and summarize the findings. A total of 80 simulations were evaluated for the accuracy of initial diagnosis, top differential diagnosis and final two differential diagnoses. Results: In a total of 80 responses evaluating both initial and final diagnoses, the initial diagnosis had an accuracy of 0% (0/80), but following multi-agent discussions, the accuracy for the top differential diagnosis increased to 71.3% (57/80), and for the final two differential diagnoses, to 80.0% (64/80). Conclusions: The framework demonstrated an ability to re-evaluate and correct misconceptions, even in scenarios with misleading initial investigations. The LLM-driven multi-agent conversation framework shows promise in enhancing diagnostic accuracy in diagnostically challenging medical scenarios. △ Less

Submitted 12 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: 21 pages, 3 figures

arXiv:2401.13488 [pdf, other]

Fast Inverse Model Transformation: Algebraic Framework for Fast Data Plane Verification

Authors: Shenshen Chen, Jian Luo, Dong Guo, Kai Gao, Yang Richard Yang

Abstract: Data plane verification (DPV) analyzes routing tables and detects routing abnormalities and policy violations during network operation and planning. Thus, it has become an important tool to harden the networking infrastructure and the computing systems building on top. Substantial advancements have been made in the last decade and state-of-the-art DPV systems can achieve sub-us verification for an… ▽ More Data plane verification (DPV) analyzes routing tables and detects routing abnormalities and policy violations during network operation and planning. Thus, it has become an important tool to harden the networking infrastructure and the computing systems building on top. Substantial advancements have been made in the last decade and state-of-the-art DPV systems can achieve sub-us verification for an update of a single forwarding rule. In this paper, we introduce fast inverse model transformation (FIMT), the first theoretical framework to systematically model and analyze centralized DPV systems. FIMT reveals the algebraic structure in the model update process, a key step in fast DPV systems. Thus, it can systematically analyze the correctness of several DPV systems, using algebraic properties. The theory also guides the design and implementation of NeoFlash, a refactored version of Flash with new optimization techniques. Evaluations show that NeoFlash outperforms existing state-of-the-art centralized DPV systems in various datasets and reveal insights to key techniques towards fast DPV. △ Less

Submitted 26 February, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: 12 pages pre-reference

arXiv:2401.13321

Temperature Compensation Method of Fluxgate Sensor Based on Polynomial Fitting

Authors: Rui** Yang, Huan Liu, Jian Ge, Daisuke Chugo, Haobin Dong

Abstract: Fluxgate sensors are widely used in the field of low frequency and weak vector magnetic field measurement because of their good performance, such as high resolution and low power consumption. However, during the long-term continuous observation, the drift errors of the fluxgate sensor will occur due to the variable ambient temperature. This paper proposes a temperature compensation method for flux… ▽ More Fluxgate sensors are widely used in the field of low frequency and weak vector magnetic field measurement because of their good performance, such as high resolution and low power consumption. However, during the long-term continuous observation, the drift errors of the fluxgate sensor will occur due to the variable ambient temperature. This paper proposes a temperature compensation method for fluxgate sensors based on polynomial fitting. First, a physical model of the temperature & fluxgate sensor was established on the COMSOL Multiphysics simulation platform, and the influence of temperature on the measurement performance of the fluxgate sensor was analyzed. Second, according to the existing temperature-magnetic field data, a temperature compensation model of the fluxgate sensor was constructed. And compared it with other temperature compensation method, the result shows that the proposed temperature compensation method is relatively simple and can better achieve real-time compensation for sensor application scenarios. Finally, to verify the effectiveness of the proposed method, numerous laboratory experiments were implemented. The temperature drift is reduced from more than 500 nT before compensation to about 1 nT. The results show that the proposed method has a good temperature compensation effect on the data measured by the fluxgate sensor within a variable temperature background. △ Less

Submitted 14 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

Comments: An error occurred in the model section

arXiv:2401.13298 [pdf, other]

Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models

Authors: Hongzhan Lin, Ziyang Luo, Wei Gao, **g Ma, Bo Wang, Ruichao Yang

Abstract: The age of social media is flooded with Internet memes, necessitating a clear grasp and effective identification of harmful ones. This task presents a significant challenge due to the implicit meaning embedded in memes, which is not explicitly conveyed through the surface text and image. However, existing harmful meme detection methods do not present readable explanations that unveil such implicit… ▽ More The age of social media is flooded with Internet memes, necessitating a clear grasp and effective identification of harmful ones. This task presents a significant challenge due to the implicit meaning embedded in memes, which is not explicitly conveyed through the surface text and image. However, existing harmful meme detection methods do not present readable explanations that unveil such implicit meaning to support their detection decisions. In this paper, we propose an explainable approach to detect harmful memes, achieved through reasoning over conflicting rationales from both harmless and harmful positions. Specifically, inspired by the powerful capacity of Large Language Models (LLMs) on text generation and reasoning, we first elicit multimodal debate between LLMs to generate the explanations derived from the contradictory arguments. Then we propose to fine-tune a small language model as the debate judge for harmfulness inference, to facilitate multimodal fusion between the harmfulness rationales and the intrinsic multimodal information within memes. In this way, our model is empowered to perform dialectical reasoning over intricate and implicit harm-indicative patterns, utilizing multimodal explanations originating from both harmless and harmful arguments. Extensive experiments on three public meme datasets demonstrate that our harmful meme detection approach achieves much better performance than state-of-the-art methods and exhibits a superior capacity for explaining the meme harmfulness of the model predictions. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: The first work towards explainable harmful meme detection by harnessing advanced LLMs

Journal ref: The ACM Web Conference 2024

arXiv:2401.11521 [pdf, other]

Quantum-enhanced Green's function Monte Carlo for excited states of nuclear shell model

Authors: Yongdan Yang, Ruyu Yang, Xiaosi Xu

Abstract: We present a hybrid quantum-classical Green's function Monte Carlo (GFMC) algorithm for estimating the excited states of the nuclear shell model. The conventional GFMC method, widely used to find the ground state of a quantum many-body system, is plagued by the sign problem, which leads to an exponentially increasing variance with the growth of system size and evolution time. This issue is typical… ▽ More We present a hybrid quantum-classical Green's function Monte Carlo (GFMC) algorithm for estimating the excited states of the nuclear shell model. The conventional GFMC method, widely used to find the ground state of a quantum many-body system, is plagued by the sign problem, which leads to an exponentially increasing variance with the growth of system size and evolution time. This issue is typically mitigated by applying classical constraints but at the cost of introducing bias. Our approach uses quantum subspace diagonalization (QSD) on a quantum computer to prepare a quantum trial state, replacing the classical trial state in the GFMC process. We also incorporated a modified classical shadow technique in the implementation of QSD to optimize quantum resource utilization. Besides, we extend our hybrid GFMC algorithm to find the excited states of a given quantum system. Numerical results suggest our method largely enhances accuracy in determining excited state energies, offering an improvement over the conventional method. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: 9 pages, 4 figures

arXiv:2401.06440 [pdf, other]

Limits on the Primordial Black Holes Dark Matter with future MeV detectors

Authors: Zhen Xie, Bing Liu, Jiahao Liu, Yi-Fu Cai, Ruizhi Yang

Abstract: Primordial black holes (PBHs) are a compelling candidate for Dark Matter (DM). There remain significant parameter spaces to be explored despite current astrophysical observations have set strong limits. Utilizing advanced MeV observation instruments, we have statistically established the upper limit of Hawking radiation emitted by PBHs in DM-dense systems, such as galaxy clusters or dwarf galaxies… ▽ More Primordial black holes (PBHs) are a compelling candidate for Dark Matter (DM). There remain significant parameter spaces to be explored despite current astrophysical observations have set strong limits. Utilizing advanced MeV observation instruments, we have statistically established the upper limit of Hawking radiation emitted by PBHs in DM-dense systems, such as galaxy clusters or dwarf galaxies. These results can set a stringent upper limit on the ratio of PBH to DM, expressed as $f_{\rm PBH}$. Our results highlight the efficacy of MeV observations in DM-dense environments. The constraints on $f_{\rm PBH}$ for PBHs in the mass range of $10^{16}-10^{17} ~\rm g$ can be improved significantly compared with the current observations. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: 7 pages, 6 figures, accepted by PRD

arXiv:2401.05607 [pdf, other]

Room-temperature Magnetic Thermal Switching by Suppressing Phonon-Magnon Scattering

Authors: Fanghao Zhang, Lokanath Patra, Yubi Chen, Wenkai Ouyang, Paul Sarte, Shantal Adajian, Xiangying Zuo, Runqing Yang, Tengfei Luo, Bolin Liao

Abstract: Thermal switching materials, whose thermal conductivity can be controlled externally, show great potential in contemporary thermal management. Manipulating thermal transport properties through magnetic fields has been accomplished in materials that exhibit a high magnetoresistance. However, it is generally understood that the lattice thermal conductivity attributed to phonons is not significantly… ▽ More Thermal switching materials, whose thermal conductivity can be controlled externally, show great potential in contemporary thermal management. Manipulating thermal transport properties through magnetic fields has been accomplished in materials that exhibit a high magnetoresistance. However, it is generally understood that the lattice thermal conductivity attributed to phonons is not significantly impacted by the magnetic fields. In this study, we experimentally demonstrate the significant impact of phonon-magnon scattering on the thermal conductivity of the rare-earth metal gadolinium near room temperature, which can be controlled by a magnetic field to realize thermal switching. Using first-principles lattice dynamics and spin-lattice dynamics simulations, we attribute the observed change in phononic thermal conductivity to field-suppressed phonon-magnon scattering. This research suggests that phonon-magnon scattering in ferromagnetic materials is crucial for determining their thermal conductivity, opening the door to innovative magnetic-field-controlled thermal switching materials. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2401.05584 [pdf]

FourCastNeXt: Optimizing FourCastNet Training for Limited Compute

Authors: Edison Guo, Maruf Ahmed, Yue Sun, Rui Yang, Harrison Cook, Tennessee Leeuwenburg, Ben Evans

Abstract: FourCastNeXt is an optimization of FourCastNet - a global machine learning weather forecasting model - that performs with a comparable level of accuracy and can be trained using around 5% of the original FourCastNet computational requirements. This technical report presents strategies for model optimization that maintain similar performance as measured by the root-mean-square error (RMSE) of the m… ▽ More FourCastNeXt is an optimization of FourCastNet - a global machine learning weather forecasting model - that performs with a comparable level of accuracy and can be trained using around 5% of the original FourCastNet computational requirements. This technical report presents strategies for model optimization that maintain similar performance as measured by the root-mean-square error (RMSE) of the modelled variables. By providing a model with very low comparative training costs, FourCastNeXt makes Neural Earth System Modelling much more accessible to researchers looking to conduct training experiments and ablation studies. FourCastNeXt training and inference code are available at https://github.com/nci/FourCastNeXt △ Less

Submitted 20 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

Comments: Major revision. All prior content (text, figures, table) has been updated. Additionally, new text, tables and figures have been added. Updated title. Updated author list

arXiv:2401.04886 [pdf]

Thermodynamics of Ionic Thermoelectrics for Low-Grade Heat Harvesting

Authors: Xin Qian, Zhihao Ma, Qiangqiang Huang, Haoran Jiang, Ronggui Yang

Abstract: More than half of the waste heat rejected into the environment has temperatures lower than 100 $^\circ C$, which accounts for nearly 85 PWh/year worldwide. Efficiently harvesting low-grade heat could be a promising step toward carbon neutrality. Recent developments of ionic thermoelectrics (i-TE) with giant thermopower have provoked intensive interest in using ions as energy and charge carriers fo… ▽ More More than half of the waste heat rejected into the environment has temperatures lower than 100 $^\circ C$, which accounts for nearly 85 PWh/year worldwide. Efficiently harvesting low-grade heat could be a promising step toward carbon neutrality. Recent developments of ionic thermoelectrics (i-TE) with giant thermopower have provoked intensive interest in using ions as energy and charge carriers for efficient thermal energy harvesting. However, current literature primarily focuses on improving thermopower only, while the ion transport and thermodynamics affecting the efficiencies have been largely neglected. This review article clarifies the fundamentals of electrochemistry and thermodynamics for develo** highly efficient i-TE devices. Two major types of i-TE devices, thermo-ionic capacitors (TIC) and thermogalvanic cells (TGC), are discussed in detail. The article analyzes the methods of enhancing ionic thermopower in the literature by taking an entropic point of view. We also derived modified thermoelectric factor Z for both TICs and TGCs that fully incorporate the dynamics of ion transport and electrochemical reactions. Recent developments of hybrid devices showing improved efficiencies, power density, and multifunctionality are reviewed. Finally, we comment on the remaining challenges and provide an outlook on future directions. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: This review article is accepted by ACS Energy Letters. (76 manuscript pages, 23 Figures.)

arXiv:2401.03214 [pdf, other]

Understanding Representation Learnability of Nonlinear Self-Supervised Learning

Authors: Ruofeng Yang, Xiangyuan Li, Bo Jiang, Shuai Li

Abstract: Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks. There are only a few theoretical works on data representation learnability, and many of those focus on final data representation, treating the nonlinear neural network as a ``black box". However, the accurate learning results of neural networks are crucial for describing the data dis… ▽ More Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks. There are only a few theoretical works on data representation learnability, and many of those focus on final data representation, treating the nonlinear neural network as a ``black box". However, the accurate learning results of neural networks are crucial for describing the data distribution features learned by SSL models. Our paper is the first to analyze the learning results of the nonlinear SSL model accurately. We consider a toy data distribution that contains two features: the label-related feature and the hidden feature. Unlike previous linear setting work that depends on closed-form solutions, we use the gradient descent algorithm to train a 1-layer nonlinear SSL model with a certain initialization region and prove that the model converges to a local minimum. Furthermore, different from the complex iterative analysis, we propose a new analysis process which uses the exact version of Inverse Function Theorem to accurately describe the features learned by the local minimum. With this local minimum, we prove that the nonlinear SSL model can capture the label-related feature and hidden feature at the same time. In contrast, the nonlinear supervised learning (SL) model can only learn the label-related feature. We also present the learning processes and results of the nonlinear SSL and SL model via simulation experiments. △ Less

Submitted 6 January, 2024; originally announced January 2024.

arXiv:2401.01523 [pdf, other]

GOAT-Bench: Safety Insights to Large Multimodal Models through Meme-Based Social Abuse

Authors: Hongzhan Lin, Ziyang Luo, Bo Wang, Ruichao Yang, **g Ma

Abstract: The exponential growth of social media has profoundly transformed how information is created, disseminated, and absorbed, exceeding any precedent in the digital age. Regrettably, this explosion has also spawned a significant increase in the online abuse of memes. Evaluating the negative impact of memes is notably challenging, owing to their often subtle and implicit meanings, which are not directl… ▽ More The exponential growth of social media has profoundly transformed how information is created, disseminated, and absorbed, exceeding any precedent in the digital age. Regrettably, this explosion has also spawned a significant increase in the online abuse of memes. Evaluating the negative impact of memes is notably challenging, owing to their often subtle and implicit meanings, which are not directly conveyed through the overt text and imagery. In light of this, large multimodal models (LMMs) have emerged as a focal point of interest due to their remarkable capabilities in handling diverse multimodal tasks. In response to this development, our paper aims to thoroughly examine the capacity of various LMMs (e.g., GPT-4V) to discern and respond to the nuanced aspects of social abuse manifested in memes. We introduce the comprehensive meme benchmark, GOAT-Bench, comprising over 6K varied memes encapsulating themes such as implicit hate speech, sexism, and cyberbullying, etc. Utilizing GOAT-Bench, we delve into the ability of LMMs to accurately assess hatefulness, misogyny, offensiveness, sarcasm, and harmful content. Our extensive experiments across a range of LMMs reveal that current models still exhibit a deficiency in safety awareness, showing insensitivity to various forms of implicit abuse. We posit that this shortfall represents a critical impediment to the realization of safe artificial intelligence. The GOAT-Bench and accompanying resources are publicly accessible at https://goatlmm.github.io/, contributing to ongoing research in this vital field. △ Less

Submitted 1 March, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

Comments: The first work to benchmark Large Multimodal Models in safety insight on social media

arXiv:2401.01045 [pdf, other]

High-dimensional FGM-ResNet modelling of turbulent spray combustion: Effects of evaporation non-adiabacity and scalar correlation

Authors: Dong Wang, Haixiao Wang, Min Zhang, Ruixin Yang, Zhi X. Chen

Abstract: In the stratified or partially premixed piloted jet flames, previous experimental and priori studies have identified a strong correlation between mixture fraction and progress variable. In the framework of large-eddy simulation (LES) and flamelet-generated manifolds (FGM) approach, a joint probability density function (PDF) method is constructed to characterize subgrid correlations. To pave the wa… ▽ More In the stratified or partially premixed piloted jet flames, previous experimental and priori studies have identified a strong correlation between mixture fraction and progress variable. In the framework of large-eddy simulation (LES) and flamelet-generated manifolds (FGM) approach, a joint probability density function (PDF) method is constructed to characterize subgrid correlations. To pave the way for high dimensional tabulation modeling, a deep residual network (ResNet) is trained, dramatically reducing the memory footprint of tabulation. The Message Passing Interface (MPI) shared memory technique is applied to load the original chemical table during parallel computations. Application of LES to a partially pre-vaporized ethanol spray flame demonstrates good agreement with experimental results. Consideration of the subgrid correlation results in a noticeable improvement in temperature prediction. Calculations using ResNet show a notable consistency with those using chemical tables. Visualization of enthalpy highlights the significance of non-adiabatic tabulation in modeling liquid fuel combustion. The unscaled progress variable is selected to better describe the chemical reaction rate in the blending zone of an air stream and a pilot stream with the product of a fully burnt lean fuel mixture. The impact of the source term due to evaporation in the transport equation of the progress variable is validated. The correlation coefficient is found to significantly influence the chemical reaction rate. The subgrid-scale interaction between liquid fuel evaporation and subgrid correlation is elucidated. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2401.00001 [pdf, other]

Sector Rotation by Factor Model and Fundamental Analysis

Authors: Runjia Yang, Beining Shi

Abstract: This study presents an analytical approach to sector rotation, leveraging both factor models and fundamental metrics. We initiate with a systematic classification of sectors, followed by an empirical investigation into their returns. Through factor analysis, the paper underscores the significance of momentum and short-term reversion in dictating sectoral shifts. A subsequent in-depth fundamental a… ▽ More This study presents an analytical approach to sector rotation, leveraging both factor models and fundamental metrics. We initiate with a systematic classification of sectors, followed by an empirical investigation into their returns. Through factor analysis, the paper underscores the significance of momentum and short-term reversion in dictating sectoral shifts. A subsequent in-depth fundamental analysis evaluates metrics such as PE, PB, EV-to-EBITDA, Dividend Yield, among others. Our primary contribution lies in develo** a predictive framework based on these fundamental indicators. The constructed models, post rigorous training, exhibit noteworthy predictive capabilities. The findings furnish a nuanced understanding of sector rotation strategies, with implications for asset management and portfolio construction in the financial domain. △ Less

Submitted 18 November, 2023; originally announced January 2024.

arXiv:2312.17061 [pdf, other]

Bayesian Analysis of High Dimensional Vector Error Correction Model

Authors: Parley R Yang, Alexander Y Shestopaloff

Abstract: Vector Error Correction Model (VECM) is a classic method to analyse cointegration relationships amongst multivariate non-stationary time series. In this paper, we focus on high dimensional setting and seek for sample-size-efficient methodology to determine the level of cointegration. Our investigation centres at a Bayesian approach to analyse the cointegration matrix, henceforth determining the co… ▽ More Vector Error Correction Model (VECM) is a classic method to analyse cointegration relationships amongst multivariate non-stationary time series. In this paper, we focus on high dimensional setting and seek for sample-size-efficient methodology to determine the level of cointegration. Our investigation centres at a Bayesian approach to analyse the cointegration matrix, henceforth determining the cointegration rank. We design two algorithms and implement them on simulated examples, yielding promising results particularly when dealing with high number of variables and relatively low number of observations. Furthermore, we extend this methodology to empirically investigate the constituents of the S&P 500 index, where low-volatility portfolios can be found during both in-sample training and out-of-sample testing periods. △ Less

Submitted 12 March, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

arXiv:2312.16926 [pdf]

Efficient High-Quality Clustering for Large Bipartite Graphs

Authors: Renchi Yang, Jieming Shi

Abstract: A bipartite graph contains inter-set edges between two disjoint vertex sets, and is widely used to model real-world data, such as user-item purchase records, author-article publications, and biological interactions between drugs and proteins. k-Bipartite Graph Clustering (k-BGC) is to partition the target vertex set in a bipartite graph into k disjoint clusters. The clustering quality is important… ▽ More A bipartite graph contains inter-set edges between two disjoint vertex sets, and is widely used to model real-world data, such as user-item purchase records, author-article publications, and biological interactions between drugs and proteins. k-Bipartite Graph Clustering (k-BGC) is to partition the target vertex set in a bipartite graph into k disjoint clusters. The clustering quality is important to the utility of k-BGC in various applications like social network analysis, recommendation systems, text mining, and bioinformatics, to name a few. Existing approaches to k-BGC either output clustering results with compromised quality due to inadequate exploitation of high-order information between vertices, or fail to handle sizable bipartite graphs with billions of edges. Motivated by this, this paper presents two efficient k-BGC solutions, HOPE and HOPE+, which achieve state-of-the-art performance on large-scale bipartite graphs. HOPE obtains high scalability and effectiveness through a new k-BGC problem formulation based on the novel notion of high-order perspective (HOP) vectors and an efficient technique for low-rank approximation of HOP vectors. HOPE+ further elevates the k-BGC performance to another level with a judicious problem transformation and a highly efficient two-stage optimization framework. Two variants, HOPE+ (FNEM) and HOPE+ (SNEM) are designed when either the Frobenius norm or spectral norm is applied in the transformation. Extensive experiments, comparing HOPE and HOPE+ against 13 competitors on 10 real-world datasets, exhibit that our solutions, especially HOPE+, are superior to existing methods in terms of result quality, while being up to orders of magnitude faster. On the largest dataset MAG with 1.1 billion edges, HOPE+ is able to produce clusters with the highest clustering accuracy within 31 minutes, which is unmatched by any existing solution for k-BGC. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: A paper accepted in SIGMOD 2024

arXiv:2312.16387 [pdf, other]

A comprehensive study on the accuracy and generalization of deep learning-generated chemical ODE integrators

Authors: Han Li, Ruixin Yang, Min Zhang, Runze Mao, Zhi X. Chen

Abstract: The application of deep neural networks (DNNs) holds considerable promise as a substitute for the direct integration of chemical source terms in combustion simulations. However, challenges persist in ensuring high precision and generalisation across various different fuels and flow conditions. In this study, we propose and validate a consistent DNN approach for chemistry integration in a range of… ▽ More The application of deep neural networks (DNNs) holds considerable promise as a substitute for the direct integration of chemical source terms in combustion simulations. However, challenges persist in ensuring high precision and generalisation across various different fuels and flow conditions. In this study, we propose and validate a consistent DNN approach for chemistry integration in a range of fuels and premixed flame configurations. This approach generates thermochemical base state from a set of low-dimensional laminar flames, followed by an effective perturbation strategy to enhance the coverage of the composition space for higher generalisation ability. A constraint criterion based on heat release rate is then employed to remove the nonphysical perturbed states for improved accuracy.Without specific tuning, three DNNs are consistently trained for three representative fuels, i.e., hydrogen, ethylene and Jet-A. Comprehensive validations are conducted using 1-D laminar flames and two typical turbulent premixed flames. The DNN model predictions on various physical characteristics, including laminar and turbulent flame speeds, dynamic flame structures influenced by turbulence-chemistry interactions, and conditional scalar profiles, all exhibit good agreement with the results obtained from direct integration. This demonstrates the exceptional accuracy and generalisation ability of the proposed DNN approach. Furthermore, when the DNN is used in the simulation, a significant speed-up for the chemistry integration is achieved, approximately 50 for the ethylene/air flame and 90 for the Jet-A/air flame. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.15742 [pdf, other]

DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

Authors: Li Xiang, Junbo Yin, Wei Li, Cheng-Zhong Xu, Ruigang Yang, Jianbing Shen

Abstract: Vehicle-to-Everything (V2X) collaborative perception has recently gained significant attention due to its capability to enhance scene understanding by integrating information from various agents, e.g., vehicles, and infrastructure. However, current works often treat the information from each agent equally, ignoring the inherent domain gap caused by the utilization of different LiDAR sensors of eac… ▽ More Vehicle-to-Everything (V2X) collaborative perception has recently gained significant attention due to its capability to enhance scene understanding by integrating information from various agents, e.g., vehicles, and infrastructure. However, current works often treat the information from each agent equally, ignoring the inherent domain gap caused by the utilization of different LiDAR sensors of each agent, thus leading to suboptimal performance. In this paper, we propose DI-V2X, that aims to learn Domain-Invariant representations through a new distillation framework to mitigate the domain discrepancy in the context of V2X 3D object detection. DI-V2X comprises three essential components: a domain-mixing instance augmentation (DMA) module, a progressive domain-invariant distillation (PDD) module, and a domain-adaptive fusion (DAF) module. Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation. Next, PDD encourages the student models from different domains to gradually learn a domain-invariant feature representation towards the teacher, where the overlap** regions between agents are employed as guidance to facilitate the distillation process. Furthermore, DAF closes the domain gap between the students by incorporating calibration-aware domain-adaptive attention. Extensive experiments on the challenging DAIR-V2X and V2XSet benchmark datasets demonstrate DI-V2X achieves remarkable performance, outperforming all the previous V2X models. Code is available at https://github.com/Serenos/DI-V2X △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: aaai2024

arXiv:2312.13877 [pdf, other]

A complete continuous-variable quantum computation architecture: from cluster state generation to fault-tolerant accomplishment

Authors: Peilin Du, **g Zhang, Tiancai Zhang, Rongguo Yang, Jiangrui Gao

Abstract: Continuous-variable measurement-based quantum computation, which requires deterministically generated large-scale cluster state, is a promising candidate for practical, scalable, universal, and fault-tolerant quantum computation. In this work, a complete architecture including cluster state preparation, gate implementations, and error correction, is demonstrated. First, a scheme for generating two… ▽ More Continuous-variable measurement-based quantum computation, which requires deterministically generated large-scale cluster state, is a promising candidate for practical, scalable, universal, and fault-tolerant quantum computation. In this work, a complete architecture including cluster state preparation, gate implementations, and error correction, is demonstrated. First, a scheme for generating two-dimensional large-scale continuous-variable cluster state by multiplexing both the temporal and spatial domains is proposed. Then, the corresponding gate implementations for universal quantum computation by gate teleportation are discussed and the actual gate noise from the generated cluster state and Gottesman-Kitaev-Preskill (GKP) state are considered. After that, the quantum error correction can be further achieved by utilizing the square-lattice GKP code. Finally, a fault-tolerent quantum computation can be realized by introducing bias into the square-lattice GKP code (to protect against phase-flips) and concatenating a classical repetition code (to handle the residual bit-flip errors), with a squeezing threshold of 12.3 dB. Our work provides a possible option for a complete fault-tolerent quantum computation architecture in the future. △ Less

Submitted 31 January, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 12 pages,12 figures

arXiv:2312.11908 [pdf, ps, other]

doi 10.1007/JHEP05(2024)115

Stability of the de-Sitter universe: One-loop nonlocal $f(R)$ gravity

Authors: Haiyuan Feng, Yi Liao, Rong-Jia Yang

Abstract: With the method of the background field expansion, we investigate the one-loop quantization of the Euclidean nonlocal $f(R)$ model in the de-Sitter universe. We obtain the ghost-free condition (GFC) based on the transformation from the Jordan frame to the Einstein frame and the classical stability condition (CSC) satisfied $f^{(0)}_{RR}-φ_0F^{(0)}_{RR}<0$. We present the on-shell and off-shell one… ▽ More With the method of the background field expansion, we investigate the one-loop quantization of the Euclidean nonlocal $f(R)$ model in the de-Sitter universe. We obtain the ghost-free condition (GFC) based on the transformation from the Jordan frame to the Einstein frame and the classical stability condition (CSC) satisfied $f^{(0)}_{RR}-φ_0F^{(0)}_{RR}<0$. We present the on-shell and off-shell one-loop effective action and quantum stability condition (QSC) by utilizing the generalized zeta-function. We find that under the fulfillment of GFC, CSC and QSC are inconsistent. △ Less

Submitted 10 May, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

Comments: 26 pages,0 figure

Journal ref: JHEP 05 (2024) 115

arXiv:2312.10917 [pdf, other]

Semi-Supervised Clustering via Structural Entropy with Different Constraints

Authors: Guangjie Zeng, Hao Peng, Angsheng Li, Zhiwei Liu, Runze Yang, Chunyang Liu, Lifang He

Abstract: Semi-supervised clustering techniques have emerged as valuable tools for leveraging prior information in the form of constraints to improve the quality of clustering outcomes. Despite the proliferation of such methods, the ability to seamlessly integrate various types of constraints remains limited. While structural entropy has proven to be a powerful clustering approach with wide-ranging applicat… ▽ More Semi-supervised clustering techniques have emerged as valuable tools for leveraging prior information in the form of constraints to improve the quality of clustering outcomes. Despite the proliferation of such methods, the ability to seamlessly integrate various types of constraints remains limited. While structural entropy has proven to be a powerful clustering approach with wide-ranging applications, it has lacked a variant capable of accommodating these constraints. In this work, we present Semi-supervised clustering via Structural Entropy (SSE), a novel method that can incorporate different types of constraints from diverse sources to perform both partitioning and hierarchical clustering. Specifically, we formulate a uniform view for the commonly used pairwise and label constraints for both types of clustering. Then, we design objectives that incorporate these constraints into structural entropy and develop tailored algorithms for their optimization. We evaluate SSE on nine clustering datasets and compare it with eleven semi-supervised partitioning and hierarchical clustering methods. Experimental results demonstrate the superiority of SSE on clustering accuracy with different types of constraints. Additionally, the functionality of SSE for biological data analysis is demonstrated by cell clustering experiments conducted on four single-cell RNAseq datasets. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: 9 pages, 3 figures, accepted by SDM 2024

arXiv:2312.10317 [pdf, other]

Spatial-Temporal DAG Convolutional Networks for End-to-End Joint Effective Connectivity Learning and Resting-State fMRI Classification

Authors: Rui Yang, Wenrui Dai, Huajun She, Yi** P. Du, Dapeng Wu, Hongkai Xiong

Abstract: Building comprehensive brain connectomes has proved of fundamental importance in resting-state fMRI (rs-fMRI) analysis. Based on the foundation of brain network, spatial-temporal-based graph convolutional networks have dramatically improved the performance of deep learning methods in rs-fMRI time series classification. However, existing works either pre-define the brain network as the correlation… ▽ More Building comprehensive brain connectomes has proved of fundamental importance in resting-state fMRI (rs-fMRI) analysis. Based on the foundation of brain network, spatial-temporal-based graph convolutional networks have dramatically improved the performance of deep learning methods in rs-fMRI time series classification. However, existing works either pre-define the brain network as the correlation matrix derived from the raw time series or jointly learn the connectome and model parameters without any topology constraint. These methods could suffer from degraded classification performance caused by the deviation from the intrinsic brain connectivity and lack biological interpretability of demonstrating the causal structure (i.e., effective connectivity) among brain regions. Moreover, most existing methods for effective connectivity learning are unaware of the downstream classification task and cannot sufficiently exploit useful rs-fMRI label information. To address these issues in an end-to-end manner, we model the brain network as a directed acyclic graph (DAG) to discover direct causal connections between brain regions and propose Spatial-Temporal DAG Convolutional Network (ST-DAGCN) to jointly infer effective connectivity and classify rs-fMRI time series by learning brain representations based on nonlinear structural equation model. The optimization problem is formulated into a continuous program and solved with score-based learning method via gradient descent. We evaluate ST-DAGCN on two public rs-fMRI databases. Experiments show that ST-DAGCN outperforms existing models by evident margins in rs-fMRI classification and simultaneously learns meaningful edges of effective connectivity that help understand brain activity patterns and pathological mechanisms in brain disease. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Accepted by NeurIPS 2023 Temporal Graph Learning Workshop

arXiv:2312.10310 [pdf, other]

scBiGNN: Bilevel Graph Representation Learning for Cell Type Classification from Single-cell RNA Sequencing Data

Authors: Rui Yang, Wenrui Dai, Chenglin Li, Junni Zou, Dapeng Wu, Hongkai Xiong

Abstract: Single-cell RNA sequencing (scRNA-seq) technology provides high-throughput gene expression data to study the cellular heterogeneity and dynamics of complex organisms. Graph neural networks (GNNs) have been widely used for automatic cell type classification, which is a fundamental problem to solve in scRNA-seq analysis. However, existing methods do not sufficiently exploit both gene-gene and cell-c… ▽ More Single-cell RNA sequencing (scRNA-seq) technology provides high-throughput gene expression data to study the cellular heterogeneity and dynamics of complex organisms. Graph neural networks (GNNs) have been widely used for automatic cell type classification, which is a fundamental problem to solve in scRNA-seq analysis. However, existing methods do not sufficiently exploit both gene-gene and cell-cell relationships, and thus the true potential of GNNs is not realized. In this work, we propose a bilevel graph representation learning method, named scBiGNN, to simultaneously mine the relationships at both gene and cell levels for more accurate single-cell classification. Specifically, scBiGNN comprises two GNN modules to identify cell types. A gene-level GNN is established to adaptively learn gene-gene interactions and cell representations via the self-attention mechanism, and a cell-level GNN builds on the cell-cell graph that is constructed from the cell representations generated by the gene-level GNN. To tackle the scalability issue for processing a large number of cells, scBiGNN adopts an Expectation Maximization (EM) framework in which the two modules are alternately trained via the E-step and M-step to learn from each other. Through this interaction, the gene- and cell-level structural information is integrated to gradually enhance the classification performance of both GNN modules. Experiments on benchmark datasets demonstrate that our scBiGNN outperforms a variety of existing methods for cell type classification from scRNA-seq data. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: Accepted by NeurIPS 2023 AI for Science Workshop

arXiv:2312.09225 [pdf, ps, other]

Gaussian Process Regression under Computational and Epistemic Misspecification

Authors: Daniel Sanz-Alonso, Ruiyi Yang

Abstract: Gaussian process regression is a classical kernel method for function estimation and data interpolation. In large data applications, computational costs can be reduced using low-rank or sparse approximations of the kernel. This paper investigates the effect of such kernel approximations on the interpolation error. We introduce a unified framework to analyze Gaussian process regression under import… ▽ More Gaussian process regression is a classical kernel method for function estimation and data interpolation. In large data applications, computational costs can be reduced using low-rank or sparse approximations of the kernel. This paper investigates the effect of such kernel approximations on the interpolation error. We introduce a unified framework to analyze Gaussian process regression under important classes of computational misspecification: Karhunen-Loève expansions that result in low-rank kernel approximations, multiscale wavelet expansions that induce sparsity in the covariance matrix, and finite element representations that induce sparsity in the precision matrix. Our theory also accounts for epistemic misspecification in the choice of kernel parameters. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.07904 [pdf, other]

ForMAX -- a beamline for multiscale and multimodal structural characterization of hierarchical materials

Authors: K. Nygård, S. A. McDonald, J. B. González, V. Haghighat, C. Appel, E. Larsson, R. Ghanbari, M. Viljanen, J. Silva, S. Malki, Y. Li, V. Silva, C. Weninger, F. Engelmann, T. Jeppsson, G. Felcsuti, T. Rosén, K. Gordeyeva, L. D. Söderberg, H. Dierks, Y. Zhang, Z. Yao, R. Yang, E. M. Asimakopoulou, J. K. Rogalinski , et al. (13 additional authors not shown)

Abstract: The ForMAX beamline at the MAX IV Laboratory provides multiscale and multimodal structural characterization of hierarchical materials in the nm to mm range by combining small- and wide-angle x-ray scattering with full-field microtomography. The modular design of the beamline is optimized for easy switching between different experimental modalities. The beamline has a special focus on the developme… ▽ More The ForMAX beamline at the MAX IV Laboratory provides multiscale and multimodal structural characterization of hierarchical materials in the nm to mm range by combining small- and wide-angle x-ray scattering with full-field microtomography. The modular design of the beamline is optimized for easy switching between different experimental modalities. The beamline has a special focus on the development of novel, fibrous materials from forest resources, but it is also well suited for studies within, e.g., food science and biomedical research. △ Less

Submitted 2 February, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: Accepted for publication in J. Synchrotron Rad

arXiv:2312.06934

Toward Real Text Manipulation Detection: New Dataset and New Solution

Authors: Dongliang Luo, Yuliang Liu, Rui Yang, Xian** Liu, Jishen Zeng, Yu Zhou, Xiang Bai

Abstract: With the surge in realistic text tampering, detecting fraudulent text in images has gained prominence for maintaining information security. However, the high costs associated with professional text manipulation and annotation limit the availability of real-world datasets, with most relying on synthetic tampering, which inadequately replicates real-world tampering attributes. To address this issue,… ▽ More With the surge in realistic text tampering, detecting fraudulent text in images has gained prominence for maintaining information security. However, the high costs associated with professional text manipulation and annotation limit the availability of real-world datasets, with most relying on synthetic tampering, which inadequately replicates real-world tampering attributes. To address this issue, we present the Real Text Manipulation (RTM) dataset, encompassing 14,250 text images, which include 5,986 manually and 5,258 automatically tampered images, created using a variety of techniques, alongside 3,006 unaltered text images for evaluating solution stability. Our evaluations indicate that existing methods falter in text forgery detection on the RTM dataset. We propose a robust baseline solution featuring a Consistency-aware Aggregation Hub and a Gated Cross Neighborhood-attention Fusion module for efficient multi-modal information fusion, supplemented by a Tampered-Authentic Contrastive Learning module during training, enriching feature representation distinction. This framework, extendable to other dual-stream architectures, demonstrated notable localization performance improvements of 7.33% and 6.38% on manual and overall manipulations, respectively. Our contributions aim to propel advancements in real-world text tampering detection. Code and dataset will be made available at https://github.com/DrLuo/RTM △ Less

Submitted 23 January, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

Comments: The paper needs to be improved

arXiv:2312.06724 [pdf, other]

doi 10.1145/3485447.3511959

Efficient and Effective Similarity Search over Bipartite Graphs

Authors: Renchi Yang

Abstract: Similarity search over a bipartite graph aims to retrieve from the graph the nodes that are similar to each other, which finds applications in various fields such as online advertising, recommender systems etc. Existing similarity measures either (i) overlook the unique properties of bipartite graphs, or (ii) fail to capture high-order information between nodes accurately, leading to suboptimal re… ▽ More Similarity search over a bipartite graph aims to retrieve from the graph the nodes that are similar to each other, which finds applications in various fields such as online advertising, recommender systems etc. Existing similarity measures either (i) overlook the unique properties of bipartite graphs, or (ii) fail to capture high-order information between nodes accurately, leading to suboptimal result quality. Recently, Hidden Personalized PageRank (HPP) is applied to this problem and found to be more effective compared with prior similarity measures. However, existing solutions for HPP computation incur significant computational costs, rendering it inefficient especially on large graphs. In this paper, we first identify an inherent drawback of HPP and overcome it by proposing bidirectional HPP (BHPP). Then, we formulate similarity search over bipartite graphs as the problem of approximate BHPP computation, and present an efficient solution Approx-BHPP. Specifically, Approx-BHPP offers rigorous theoretical accuracy guarantees with optimal computational complexity by combining deterministic graph traversal with matrix operations in an optimized and non-trivial way. Moreover, our solution achieves significant gain in practical efficiency due to several carefully-designed optimizations. Extensive experiments, comparing BHPP against 8 existing similarity measures over 7 real bipartite graphs, demonstrate the effectiveness of BHPP on query rewriting and item recommendation. Moreover, Approx-BHPP outperforms baseline solutions often by up to orders of magnitude in terms of computational time on both small and large datasets. △ Less

Submitted 10 December, 2023; originally announced December 2023.

Comments: Best Paper Award Nominee in WWW 2022. Fixing the incorrect figures

arXiv:2312.06639 [pdf, other]

Harmonic Mobile Manipulation

Authors: Ruihan Yang, Ye** Kim, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani

Abstract: Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently. However, robots are still impotent in many household tasks requiring coordinated behaviors such as opening doors. The factorization of navigation and manipulation, while effective for some tasks, fails in scenarios requiring coordinated actions. To address this challenge, we… ▽ More Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently. However, robots are still impotent in many household tasks requiring coordinated behaviors such as opening doors. The factorization of navigation and manipulation, while effective for some tasks, fails in scenarios requiring coordinated actions. To address this challenge, we introduce, HarmonicMM, an end-to-end learning method that optimizes both navigation and manipulation, showing notable improvement over existing techniques in everyday tasks. This approach is validated in simulated and real-world environments and adapts to novel unseen settings without additional tuning. Our contributions include a new benchmark for mobile manipulation and the successful deployment in a real unseen apartment, demonstrating the potential for practical indoor robot deployment in daily life. More results are on our project site: https://rchalyang.github.io/HarmonicMM/ △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: More results are on our project site: https://rchalyang.github.io/HarmonicMM/

arXiv:2312.06331 [pdf, other]

Semantic Connectivity-Driven Pseudo-labeling for Cross-domain Segmentation

Authors: Dong Zhao, Ruizhi Yang, Shuang Wang, Qi Zang, Yang Hu, Licheng Jiao, Nicu Sebe, Zhun Zhong

Abstract: Presently, self-training stands as a prevailing approach in cross-domain semantic segmentation, enhancing model efficacy by training with pixels assigned with reliable pseudo-labels. However, we find two critical limitations in this paradigm. (1) The majority of reliable pixels exhibit a speckle-shaped pattern and are primarily located in the central semantic region. This presents challenges for t… ▽ More Presently, self-training stands as a prevailing approach in cross-domain semantic segmentation, enhancing model efficacy by training with pixels assigned with reliable pseudo-labels. However, we find two critical limitations in this paradigm. (1) The majority of reliable pixels exhibit a speckle-shaped pattern and are primarily located in the central semantic region. This presents challenges for the model in accurately learning semantics. (2) Category noise in speckle pixels is difficult to locate and correct, leading to error accumulation in self-training. To address these limitations, we propose a novel approach called Semantic Connectivity-driven pseudo-labeling (SeCo). This approach formulates pseudo-labels at the connectivity level and thus can facilitate learning structured and low-noise semantics. Specifically, SeCo comprises two key components: Pixel Semantic Aggregation (PSA) and Semantic Connectivity Correction (SCC). Initially, PSA divides semantics into 'stuff' and 'things' categories and aggregates speckled pseudo-labels into semantic connectivity through efficient interaction with the Segment Anything Model (SAM). This enables us not only to obtain accurate boundaries but also simplifies noise localization. Subsequently, SCC introduces a simple connectivity classification task, which enables locating and correcting connectivity noise with the guidance of loss distribution. Extensive experiments demonstrate that SeCo can be flexibly applied to various cross-domain semantic segmentation tasks, including traditional unsupervised, source-free, and black-box domain adaptation, significantly improving the performance of existing state-of-the-art methods. The code is available at https://github.com/DZhaoXd/SeCo. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.06123 [pdf, other]

doi 10.1145/3588696

Efficient Estimation of Pairwise Effective Resistance

Authors: Renchi Yang, **g Tang

Abstract: Given an undirected graph G, the effective resistance r(s,t) measures the dissimilarity of node pair s,t in G, which finds numerous applications in real-world problems, such as recommender systems, combinatorial optimization, molecular chemistry, and electric power networks. Existing techniques towards pairwise effective resistance estimation either trade approximation guarantees for practical eff… ▽ More Given an undirected graph G, the effective resistance r(s,t) measures the dissimilarity of node pair s,t in G, which finds numerous applications in real-world problems, such as recommender systems, combinatorial optimization, molecular chemistry, and electric power networks. Existing techniques towards pairwise effective resistance estimation either trade approximation guarantees for practical efficiency, or vice versa. In particular, the state-of-the-art solution is based on a multitude of Monte Carlo random walks, rendering it rather inefficient in practice, especially on large graphs. Motivated by this, this paper first presents an improved Monte Carlo approach, AMC, which reduces both the length and amount of random walks required without degrading the theoretical accuracy guarantee, through careful theoretical analysis and an adaptive sampling scheme. Further, we develop a greedy approach, GEER, which combines AMC with sparse matrix-vector multiplications in an optimized and non-trivial way. GEER offers significantly improved practical efficiency over AMC without compromising its asymptotic performance and accuracy guarantees. Extensive experiments on multiple benchmark datasets reveal that GEER is orders of magnitude faster than the state of the art in terms of computational time when achieving the same accuracy. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: A paper published in SIGMOD 2023

arXiv:2312.06071 [pdf, other]

Precipitation Downscaling with Spatiotemporal Video Diffusion

Authors: Prakhar Srivastava, Ruihan Yang, Gavin Kerrigan, Gideon Dresdner, Jeremy McGibbon, Christopher Bretherton, Stephan Mandt

Abstract: In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require… ▽ More In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require capturing the accurate conditional distribution of high-resolution given low-resolution patterns to assure reliable ensemble averages and unbiased estimates of extreme events, such as heavy rain. This work extends recent video diffusion models to precipitation super-resolution, employing a deterministic downscaler followed by a temporally-conditioned diffusion model to capture noise characteristics and high-frequency patterns. We test our approach on FV3GFS output, an established large-scale global atmosphere model, and compare it against six state-of-the-art baselines. Our analysis, capturing CRPS, MSE, precipitation distributions, and qualitative aspects using California and the Himalayas as examples, establishes our method as a new standard for data-driven precipitation downscaling. △ Less

Submitted 20 June, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

arXiv:2312.05912 [pdf, other]

Circular objects do not melt the slowest in water

Authors: Rui Yang, Thijs van den Ham, Roberto Verzicco, Detlef Lohse, Sander G. Huisman

Abstract: We report on the melting dynamics of ice suspended in fresh water and subject to natural convective flows. Using direct numerical simulations we investigate the melt rate of ellipsoidal objects for $2.32\times 10^4 \leq \text{Ra} \leq 7.61\times 10^8$, where \text{Ra} is the Rayleigh number defined with the temperature difference between the ice and the surrounding water. We reveal that the system… ▽ More We report on the melting dynamics of ice suspended in fresh water and subject to natural convective flows. Using direct numerical simulations we investigate the melt rate of ellipsoidal objects for $2.32\times 10^4 \leq \text{Ra} \leq 7.61\times 10^8$, where \text{Ra} is the Rayleigh number defined with the temperature difference between the ice and the surrounding water. We reveal that the system exhibits non-monotonic behavior in three control parameters. As a function of the aspect ratio of the ellipsoidal, the melting time shows a distinct minimum that is different from a disk which has the minimum perimeter. Furthermore, also with \text{Ra} the system shows a non-monotonic trend, since for large \text{Ra} and large aspect ratio the flow separates, leading to distinctly different dynamics. Lastly, since the density of water is non-monotonic with temperature, the melt rate depends non-monotonically also on the ambient temperature, as for intermediate temperatures ($\unit{4}{\celsius}$--$\unit{7}{\celsius}$) the flow is (partially) reversed. In general, the shape which melts the slowest is quite distinct from that of a disk. △ Less

Submitted 10 December, 2023; originally announced December 2023.

Showing 101–150 of 1,309 results for author: Yang, R