Search | arXiv e-print repository

Bunny-VisionPro: Real-Time Bimanual Dexterous Teleoperation for Imitation Learning

Authors: Runyu Ding, Yuzhe Qin, Jiyue Zhu, Chengzhe Jia, Shiqi Yang, Ruihan Yang, Xiaojuan Qi, Xiaolong Wang

Abstract: Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-bas… ▽ More Teleoperation is a crucial tool for collecting human demonstrations, but controlling robots with bimanual dexterous hands remains a challenge. Existing teleoperation systems struggle to handle the complexity of coordinating two hands for intricate manipulations. We introduce Bunny-VisionPro, a real-time bimanual dexterous teleoperation system that leverages a VR headset. Unlike previous vision-based teleoperation systems, we design novel low-cost devices to provide haptic feedback to the operator, enhancing immersion. Our system prioritizes safety by incorporating collision and singularity avoidance while maintaining real-time performance through innovative designs. Bunny-VisionPro outperforms prior systems on a standard task suite, achieving higher success rates and reduced task completion times. Moreover, the high-quality teleoperation demonstrations improve downstream imitation learning performance, leading to better generalizability. Notably, Bunny-VisionPro enables imitation learning with challenging multi-stage, long-horizon dexterous manipulation tasks, which have rarely been addressed in previous work. Our system's ability to handle bimanual manipulations while prioritizing safety and real-time performance makes it a powerful tool for advancing dexterous manipulation and imitation learning. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: project page: https://dingry.github.io/projects/bunny_visionpro.html

arXiv:2406.19414 [pdf, other]

Stock Volume Forecasting with Advanced Information by Conditional Variational Auto-Encoder

Authors: Parley R Yang, Alexander Y Shestopaloff

Abstract: We demonstrate the use of Conditional Variational Encoder (CVAE) to improve the forecasts of daily stock volume time series in both short and long term forecasting tasks, with the use of advanced information of input variables such as rebalancing dates. CVAE generates non-linear time series as out-of-sample forecasts, which have better accuracy and closer fit of correlation to the actual data, com… ▽ More We demonstrate the use of Conditional Variational Encoder (CVAE) to improve the forecasts of daily stock volume time series in both short and long term forecasting tasks, with the use of advanced information of input variables such as rebalancing dates. CVAE generates non-linear time series as out-of-sample forecasts, which have better accuracy and closer fit of correlation to the actual data, compared to traditional linear models. These generative forecasts can also be used for scenario generation, which aids interpretation. We further discuss correlations in non-stationary time series and other potential extensions from the CVAE forecasts. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.19034 [pdf, other]

Extended GeV $γ$-ray emission around the star forming region of the W3 complex

Authors: Qihang Wu, Xiaona Sun, Ruizhi Yang, Tingting Ge, Yunfeng Liang, Enwei Liang

Abstract: We analyze the GeV $γ$-ray emission from the W3 complex using about 14 years of Pass 8 data recorded by the $\it Fermi$ Large Area Telescope (\textit{Fermi}-LAT). We resolve the $γ$-ray emissions around W3 into two components: an elliptical Gaussian overlap** with the molecular gas and a point-like source near the cluster W3 Main. The pion-bump feature of SED for the elliptical Gaussian together… ▽ More We analyze the GeV $γ$-ray emission from the W3 complex using about 14 years of Pass 8 data recorded by the $\it Fermi$ Large Area Telescope (\textit{Fermi}-LAT). We resolve the $γ$-ray emissions around W3 into two components: an elliptical Gaussian overlap** with the molecular gas and a point-like source near the cluster W3 Main. The pion-bump feature of SED for the elliptical Gaussian together with the better fitting result of pion decay model favor the hadronic origin. We further argue that the cosmic rays (CRs) could originate from the interactions between cluster winds and the shock produced by the SNR HB3. The point-like source positionally coincident with the star cluster W3 Main indicates it may be directly powered by near clusters, while its fainter $γ$-ray emissions below 10 GeV is possibly due to the shelter from dense gas making the low-energy CRs incapable of penetrating the dense materials. Meanwhile, we cannot rule out that the $γ$-ray emissions originate from the interaction of accelerated protons in SNR with the ambient gas. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.17624 [pdf, other]

Self-assessment, Exhibition, and Recognition: a Review of Personality in Large Language Models

Authors: Zhiyuan Wen, Yu Yang, Jiannong Cao, Haoming Sun, Ruosong Yang, Shuaiqi Liu

Abstract: As large language models (LLMs) appear to behave increasingly human-like in text-based interactions, more and more researchers become interested in investigating personality in LLMs. However, the diversity of psychological personality research and the rapid development of LLMs have led to a broad yet fragmented landscape of studies in this interdisciplinary field. Extensive studies across differen… ▽ More As large language models (LLMs) appear to behave increasingly human-like in text-based interactions, more and more researchers become interested in investigating personality in LLMs. However, the diversity of psychological personality research and the rapid development of LLMs have led to a broad yet fragmented landscape of studies in this interdisciplinary field. Extensive studies across different research focuses, different personality psychometrics, and different LLMs make it challenging to have a holistic overview and further pose difficulties in applying findings to real-world applications. In this paper, we present a comprehensive review by categorizing current studies into three research problems: self-assessment, exhibition, and recognition, based on the intrinsic characteristics and external manifestations of personality in LLMs. For each problem, we provide a thorough analysis and conduct in-depth comparisons of their corresponding solutions. Besides, we summarize research findings and open challenges from current studies and further discuss their underlying causes. We also collect extensive publicly available resources to facilitate interested researchers and developers. Lastly, we discuss the potential future research directions and application scenarios. Our paper is the first comprehensive survey of up-to-date literature on personality in LLMs. By presenting a clear taxonomy, in-depth analysis, promising future directions, and extensive resource collections, we aim to provide a better understanding and facilitate further advancements in this emerging field. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17274 [pdf, other]

Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

Authors: Jianfeng He, Runing Yang, Linlin Yu, Changbin Li, Ruoxi Jia, Feng Chen, Ming **, Chang-Tien Lu

Abstract: Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncerta… ▽ More Text summarization, a key natural language generation (NLG) task, is vital in various domains. However, the high cost of inaccurate summaries in risk-critical applications, particularly those involving human-in-the-loop decision-making, raises concerns about the reliability of uncertainty estimation on text summarization (UE-TS) evaluation methods. This concern stems from the dependency of uncertainty model metrics on diverse and potentially conflicting NLG metrics. To address this issue, we introduce a comprehensive UE-TS benchmark incorporating 31 NLG metrics across four dimensions. The benchmark evaluates the uncertainty estimation capabilities of two large language models and one pre-trained language model on three datasets, with human-annotation analysis incorporated where applicable. We also assess the performance of 14 common uncertainty estimation methods within this benchmark. Our findings emphasize the importance of considering multiple uncorrelated NLG metrics and diverse uncertainty estimation methods to ensure reliable and efficient evaluation of UE-TS techniques. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 63 pages, 41 figures, 11 tables

arXiv:2406.14128 [pdf, ps, other]

Identifying Three New AGNs Among Fermi Unidentified Gigaelectronvolt Sources

Authors: Shunhao Ji, Zhongxiang Wang, Qiangmeng Huang, Ruoheng Yang

Abstract: We report our identification of three gigaelectronvolt $γ$-ray sources, 4FGL J0502.6+0036, 4FGL J1055.9+6507, and 4FGL J1708.2+5519, as Active Galactic Nuclei (AGNs). They are listed in the latest Fermi-LAT source catalog as unidentified ones. We find that the sources all showed $γ$-ray flux variations in recent years. Using different survey catalogs, we are able to find a radio source within the… ▽ More We report our identification of three gigaelectronvolt $γ$-ray sources, 4FGL J0502.6+0036, 4FGL J1055.9+6507, and 4FGL J1708.2+5519, as Active Galactic Nuclei (AGNs). They are listed in the latest Fermi-LAT source catalog as unidentified ones. We find that the sources all showed $γ$-ray flux variations in recent years. Using different survey catalogs, we are able to find a radio source within the error circle of each source's position. Further analysis of optical sources in the fields allows us to determine the optical counterparts, which showed similar variation patterns to those seen in $γ$-rays. The optical counterparts have reported redshifts of 0.6, 1.5, and 2.3, respectively, estimated from photometric measurements. In addition, we also obtain an X-ray spectrum of 4FGL J0502.6+0036 and a flux upper limit on the X-ray emission of 4FGL J1055.9+6507 by analyzing the archival data. The broadband spectral energy distributions of the three sources from radio to $γ$-rays are constructed. Comparing mainly the $γ$-ray properties of the three sources with those of different sub-classes of AGNs, we tentatively identify them as blazars. Followup optical spectroscopy is highly warranted for obtaining their spectral features and thus verifying the identification. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 15 pages, 7 figures, 2 tables, accepted to be published in RAA

arXiv:2406.13369 [pdf, other]

Effective Edge-wise Representation Learning in Edge-Attributed Bipartite Graphs

Authors: Hewen Wang, Renchi Yang, Xiaokui Xiao

Abstract: Graph representation learning (GRL) is to encode graph elements into informative vector representations, which can be used in downstream tasks for analyzing graph-structured data and has seen extensive applications in various domains. However, the majority of extant studies on GRL are geared towards generating node representations, which cannot be readily employed to perform edge-based analytics t… ▽ More Graph representation learning (GRL) is to encode graph elements into informative vector representations, which can be used in downstream tasks for analyzing graph-structured data and has seen extensive applications in various domains. However, the majority of extant studies on GRL are geared towards generating node representations, which cannot be readily employed to perform edge-based analytics tasks in edge-attributed bipartite graphs (EABGs) that pervade the real world, e.g., spam review detection in customer-product reviews and identifying fraudulent transactions in user-merchant networks. Compared to node-wise GRL, learning edge representations (ERL) on such graphs is challenging due to the need to incorporate the structure and attribute semantics from the perspective of edges while considering the separate influence of two heterogeneous node sets U and V in bipartite graphs. To our knowledge, despite its importance, limited research has been devoted to this frontier, and existing workarounds all suffer from sub-par results. Motivated by this, this paper designs EAGLE, an effective ERL method for EABGs. Building on an in-depth and rigorous theoretical analysis, we propose the factorized feature propagation (FFP) scheme for edge representations with adequate incorporation of long-range dependencies of edges/features without incurring tremendous computation overheads. We further ameliorate FFP as a dual-view FFP by taking into account the influences from nodes in U and V severally in ERL. Extensive experiments on 5 real datasets showcase the effectiveness of the proposed EAGLE models in semi-supervised edge classification tasks. In particular, EAGLE can attain a considerable gain of at most 38.11% in AP and 1.86% in AUC when compared to the best baselines. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 11 pages. Full version of the research paper accepted to KDD 2024

arXiv:2406.12556 [pdf, other]

Towards Deep Application-Network Integration: Architectures, Progress and Opportunities

Authors: Berta Serracanta, Kai Gao, Jordi Ros-Giralt, Alberto Rodriguez-Natal, Luis M. Contreras, Richard Yang, Albert Cabellos

Abstract: With the rise of a new generation of applications (e.g., virtual and augmented reality, artificial intelligence, etc) demanding stringent performance requirements, the need for networking solutions and architectures that can enable a higher Quality of Experience (QoE) is becoming increasingly important. While jointly optimizing application and network may increase the applications' QoE and simul… ▽ More With the rise of a new generation of applications (e.g., virtual and augmented reality, artificial intelligence, etc) demanding stringent performance requirements, the need for networking solutions and architectures that can enable a higher Quality of Experience (QoE) is becoming increasingly important. While jointly optimizing application and network may increase the applications' QoE and simultaneously improve the utilization of network resources, such a paradigm has had limited success in real production networks. However, with the combination of revolutionary trends in (1) compute processing demands, (2) networking capabilities, and (3) sustainable business models, it is high time the community explores the full potential of deeper integration between application and network. In this paper, recent trends observed over the past few years are systematically reviewed. These include the paradigm shift in modern communication services towards computing-driven applications, such as on-site AI training, advances in programmable network technologies like Software Defined Networking (SDN), and new business models incentivizing collaboration and cooperation between parties. Following this, successful scenarios that benefit from various forms of deeper network-application integration are reported, highlighting their considerable potential. A unified framework is then introduced, providing an overview of possible architecture paradigms for network-application integration and bringing awareness to existing abstractions, mechanisms, tools, and their potential combinations. The paper concludes with a discussion of several remaining challenges in building practical network-application integrated systems. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.12449 [pdf]

Retrieval-Augmented Generation for Generative Artificial Intelligence in Medicine

Authors: Rui Yang, Yilin Ning, Emilia Keppo, Mingxuan Liu, Chuan Hong, Danielle S Bitterman, Jasmine Chiat Ling Ong, Daniel Shu Wei Ting, Nan Liu

Abstract: Generative artificial intelligence (AI) has brought revolutionary innovations in various fields, including medicine. However, it also exhibits limitations. In response, retrieval-augmented generation (RAG) provides a potential solution, enabling models to generate more accurate contents by leveraging the retrieval of external knowledge. With the rapid advancement of generative AI, RAG can pave the… ▽ More Generative artificial intelligence (AI) has brought revolutionary innovations in various fields, including medicine. However, it also exhibits limitations. In response, retrieval-augmented generation (RAG) provides a potential solution, enabling models to generate more accurate contents by leveraging the retrieval of external knowledge. With the rapid advancement of generative AI, RAG can pave the way for connecting this transformative technology with medical applications and is expected to bring innovations in equity, reliability, and personalization to health care. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.12367 [pdf, other]

Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines

Authors: Honglei Zhang, Jukka I. Ahonen, Nam Le, Ruiying Yang, Francesco Cricri

Abstract: This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image codec into a codec suitable for machine vision tasks. By observing that artifacts produced by video/image codecs are content-dependent, we propose a novel training strategy based on competitive learning principles. This strategy assigns training samples to filte… ▽ More This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image codec into a codec suitable for machine vision tasks. By observing that artifacts produced by video/image codecs are content-dependent, we propose a novel training strategy based on competitive learning principles. This strategy assigns training samples to filters dynamically, in a fuzzy manner, which further optimizes the winning filter on the given sample. Inspired by simulated annealing optimization techniques, we employ a softmax function with a temperature variable as the weight allocation function to mitigate the effects of random initialization. Our evaluation, conducted on a system utilizing multiple post-processing filters within a Versatile Video Coding (VVC) codec framework, demonstrates the superiority of content-specific filters trained with our proposed strategies, specifically, when images are processed in blocks. Using VVC reference software VTM 12.0 as the anchor, experiments on the OpenImages dataset show an improvement in the BD-rate reduction from -41.3% and -44.6% to -42.3% and -44.7% for object detection and instance segmentation tasks, respectively, compared to independently trained filters. The statistics of the filter usage align with our hypothesis and underscore the importance of jointly optimizing filters for both content and reconstruction quality. Our findings pave the way for further improving the performance of video/image codecs. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Accepted to be preseneted in ICIP 2024

arXiv:2406.12061 [pdf, ps, other]

doi 10.13140/RG.2.2.30051.16163

Pluriharmonic solutions to Yang-Mills equations: a $C^*$-algebras approach

Authors: Marius Beceanu, Sachin Munshi, Rongwei Yang

Abstract: This partially expository paper provides a view of Yang-Mills equations from the perspective of complex variables, operator theory, and $C^{*}$-algebras. Through operator-valued pluriharmonic and skew-Hermitian differential forms, it constructs a new class of instanton solutions. Furthermore, it provides a complex variable version of the Yang-Mills Lagrangian and the Belavin-Polyakov-Schwartz-Tyup… ▽ More This partially expository paper provides a view of Yang-Mills equations from the perspective of complex variables, operator theory, and $C^{*}$-algebras. Through operator-valued pluriharmonic and skew-Hermitian differential forms, it constructs a new class of instanton solutions. Furthermore, it provides a complex variable version of the Yang-Mills Lagrangian and the Belavin-Polyakov-Schwartz-Tyupkin instanton. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.12053 [pdf, other]

InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal States

Authors: Mohammad Beigi, Ying Shen, Runing Yang, Zihao Lin, Qifan Wang, Ankith Mohan, Jianfeng He, Ming **, Chang-Tien Lu, Lifu Huang

Abstract: Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention st… ▽ More Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention states, feed-forward states, and activation states of all layers. Unlike existing methods that primarily focus on the final activation state, InternalInspector conducts a comprehensive analysis across all internal states of every layer to accurately identify both correct and incorrect prediction processes. By benchmarking InternalInspector against existing confidence estimation methods across various natural language understanding and generation tasks, including factual question answering, commonsense reasoning, and reading comprehension, InternalInspector achieves significantly higher accuracy in aligning the estimated confidence scores with the correctness of the LLM's predictions and lower calibration error. Furthermore, InternalInspector excels at HaluEval, a hallucination detection benchmark, outperforming other internal-based confidence estimation methods in this task. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 8 pages

arXiv:2406.10983 [pdf, ps, other]

Expressibility of linear combination of ansatz circuits

Authors: Peng Wang, Ruyu Yang

Abstract: Variational Quantum Eigensolver is considered promising for medium-scale noisy quantum computers. Expressibility is an important metric for measuring the capability of a variational quantum Ansatz circuit. A commonly used method to increase expressibility is to increase the circuit depth. However, increasing the circuit depth also introduces more noise. We propose to use a linear combination of an… ▽ More Variational Quantum Eigensolver is considered promising for medium-scale noisy quantum computers. Expressibility is an important metric for measuring the capability of a variational quantum Ansatz circuit. A commonly used method to increase expressibility is to increase the circuit depth. However, increasing the circuit depth also introduces more noise. We propose to use a linear combination of ansatzes to improve the expressibility of variational circuits, thus avoiding the increase of circuit depth. Concurrently, we introduce a novel measurement strategy that circumvents the necessity for the Hadamard test, thereby significantly diminishing the reliance on two-qubit gates, which are presently the predominant contributors to quantum noise. We also provide a corresponding gradient calculation method, which makes it convenient to update the parameters. Compared with the method of increasing the circuit depth, our method of improving expressibility is more practical. Numerical simulations demonstrate the effectiveness of our method. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: 10pages, 9figures

arXiv:2406.10216 [pdf, other]

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

Authors: Rui Yang, Ruomeng Ding, Yong Lin, Huan Zhang, Tong Zhang

Abstract: Reward models trained on human preference data have been proven to be effective for aligning Large Language Models (LLMs) with human intent within the reinforcement learning from human feedback (RLHF) framework. However, the generalization capabilities of current reward models to unseen prompts and responses are limited. This limitation can lead to an unexpected phenomenon known as reward over-opt… ▽ More Reward models trained on human preference data have been proven to be effective for aligning Large Language Models (LLMs) with human intent within the reinforcement learning from human feedback (RLHF) framework. However, the generalization capabilities of current reward models to unseen prompts and responses are limited. This limitation can lead to an unexpected phenomenon known as reward over-optimization, where excessive optimization of rewards results in a decline in actual performance. While previous research has advocated for constraining policy optimization, our study proposes a novel approach to enhance the reward model's generalization ability against distribution shifts by regularizing the hidden states. Specifically, we retain the base model's language model head and incorporate a suite of text-generation losses to preserve the hidden states' text generation capabilities, while concurrently learning a reward head behind the same hidden states. Our experimental results demonstrate that the introduced regularization technique markedly improves the accuracy of learned reward models across a variety of out-of-distribution (OOD) tasks and effectively alleviate the over-optimization issue in RLHF, offering a more reliable and robust preference learning paradigm. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 21 pages

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.08301 [pdf, other]

Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.07801 [pdf, other]

PolySpeech: Exploring Unified Multitask Speech Models for Competitiveness with Single-task Models

Authors: Runyan Yang, Huibao Yang, Xiqing Zhang, Tiantian Ye, Ying Liu, Yingying Gao, Shilei Zhang, Chao Deng, Junlan Feng

Abstract: Recently, there have been attempts to integrate various speech processing tasks into a unified model. However, few previous works directly demonstrated that joint optimization of diverse tasks in multitask speech models has positive influence on the performance of individual tasks. In this paper we present a multitask speech model -- PolySpeech, which supports speech recognition, speech synthesis,… ▽ More Recently, there have been attempts to integrate various speech processing tasks into a unified model. However, few previous works directly demonstrated that joint optimization of diverse tasks in multitask speech models has positive influence on the performance of individual tasks. In this paper we present a multitask speech model -- PolySpeech, which supports speech recognition, speech synthesis, and two speech classification tasks. PolySpeech takes multi-modal language model as its core structure and uses semantic representations as speech inputs. We introduce semantic speech embedding tokenization and speech reconstruction methods to PolySpeech, enabling efficient generation of high-quality speech for any given speaker. PolySpeech shows competitiveness across various tasks compared to single-task models. In our experiments, multitask optimization achieves performance comparable to single-task optimization and is especially beneficial for specific tasks. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 5 pages, 2 figures

arXiv:2406.05521 [pdf, other]

Gaiotto Conjecture for $\mathrm{Rep}_q(\mathrm{F}(4))$

Authors: Michael Finkelberg, Roman Travkin, Ruotao Yang

Abstract: This paper is a part of the series proving the Gaiotto conjecture for basic classical quantum supergroups. The previous part arXiv:2107.02653 [math.RT] , arXiv:2306.09556 [math.RT], proved the Gaiotto conjecture for the general linear quantum supergroups $U_q(\mathfrak{gl}(N|M))$. Here we deal with the exceptional quantum supergroup $U_q(\mathfrak{f}(4))$. This paper is a part of the series proving the Gaiotto conjecture for basic classical quantum supergroups. The previous part arXiv:2107.02653 [math.RT] , arXiv:2306.09556 [math.RT], proved the Gaiotto conjecture for the general linear quantum supergroups $U_q(\mathfrak{gl}(N|M))$. Here we deal with the exceptional quantum supergroup $U_q(\mathfrak{f}(4))$. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: Comments welcome! 50 pages

arXiv:2406.05482 [pdf, other]

Efficient Topology-aware Data Augmentation for High-Degree Graph Neural Networks

Authors: Yurui Lai, Xiaoyang Lin, Renchi Yang, Hongtao Wang

Abstract: In recent years, graph neural networks (GNNs) have emerged as a potent tool for learning on graph-structured data and won fruitful successes in varied fields. The majority of GNNs follow the message-passing paradigm, where representations of each node are learned by recursively aggregating features of its neighbors. However, this mechanism brings severe over-smoothing and efficiency issues over hi… ▽ More In recent years, graph neural networks (GNNs) have emerged as a potent tool for learning on graph-structured data and won fruitful successes in varied fields. The majority of GNNs follow the message-passing paradigm, where representations of each node are learned by recursively aggregating features of its neighbors. However, this mechanism brings severe over-smoothing and efficiency issues over high-degree graphs (HDGs), wherein most nodes have dozens (or even hundreds) of neighbors, such as social networks, transaction graphs, power grids, etc. Additionally, such graphs usually encompass rich and complex structure semantics, which are hard to capture merely by feature aggregations in GNNs. Motivated by the above limitations, we propose TADA, an efficient and effective front-mounted data augmentation framework for GNNs on HDGs. Under the hood, TADA includes two key modules: (i) feature expansion with structure embeddings, and (ii) topology- and attribute-aware graph sparsification. The former obtains augmented node features and enhanced model capacity by encoding the graph structure into high-quality structure embeddings with our highly-efficient sketching method. Further, by exploiting task-relevant features extracted from graph structures and attributes, the second module enables the accurate identification and reduction of numerous redundant/noisy edges from the input graph, thereby alleviating over-smoothing and facilitating faster feature aggregations over HDGs. Empirically, TADA considerably improves the predictive performance of mainstream GNN models on 8 real homophilic/heterophilic HDGs in terms of node classification, while achieving efficient training and inference processes. △ Less

Submitted 17 June, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

Comments: This is the technical report for the paper accepted to KDD 2024. 16 pages

arXiv:2406.04784 [pdf, other]

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Authors: Ruihan Yang, Jiangjie Chen, Yikai Zhang, Siyu Yuan, Aili Chen, Kyle Richardson, Yanghua Xiao, Deqing Yang

Abstract: Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming. However, these agents often face challenges in achieving high-level goals without detailed instructions and in adapting to environments where feedback is delayed. In this paper, we present SelfGoal, a novel automatic approach designed to enhance agen… ▽ More Language agents powered by large language models (LLMs) are increasingly valuable as decision-making tools in domains such as gaming and programming. However, these agents often face challenges in achieving high-level goals without detailed instructions and in adapting to environments where feedback is delayed. In this paper, we present SelfGoal, a novel automatic approach designed to enhance agents' capabilities to achieve high-level goals with limited human prior and environmental feedback. The core concept of SelfGoal involves adaptively breaking down a high-level goal into a tree structure of more practical subgoals during the interaction with environments while identifying the most useful subgoals and progressively updating this structure. Experimental results demonstrate that SelfGoal significantly enhances the performance of language agents across various tasks, including competitive, cooperative, and deferred feedback environments. Project page: https://selfgoal-agent.github.io. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Preprint

arXiv:2406.03691 [pdf, other]

Precise measurement of pion-bump structure using future MeV gamma-ray detectors

Authors: Jiahao Liu, Bing Liu, Ruizhi Yang

Abstract: The pion-bump structure in the gamma-ray spectrum is a direct proof for the hadronic origin of the gamma rays, and thus the decisive evidence for the acceleration of hadronic cosmic rays in astrophysical objects. However, the identification of such a spectral feature is limited by the resolution and energy coverage of current gamma-ray instruments. Furthermore, there are unavoidable bremsstrahlung… ▽ More The pion-bump structure in the gamma-ray spectrum is a direct proof for the hadronic origin of the gamma rays, and thus the decisive evidence for the acceleration of hadronic cosmic rays in astrophysical objects. However, the identification of such a spectral feature is limited by the resolution and energy coverage of current gamma-ray instruments. Furthermore, there are unavoidable bremsstrahlung emissions from secondary and primary electrons, which may dominate the gamma-ray emission below the pion-bump. Thus, the study of this gamma-ray emission component can provide unique information on the acceleration and confinement of high-energy particles. In this paper, we studied the predicted gamma-ray spectrum assuming both hadronic or leptonic origin in mid-aged supernova remnants W44, we discuss the detection potential of future MeV missions on these emissions and possible implications. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 7 pages, 4 figures, submitted to PRD

arXiv:2406.03320 [pdf, other]

Detection of extended gamma-ray emission in the vicinity of Cl Danks 1 and 2

Authors: Jiahao Liu, Bing Liu, Ruizhi Yang

Abstract: We report the detection of high-energy gamma-ray emission towards the G305 star-forming region. Using almost 15 years of observation data from {\sl Fermi} Large Area Telescope, we detected an extended gamma-ray source in this region with a significance of $\sim 13 σ$. The gamma-ray radiation reveals a clear pion-bump feature and can be fitted with the power law parent proton spectrum with an index… ▽ More We report the detection of high-energy gamma-ray emission towards the G305 star-forming region. Using almost 15 years of observation data from {\sl Fermi} Large Area Telescope, we detected an extended gamma-ray source in this region with a significance of $\sim 13 σ$. The gamma-ray radiation reveals a clear pion-bump feature and can be fitted with the power law parent proton spectrum with an index of $-2.5$. The total cosmic ray (CR) proton energy in the gamma-ray production region is estimated to be the order of $10^{49}\ \rm erg$. We further derived the CR radial distribution from both the gamma-ray emission and gas distribution and found it roughly obeys the $1/r$ type profile, which is consistent with other similar systems and expected from the continuous injection of CRs by the central powerful young massive star cluster Danks 1 or Danks 2 in this region. Together with former detections of similar gamma-ray structures, such as Cygnus cocoon, Westerlund 1, Westerlund 2, NGC 3603, and W40, the detection supports the hypothesis that young massive star clusters are CR accelerators. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: 9 pages, 4 figures, submitted to APJL

arXiv:2406.02222 [pdf, other]

Towards an Extensible Model-Based Digital Twin Framework for Space Launch Vehicles

Authors: Ran Wei, Ruizhe Yang, Shijun Liu, Chongsheng Fan, Rong Zhou, Zekun Wu, Haochi Wang, Yifan Cai, Zhe Jiang

Abstract: The concept of Digital Twin (DT) is increasingly applied to systems on different levels of abstraction across domains, to support monitoring, analysis, diagnosis, decision making and automated control. Whilst the interest in applying DT is growing, the definition of DT is unclear, neither is there a clear pathway to develop DT to fully realise its capacities. In this paper, we revise the concept o… ▽ More The concept of Digital Twin (DT) is increasingly applied to systems on different levels of abstraction across domains, to support monitoring, analysis, diagnosis, decision making and automated control. Whilst the interest in applying DT is growing, the definition of DT is unclear, neither is there a clear pathway to develop DT to fully realise its capacities. In this paper, we revise the concept of DT and its categorisation. We propose a DT maturity matrix, based on which we propose a model-based DT development methodology. We also discuss how model-based tools can be used to support the methodology and present our own supporting tool. We report our preliminary findings with a discussion on a case study, in which we use our proposed methodology and our supporting tool to develop an extensible DT platform for the assurance of Electrical and Electronics systems of space launch vehicles. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02143 [pdf, other]

Reinforcement Tuning for Detecting Stances and Debunking Rumors Jointly with Large Language Models

Authors: Ruichao Yang, Wei Gao, **g Ma, Hongzhan Lin, Bo Wang

Abstract: Learning multi-task models for jointly detecting stance and verifying rumors poses challenges due to the need for training data of stance at post level and rumor veracity at claim level, which are difficult to obtain. To address this issue, we leverage large language models (LLMs) as the foundation annotators for the joint stance detection (SD) and rumor verification (RV) tasks, dubbed as JSDRV. W… ▽ More Learning multi-task models for jointly detecting stance and verifying rumors poses challenges due to the need for training data of stance at post level and rumor veracity at claim level, which are difficult to obtain. To address this issue, we leverage large language models (LLMs) as the foundation annotators for the joint stance detection (SD) and rumor verification (RV) tasks, dubbed as JSDRV. We introduce a novel reinforcement tuning framework to enhance the joint predictive capabilities of LLM-based SD and RV components. Specifically, we devise a policy for selecting LLM-annotated data at the two levels, employing a hybrid reward mechanism to choose high-quality labels for effective LLM fine-tuning on both tasks. Results demonstrate that JSDRV improves the capabilities of LLMs in the joint tasks, not only outperforming state-of-the-art methods but also generalizing to non-LLMs accommodated as task models. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: ACL 2024 (Findings)

arXiv:2406.02038 [pdf, other]

Leveraging Predicate and Triplet Learning for Scene Graph Generation

Authors: Jiankai Li, Yunhong Wang, Xiefan Guo, Ruijie Yang, Weixin Li

Abstract: Scene Graph Generation (SGG) aims to identify entities and predict the relationship triplets \textit{\textless subject, predicate, object\textgreater } in visual scenes. Given the prevalence of large visual variations of subject-object pairs even in the same predicate, it can be quite challenging to model and refine predicate representations directly across such pairs, which is however a common st… ▽ More Scene Graph Generation (SGG) aims to identify entities and predict the relationship triplets \textit{\textless subject, predicate, object\textgreater } in visual scenes. Given the prevalence of large visual variations of subject-object pairs even in the same predicate, it can be quite challenging to model and refine predicate representations directly across such pairs, which is however a common strategy adopted by most existing SGG methods. We observe that visual variations within the identical triplet are relatively small and certain relation cues are shared in the same type of triplet, which can potentially facilitate the relation learning in SGG. Moreover, for the long-tail problem widely studied in SGG task, it is also crucial to deal with the limited types and quantity of triplets in tail predicates. Accordingly, in this paper, we propose a Dual-granularity Relation Modeling (DRM) network to leverage fine-grained triplet cues besides the coarse-grained predicate ones. DRM utilizes contexts and semantics of predicate and triplet with Dual-granularity Constraints, generating compact and balanced representations from two perspectives to facilitate relation recognition. Furthermore, a Dual-granularity Knowledge Transfer (DKT) strategy is introduced to transfer variation from head predicates/triplets to tail ones, aiming to enrich the pattern diversity of tail classes to alleviate the long-tail problem. Extensive experiments demonstrate the effectiveness of our method, which establishes new state-of-the-art performance on Visual Genome, Open Image, and GQA datasets. Our code is available at \url{https://github.com/jkli1998/DRM} △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: CVPR 2024

arXiv:2406.01584 [pdf, other]

SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model

Authors: An-Chieh Cheng, Hongxu Yin, Yang Fu, Qiushan Guo, Ruihan Yang, Jan Kautz, Xiaolong Wang, Sifei Liu

Abstract: Vision Language Models (VLMs) have demonstrated remarkable performance in 2D vision and language tasks. However, their ability to reason about spatial arrangements remains limited. In this work, we introduce Spatial Region GPT (SpatialRGPT) to enhance VLMs' spatial perception and reasoning capabilities. SpatialRGPT advances VLMs' spatial understanding through two key innovations: (1) a data curati… ▽ More Vision Language Models (VLMs) have demonstrated remarkable performance in 2D vision and language tasks. However, their ability to reason about spatial arrangements remains limited. In this work, we introduce Spatial Region GPT (SpatialRGPT) to enhance VLMs' spatial perception and reasoning capabilities. SpatialRGPT advances VLMs' spatial understanding through two key innovations: (1) a data curation pipeline that enables effective learning of regional representation from 3D scene graphs, and (2) a flexible plugin module for integrating depth information into the visual encoder of existing VLMs. During inference, when provided with user-specified region proposals, SpatialRGPT can accurately perceive their relative directions and distances. Additionally, we propose SpatialRGBT-Bench, a benchmark with ground-truth 3D annotations encompassing indoor, outdoor, and simulated environments, for evaluating 3D spatial cognition in VLMs. Our results demonstrate that SpatialRGPT significantly enhances performance in spatial reasoning tasks, both with and without local region prompts. The model also exhibits strong generalization capabilities, effectively reasoning about complex spatial relations and functioning as a region-aware dense reward annotator for robotic tasks. Code, dataset, and benchmark will be released at https://www.anjiecheng.me/SpatialRGPT △ Less

Submitted 18 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

Comments: Project Page: https://www.anjiecheng.me/SpatialRGPT

arXiv:2406.01069 [pdf, other]

UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment

Authors: Hantao Zhou, Longxiang Tang, Rui Yang, Guanyi Qin, Yan Zhang, Runze Hu, Xiu Li

Abstract: Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal. Existing methods typically address these tasks independently due to distinct learning objectives. However, they neglect the underlying interconnectedness of both tasks, which hinders the learning of task-agnostic shared representations for hu… ▽ More Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal. Existing methods typically address these tasks independently due to distinct learning objectives. However, they neglect the underlying interconnectedness of both tasks, which hinders the learning of task-agnostic shared representations for human subjective perception. To confront this challenge, we propose Unified vision-language pre-training of Quality and Aesthetics (UniQA), to learn general perceptions of two tasks, thereby benefiting them simultaneously. Addressing the absence of text in the IQA datasets and the presence of textual noise in the IAA datasets, (1) we utilize multimodal large language models (MLLMs) to generate high-quality text descriptions; (2) the generated text for IAA serves as metadata to purify noisy IAA data. To effectively adapt the pre-trained UniQA to downstream tasks, we further propose a lightweight adapter that utilizes versatile cues to fully exploit the extensive knowledge of the pre-trained model. Extensive experiments demonstrate that our approach attains a new state-of-the-art performance on both IQA and IAA tasks, while concurrently showcasing exceptional zero-shot and few-label image assessment capabilities. The source code will be available at https://github.com/zht8506/UniQA. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2405.19624 [pdf, other]

Single spin asymmetry $ A _{ U L } ^ { \sin ( 2 φ_ { h } ) }$ in dihadron production in SIDIS

Authors: Ren Yang, Yangyang Yu, Qihang Zhou, Gang Li, Mao Song, Xuan Luo

Abstract: The paper calculates the helicity-dependent dihadron fragmentation function (DiFF), by extending the dihadron spectator model and examine the single longitudinal spin asymmetry $A^{\sin(2φ_h)}_{UL}$ from dihadron in semi-inclusive inelastic scattering (SIDIS). This function elucidates the relationship between the longitudinal polarization of the fragmented quark and the transverse momentum of the… ▽ More The paper calculates the helicity-dependent dihadron fragmentation function (DiFF), by extending the dihadron spectator model and examine the single longitudinal spin asymmetry $A^{\sin(2φ_h)}_{UL}$ from dihadron in semi-inclusive inelastic scattering (SIDIS). This function elucidates the relationship between the longitudinal polarization of the fragmented quark and the transverse momentum of the resulting hadron pairs. A study by the COMPASS collaboration detected a minimal signal in their experimental search for this azimuthal asymmetry in SIDIS. Here, we use the spectator model to calculate the unknown T-odd dihadron fragmentation function $H_1^\perp$. Adopting collinear factorization to describe the data, avoiding the transverse momentum dependent factorization and the associated resummation effects, hel** us understand the asymmetry and explaining why the signal is so weak. We involve the approach of transverse momentum dependence in the model calculations, in order to formulate the differential cross sections and the spin asymmetries in terms of the collinear parton distributions and the collinear DiFFs. A transverse momentum factor analysis method was used, in which the transverse momentum of the final hadron pairs was not integrated. The asymmetry of $sin(2φ_h)$ in COMPASS kinematics was calculated and compared with experimental data. In addition, predictions for the same asymmetry are also presented for HERMES and the Electron Ion Collider. △ Less

Submitted 24 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: 10 pages,10 figures. Version appearing in PRD

Journal ref: Phys. Rev. D 109, 114038 (2024)

arXiv:2405.19372 [pdf, other]

On a conjecture about generalized integration operators on Hardy spaces

Authors: Rong Yang, Songxiao Li

Abstract: A conjecture posed by Chalmoukis in 2020 states that if $T_{g,a}:H^p\to H^q(0<q<p<\infty)$ is bounded, then $g$ must be in $H^{\frac{pq}{p-q}}$. In this article, we provide a positive answer to the aforementioned conjecture. We also consider the compactness of $T_{g,a}:H^p\to H^q(0<q<p<\infty)$. A conjecture posed by Chalmoukis in 2020 states that if $T_{g,a}:H^p\to H^q(0<q<p<\infty)$ is bounded, then $g$ must be in $H^{\frac{pq}{p-q}}$. In this article, we provide a positive answer to the aforementioned conjecture. We also consider the compactness of $T_{g,a}:H^p\to H^q(0<q<p<\infty)$. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: This paper was finished and submitted to manuscripta mathematica on April 24, 2024. In May 24, we found that Nikolaos Chalmoukis and Georgios Nikolaidis have also independently proven this conjecture on arXiv. See arXiv:2405.13920. arXiv admin note: substantial text overlap with arXiv:2405.16278

arXiv:2405.18959 [pdf, other]

Transcending Fusion: A Multi-Scale Alignment Method for Remote Sensing Image-Text Retrieval

Authors: Rui Yang, Shuang Wang, Ying** Han, Yuanheng Li, Dong Zhao, Dou Quan, Yanhe Guo, Licheng Jiao

Abstract: Remote Sensing Image-Text Retrieval (RSITR) is pivotal for knowledge services and data mining in the remote sensing (RS) domain. Considering the multi-scale representations in image content and text vocabulary can enable the models to learn richer representations and enhance retrieval. Current multi-scale RSITR approaches typically align multi-scale fused image features with text features, but ove… ▽ More Remote Sensing Image-Text Retrieval (RSITR) is pivotal for knowledge services and data mining in the remote sensing (RS) domain. Considering the multi-scale representations in image content and text vocabulary can enable the models to learn richer representations and enhance retrieval. Current multi-scale RSITR approaches typically align multi-scale fused image features with text features, but overlook aligning image-text pairs at distinct scales separately. This oversight restricts their ability to learn joint representations suitable for effective retrieval. We introduce a novel Multi-Scale Alignment (MSA) method to overcome this limitation. Our method comprises three key innovations: (1) Multi-scale Cross-Modal Alignment Transformer (MSCMAT), which computes cross-attention between single-scale image features and localized text features, integrating global textual context to derive a matching score matrix within a mini-batch, (2) a multi-scale cross-modal semantic alignment loss that enforces semantic alignment across scales, and (3) a cross-scale multi-modal semantic consistency loss that uses the matching matrix from the largest scale to guide alignment at smaller scales. We evaluated our method across multiple datasets, demonstrating its efficacy with various visual backbones and establishing its superiority over existing state-of-the-art methods. The GitHub URL for our project is: https://github.com/yr666666/MSA △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 16 pages, 9 figures

arXiv:2405.18525 [pdf, other]

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

Authors: Haonan Han, Rui Yang, Huan Liao, Jiankai Xing, Zunnan Xu, Xiaoming Yu, Junwei Zha, Xiu Li, Wanhua Li

Abstract: Traditional image-to-3D models often struggle with scenes containing multiple objects due to biases and occlusion complexities. To address this challenge, we present REPARO, a novel approach for compositional 3D asset generation from single images. REPARO employs a two-step process: first, it extracts individual objects from the scene and reconstructs their 3D meshes using off-the-shelf image-to-3… ▽ More Traditional image-to-3D models often struggle with scenes containing multiple objects due to biases and occlusion complexities. To address this challenge, we present REPARO, a novel approach for compositional 3D asset generation from single images. REPARO employs a two-step process: first, it extracts individual objects from the scene and reconstructs their 3D meshes using off-the-shelf image-to-3D models; then, it optimizes the layout of these meshes through differentiable rendering techniques, ensuring coherent scene composition. By integrating optimal transport-based long-range appearance loss term and high-level semantic loss term in the differentiable rendering, REPARO can effectively recover the layout of 3D assets. The proposed method can significantly enhance object independence, detail accuracy, and overall scene coherence. Extensive evaluation of multi-object scenes demonstrates that our REPARO offers a comprehensive approach to address the complexities of multi-object 3D scene generation from single images. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.17673 [pdf, other]

Fast Samplers for Inverse Problems in Iterative Refinement Models

Authors: Kushagra Pandey, Ruihan Yang, Stephan Mandt

Abstract: Constructing fast samplers for unconditional diffusion and flow-matching models has received much attention recently; however, existing methods for solving inverse problems, such as super-resolution, inpainting, or deblurring, still require hundreds to thousands of iterative steps to obtain high-quality results. We propose a plug-and-play framework for constructing efficient samplers for inverse p… ▽ More Constructing fast samplers for unconditional diffusion and flow-matching models has received much attention recently; however, existing methods for solving inverse problems, such as super-resolution, inpainting, or deblurring, still require hundreds to thousands of iterative steps to obtain high-quality results. We propose a plug-and-play framework for constructing efficient samplers for inverse problems, requiring only pre-trained diffusion or flow-matching models. We present Conditional Conjugate Integrators, which leverage the specific form of the inverse problem to project the respective conditional diffusion/flow dynamics into a more amenable space for sampling. Our method complements popular posterior approximation methods for solving inverse problems using diffusion/flow models. We evaluate the proposed method's performance on various linear image restoration tasks across multiple datasets, employing diffusion and flow-matching models. Notably, on challenging inverse problems like 4$\times$ super-resolution on the ImageNet dataset, our method can generate high-quality samples in as few as 5 conditional sampling steps and outperforms competing baselines requiring 20-1000 steps. Our code and models will be publicly available at https://github.com/mandt-lab/CI2RM. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.17212 [pdf, ps, other]

A new parametrization of Hubble function and Hubble tension

Authors: Tong-Yu He, Jia-Jun Yin, Zhen-Yu Wang, Zhan-Wen Han, Rong-Jia Yang

Abstract: We present a new Hubble parameterization method and employ observational data from Hubble, Pantheon, and Baryon Acoustic Oscillations to constrain model parameters. The proposed method is thoroughly validated against these datasets, demonstrating a robust fit to the observational data. The obtained best-fit values are $H_0 = 67.5^{+1.3}_{-1.6}$ $\text{km s}^{-1} \text{Mpc}^{-1}$,… ▽ More We present a new Hubble parameterization method and employ observational data from Hubble, Pantheon, and Baryon Acoustic Oscillations to constrain model parameters. The proposed method is thoroughly validated against these datasets, demonstrating a robust fit to the observational data. The obtained best-fit values are $H_0 = 67.5^{+1.3}_{-1.6}$ $\text{km s}^{-1} \text{Mpc}^{-1}$, $Ω_{\rm{m0}} = 0.2764\pm{0.0094}$, and $α= 0.33\pm{0.22}$, consistent with the Planck 2018 results, highlighting the existence of Hubble tension. △ Less

Submitted 16 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.16850 [pdf, other]

UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation

Authors: Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, **li Suo, Qionghai Dai

Abstract: In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multi… ▽ More In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multiple medical data blocks using a single INR network. By employing wavelet transforms and quantization, we introduce a codebook containing frequency domain information as a prior input to the INR network. This enhances the representational power of INR and provides distinctive conditioning for different image blocks. Furthermore, our research introduces a new technique for the knowledge distillation of implicit representations, simplifying complex model knowledge into more manageable formats to improve compression ratios. Extensive testing on CT and electron microscopy (EM) datasets has demonstrated that UniCompress outperforms traditional INR methods and commercial compression solutions like HEVC, especially in complex and high compression scenarios. Notably, compared to existing INR techniques, UniCompress achieves a 4$\sim$5 times increase in compression speed, marking a significant advancement in the field of medical image compression. Codes will be publicly available. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.16726 [pdf, other]

Exploring Edge Probability Graph Models Beyond Edge Independency: Concepts, Analyses, and Algorithms

Authors: Fanchen Bu, Ruochen Yang, Paul Bogdan, Kijung Shin

Abstract: Desirable random graph models (RGMs) should (i) be tractable so that we can compute and control graph statistics, and (ii) generate realistic structures such as high clustering (i.e., high subgraph densities). A popular category of RGMs (e.g., Erdos-Renyi and stochastic Kronecker) outputs edge probabilities, and we need to realize (i.e., sample from) the edge probabilities to generate graphs. Typi… ▽ More Desirable random graph models (RGMs) should (i) be tractable so that we can compute and control graph statistics, and (ii) generate realistic structures such as high clustering (i.e., high subgraph densities). A popular category of RGMs (e.g., Erdos-Renyi and stochastic Kronecker) outputs edge probabilities, and we need to realize (i.e., sample from) the edge probabilities to generate graphs. Typically, each edge (in)existence is assumed to be determined independently. However, with edge independency, RGMs theoretically cannot produce high subgraph densities unless they "replicate" input graphs. In this work, we explore realization beyond edge independence that can produce more realistic structures while ensuring high tractability. Specifically, we propose edge-dependent realization schemes called binding and derive closed-form tractability results on subgraph (e.g., triangle) densities in graphs generated with binding. We propose algorithms for graph generation with binding and parameter fitting of binding. We empirically validate that binding exhibits high tractability and generates realistic graphs with high clustering, significantly improving upon existing RGMs assuming edge independency. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.16376 [pdf, other]

STRIDE: A Tool-Assisted LLM Agent Framework for Strategic and Interactive Decision-Making

Authors: Chuanhao Li, Runhan Yang, Tiankai Li, Milad Bafarassat, Kourosh Sharifi, Dirk Bergemann, Zhuoran Yang

Abstract: Large Language Models (LLMs) like GPT-4 have revolutionized natural language processing, showing remarkable linguistic proficiency and reasoning capabilities. However, their application in strategic multi-agent decision-making environments is hampered by significant limitations including poor mathematical reasoning, difficulty in following instructions, and a tendency to generate incorrect informa… ▽ More Large Language Models (LLMs) like GPT-4 have revolutionized natural language processing, showing remarkable linguistic proficiency and reasoning capabilities. However, their application in strategic multi-agent decision-making environments is hampered by significant limitations including poor mathematical reasoning, difficulty in following instructions, and a tendency to generate incorrect information. These deficiencies hinder their performance in strategic and interactive tasks that demand adherence to nuanced game rules, long-term planning, exploration in unknown environments, and anticipation of opponents' moves. To overcome these obstacles, this paper presents a novel LLM agent framework equipped with memory and specialized tools to enhance their strategic decision-making capabilities. We deploy the tools in a number of economically important environments, in particular bilateral bargaining and multi-agent and dynamic mechanism design. We employ quantitative metrics to assess the framework's performance in various strategic decision-making problems. Our findings establish that our enhanced framework significantly improves the strategic decision-making capability of LLMs. While we highlight the inherent limitations of current LLM models, we demonstrate the improvements through targeted enhancements, suggesting a promising direction for future developments in LLM applications for interactive environments. △ Less

Submitted 27 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: 39 pages, 4 figures

arXiv:2405.16278 [pdf, ps, other]

Generalized integration operators on analytic tent spaces

Authors: Rong Yang, Lian Hu, Songxiao Li

Abstract: In this paper, the boundedness and compactness of generalized integration operators $T_g^{n,k}$ between different analytic tent spaces in the unit disc are completely characterized. In this paper, the boundedness and compactness of generalized integration operators $T_g^{n,k}$ between different analytic tent spaces in the unit disc are completely characterized. △ Less

Submitted 25 May, 2024; originally announced May 2024.

arXiv:2405.16209 [pdf]

Analytical photoresponses of Schottky contact MoS2 phototransistors

Authors: Jianyong Wei, Yumeng Liu, Yizhuo Wang, Kai Li, Zhentao Lian, Maosong Xie, Xinhan Yang, Seyed Saleh Mousavi Khaleghi, Fuxing Dai, Weida Hu, Xuejiao Gao, Rui Yang, Ya** Dan

Abstract: High-gain photodetectors based on two-dimensional (2D) semiconductors, in particular those in photoconductive mode, have been extensively investigated in the past decade. However, the classical photoconductive theory was derived on two misplaced assumptions. In this work, we established an explicit analytical device model for Schottky contact MoS2 phototransistors that fits well with experimental… ▽ More High-gain photodetectors based on two-dimensional (2D) semiconductors, in particular those in photoconductive mode, have been extensively investigated in the past decade. However, the classical photoconductive theory was derived on two misplaced assumptions. In this work, we established an explicit analytical device model for Schottky contact MoS2 phototransistors that fits well with experimental data. From the fitting results, we found that the Richardson constant of the MoS2 Schottky contact is temperature dependent, indicating that the Schottky contacts for the 2D material is best described by the mixed thermionic emission and diffusion model. Based on this device model, we further established an analytical photoresponse for the few-layer MoS2 phototransistors, from which we found the voltage distribution on the two Schottky contacts and the channel, and extracted the minority carrier recombination lifetimes. The lifetimes are comparable with the values found from transient photoluminescence measurements, which therefore validates our analytical photoresponses for Schottky contact 2D semiconducting phototransistors. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 15 pages, 6 figures

arXiv:2405.16030 [pdf, other]

Constrained Ensemble Exploration for Unsupervised Skill Discovery

Authors: Chenjia Bai, Rushuai Yang, Qiaosheng Zhang, Kang Xu, Yi Chen, Ting Xiao, Xuelong Li

Abstract: Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning useful behaviors via reward-free per-training. Existing methods for unsupervised RL mainly conduct empowerment-driven skill discovery or entropy-based exploration. However, empowerment often leads to static skills, and pure exploration only maximizes the state coverage rather than learning useful behaviors. In this… ▽ More Unsupervised Reinforcement Learning (RL) provides a promising paradigm for learning useful behaviors via reward-free per-training. Existing methods for unsupervised RL mainly conduct empowerment-driven skill discovery or entropy-based exploration. However, empowerment often leads to static skills, and pure exploration only maximizes the state coverage rather than learning useful behaviors. In this paper, we propose a novel unsupervised RL framework via an ensemble of skills, where each skill performs partition exploration based on the state prototypes. Thus, each skill can explore the clustered area locally, and the ensemble skills maximize the overall state coverage. We adopt state-distribution constraints for the skill occupancy and the desired cluster for learning distinguishable skills. Theoretical analysis is provided for the state entropy and the resulting skill distributions. Based on extensive experiments on several challenging tasks, we find our method learns well-explored ensemble skills and achieves superior performance in various downstream tasks compared to previous methods. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: Accepted by ICML 2024

arXiv:2405.15385 [pdf, other]

CPT-Interp: Continuous sPatial and Temporal Motion Modeling for 4D Medical Image Interpolation

Authors: Xia Li, Runzhao Yang, Xiangtai Li, Antony Lomax, Ye Zhang, Joachim Buhmann

Abstract: Motion information from 4D medical imaging offers critical insights into dynamic changes in patient anatomy for clinical assessments and radiotherapy planning and, thereby, enhances the capabilities of 3D image analysis. However, inherent physical and technical constraints of imaging hardware often necessitate a compromise between temporal resolution and image quality. Frame interpolation emerges… ▽ More Motion information from 4D medical imaging offers critical insights into dynamic changes in patient anatomy for clinical assessments and radiotherapy planning and, thereby, enhances the capabilities of 3D image analysis. However, inherent physical and technical constraints of imaging hardware often necessitate a compromise between temporal resolution and image quality. Frame interpolation emerges as a pivotal solution to this challenge. Previous methods often suffer from discretion when they estimate the intermediate motion and execute the forward war**. In this study, we draw inspiration from fluid mechanics to propose a novel approach for continuously modeling patient anatomic motion using implicit neural representation. It ensures both spatial and temporal continuity, effectively bridging Eulerian and Lagrangian specifications together to naturally facilitate continuous frame interpolation. Our experiments across multiple datasets underscore the method's superior accuracy and speed. Furthermore, as a case-specific optimization (training-free) approach, it circumvents the need for extensive datasets and addresses model generalization issues. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.11922 [pdf, other]

Effective Clustering on Large Attributed Bipartite Graphs

Authors: Renchi Yang, Yidu Wu, Xiaoyang Lin, Qichen Wang, Tsz Nam Chan, Jieming Shi

Abstract: Attributed bipartite graphs (ABGs) are an expressive data model for describing the interactions between two sets of heterogeneous nodes that are associated with rich attributes, such as customer-product purchase networks and author-paper authorship graphs. Partitioning the target node set in such graphs into k disjoint clusters (referred to as k-ABGC) finds widespread use in various domains, inclu… ▽ More Attributed bipartite graphs (ABGs) are an expressive data model for describing the interactions between two sets of heterogeneous nodes that are associated with rich attributes, such as customer-product purchase networks and author-paper authorship graphs. Partitioning the target node set in such graphs into k disjoint clusters (referred to as k-ABGC) finds widespread use in various domains, including social network analysis, recommendation systems, information retrieval, and bioinformatics. However, the majority of existing solutions towards k-ABGC either overlook attribute information or fail to capture bipartite graph structures accurately, engendering severely compromised result quality. The severity of these issues is accentuated in real ABGs, which often encompass millions of nodes and a sheer volume of attribute data, rendering effective k-ABGC over such graphs highly challenging. In this paper, we propose TPO, an effective and efficient approach to k-ABGC that achieves superb clustering performance on multiple real datasets. TPO obtains high clustering quality through two major contributions: (i) a novel formulation and transformation of the k-ABGC problem based on multi-scale attribute affinity specialized for capturing attribute affinities between nodes with the consideration of their multi-hop connections in ABGs, and (ii) a highly efficient solver that includes a suite of carefully-crafted optimizations for sidestep** explicit affinity matrix construction and facilitating faster convergence. Extensive experiments, comparing TPO against 19 baselines over 5 real ABGs, showcase the superior clustering quality of TPO measured against ground-truth labels. Moreover, compared to the state of the arts, TPO is often more than 40x faster over both small and large ABGs. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: The technical report for the paper was accepted to KDD 2024. 14 pages

arXiv:2405.11921 [pdf, other]

MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

Authors: Jiayue Liu, Xiao Tang, Freeman Cheng, Roy Yang, Zhihao Li, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan

Abstract: 3D Gaussian Splatting showcases notable advancements in photo-realistic and real-time novel view synthesis. However, it faces challenges in modeling mirror reflections, which exhibit substantial appearance variations from different viewpoints. To tackle this problem, we present MirrorGaussian, the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting.… ▽ More 3D Gaussian Splatting showcases notable advancements in photo-realistic and real-time novel view synthesis. However, it faces challenges in modeling mirror reflections, which exhibit substantial appearance variations from different viewpoints. To tackle this problem, we present MirrorGaussian, the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting. The key insight is grounded on the mirror symmetry between the real-world space and the virtual mirror space. We introduce an intuitive dual-rendering strategy that enables differentiable rasterization of both the real-world 3D Gaussians and the mirrored counterpart obtained by reflecting the former about the mirror plane. All 3D Gaussians are jointly optimized with the mirror plane in an end-to-end framework. MirrorGaussian achieves high-quality and real-time rendering in scenes with mirrors, empowering scene editing like adding new mirrors and objects. Comprehensive experiments on multiple datasets demonstrate that our approach significantly outperforms existing methods, achieving state-of-the-art results. Project page: https://mirror-gaussian.github.io/. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.11754 [pdf, other]

Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation

Authors: Runou Yang, Tian Tian, **wen Tian

Abstract: Addressing the challenge of domain shift between datasets is vital in maintaining model performance. In the context of cross-domain object detection, the teacher-student framework, a widely-used semi-supervised model, has shown significant accuracy improvements. However, existing methods often overlook class differences, treating all classes equally, resulting in suboptimal results. Furthermore, t… ▽ More Addressing the challenge of domain shift between datasets is vital in maintaining model performance. In the context of cross-domain object detection, the teacher-student framework, a widely-used semi-supervised model, has shown significant accuracy improvements. However, existing methods often overlook class differences, treating all classes equally, resulting in suboptimal results. Furthermore, the integration of instance-level alignment with a one-stage detector, essential due to the absence of a Region Proposal Network (RPN), remains unexplored in this framework. In response to these shortcomings, we introduce a novel teacher-student model named Versatile Teacher (VT). VT differs from previous works by considering class-specific detection difficulty and employing a two-step pseudo-label selection mechanism, referred to as Class-aware Pseudo-label Adaptive Selection (CAPS), to generate more reliable pseudo labels. These labels are leveraged as saliency matrices to guide the discriminator for targeted instance-level alignment. Our method demonstrates promising results on three benchmark datasets, and extends the alignment methods for widely-used one-stage detectors, presenting significant potential for practical applications. Code is available at https://github.com/RicardooYoung/VersatileTeacher. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11225 [pdf, other]

SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

Authors: Yingguang Yang, Qi Wu, Buyun He, Hao Peng, Renyu Yang, Zhifeng Hao, Yong Liao

Abstract: Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-based detection methods that follow the transductive message-passing paradigm may not fully utilize hidden graph information and are vulnerable to ad… ▽ More Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-based detection methods that follow the transductive message-passing paradigm may not fully utilize hidden graph information and are vulnerable to adversarial bot behavior. The indiscriminate message passing between nodes from different categories and communities results in excessively homogeneous node representations, ultimately reducing the effectiveness of social bot detectors. In this paper, we propose SEBot, a novel multi-view graph-based contrastive learning-enabled social bot detector. In particular, we use structural entropy as an uncertainty metric to optimize the entire graph's structure and subgraph-level granularity, revealing the implicitly existing hierarchical community structure. And we design an encoder to enable message passing beyond the homophily assumption, enhancing robustness to adversarial behaviors of social bots. Finally, we employ multi-view contrastive learning to maximize mutual information between different views and enhance the detection performance through multi-task learning. Experimental results demonstrate that our approach significantly improves the performance of social bot detection compared with SOTA methods. △ Less

Submitted 18 May, 2024; originally announced May 2024.

Comments: KDD 2024

arXiv:2405.09357 [pdf, ps, other]

A universal optimization framework based on cycle ranking for influence maximization in complex networks

Authors: Wenfeng Shi, Tianlong Fan, Shuqi Xu, Rongmei Yang, Linyuan Lü

Abstract: Influence maximization aims to identify a set of influential individuals, referred to as influencers, as information sources to maximize the spread of information within networks, constituting a vital combinatorial optimization problem with extensive practical applications and sustained interdisciplinary interest. Diverse approaches have been devised to efficiently address this issue, one of which… ▽ More Influence maximization aims to identify a set of influential individuals, referred to as influencers, as information sources to maximize the spread of information within networks, constituting a vital combinatorial optimization problem with extensive practical applications and sustained interdisciplinary interest. Diverse approaches have been devised to efficiently address this issue, one of which involves selecting the influencers from a given centrality ranking. In this paper, we propose a novel optimization framework based on ranking basic cycles in networks, capable of selecting the influencers from diverse centrality measures. The experimental results demonstrate that, compared to directly selecting the top-k nodes from centrality sequences and other state-of-the-art optimization approaches, the new framework can expand the dissemination range by 1.5 to 3 times. Counterintuitively, it exhibits minimal hub property, with the average distance between influencers being only one-third of alternative approaches, regardless of the centrality metrics or network types. Our study not only paves the way for novel strategies in influence maximization but also underscores the unique potential of underappreciated cycle structures. △ Less

Submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.07800 [pdf, other]

Data Imputation by Pursuing Better Classification: A Supervised Kernel-Based Method

Authors: Ruikai Yang, Fan He, Mingzhen He, Kaijie Wang, Xiaolin Huang

Abstract: Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows that the pursuit of better classification can guide the data imputation process. While some works consider using label information to assist in this task, their si… ▽ More Data imputation, the process of filling in missing feature elements for incomplete data sets, plays a crucial role in data-driven learning. A fundamental belief is that data imputation is helpful for learning performance, and it follows that the pursuit of better classification can guide the data imputation process. While some works consider using label information to assist in this task, their simplistic utilization of labels lacks flexibility and may rely on strict assumptions. In this paper, we propose a new framework that effectively leverages supervision information to complete missing data in a manner conducive to classification. Specifically, this framework operates in two stages. Firstly, it leverages labels to supervise the optimization of similarity relationships among data, represented by the kernel matrix, with the goal of enhancing classification accuracy. To mitigate overfitting that may occur during this process, a perturbation variable is introduced to improve the robustness of the framework. Secondly, the learned kernel matrix serves as additional supervision information to guide data imputation through regression, utilizing the block coordinate descent method. The superiority of the proposed method is evaluated on four real-world data sets by comparing it with state-of-the-art imputation methods. Remarkably, our algorithm significantly outperforms other methods when the data is missing more than 60\% of the features △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07791 [pdf, ps, other]

Decentralized Kernel Ridge Regression Based on Data-dependent Random Feature

Authors: Ruikai Yang, Fan He, Mingzhen He, Jie Yang, Xiaolin Huang

Abstract: Random feature (RF) has been widely used for node consistency in decentralized kernel ridge regression (KRR). Currently, the consistency is guaranteed by imposing constraints on coefficients of features, necessitating that the random features on different nodes are identical. However, in many applications, data on different nodes varies significantly on the number or distribution, which calls for… ▽ More Random feature (RF) has been widely used for node consistency in decentralized kernel ridge regression (KRR). Currently, the consistency is guaranteed by imposing constraints on coefficients of features, necessitating that the random features on different nodes are identical. However, in many applications, data on different nodes varies significantly on the number or distribution, which calls for adaptive and data-dependent methods that generate different RFs. To tackle the essential difficulty, we propose a new decentralized KRR algorithm that pursues consensus on decision functions, which allows great flexibility and well adapts data on nodes. The convergence is rigorously given and the effectiveness is numerically verified: by capturing the characteristics of the data on each node, while maintaining the same communication costs as other methods, we achieved an average regression accuracy improvement of 25.5\% across six real-world data sets. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2405.07464 [pdf]

Atomic-scale tunable phonon transport at tailored grain boundaries

Authors: Xiaowang Wang, Chaitanya A. Gadre, Runqing Yang, Wanjuan Zou, Xing Bin, Christopher Addiego, Toshihiro Aoki, Yujie Quan, Wei-Tao Peng, Yifeng Huang, Chaojie Du, Mingjie Xu, Xingxu Yan, Ruqian Wu, Shyue ** Ong, Bolin Liao, Penghui Cao, Xiaoqing Pan

Abstract: Manipulating thermal properties in materials has been of fundamental importance for advancing innovative technologies. Heat carriers such as phonons are impeded by breaking crystal symmetry or periodicity. Notable methods of impeding the phonon propagation include varying the density of defects, interfaces, and nanostructures, as well as changing composition. However, a robust link between the ind… ▽ More Manipulating thermal properties in materials has been of fundamental importance for advancing innovative technologies. Heat carriers such as phonons are impeded by breaking crystal symmetry or periodicity. Notable methods of impeding the phonon propagation include varying the density of defects, interfaces, and nanostructures, as well as changing composition. However, a robust link between the individual nanoscale defect structures, phonon states, and macroscopic thermal conductivity is lacking. Here we reveal from nanoscale structure-phonon mechanisms on how the grain boundary (GB) tilt and twist angles fundamentally drive the changes in atom rearrangements, exotic vibrational states, and finally macroscopic heat transport at different bicrystal strontium titanate GBs using emerging atomic resolution vibrational spectroscopy. The 10 deg and 22 deg tilt GBs exhibit reduced phonon populations by 54% and 16% compared to the bulk value, respectively, consistent with measured thermal conductivities. A tiny twist angle further introduces a fine and local tunning of thermal conductivity by introducing twist induced defects periodically embedded with the tilt induced GB defects. Our results demonstrate that varying the tilt angle coarsely modifies the phonon population along entire GB while varying the twist angle incurs a finer adjustment at periodic locations on the GB. Our study offers a systematic approach to understanding and manipulating cross GB thermal transport of arbitrary GBs predictably and precisely. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Showing 1–50 of 1,309 results for author: Yang, R