Search | arXiv e-print repository

ZBanner: Fast Stateless Scanning Capable of Obtaining Responses over TCP

Authors: Chiyu Chen, Yuliang Lu, Guozheng Yang, Yi Xie, Shasha Guo

Abstract: Fast large-scale network scanning is an important way to understand internet service configurations and security in real time, among which stateless scan is representative. Existing stateless scanners can perform single-packet scans for internet-wide network measurements but are limited to host discovery or port scanning. To obtain further information over TCP, slower stateful scanners must be use… ▽ More Fast large-scale network scanning is an important way to understand internet service configurations and security in real time, among which stateless scan is representative. Existing stateless scanners can perform single-packet scans for internet-wide network measurements but are limited to host discovery or port scanning. To obtain further information over TCP, slower stateful scanners must be used in conjunction which spend more time and memory because of connection state maintenance. Through simplifying TCP finite state machine, this paper proposes a novel stateless scanning model, which can establish TCP connections and obtain further responses in a completely stateless manner. Based on this model, we implement ZBanner, an improved modular stateless scanner that utilizes user-defined probes for identifying services and versions, fingerprinting TLS servers, etc. We present unique design of ZBanner and experimentally characterize its feasibility and performance. Experiments show that ZBanner performs better than current state-of-the-art solutions in terms of scan rate and memory usage. ZBanner achieves at least three times faster than current tools for generic ports and over 90 times faster for open ports while kee** a minimum and stable memory usage. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: The paper has been submitted and the code will be published later

arXiv:2405.06964 [pdf, other]

ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots

Authors: Zhixuan Xu, Chongkai Gao, Zixuan Liu, Gang Yang, Chenrui Tie, Haozhuo Zheng, Haoyu Zhou, Weikun Peng, Debang Wang, Tianyi Chen, Zhouliang Yu, Lin Shao

Abstract: To substantially enhance robot intelligence, there is a pressing need to develop a large model that enables general-purpose robots to proficiently undertake a broad spectrum of manipulation tasks, akin to the versatile task-planning ability exhibited by LLMs. The vast diversity in objects, robots, and manipulation tasks presents huge challenges. Our work introduces a comprehensive framework to dev… ▽ More To substantially enhance robot intelligence, there is a pressing need to develop a large model that enables general-purpose robots to proficiently undertake a broad spectrum of manipulation tasks, akin to the versatile task-planning ability exhibited by LLMs. The vast diversity in objects, robots, and manipulation tasks presents huge challenges. Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation that formalizes a manipulation task as contact synthesis. Specifically, our model takes as input object and robot manipulator point clouds, object physical attributes, target motions, and manipulation region masks. It outputs contact points on the object and associated contact forces or post-contact motions for robots to achieve the desired manipulation task. We perform extensive experiments both in the simulation and real-world settings, manipulating articulated rigid objects, rigid objects, and deformable objects that vary in dimensionality, ranging from one-dimensional objects like ropes to two-dimensional objects like cloth and extending to three-dimensional objects such as plasticine. Our model achieves average success rates of around 90\%. Supplementary materials and videos are available on our project website at https://manifoundationmodel.github.io/. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.05030 [pdf]

Functional Specifications and Testing Requirements of Grid-Forming Type-IV Offshore Wind Power

Authors: Sulav Ghimire, Gabriel M. G. Guerreiro, Kanakesh V. K., Emerson D. Guest, Kim H. Jensen, Guangya Yang, Xiongfei Wang

Abstract: Throughout the past few years, various transmission system operators (TSOs) and research institutes have defined several functional specifications for grid-forming (GFM) converters via grid codes, white papers, and technical documents. These institutes and organisations also proposed testing requirements for general inverter-based resources (IBRs) and specific GFM converters. This paper initially… ▽ More Throughout the past few years, various transmission system operators (TSOs) and research institutes have defined several functional specifications for grid-forming (GFM) converters via grid codes, white papers, and technical documents. These institutes and organisations also proposed testing requirements for general inverter-based resources (IBRs) and specific GFM converters. This paper initially reviews functional specifications and testing requirements from several sources to create an understanding of GFM capabilities in general. Furthermore, it proposes an outlook of the defined GFM capabilities, functional specifications, and testing requirements for offshore wind power plant (OF WPP) applications from an original equipment manufacturer (OEM) perspective. Finally, this paper briefly establishes the relevance of new testing methodologies for equipment-level certification and model validation, focusing on GFM functional specifications. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.04753 [pdf, other]

AttacKG+:Boosting Attack Knowledge Graph Construction with Large Language Models

Authors: Yongheng Zhang, Tingwen Du, Yunshan Ma, Xiang Wang, Yi Xie, Guozheng Yang, Yuliang Lu, Ee-Chien Chang

Abstract: Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of ex… ▽ More Attack knowledge graph construction seeks to convert textual cyber threat intelligence (CTI) reports into structured representations, portraying the evolutionary traces of cyber attacks. Even though previous research has proposed various methods to construct attack knowledge graphs, they generally suffer from limited generalization capability to diverse knowledge types as well as requirement of expertise in model design and tuning. Addressing these limitations, we seek to utilize Large Language Models (LLMs), which have achieved enormous success in a broad range of tasks given exceptional capabilities in both language understanding and zero-shot task fulfillment. Thus, we propose a fully automatic LLM-based framework to construct attack knowledge graphs named: AttacKG+. Our framework consists of four consecutive modules: rewriter, parser, identifier, and summarizer, each of which is implemented by instruction prompting and in-context learning empowered by LLMs. Furthermore, we upgrade the existing attack knowledge schema and propose a comprehensive version. We represent a cyber attack as a temporally unfolding event, each temporal step of which encapsulates three layers of representation, including behavior graph, MITRE TTP labels, and state summary. Extensive evaluation demonstrates that: 1) our formulation seamlessly satisfies the information needs in threat event analysis, 2) our construction framework is effective in faithfully and accurately extracting the information defined by AttacKG+, and 3) our attack graph directly benefits downstream security practices such as attack reconstruction. All the code and datasets will be released upon acceptance. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 20 pages, 5 figures

arXiv:2405.03851 [pdf, other]

Querying in Constant Time with Learned Indexes

Authors: Luis Croquevielle, Guang Yang, Liang Lian, Ali Hadian, Thomas Heinis

Abstract: Learned indexes leverage machine learning models to accelerate query answering in databases, showing impressive practical performance. However, theoretical understanding of these methods remains incomplete. Existing research suggests that learned indexes have superior asymptotic complexity compared to their non-learned counterparts, but these findings have been established under restrictive probab… ▽ More Learned indexes leverage machine learning models to accelerate query answering in databases, showing impressive practical performance. However, theoretical understanding of these methods remains incomplete. Existing research suggests that learned indexes have superior asymptotic complexity compared to their non-learned counterparts, but these findings have been established under restrictive probabilistic assumptions. Specifically, for a sorted array with $n$ elements, it has been shown that learned indexes can find a key in $O(\log(\log n))$ expected time using at most linear space, compared with $O(\log n)$ for non-learned methods. In this work, we prove $O(1)$ expected time can be achieved with at most linear space, thereby establishing the tightest upper bound so far for the time complexity of an asymptotically optimal learned index. Notably, we use weaker probabilistic assumptions than prior work, meaning our results generalize previous efforts. Furthermore, we introduce a new measure of statistical complexity for data. This metric exhibits an information-theoretical interpretation and can be estimated in practice. This characterization provides further theoretical understanding of learned indexes, by hel** to explain why some datasets seem to be particularly challenging for these methods. △ Less

Submitted 13 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.01520 [pdf]

AI for Manufacturing and Healthcare: a chemistry and engineering perspective

Authors: Jihua Chen, Yue Yuan, Amir Koushyar Ziabari, Xuan Xu, Honghai Zhang, Panagiotis Christakopoulos, Peter V. Bonnesen, Ilia N. Ivanov, Panchapakesan Ganesh, Chen Wang, Karen Patino Jaimes, Guang Yang, Rajeev Kumar, Bobby G. Sumpter, Rigoberto Advincula

Abstract: Artificial Intelligence (AI) approaches are increasingly being applied to more and more domains of Science, Engineering, Chemistry, and Industries to not only improve efficiencies and enhance productivity, but also enable new capabilities. The new opportunities range from automated molecule design and screening, properties prediction, gaining insights of chemical reactions, to computer-aided desig… ▽ More Artificial Intelligence (AI) approaches are increasingly being applied to more and more domains of Science, Engineering, Chemistry, and Industries to not only improve efficiencies and enhance productivity, but also enable new capabilities. The new opportunities range from automated molecule design and screening, properties prediction, gaining insights of chemical reactions, to computer-aided design, predictive maintenance of systems, robotics, and autonomous vehicles. This review focuses on the new applications of AI in manufacturing and healthcare. For the Manufacturing Industries, we focus on AI and algorithms for (1) Battery, (2) Flow Chemistry, (3) Additive Manufacturing, (4) Sensors, and (5) Machine Vision. For Healthcare applications, we focus on: (1) Medical Vision (2) Diagnosis, (3) Protein Design, and (4) Drug Discovery. In the end, related topics are discussed, including physics integrated machine learning, model explainability, security, and governance during model deployment. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.18209 [pdf, other]

4DBInfer: A 4D Benchmarking Toolbox for Graph-Centric Predictive Modeling on Relational DBs

Authors: Minjie Wang, Quan Gan, David Wipf, Zhenkun Cai, Ning Li, Jianheng Tang, Yanlin Zhang, Zizhao Zhang, Zunyao Mao, Yakun Song, Yanbo Wang, Jiahang Li, Han Zhang, Guang Yang, Xiao Qin, Chuan Lei, Muhan Zhang, Weinan Zhang, Christos Faloutsos, Zheng Zhang

Abstract: Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and eva… ▽ More Although RDBs store vast amounts of rich, informative data spread across interconnected tables, the progress of predictive machine learning models as applied to such tasks arguably falls well behind advances in other domains such as computer vision or natural language processing. This deficit stems, at least in part, from the lack of established/public RDB benchmarks as needed for training and evaluation purposes. As a result, related model development thus far often defaults to tabular approaches trained on ubiquitous single-table benchmarks, or on the relational side, graph-based alternatives such as GNNs applied to a completely different set of graph datasets devoid of tabular characteristics. To more precisely target RDBs lying at the nexus of these two complementary regimes, we explore a broad class of baseline models predicated on: (i) converting multi-table datasets into graphs using various strategies equipped with efficient subsampling, while preserving tabular characteristics; and (ii) trainable models with well-matched inductive biases that output predictions based on these input subgraphs. Then, to address the dearth of suitable public benchmarks and reduce siloed comparisons, we assemble a diverse collection of (i) large-scale RDB datasets and (ii) coincident predictive tasks. From a delivery standpoint, we operationalize the above four dimensions (4D) of exploration within a unified, scalable open-source toolbox called 4DBInfer. We conclude by presenting evaluations using 4DBInfer, the results of which highlight the importance of considering each such dimension in the design of RDB predictive models, as well as the limitations of more naive approaches such as simply joining adjacent tables. Our source code is released at https://github.com/awslabs/multi-table-benchmark . △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: Under review

arXiv:2404.18051 [pdf, ps, other]

Liouville type theorems for the 3D stationary MHD and Hall-MHD equations with non-zero constant vectors at infinity

Authors: Wendong Wang, Guoxu Yang

Abstract: In this paper, we investigate Liouville type theorems for the three-dimensional steady-state MHD or Hall-MHD system under some asymptotic assumptions at infinity. Firstly, for the Hall-MHD system we obtain that $u$ and $B$ are constant vectors for any fluid viscosity, magnetic resistivity or Hall-coefficient when the magnetic field $B$ tends to a non-zero constant vector at infinity while the velo… ▽ More In this paper, we investigate Liouville type theorems for the three-dimensional steady-state MHD or Hall-MHD system under some asymptotic assumptions at infinity. Firstly, for the Hall-MHD system we obtain that $u$ and $B$ are constant vectors for any fluid viscosity, magnetic resistivity or Hall-coefficient when the magnetic field $B$ tends to a non-zero constant vector at infinity while the velocity field $u$ tends to $0$. Secondly, it also follows that $u$ and $B$ are constant for the Hall-MHD system when the velocity field tends to a constant vector at infinity while the magnetic field tends to $0$ without any assumptions on viscosity, magnetic resistivity or Hall-coefficient. One main difficulty lies in the Hall term, and we obtain the $L^p$ estimates of a generalized Oseen system with some supercritical terms via Lizorkin's theory and prove that the operator is stable by exploring Kato's stability theorem. Moreover, some similar results for the degenerate fluid viscosity or magnetic resistivity for the MHD system are also obtained, which is independent of interest. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.17672 [pdf, other]

BlenderAlchemy: Editing 3D Graphics with Vision-Language Models

Authors: Ian Huang, Guandao Yang, Leonidas Guibas

Abstract: Graphics design is important for various applications, including movie production and game design. To create a high-quality scene, designers usually need to spend hours in software like Blender, in which they might need to interleave and repeat operations, such as connecting material nodes, hundreds of times. Moreover, slightly different design goals may require completely different sequences, mak… ▽ More Graphics design is important for various applications, including movie production and game design. To create a high-quality scene, designers usually need to spend hours in software like Blender, in which they might need to interleave and repeat operations, such as connecting material nodes, hundreds of times. Moreover, slightly different design goals may require completely different sequences, making automation difficult. In this paper, we propose a system that leverages Vision-Language Models (VLMs), like GPT-4V, to intelligently search the design action space to arrive at an answer that can satisfy a user's intent. Specifically, we design a vision-based edit generator and state evaluator to work together to find the correct sequence of actions to achieve the goal. Inspired by the role of visual imagination in the human design process, we supplement the visual reasoning capabilities of VLMs with "imagined" reference images from image-generation models, providing visual grounding of abstract language descriptions. In this paper, we provide empirical evidence suggesting our system can produce simple but tedious Blender editing sequences for tasks such as editing procedural materials from text and/or reference images, as well as adjusting lighting configurations for product renderings in complex scenes. △ Less

Submitted 22 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.16938 [pdf, other]

IRX-CIGALE: a tailored module for Low-Luminosity AGN

Authors: I. E. López, G. Yang, G. Mountrichas, M. Brusa, D. M. Alexander, R. D. Baldi, E. Bertola, S. Bonoli, A. Comastri, F. Shankar, N. Acharya, A. V. Alonso Tetilla, A. Lapi, B. Laloux, X. López López, I. Muñoz Rodríguez, B. Musiimenta, N. Osorio Clavijo, L. Sala, D. Sengupta

Abstract: The spectral energy distribution (SED) of low-luminosity active galactic nuclei (LLAGN) presents unique challenges due to their comparable radiation output to their host galaxies and complex accretion dynamics. We introduce a novel module within the CIGALE framework specifically designed for SED fitting of LLAGN, incorporating both empirical relationships like $L_\mathrm{X}$--$L_\mathrm{12μm}$ and… ▽ More The spectral energy distribution (SED) of low-luminosity active galactic nuclei (LLAGN) presents unique challenges due to their comparable radiation output to their host galaxies and complex accretion dynamics. We introduce a novel module within the CIGALE framework specifically designed for SED fitting of LLAGN, incorporating both empirical relationships like $L_\mathrm{X}$--$L_\mathrm{12μm}$ and physically-based accretion models such as advection-dominated accretion flows (ADAFs) and truncated accretion disks. This allows for more accurate depiction of LLAGN central emissions. Using this module, we analyzed a set of 52 X-ray-detected local galaxies, primarily LINERs and Seyferts, and compared its performance to higher-luminosity AGN from the COSMOS and SDSS datasets. Our results show that the module adeptly estimates bolometric luminosities with high precision, despite significant galaxy contamination. It also introduces a versatile X-ray bolometric correction formula covering a vast range of luminosities. Further, our study explored the $α_\mathrm{ox}$ index, which measures the UV to X-ray emission slope, showing that unlike quasars, LLAGN display either stable or only slightly varying $α_\mathrm{ox}$ values, indicating differing accretion and photon production processes in the low luminosity regime. Additionally, we observed a significant drop of 1.4 dex in specific star formation rates when moving from whole galaxies to a central 9-arcsecond aperture in LLAGN, suggesting potential feedback mechanisms at play. Overall, our findings underscore the importance of a multiwavelength approach in AGN studies, highlighting distinct behaviors of LLAGN compared to quasars, thus enhancing our understanding of LLAGN and providing a framework for future comprehensive AGN population studies. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Submitted to A&A

arXiv:2404.16264 [pdf]

Realisation of de Gennes$'$ Absolute Superconducting Switch with a Heavy Metal Interface

Authors: Hisakazu Matsuki, Alberto Hijano, Grzegorz P. Mazur, Stefan Ilic, Binbin Wang, Yuliya Alekhina, Kohei Ohnishi, Sachio Komori, Yang Li, Nadia Stelmashenko, Niladri Banerjee, Lesley F. Cohen, David W. McComb, F. Sebastian Bergeret, Guang Yang, Jason W. A. Robinson

Abstract: In 1966, Pierre-Gilles de Gennes proposed a non-volatile mechanism for switching superconductivity on and off in a magnetic device. This involved a superconductor (S) sandwiched between ferromagnetic (F) insulators in which the net magnetic exchange field could be controlled through the magnetisation-orientation of the F layers. Because superconducting switches are attractive for a range of applic… ▽ More In 1966, Pierre-Gilles de Gennes proposed a non-volatile mechanism for switching superconductivity on and off in a magnetic device. This involved a superconductor (S) sandwiched between ferromagnetic (F) insulators in which the net magnetic exchange field could be controlled through the magnetisation-orientation of the F layers. Because superconducting switches are attractive for a range of applications, extensive studies have been carried out on $F/S/F$ structures. Although these have demonstrated a sensitivity of the superconducting critical temperature ($T_{c}$) to parallel (P) and antiparallel (AP) magnetisation-orientations of the F layers, corresponding shifts in $T_c$ (i.e., $ΔT_c = T_{c,AP} - T_{c,P}$) are lower than predicted with $ΔT_c$ only a small fraction of $T_{c,AP}$, precluding the development of applications. Here, we report $EuS/Au/Nb/EuS$ structures where EuS is an insulating ferromagnet, Nb is a superconductor and Au is a heavy metal. For P magnetisations, the superconducting state in this structure is quenched down to the lowest measured temperature of 20 mK meaning that $ΔT_c/T_{c,AP}$ is practically 1. The key to this so-called absolute switching effect is a sizable spin-mixing conductance at the $EuS/Au$ interface which ensures a robust magnetic proximity effect, unlocking the potential of $F/S/F$ switches for low power electronics. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.12670 [pdf, other]

Towards Human-centered Proactive Conversational Agents

Authors: Yang Deng, Lizi Liao, Zhonghua Zheng, Grace Hui Yang, Tat-Seng Chua

Abstract: Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system's capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals before users articulate their requests. This perspectives paper highlights the importance of moving towards building human-centered PCAs that emphasize human needs and expectations, and that considers eth… ▽ More Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system's capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals before users articulate their requests. This perspectives paper highlights the importance of moving towards building human-centered PCAs that emphasize human needs and expectations, and that considers ethical and social implications of these agents, rather than solely focusing on technological capabilities. The distinction between a proactive and a reactive system lies in the proactive system's initiative-taking nature. Without thoughtful design, proactive systems risk being perceived as intrusive by human users. We address the issue by establishing a new taxonomy concerning three key dimensions of human-centered PCAs, namely Intelligence, Adaptivity, and Civility. We discuss potential research opportunities and challenges based on this new taxonomy upon the five stages of PCA system construction. This perspectives paper lays a foundation for the emerging area of conversational information retrieval research and paves the way towards advancing human-centered proactive conversational systems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: Accepted by SIGIR 2024 (Perspectives Track)

arXiv:2404.10253 [pdf, other]

Kilometer-Level Coupled Modeling Using 40 Million Cores: An Eight-Year Journey of Model Development

Authors: Xiaohui Duan, Yuxuan Li, Zhao Liu, Bin Yang, Juepeng Zheng, Haohuan Fu, Shaoqing Zhang, Shiming Xu, Yang Gao, Wei Xue, Di Wei, Xiao**g Lv, Lifeng Yan, Haopeng Huang, Haitian Lu, Lingfeng Wan, Haoran Lin, Qixin Chang, Chenlin Li, Quanjie He, Zeyu Song, Xuantong Wang, Yangyang Yu, Xilong Fan, Zhaopeng Qu , et al. (16 additional authors not shown)

Abstract: With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries t… ▽ More With current and future leading systems adopting heterogeneous architectures, adapting existing models for heterogeneous supercomputers is of urgent need for improving model resolution and reducing modeling uncertainty. This paper presents our three-week effort on porting a complex earth system model, CESM 2.2, to a 40-million-core Sunway supercomputer. Taking a non-intrusive approach that tries to minimizes manual code modifications, our project tries to achieve both improvement of performance and consistency of the model code. By using a hierarchical grid system and an OpenMP-based offloading toolkit, our porting and parallelization effort covers over 80% of the code, and achieves a simulation speed of 340 SDPD (simulated days per day) for 5-km atmosphere, 265 SDPD for 3-km ocean, and 222 SDPD for a coupled model, thus making multi-year or even multi-decadal experiments at such high resolution possible. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 18 pages, 13 figures

arXiv:2404.06187 [pdf, other]

High-Fidelity CZ Gates in Double Quantum Dot -- Circuit QED Systems Beyond the Rotating-Wave Approximation

Authors: Guangzhao Yang, Marek Gluza, Si Yan Koh, Calvin Pei Yu Wong, Kuan Eng Johnson Goh, Bent Weber, Hui Khoon Ng, Teck Seng Koh

Abstract: Semiconductor double quantum dot (DQD) qubits coupled via superconducting microwave resonators provide a powerful means of long-range manipulation of the qubits' spin and charge degrees of freedom. Quantum gates can be implemented by parametrically driving the qubits while their transition frequencies are detuned from the resonator frequency. Long-range two-qubit CZ gates have been proposed for th… ▽ More Semiconductor double quantum dot (DQD) qubits coupled via superconducting microwave resonators provide a powerful means of long-range manipulation of the qubits' spin and charge degrees of freedom. Quantum gates can be implemented by parametrically driving the qubits while their transition frequencies are detuned from the resonator frequency. Long-range two-qubit CZ gates have been proposed for the DQD spin qubit within the rotating-wave approximation (RWA). Rapid gates demand strong coupling, but RWA breaks down when coupling strengths become significant relative to system frequencies. Therefore, understanding the detrimental impact of time-dependent terms ignored by RWA is critical for high-fidelity operation. Here, we go beyond RWA to study CZ gate fidelity for both DQD spin and charge qubits. We propose a novel parametric drive on the charge qubit that produces fewer time-dependent terms and show that it outperforms its spin counterpart. We find that drive amplitude - a parameter dropped in RWA - is critical for optimizing fidelity and map out high-fidelity regimes. Our results demonstrate the necessity of going beyond RWA in understanding how long-range gates can be realized in DQD qubits, with charge qubits offering considerable advantages in high-fidelity operation. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 6 Pages, 3 Figures (Main text); 12 Pages, 1 Figure (Supplemental Material)

arXiv:2404.04421 [pdf, other]

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

Authors: Yang Zheng, Qingqing Zhao, Guandao Yang, Wang Yifan, Donglai Xiang, Florian Dubost, Dmitry Lagun, Thabo Beeler, Federico Tombari, Leonidas Guibas, Gordon Wetzstein

Abstract: Modeling and rendering photorealistic avatars is of crucial importance in many applications. Existing methods that build a 3D avatar from visual observations, however, struggle to reconstruct clothed humans. We introduce PhysAvatar, a novel framework that combines inverse rendering with inverse physics to automatically estimate the shape and appearance of a human from multi-view video data along w… ▽ More Modeling and rendering photorealistic avatars is of crucial importance in many applications. Existing methods that build a 3D avatar from visual observations, however, struggle to reconstruct clothed humans. We introduce PhysAvatar, a novel framework that combines inverse rendering with inverse physics to automatically estimate the shape and appearance of a human from multi-view video data along with the physical parameters of the fabric of their clothes. For this purpose, we adopt a mesh-aligned 4D Gaussian technique for spatio-temporal mesh tracking as well as a physically based inverse renderer to estimate the intrinsic material properties. PhysAvatar integrates a physics simulator to estimate the physical parameters of the garments using gradient-based optimization in a principled manner. These novel capabilities enable PhysAvatar to create high-quality novel-view renderings of avatars dressed in loose-fitting clothes under motions and lighting conditions not seen in the training data. This marks a significant advancement towards modeling photorealistic digital humans using physically based inverse rendering with physics in the loop. Our project website is at: https://qingqing-zhao.github.io/PhysAvatar △ Less

Submitted 9 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: Project Page: https://qingqing-zhao.github.io/PhysAvatar

arXiv:2404.03576 [pdf, ps, other]

The Rise of Faint, Red AGN at $z>4$: A Sample of Little Red Dots in the JWST Extragalactic Legacy Fields

Authors: Dale D. Kocevski, Steven L. Finkelstein, Guillermo Barro, Anthony J. Taylor, Antonello Calabrò, Brivael Laloux, Johannes Buchner, Jonathan R. Trump, Gene C. K. Leung, Guang Yang, Mark Dickinson, Pablo G. Pérez-González, Fabio Pacucci, Kohei Inayoshi, Rachel S. Somerville, Elizabeth J. McGrath, Hollis B. Akins, Micaela B. Bagley, Laura Bisigello, Rebecca A. A. Bowler, Adam Carnall, Caitlin M. Casey, Yingjie Cheng, Nikko J. Cleri, Luca Costantin , et al. (32 additional authors not shown)

Abstract: We present a sample of 341 "little red dots" (LRDs) spanning the redshift range $z\sim2-11$ using data from the CEERS, PRIMER, JADES, UNCOVER and NGDEEP surveys. These sources are likely heavily-reddened AGN that trace a previously-hidden phase of dust-obscured black hole growth in the early Universe. Unlike past use of color indices to identify LRDs, we employ continuum slope fitting using shifti… ▽ More We present a sample of 341 "little red dots" (LRDs) spanning the redshift range $z\sim2-11$ using data from the CEERS, PRIMER, JADES, UNCOVER and NGDEEP surveys. These sources are likely heavily-reddened AGN that trace a previously-hidden phase of dust-obscured black hole growth in the early Universe. Unlike past use of color indices to identify LRDs, we employ continuum slope fitting using shifting bandpasses to sample the same rest-frame emission blueward and redward of the Balmer break. This approach allows us to identify LRDs over a wider redshift range and is less susceptible to contamination from galaxies with strong breaks that otherwise lack a rising red continuum. The redshift distribution of our sample increases at $z<8$ and then undergoes a rapid decline at $z\sim4.5$, which may tie the emergence, and obscuration, of these sources to the inside-out growth that galaxies experience during this epoch. We find that LRDs are 2-3 dex more numerous than bright quasars at $z\sim5-7$, but their number density is only 0.6-1 dex higher than X-ray and UV selected AGN at these redshifts. Within our sample, we have identified the first X-ray detected LRDs at $z=3.1$ and $z=4.66$. An X-ray spectral analysis confirms that these AGN are moderately obscured with $\log\,(N_{\rm H}/{\rm cm}^{2}$) of $23.3^{+0.4}_{-1.3}$ and $22.72^{+0.13}_{-0.16}$. Our analysis reveals that reddened AGN emission dominates their rest-optical light, while the rest-UV originates from their host galaxies. We also present NIRSpec follow-up spectroscopy of 17 LRDs that show broad emission lines consistent with AGN activity. The confirmed AGN fraction of our sample is $71\%$ for sources with F444W$<26.5$. In addition, we find three LRDs with narrow blue-shifted Balmer absorption features in their spectra, suggesting an outflow of high-density, low ionization gas from near the central engine of these faint, red AGN. △ Less

Submitted 19 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: 23 pages, 17 figures, submitted to ApJ

arXiv:2404.02028 [pdf, other]

QUSL: Quantum Unsupervised Image Similarity Learning with Enhanced Performance

Authors: Lian-Hui Yu, Xiao-Yu Li, Geng Chen, Qin-Sheng Zhu, Guo-Wu Yang

Abstract: Leveraging quantum advantages to enhance machine learning capabilities has become a primary focus of research, particularly for complex tasks such as image similarity detection. To fully exploit the potential of quantum computing, it is essential to design quantum circuits tailored to the specific characteristics of the task at hand. In response to this challenge, we propose a novel quantum unsupe… ▽ More Leveraging quantum advantages to enhance machine learning capabilities has become a primary focus of research, particularly for complex tasks such as image similarity detection. To fully exploit the potential of quantum computing, it is essential to design quantum circuits tailored to the specific characteristics of the task at hand. In response to this challenge, we propose a novel quantum unsupervised similarity learning method,QUSL. Building upon the foundation of similarity detection triplets and generating positive samples through perturbations of anchor images, QUSL operates independently of classical oracles. By leveraging the performance of triplets and the characteristics of quantum circuits, QUSL systematically explores high-performance quantum circuit architectures customized for dataset features using metaheuristic algorithms, thereby achieving efficient quantum feature extraction with reduced circuit costs. Comprehensive numerical simulations and experiments on quantum computers demonstrate QUSL's remarkable performance compared to state-of-the-art quantum methods. QUSL achieves reductions exceeding 50% in critical quantum resource utilization while also realizing an enhancement of up to 19.5% in similarity detection correlation across the DISC21, COCO, and landscape datasets. This enables efficient quantum similarity modeling for large-scale unlabeled image data with reduced quantum resource utilization. △ Less

Submitted 23 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

arXiv:2404.01687 [pdf, other]

Search for a sub-eV sterile neutrino using Daya Bay's full dataset

Authors: F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding, Y. Y. Ding , et al. (176 additional authors not shown)

Abstract: This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis… ▽ More This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis benefits from a doubling of the statistics of our previous result and from improvements of several important systematic uncertainties. No significant oscillation due to mixing of a sub-eV sterile neutrino with active neutrinos was found. Exclusion limits are set by both Feldman-Cousins and CLs methods. Light sterile neutrino mixing with $\sin^2 2θ_{14} \gtrsim 0.01$ can be excluded at 95\% confidence level in the region of $0.01$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.1 $ eV$^2$. This result represents the world-leading constraints in the region of $2 \times 10^{-4}$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.2 $ eV$^2$. △ Less

Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, 1 table

arXiv:2404.01223 [pdf, other]

Feature Splatting: Language-Driven Physics-Based Scene Synthesis and Editing

Authors: Ri-Zhao Qiu, Ge Yang, Weijia Zeng, Xiaolong Wang

Abstract: Scene representations using 3D Gaussian primitives have produced excellent results in modeling the appearance of static and dynamic 3D scenes. Many graphics applications, however, demand the ability to manipulate both the appearance and the physical properties of objects. We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics from vision… ▽ More Scene representations using 3D Gaussian primitives have produced excellent results in modeling the appearance of static and dynamic 3D scenes. Many graphics applications, however, demand the ability to manipulate both the appearance and the physical properties of objects. We introduce Feature Splatting, an approach that unifies physics-based dynamic scene synthesis with rich semantics from vision language foundation models that are grounded by natural language. Our first contribution is a way to distill high-quality, object-centric vision-language features into 3D Gaussians, that enables semi-automatic scene decomposition using text queries. Our second contribution is a way to synthesize physics-based dynamics from an otherwise static scene using a particle-based simulator, in which material properties are assigned automatically via text queries. We ablate key techniques used in this pipeline, to illustrate the challenge and opportunities in using feature-carrying 3D Gaussians as a unified format for appearance, geometry, material properties and semantics grounded on natural language. Project website: https://feature-splatting.github.io/ △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Project website: https://feature-splatting.github.io/

arXiv:2404.01082 [pdf, other]

The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Li** Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation platform hinder the development of data-driven reconstruction algorithms. To address this issue, we organized the Cardiac MRI Reconstruction Challenge (CMRxRecon) in 2023, in collaboration with the 26th International Conference on MICCAI. CMRxRecon presented an extensive k-space dataset comprising cine and map** raw data, accompanied by detailed annotations of cardiac anatomical structures. With overwhelming participation, the challenge attracted more than 285 teams and over 600 participants. Among them, 22 teams successfully submitted Docker containers for the testing phase, with 7 teams submitted for both cine and map** tasks. All teams use deep learning based approaches, indicating that deep learning has predominately become a promising solution for the problem. The first-place winner of both tasks utilizes the E2E-VarNet architecture as backbones. In contrast, U-Net is still the most popular backbone for both multi-coil and single-coil reconstructions. This paper provides a comprehensive overview of the challenge design, presents a summary of the submitted results, reviews the employed methods, and offers an in-depth discussion that aims to inspire future advancements in cardiac MRI reconstruction models. The summary emphasizes the effective strategies observed in Cardiac MRI reconstruction, including backbone architecture, loss function, pre-processing techniques, physical modeling, and model complexity, thereby providing valuable insights for further developments in this field. △ Less

Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: 25 pages, 17 figures

arXiv:2404.00697 [pdf]

A Lane Usage Strategy for General Traffic Access on Bus Lanes under Mixed Traffic Environment

Authors: Haoran Li, Zhenzhou Yuan, Rui Yue, Guangchuan Yang, Chuang Zhu, Siyuan Chen

Abstract: The strategy of permitting general traffic to use the bus lane for improved utilization while ensuring bus priority has gained increasingly attention, particularly with the support of vehicle-to-everything technology. In this study, we propose a novel lane usage strategy called Dynamic Spatial-Temporal Priority (DSTP) to ensure bus priority and optimize bus lane usage in a mixed traffic environmen… ▽ More The strategy of permitting general traffic to use the bus lane for improved utilization while ensuring bus priority has gained increasingly attention, particularly with the support of vehicle-to-everything technology. In this study, we propose a novel lane usage strategy called Dynamic Spatial-Temporal Priority (DSTP) to ensure bus priority and optimize bus lane usage in a mixed traffic environment. DSTP leverages dynamic methods to identify available spatial-temporal resources in the lane, utilizing signal timing, road information, and vehicle data. A Right-of-Way assignment optimization model is then developed based on these resources to determine which vehicles can enter the bus lane. The model is dynamically enacted using a rolling horizon scheme to accommodate time-varying traffic conditions. Numerical studies have validated the advantages of DSTP, showing maintained bus priority, improved traffic efficiency, reduced fuel consumption, and lower CO2 emissions, especially during periods of high traffic demand and concentrated bus arrivals. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: 16 pages, 22 figures

arXiv:2404.00598 [pdf, other]

Robust Beamforming Design and Antenna Selection for Dynamic HRIS-aided Massive MIMO Systems

Authors: **tao Wang, Binggui Zhou, Chengzhi Ma, Shiqi Gong, Guanghua Yang, Shaodan Ma

Abstract: In this paper, a dynamic hybrid active-passive reconfigurable intelligent surface (HRIS) is proposed to further enhance the massive multiple-input-multiple-output (MIMO) system, since it supports the dynamic placement of active and passive elements. Specifically, considering the impact of the hardware impairments (HWIs), we investigate the channel-aware configuration of the receive antennas at the… ▽ More In this paper, a dynamic hybrid active-passive reconfigurable intelligent surface (HRIS) is proposed to further enhance the massive multiple-input-multiple-output (MIMO) system, since it supports the dynamic placement of active and passive elements. Specifically, considering the impact of the hardware impairments (HWIs), we investigate the channel-aware configuration of the receive antennas at the base station (BS) and the active/passive elements at the HRIS to improve the reliability of system. To this end, we investigate the average mean-square-error (MSE) minimization problem for the HRIS-aided massive MIMO system by jointly optimizing the BS receive antenna selection matrix, the reflection phase coefficients, the reflection amplitude matrix, and the mode selection matrix of the HRIS under the power budget of the HRIS. To tackle the non-convexity and intractability of this problem, we first transform the binary and discrete variables into continuous ones, and then propose a penalty-based exact block coordinate descent (BCD) algorithm to solve these subproblems alternately. Numerical simulations demonstrate the great superiority of the proposed scheme over the conventional benchmark schemes. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: 5 pages, 2 figures

arXiv:2404.00097 [pdf, other]

doi 10.3847/1538-4357/ad27cc

Map** the Growth of Supermassive Black Holes as a Function of Galaxy Stellar Mass and Redshift

Authors: Fan Zou, Zhibo Yu, W. N. Brandt, Hyungsuk Tak, Guang Yang, Qingling Ni

Abstract: The growth of supermassive black holes is strongly linked to their galaxies. It has been shown that the population mean black-hole accretion rate ($\overline{\mathrm{BHAR}}$) primarily correlates with the galaxy stellar mass ($M_\star$) and redshift for the general galaxy population. This work aims to provide the best measurements of $\overline{\mathrm{BHAR}}$ as a function of $M_\star$ and redshi… ▽ More The growth of supermassive black holes is strongly linked to their galaxies. It has been shown that the population mean black-hole accretion rate ($\overline{\mathrm{BHAR}}$) primarily correlates with the galaxy stellar mass ($M_\star$) and redshift for the general galaxy population. This work aims to provide the best measurements of $\overline{\mathrm{BHAR}}$ as a function of $M_\star$ and redshift over ranges of $10^{9.5}<M_\star<10^{12}~M_\odot$ and $z<4$. We compile an unprecedentedly large sample with eight thousand active galactic nuclei (AGNs) and 1.3 million normal galaxies from nine high-quality survey fields following a wedding-cake design. We further develop a semiparametric Bayesian method that can reasonably estimate $\overline{\mathrm{BHAR}}$ and the corresponding uncertainties, even for sparsely populated regions in the parameter space. $\overline{\mathrm{BHAR}}$ is constrained by X-ray surveys sampling the AGN accretion power and UV-to-infrared multi-wavelength surveys sampling the galaxy population. Our results can independently predict the X-ray luminosity function (XLF) from the galaxy stellar mass function (SMF), and the prediction is consistent with the observed XLF. We also try adding external constraints from the observed SMF and XLF. We further measure $\overline{\mathrm{BHAR}}$ for star-forming and quiescent galaxies and show that star-forming $\overline{\mathrm{BHAR}}$ is generally larger than or at least comparable to the quiescent $\overline{\mathrm{BHAR}}$. △ Less

Submitted 29 March, 2024; originally announced April 2024.

Comments: 25 pages, 12 figures, 1 table. Published in ApJ

Journal ref: ApJ, 964, 183 (2024)

arXiv:2403.19923 [pdf, other]

On the Preprocessing of Physics-informed Neural Networks: How to Better Utilize Data in Fluid Mechanics

Authors: Shengfeng Xu, Chang Yan, Zhenxu Sun, Renfang Huang, Dilong Guo, Guowei Yang

Abstract: Physics-Informed Neural Networks (PINNs) serve as a flexible alternative for tackling forward and inverse problems in differential equations, displaying impressive advancements in diverse areas of applied mathematics. Despite integrating both data and underlying physics to enrich the neural network's understanding, concerns regarding the effectiveness and practicality of PINNs persist. Over the pa… ▽ More Physics-Informed Neural Networks (PINNs) serve as a flexible alternative for tackling forward and inverse problems in differential equations, displaying impressive advancements in diverse areas of applied mathematics. Despite integrating both data and underlying physics to enrich the neural network's understanding, concerns regarding the effectiveness and practicality of PINNs persist. Over the past few years, extensive efforts in the current literature have been made to enhance this evolving method, by drawing inspiration from both machine learning algorithms and numerical methods. Despite notable progressions in PINNs algorithms, the important and fundamental field of data preprocessing remain unexplored, limiting the applications of PINNs especially in solving inverse problems. Therefore in this paper, a concise yet potent data preprocessing method focusing on data normalization was proposed. By applying a linear transformation to both the data and corresponding equations concurrently, the normalized PINNs approach was evaluated on the task of reconstructing flow fields in three turbulent cases. The results, both qualitatively and quantitatively, illustrate that by adhering to the data preprocessing procedure, PINNs can robustly achieve higher prediction accuracy for all flow quantities through the entire training process, distinctly improving the utilization of limited training data. The proposed normalization method requires zero extra computational cost. Though only verified in Navier-Stokes (NS) equations, this method holds potential for application to various other equations. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.16520 [pdf, other]

CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification

Authors: Guangqian Yang, Kangrui Du, Zhihan Yang, Ye Du, Yong** Zheng, Shujun Wang

Abstract: Alzheimer's disease (AD) is an incurable neurodegenerative condition leading to cognitive and functional deterioration. Given the lack of a cure, prompt and precise AD diagnosis is vital, a complex process dependent on multiple factors and multi-modal data. While successful efforts have been made to integrate multi-modal representation learning into medical datasets, scant attention has been given… ▽ More Alzheimer's disease (AD) is an incurable neurodegenerative condition leading to cognitive and functional deterioration. Given the lack of a cure, prompt and precise AD diagnosis is vital, a complex process dependent on multiple factors and multi-modal data. While successful efforts have been made to integrate multi-modal representation learning into medical datasets, scant attention has been given to 3D medical images. In this paper, we propose Contrastive Masked Vim Autoencoder (CMViM), the first efficient representation learning method tailored for 3D multi-modal data. Our proposed framework is built on a masked Vim autoencoder to learn a unified multi-modal representation and long-dependencies contained in 3D medical images. We also introduce an intra-modal contrastive learning module to enhance the capability of the multi-modal Vim encoder for modeling the discriminative features in the same modality, and an inter-modal contrastive learning module to alleviate misaligned representation among modalities. Our framework consists of two main steps: 1) incorporate the Vision Mamba (Vim) into the mask autoencoder to reconstruct 3D masked multi-modal data efficiently. 2) align the multi-modal representations with contrastive learning mechanisms from both intra-modal and inter-modal aspects. Our framework is pre-trained and validated ADNI2 dataset and validated on the downstream task for AD classification. The proposed CMViM yields 2.7\% AUC performance improvement compared with other state-of-the-art methods. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 11 pages, 1 figure

arXiv:2403.14583 [pdf, other]

Co-Optimization of Environment and Policies for Decentralized Multi-Agent Navigation

Authors: Zhan Gao, Guang Yang, Amanda Prorok

Abstract: This work views the multi-agent system and its surrounding environment as a co-evolving system, where the behavior of one affects the other. The goal is to take both agent actions and environment configurations as decision variables, and optimize these two components in a coordinated manner to improve some measure of interest. Towards this end, we consider the problem of decentralized multi-agent… ▽ More This work views the multi-agent system and its surrounding environment as a co-evolving system, where the behavior of one affects the other. The goal is to take both agent actions and environment configurations as decision variables, and optimize these two components in a coordinated manner to improve some measure of interest. Towards this end, we consider the problem of decentralized multi-agent navigation in cluttered environments. By introducing two sub-objectives of multi-agent navigation and environment optimization, we propose an $\textit{agent-environment co-optimization}$ problem and develop a $\textit{coordinated algorithm}$ that alternates between these sub-objectives to search for an optimal synthesis of agent actions and obstacle configurations in the environment; ultimately, improving the navigation performance. Due to the challenge of explicitly modeling the relation between agents, environment and performance, we leverage policy gradient to formulate a model-free learning mechanism within the coordinated framework. A formal convergence analysis shows that our coordinated algorithm tracks the local minimum trajectory of an associated time-varying non-convex optimization problem. Extensive numerical results corroborate theoretical findings and show the benefits of co-optimization over baselines. Interestingly, the results also indicate that optimized environment configurations are able to offer structural guidance that is key to de-conflicting agents in motion. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.13873 [pdf, ps, other]

A quantum picture of light-suppressed photosynthetic charge transfer: Photo-blockade

Authors: Guang Yang, Gen Tatara

Abstract: We propose a dynamic mechanism for the reversible regulation of photochemistry in plants under varying light environments. We employ a three-level quantum model to take into account the correlations between charge donors and charge acceptors immediately before photoexcitation, and show that under steady and coherent driving of light, the efficiency of charge transfer is inversely proportional to t… ▽ More We propose a dynamic mechanism for the reversible regulation of photochemistry in plants under varying light environments. We employ a three-level quantum model to take into account the correlations between charge donors and charge acceptors immediately before photoexcitation, and show that under steady and coherent driving of light, the efficiency of charge transfer is inversely proportional to the intensity of incident light, which can be suppressed so severely that it becomes a limiting factor on photosynthetic electron transport. These results are analyzed to gain insight in the light responses of photosynthetic parameters. We discuss the implications of thermal fluctuation in the light source used in photochemical experiments, and argue that in high light conditions, the quantum yields measured with an incandescent lamp may be higher than those measured with a laser, a manifestation of thermal fluctuation in lamp illumination. Our new picture renders a consistent interpretation of a wide range of experiments, including plastocyanin-dependent electron transport in photosystem I, biphasic redox kinetics of P700 and wavelength-dependent quantum yields, and provides a donor-side scheme for the onset of irreversible damage to photosystem II. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 13 pages, 6 figures

arXiv:2403.13398 [pdf, other]

A unified framework for bounding causal effects on the always-survivor and other populations

Authors: Aixian Chen, Xia Cui, Guangren Yang

Abstract: We investigate the bounding problem of causal effects in experimental studies in which the outcome is truncated by death, meaning that the subject dies before the outcome can be measured. Causal effects cannot be point identified without instruments and/or tight parametric assumptions but can be bounded under mild restrictions. Previous work on partial identification under the principal stratifica… ▽ More We investigate the bounding problem of causal effects in experimental studies in which the outcome is truncated by death, meaning that the subject dies before the outcome can be measured. Causal effects cannot be point identified without instruments and/or tight parametric assumptions but can be bounded under mild restrictions. Previous work on partial identification under the principal stratification framework has primarily focused on the `always-survivor' subpopulation. In this paper, we present a novel nonparametric unified framework to provide sharp bounds on causal effects on discrete and continuous square-integrable outcomes. These bounds are derived on the `always-survivor', `protected', and `harmed' subpopulations and on the entire population with/without assumptions of monotonicity and stochastic dominance. The main idea depends on rewriting the optimization problem in terms of the integrated tail probability expectation formula using a set of conditional probability distributions. The proposed procedure allows for settings with any type and number of covariates, and can be extended to incorporate average causal effects and complier average causal effects. Furthermore, we present several simulation studies conducted under various assumptions as well as the application of the proposed approach to a real dataset from the National Supported Work Demonstration. △ Less

Submitted 26 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.13231 [pdf, other]

Design, construction, and operation of a 1-ton Water-based Liquid scintillator detector at Brookhaven National Laboratory

Authors: X. Xiang, G. Yang, S. Andrade, M. Askins, D. M. Asner, A. Baldoni, D. Cowen, M. V. Diwan, S. Gokhale, S. Hans, J. Jerome, G. Lawley, S. Linden, G. D. Orebi Gann, C. Reyes, R. Rosero, N. Seberg, M. Smiley, N. Speece-Moyer, B. Walsh, J. J. Wang, M. Wilking, M. Yeh

Abstract: Water-based liquid scintillators (WbLS) are attractive neutrino detector materials because they allow us to tune the ratio of the Cherenkov and scintillation signals. Using WbLS large-scale neutrino experiments can benefit from both directional reconstruction and enhanced low-energy efficiency. Furthermore, broadening the science capability of such materials by metal do** may be better suited fo… ▽ More Water-based liquid scintillators (WbLS) are attractive neutrino detector materials because they allow us to tune the ratio of the Cherenkov and scintillation signals. Using WbLS large-scale neutrino experiments can benefit from both directional reconstruction and enhanced low-energy efficiency. Furthermore, broadening the science capability of such materials by metal do** may be better suited for water based liquid scintillators. We recently constructed and commissioned a 1-ton WbLS detector with good photosensor coverage and a capable data acquisition system. We intend to use this flexible detector system as a testbed for WbLS R&D. In this paper we give an overview of the 1-ton system and provide some early analysis results. △ Less

Submitted 13 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12414 [pdf, other]

Development of low-radon ultra-pure water for the Jiangmen Underground Neutrino Observatory

Authors: T. Y. Guan, Y. P. Zhang, B. Wang, C. Guo, J. C. Liu, Q. Tang, C. G. Yang, C. Li

Abstract: The Jiangmen Underground Neutrino Observatory(JUNO) is a state-of-the-art liquid scintillator-based neutrino physics experiment under construction in South China. To reduce the background from external radioactivities, a water Cherenkov detector composed of 35~kton ultra-pure water and 2,400 20-inch photomultiplier tubes is developed. Even after specialized treatment, ultra-pure water still contai… ▽ More The Jiangmen Underground Neutrino Observatory(JUNO) is a state-of-the-art liquid scintillator-based neutrino physics experiment under construction in South China. To reduce the background from external radioactivities, a water Cherenkov detector composed of 35~kton ultra-pure water and 2,400 20-inch photomultiplier tubes is developed. Even after specialized treatment, ultra-pure water still contains trace levels of radioactive elements that can contribute to the detector background. Among which $^{222}$Rn is particularly significant. To address this, an online radon removal system based on the JUNO prototype has been developed. By integrating micro-bubble generators to enhance degasser's radon removal efficiency, the radon concentration in water can be reduced to 1~mBq/m$^{3}$ level, meeting the stringent requirements of JUNO. Additionally, a highly sensitive online radon concentration measurement system capable of detecting concentrations $\sim$1~mBq/m$^3$ has been developed to monitor the radon concentration in water. In this paper, the details regarding both systems will be presented. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 20 pages, 13 figures

arXiv:2403.11004 [pdf, other]

Forward Learning of Graph Neural Networks

Authors: Namyong Park, Xing Wang, Antoine Simoulin, Shuai Yang, Grey Yang, Ryan Rossi, Puja Trivedi, Nesreen Ahmed

Abstract: Graph neural networks (GNNs) have achieved remarkable success across a wide range of applications, such as recommendation, drug discovery, and question answering. Behind the success of GNNs lies the backpropagation (BP) algorithm, which is the de facto standard for training deep neural networks (NNs). However, despite its effectiveness, BP imposes several constraints, which are not only biological… ▽ More Graph neural networks (GNNs) have achieved remarkable success across a wide range of applications, such as recommendation, drug discovery, and question answering. Behind the success of GNNs lies the backpropagation (BP) algorithm, which is the de facto standard for training deep neural networks (NNs). However, despite its effectiveness, BP imposes several constraints, which are not only biologically implausible, but also limit the scalability, parallelism, and flexibility in learning NNs. Examples of such constraints include storage of neural activities computed in the forward pass for use in the subsequent backward pass, and the dependence of parameter updates on non-local signals. To address these limitations, the forward-forward algorithm (FF) was recently proposed as an alternative to BP in the image classification domain, which trains NNs by performing two forward passes over positive and negative data. Inspired by this advance, we propose ForwardGNN in this work, a new forward learning procedure for GNNs, which avoids the constraints imposed by BP via an effective layer-wise local forward training. ForwardGNN extends the original FF to deal with graph data and GNNs, and makes it possible to operate without generating negative inputs (hence no longer forward-forward). Further, ForwardGNN enables each layer to learn from both the bottom-up and top-down signals without relying on the backpropagation of errors. Extensive experiments on real-world datasets show the effectiveness and generality of the proposed forward graph learning framework. We release our code at https://github.com/facebookresearch/forwardgnn. △ Less

Submitted 12 April, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

Comments: ICLR 2024

arXiv:2403.10860 [pdf, other]

Efficient Domain Adaptation for Endoscopic Visual Odometry

Authors: Junyang Wu, Yun Gu, Guang-Zhong Yang

Abstract: Visual odometry plays a crucial role in endoscopic imaging, yet the scarcity of realistic images with ground truth poses poses a significant challenge. Therefore, domain adaptation offers a promising approach to bridge the pre-operative planning domain with the intra-operative real domain for learning odometry information. However, existing methodologies suffer from inefficiencies in the training… ▽ More Visual odometry plays a crucial role in endoscopic imaging, yet the scarcity of realistic images with ground truth poses poses a significant challenge. Therefore, domain adaptation offers a promising approach to bridge the pre-operative planning domain with the intra-operative real domain for learning odometry information. However, existing methodologies suffer from inefficiencies in the training time. In this work, an efficient neural style transfer framework for endoscopic visual odometry is proposed, which compresses the time from pre-operative planning to testing phase to less than five minutes. For efficient traing, this work focuses on training modules with only a limited number of real images and we exploit pre-operative prior information to dramatically reduce training duration. Moreover, during the testing phase, we propose a novel Test Time Adaptation (TTA) method to mitigate the gap in lighting conditions between training and testing datasets. Experimental evaluations conducted on two public endoscope datasets showcase that our method achieves state-of-the-art accuracy in visual odometry tasks while boasting the fastest training speeds. These results demonstrate significant promise for intra-operative surgery applications. △ Less

Submitted 16 March, 2024; originally announced March 2024.

arXiv:2403.09088 [pdf, ps, other]

The fundamental plane of blazars based on the black hole spin-mass energy

Authors: Xu Zhang, Dingrong Xiong, Quangui Gao, Guiqin Yang, Fangwu Lu, Weiwei Na, Longhua Qin

Abstract: We examine the fundamental plane of 91 Blazars which include FSRQs and BL Lacs with known X-ray luminosity ($L_{R}$), radio luminosity ($L_X$), and black hole mass measurements ($M$) to reflect the relationship between jet and accretion for blazars. The fundamental plane of Blazars are log$L_{R}$=${0.273}_{+0.059}^{-0.059}$log$L_X$+${0.695}_{+0.191}^{-0.191}$log$M$+${25.457}_{+2.728}^{-2.728}$ and… ▽ More We examine the fundamental plane of 91 Blazars which include FSRQs and BL Lacs with known X-ray luminosity ($L_{R}$), radio luminosity ($L_X$), and black hole mass measurements ($M$) to reflect the relationship between jet and accretion for blazars. The fundamental plane of Blazars are log$L_{R}$=${0.273}_{+0.059}^{-0.059}$log$L_X$+${0.695}_{+0.191}^{-0.191}$log$M$+${25.457}_{+2.728}^{-2.728}$ and log$L_{R}$=${0.190}_{+0.049}^{-0.049}$log$L_X$+${0.475}_{+0.157}^{-0.157}$log$M$+${28.568}_{+2.245}^{-2.245}$ after considering the effect of beam factor. Our results suggest that the jet of blazars has connection with accretion. We set the black hole spin energy as a new variable to correct the black hole mass and explore the effect of black hole spin on the fundamental relationship. We find that the fundamental plane of Blazars is effected by the black hole spin, which is similar to the previous work for AGNs. We additionally examine a new fundamental plane which is based on the black hole spin-mass energy ($M_{spin}$). The new fundamental plane (log$L_{R}$=${0.332}_{+0.081}^{-0.081}$log$L_X$+${0.502}_{+0.091}^{-0.091}$log$M_{spin}$+${22.606}_{+3.346}^{-3.346}$ with R-Square=0.575) shows that $M_{spin}$ has a better correlation coefficient comparing to the $M$ for fundamental plane of Blazars. These results support that the black hole spin should be considered as a important factor for the study of fundamental plane for Blazars. And these may further our understanding of the Blandford-Znajek process in blazars. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: Accepted for publication in MNRAS

arXiv:2403.08157 [pdf]

Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks

Authors: Fuzhi Wu, Jiasong Wu, Youyong Kong, Chunfeng Yang, Guanyu Yang, Huazhong Shu, Guy Carrault, Lotfi Senhadji

Abstract: Deep learning and Convolutional Neural Networks (CNNs) have driven major transformations in diverse research areas. However, their limitations in handling low-frequency information present obstacles in certain tasks like interpreting global structures or managing smooth transition images. Despite the promising performance of transformer structures in numerous tasks, their intricate optimization co… ▽ More Deep learning and Convolutional Neural Networks (CNNs) have driven major transformations in diverse research areas. However, their limitations in handling low-frequency information present obstacles in certain tasks like interpreting global structures or managing smooth transition images. Despite the promising performance of transformer structures in numerous tasks, their intricate optimization complexities highlight the persistent need for refined CNN enhancements using limited resources. Responding to these complexities, we introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM) Network, with the goal to harness the full potential of CNNs while kee** their complexity unchanged. The MLFM efficiently preserves low-frequency information, enhancing performance in targeted computer vision tasks. Central to our MLFM is the Low-Frequency Memory Unit (LFMU), which stores various low-frequency data and forms a parallel channel to the core network. A key advantage of MLFM is its seamless compatibility with various prevalent networks, requiring no alterations to their original core structure. Testing on ImageNet demonstrated substantial accuracy improvements in multiple 2D CNNs, including ResNet, MobileNet, EfficientNet, and ConvNeXt. Furthermore, we showcase MLFM's versatility beyond traditional image classification by successfully integrating it into image-to-image translation tasks, specifically in semantic segmentation networks like FCN and U-Net. In conclusion, our work signifies a pivotal stride in the journey of optimizing the efficacy and efficiency of CNNs with limited resources. This research builds upon the existing CNN foundations and paves the way for future advancements in computer vision. Our codes are available at https://github.com/AlphaWuSeu/ MLFM. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 9 pages, 10 figures,6 tables. AAAI 2024 conference

arXiv:2403.07563 [pdf, other]

Learning Generalizable Feature Fields for Mobile Manipulation

Authors: Ri-Zhao Qiu, Yafei Hu, Ge Yang, Yuchen Song, Yang Fu, Jianglong Ye, Jiteng Mu, Ruihan Yang, Nikolay Atanasov, Sebastian Scherer, Xiaolong Wang

Abstract: An open problem in mobile manipulation is how to represent objects and scenes in a unified manner, so that robots can use it both for navigating in the environment and manipulating objects. The latter requires capturing intricate geometry while understanding fine-grained semantics, whereas the former involves capturing the complexity inherit to an expansive physical scale. In this work, we present… ▽ More An open problem in mobile manipulation is how to represent objects and scenes in a unified manner, so that robots can use it both for navigating in the environment and manipulating objects. The latter requires capturing intricate geometry while understanding fine-grained semantics, whereas the former involves capturing the complexity inherit to an expansive physical scale. In this work, we present GeFF (Generalizable Feature Fields), a scene-level generalizable neural feature field that acts as a unified representation for both navigation and manipulation that performs in real-time. To do so, we treat generative novel view synthesis as a pre-training task, and then align the resulting rich scene priors with natural language via CLIP feature distillation. We demonstrate the effectiveness of this approach by deploying GeFF on a quadrupedal robot equipped with a manipulator. We evaluate GeFF's ability to generalize to open-set objects as well as running time, when performing open-vocabulary mobile manipulation in dynamic scenes. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Preprint. Project website is at: https://geff-b1.github.io/

arXiv:2403.06122 [pdf, other]

Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning

Authors: Woo-** Ahn, Geun-Yeong Yang, Hyun-Duck Choi, Myo-Taeg Lim

Abstract: Deep learning models for semantic segmentation often experience performance degradation when deployed to unseen target domains unidentified during the training phase. This is mainly due to variations in image texture (\ie style) from different data sources. To tackle this challenge, existing domain generalized semantic segmentation (DGSS) methods attempt to remove style variations from the feature… ▽ More Deep learning models for semantic segmentation often experience performance degradation when deployed to unseen target domains unidentified during the training phase. This is mainly due to variations in image texture (\ie style) from different data sources. To tackle this challenge, existing domain generalized semantic segmentation (DGSS) methods attempt to remove style variations from the feature. However, these approaches struggle with the entanglement of style and content, which may lead to the unintentional removal of crucial content information, causing performance degradation. This study addresses this limitation by proposing BlindNet, a novel DGSS approach that blinds the style without external modules or datasets. The main idea behind our proposed approach is to alleviate the effect of style in the encoder whilst facilitating robust segmentation in the decoder. To achieve this, BlindNet comprises two key components: covariance alignment and semantic consistency contrastive learning. Specifically, the covariance alignment trains the encoder to uniformly recognize various styles and preserve the content information of the feature, rather than removing the style-sensitive factor. Meanwhile, semantic consistency contrastive learning enables the decoder to construct discriminative class embedding space and disentangles features that are vulnerable to misclassification. Through extensive experiments, our approach outperforms existing DGSS methods, exhibiting robustness and superior performance for semantic segmentation on unseen target domains. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: CVPR 2024

arXiv:2403.05485 [pdf]

A Paradigm Shift in Catheter Development: Thermally Drawn Polymeric Fibers for MR-Guided Cardiovascular Interventions

Authors: Mohamed E. M. K. Abdelaziz, Libaihe Tian, Thomas Lottner, Simon Reiss, Timo Heidt, Alexander Maier, Klaus Düring, Constantin von zur Mühlen, Michael Bock, Eric Yeatman, Guang-Zhong Yang, Burak Temelkuran

Abstract: Cardiovascular diseases (CVDs) and congenital heart diseases (CHD) pose significant global health challenges. Fluoroscopy-guided endovascular interventions, though effective, are accompanied by ionizing radiation concerns, especially in pediatric cases. Magnetic resonance imaging (MRI) emerges as a radiation-free alternative, offering superior soft tissue visualization and functional insights. How… ▽ More Cardiovascular diseases (CVDs) and congenital heart diseases (CHD) pose significant global health challenges. Fluoroscopy-guided endovascular interventions, though effective, are accompanied by ionizing radiation concerns, especially in pediatric cases. Magnetic resonance imaging (MRI) emerges as a radiation-free alternative, offering superior soft tissue visualization and functional insights. However, the lack of compatible instruments remains a hurdle. We present two novel catheter systems, a tendon-driven steerable catheter and an active tracking Tiger-shaped catheter, fabricated using a unique fiber drawing technique. These catheters, showcasing mechanical properties similar to commercial counterparts, have undergone rigorous in-vitro and in-vivo testing, yielding promising outcomes. This innovative approach has the potential to streamline medical device development, thus enhancing patient care in MR-guided interventions. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05280 [pdf, other]

ContrastDiagnosis: Enhancing Interpretability in Lung Nodule Diagnosis Using Contrastive Learning

Authors: Chenglong Wang, Yinqiao Yi, Yida Wang, Chengxiu Zhang, Yun Liu, Kensaku Mori, Mei Yuan, Guang Yang

Abstract: With the ongoing development of deep learning, an increasing number of AI models have surpassed the performance levels of human clinical practitioners. However, the prevalence of AI diagnostic products in actual clinical practice remains significantly lower than desired. One crucial reason for this gap is the so-called `black box' nature of AI models. Clinicians' distrust of black box models has d… ▽ More With the ongoing development of deep learning, an increasing number of AI models have surpassed the performance levels of human clinical practitioners. However, the prevalence of AI diagnostic products in actual clinical practice remains significantly lower than desired. One crucial reason for this gap is the so-called `black box' nature of AI models. Clinicians' distrust of black box models has directly hindered the clinical deployment of AI products. To address this challenge, we propose ContrastDiagnosis, a straightforward yet effective interpretable diagnosis framework. This framework is designed to introduce inherent transparency and provide extensive post-hoc explainability for deep learning model, making them more suitable for clinical medical diagnosis. ContrastDiagnosis incorporates a contrastive learning mechanism to provide a case-based reasoning diagnostic rationale, enhancing the model's transparency and also offers post-hoc interpretability by highlighting similar areas. High diagnostic accuracy was achieved with AUC of 0.977 while maintain a high transparency and explainability. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.05236 [pdf, other]

Fault Recovery and Transient Stability of Grid-Forming Converters Equipped with Current Saturation

Authors: Ali Arjomandi-Nezhad, Yifei Guo, Bikash C. Pal, Guangya Yang

Abstract: When grid-forming (GFM) inverter-based resources (IBRs) experience large grid disturbances (e.g., short-circuit faults), the current limiter may be triggered and GFM IBRs enter the current saturation mode, inducing nonlinear dynamical behaviors and imposing great challenges to the post-disturbance transient angle stability. This paper presents a systematic study to reveal the fault recovery behavi… ▽ More When grid-forming (GFM) inverter-based resources (IBRs) experience large grid disturbances (e.g., short-circuit faults), the current limiter may be triggered and GFM IBRs enter the current saturation mode, inducing nonlinear dynamical behaviors and imposing great challenges to the post-disturbance transient angle stability. This paper presents a systematic study to reveal the fault recovery behaviors of a GFM IBR and identify the risk of instability. The impact of the angle of the magnitude-saturated current on the post-fault recovery and transient stability is also investigated. The selection of the angle of magnitude-saturated current significantly influences the post-fault behaviors while a few additional dynamical conditions that have a substantial impact are also identified. It is found that the system may follow multiple post-fault recovery trajectories depending on those conditions: 1) Convergence to the normal stable equilibrium point (SEP), 2) convergence to the saturated stable equilibrium point (SSEP), and 3) divergence (instability). To examine the models' accuracy, several cases are simulated. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 10 pages, 22 figures

arXiv:2403.03809 [pdf, other]

Variational Bayesian Learning based Joint Localization and Channel Estimation with Distance-dependent Noise

Authors: Yunfei Li, Yiting Luo, Weiqiang Tan, Chunguo Li, Shaodan Ma, Guanghua Yang

Abstract: In the Industrial Internet of Things (IIoTs) and Ocean of Things (OoTs), the advent of massive intelligent services has imposed stringent requirements on both communication and localization, particularly emphasizing precise localization and channel information. This paper focuses on the challenge of jointly optimizing localization and communication in IoT networks. Departing from the conventional… ▽ More In the Industrial Internet of Things (IIoTs) and Ocean of Things (OoTs), the advent of massive intelligent services has imposed stringent requirements on both communication and localization, particularly emphasizing precise localization and channel information. This paper focuses on the challenge of jointly optimizing localization and communication in IoT networks. Departing from the conventional independent noise model used in localization and channel estimation problems, we consider a more realistic model incorporating distance-dependent noise variance, as revealed in recent theoretical analyses and experimental results. The distance-dependent noise introduces unknown noise power and a complex noise model, resulting in an exceptionally challenging non-convex and nonlinear optimization problem. In this study, we address a joint localization and channel estimation problem encompassing distance-dependent noise, unknown channel parameters, and uncertainties in sensor node locations. To surmount the intractable nonlinear and non-convex objective function inherent in the problem, we introduce a variational Bayesian learning-based framework. This framework enables the joint optimization of localization and channel parameters by leveraging an effective approximation to the true posterior distribution. Furthermore, the proposed joint learning algorithm provides an iterative closed-form solution and exhibits superior performance in terms of computational complexity compared to existing algorithms. Computer simulation results demonstrate that the proposed algorithm approaches the performance of the Bayesian Cramer-Rao bound (BCRB), achieves localization performance comparable to the ML-GMP algorithm, and outperforms the other two comparison algorithms. △ Less

Submitted 6 March, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

arXiv:2403.03677 [pdf, other]

Automatic Bi-modal Question Title Generation for Stack Overflow with Prompt Learning

Authors: Shaoyu Yang, Xiang Chen, Ke Liu, Guang Yang, Chi Yu

Abstract: When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question… ▽ More When drafting question posts for Stack Overflow, developers may not accurately summarize the core problems in the question titles, which can cause these questions to not get timely help. Therefore, improving the quality of question titles has attracted the wide attention of researchers. An initial study aimed to automatically generate the titles by only analyzing the code snippets in the question body. However, this study ignored the helpful information in their corresponding problem descriptions. Therefore, we propose an approach SOTitle+ by considering bi-modal information (i.e., the code snippets and the problem descriptions) in the question body. Then we formalize the title generation for different programming languages as separate but related tasks and utilize multi-task learning to solve these tasks. Later we fine-tune the pre-trained language model CodeT5 to automatically generate the titles. Unfortunately, the inconsistent inputs and optimization objectives between the pre-training task and our investigated task may make fine-tuning hard to fully explore the knowledge of the pre-trained model. To solve this issue, SOTitle+ further prompt-tunes CodeT5 with hybrid prompts (i.e., mixture of hard and soft prompts). To verify the effectiveness of SOTitle+, we construct a large-scale high-quality corpus from recent data dumps shared by Stack Overflow. Our corpus includes 179,119 high-quality question posts for six popular programming languages. Experimental results show that SOTitle+ can significantly outperform four state-of-the-art baselines in both automatic evaluation and human evaluation. Our work indicates that considering bi-modal information and prompt learning in Stack Overflow title generation is a promising exploration direction. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: Accepted by Empirical Software Engineering 2024 (EMSE)

arXiv:2403.01246 [pdf, other]

Dual Graph Attention based Disentanglement Multiple Instance Learning for Brain Age Estimation

Authors: Fanzhe Yan, Gang Yang, Yu Li, Ai** Liu, Xun Chen

Abstract: Deep learning techniques have demonstrated great potential for accurately estimating brain age by analyzing Magnetic Resonance Imaging (MRI) data from healthy individuals. However, current methods for brain age estimation often directly utilize whole input images, overlooking two important considerations: 1) the heterogeneous nature of brain aging, where different brain regions may degenerate at d… ▽ More Deep learning techniques have demonstrated great potential for accurately estimating brain age by analyzing Magnetic Resonance Imaging (MRI) data from healthy individuals. However, current methods for brain age estimation often directly utilize whole input images, overlooking two important considerations: 1) the heterogeneous nature of brain aging, where different brain regions may degenerate at different rates, and 2) the existence of age-independent redundancies in brain structure. To overcome these limitations, we propose a Dual Graph Attention based Disentanglement Multi-instance Learning (DGA-DMIL) framework for improving brain age estimation. Specifically, the 3D MRI data, treated as a bag of instances, is fed into a 2D convolutional neural network backbone, to capture the unique aging patterns in MRI. A dual graph attention aggregator is then proposed to learn the backbone features by exploiting the intra- and inter-instance relationships. Furthermore, a disentanglement branch is introduced to separate age-related features from age-independent structural representations to ameliorate the interference of redundant information on age prediction. To verify the effectiveness of the proposed framework, we evaluate it on two datasets, UK Biobank and ADNI, containing a total of 35,388 healthy individuals. Our proposed model demonstrates exceptional accuracy in estimating brain age, achieving a remarkable mean absolute error of 2.12 years in the UK Biobank. The results establish our approach as state-of-the-art compared to other competing brain age estimation models. In addition, the instance contribution scores identify the varied importance of brain areas for aging prediction, which provides deeper insights into the understanding of brain aging. △ Less

Submitted 2 March, 2024; originally announced March 2024.

Comments: 12 pages, 9 figures

arXiv:2403.01093 [pdf, other]

Variational Bayesian Learning Based Localization and Channel Reconstruction in RIS-aided Systems

Authors: Yunfei Li, Yiting Luo, Xianda Wu, Zheng Shi, Shaodan Ma, Guanghua Yang

Abstract: The emerging immersive and autonomous services have posed stringent requirements on both communications and localization. By considering the great potential of reconfigurable intelligent surface (RIS), this paper focuses on the joint channel estimation and localization for RIS-aided wireless systems. As opposed to existing works that treat channel estimation and localization independently, this pa… ▽ More The emerging immersive and autonomous services have posed stringent requirements on both communications and localization. By considering the great potential of reconfigurable intelligent surface (RIS), this paper focuses on the joint channel estimation and localization for RIS-aided wireless systems. As opposed to existing works that treat channel estimation and localization independently, this paper exploits the intrinsic coupling and nonlinear relationships between the channel parameters and user location for enhancement of both localization and channel reconstruction. By noticing the non-convex, nonlinear objective function and the sparser angle pattern, a variational Bayesian learning-based framework is developed to jointly estimate the channel parameters and user location through leveraging an effective approximation of the posterior distribution. The proposed framework is capable of unifying near-field and far-field scenarios owing to exploitation of sparsity of the angular domain. Since the joint channel and location estimation problem has a closed-form solution in each iteration, our proposed iterative algorithm performs better than the conventional particle swarm optimization (PSO) and maximum likelihood (ML) based ones in terms of computational complexity. Simulations demonstrate that the proposed algorithm almost reaches the Bayesian Cramer-Rao bound (BCRB) and achieves a superior estimation accuracy by comparing to the PSO and the ML algorithms. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.18975 [pdf, other]

Theoretically Achieving Continuous Representation of Oriented Bounding Boxes

Authors: Zi-Kai Xiao, Guo-Ye Yang, Xue Yang, Tai-Jiang Mu, Junchi Yan, Shi-min Hu

Abstract: Considerable efforts have been devoted to Oriented Object Detection (OOD). However, one lasting issue regarding the discontinuity in Oriented Bounding Box (OBB) representation remains unresolved, which is an inherent bottleneck for extant OOD methods. This paper endeavors to completely solve this issue in a theoretically guaranteed manner and puts an end to the ad-hoc efforts in this direction. Pr… ▽ More Considerable efforts have been devoted to Oriented Object Detection (OOD). However, one lasting issue regarding the discontinuity in Oriented Bounding Box (OBB) representation remains unresolved, which is an inherent bottleneck for extant OOD methods. This paper endeavors to completely solve this issue in a theoretically guaranteed manner and puts an end to the ad-hoc efforts in this direction. Prior studies typically can only address one of the two cases of discontinuity: rotation and aspect ratio, and often inadvertently introduce decoding discontinuity, e.g. Decoding Incompleteness (DI) and Decoding Ambiguity (DA) as discussed in literature. Specifically, we propose a novel representation method called Continuous OBB (COBB), which can be readily integrated into existing detectors e.g. Faster-RCNN as a plugin. It can theoretically ensure continuity in bounding box regression which to our best knowledge, has not been achieved in literature for rectangle-based object representation. For fairness and transparency of experiments, we have developed a modularized benchmark based on the open-source deep learning framework Jittor's detection toolbox JDet for OOD evaluation. On the popular DOTA dataset, by integrating Faster-RCNN as the same baseline model, our new method outperforms the peer method Gliding Vertex by 1.13% mAP50 (relative improvement 1.54%), and 2.46% mAP75 (relative improvement 5.91%), without any tricks. △ Less

Submitted 16 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: 17 pages, 12 tables, 8 figures. Accepted by CVPR'24. Code: https://github.com/514flowey/JDet-COBB

arXiv:2402.18451 [pdf, other]

MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation

Authors: Jiahao Huang, Liutao Yang, Fanwen Wang, Yang Nan, Angelica I. Aviles-Rivero, Carola-Bibiane Schönlieb, Daoqiang Zhang, Guang Yang

Abstract: The recent Mamba model has shown remarkable adaptability for visual representation learning, including in medical imaging tasks. This study introduces MambaMIR, a Mamba-based model for medical image reconstruction, as well as its Generative Adversarial Network-based variant, MambaMIR-GAN. Our proposed MambaMIR inherits several advantages, such as linear complexity, global receptive fields, and dyn… ▽ More The recent Mamba model has shown remarkable adaptability for visual representation learning, including in medical imaging tasks. This study introduces MambaMIR, a Mamba-based model for medical image reconstruction, as well as its Generative Adversarial Network-based variant, MambaMIR-GAN. Our proposed MambaMIR inherits several advantages, such as linear complexity, global receptive fields, and dynamic weights, from the original Mamba model. The innovated arbitrary-mask mechanism effectively adapt Mamba to our image reconstruction task, providing randomness for subsequent Monte Carlo-based uncertainty estimation. Experiments conducted on various medical image reconstruction tasks, including fast MRI and SVCT, which cover anatomical regions such as the knee, chest, and abdomen, have demonstrated that MambaMIR and MambaMIR-GAN achieve comparable or superior reconstruction results relative to state-of-the-art methods. Additionally, the estimated uncertainty maps offer further insights into the reliability of the reconstruction quality. The code is publicly available at https://github.com/ayanglab/MambaMIR. △ Less

Submitted 25 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

arXiv:2402.16870 [pdf, other]

Pioneering Deterministic Scheduling and Network Structure Optimization for Time-Critical Computing Tasks in Industrial IoT

Authors: Yujiao Hu, Yining Zhu, Huayu Zhang, Yan Pan, Qingmin Jia, Renchao Xie, Gang Yang, F. Richard Yu

Abstract: The Industrial Internet of Things (IIoT) has become a critical technology to accelerate the process of digital and intelligent transformation of industries. As the cooperative relationship between smart devices in IIoT becomes more complex, getting deterministic responses of IIoT periodic time-critical computing tasks becomes a crucial and nontrivial problem. However, few current works in cloud/ed… ▽ More The Industrial Internet of Things (IIoT) has become a critical technology to accelerate the process of digital and intelligent transformation of industries. As the cooperative relationship between smart devices in IIoT becomes more complex, getting deterministic responses of IIoT periodic time-critical computing tasks becomes a crucial and nontrivial problem. However, few current works in cloud/edge/fog computing focus on this problem. This paper is a pioneer to explore the deterministic scheduling and network structural optimization problems for IIoT periodic time-critical computing tasks. We first formulate the two problems and derive theorems to help quickly identify computation and network resource sharing conflicts. Based on this, we propose a deterministic scheduling algorithm, \textit{IIoTBroker}, which realizes deterministic response for each IIoT task by optimizing the fine-grained computation and network resources allocations, and a network optimization algorithm, \textit{IIoTDeployer}, providing a cost-effective structural upgrade solution for existing IIoT networks. Our methods are illustrated to be cost-friendly, scalable, and deterministic response guaranteed with low computation cost from our simulation results. △ Less

Submitted 23 January, 2024; originally announced February 2024.

Comments: Under Review

arXiv:2402.16796 [pdf, other]

Expressive Whole-Body Control for Humanoid Robots

Authors: Xuxin Cheng, Yandong Ji, Junming Chen, Ruihan Yang, Ge Yang, Xiaolong Wang

Abstract: Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world? We propose to learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. To train such a policy, we leverage the large-scale human motion capture data from the graphics community in a Reinforcement Learning framework. However, directly performing imitati… ▽ More Can we enable humanoid robots to generate rich, diverse, and expressive motions in the real world? We propose to learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. To train such a policy, we leverage the large-scale human motion capture data from the graphics community in a Reinforcement Learning framework. However, directly performing imitation learning with the motion capture dataset would not work on the real humanoid robot, given the large gap in degrees of freedom and physical capabilities. Our method Expressive Whole-Body Control (Exbody) tackles this problem by encouraging the upper humanoid body to imitate a reference motion, while relaxing the imitation constraint on its two legs and only requiring them to follow a given velocity robustly. With training in simulation and Sim2Real transfer, our policy can control a humanoid robot to walk in different styles, shake hands with humans, and even dance with a human in the real world. We conduct extensive studies and comparisons on diverse motions in both simulation and the real world to show the effectiveness of our approach. △ Less

Submitted 5 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: Website: https://expressive-humanoid.github.io

arXiv:2402.16674 [pdf, other]

ConSept: Continual Semantic Segmentation via Adapter-based Vision Transformer

Authors: Bowen Dong, Guanglei Yang, Wangmeng Zuo, Lei Zhang

Abstract: In this paper, we delve into the realm of vision transformers for continual semantic segmentation, a problem that has not been sufficiently explored in previous literature. Empirical investigations on the adaptation of existing frameworks to vanilla ViT reveal that incorporating visual adapters into ViTs or fine-tuning ViTs with distillation terms is advantageous for enhancing the segmentation cap… ▽ More In this paper, we delve into the realm of vision transformers for continual semantic segmentation, a problem that has not been sufficiently explored in previous literature. Empirical investigations on the adaptation of existing frameworks to vanilla ViT reveal that incorporating visual adapters into ViTs or fine-tuning ViTs with distillation terms is advantageous for enhancing the segmentation capability of novel classes. These findings motivate us to propose Continual semantic Segmentation via Adapter-based ViT, namely ConSept. Within the simplified architecture of ViT with linear segmentation head, ConSept integrates lightweight attention-based adapters into vanilla ViTs. Capitalizing on the feature adaptation abilities of these adapters, ConSept not only retains superior segmentation ability for old classes, but also attains promising segmentation quality for novel classes. To further harness the intrinsic anti-catastrophic forgetting ability of ConSept and concurrently enhance the segmentation capabilities for both old and new classes, we propose two key strategies: distillation with a deterministic old-classes boundary for improved anti-catastrophic forgetting, and dual dice losses to regularize segmentation maps, thereby improving overall segmentation performance. Extensive experiments show the effectiveness of ConSept on multiple continual semantic segmentation benchmarks under overlapped or disjoint settings. Code will be publicly available at \url{https://github.com/DongSky/ConSept}. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.15939 [pdf]

Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Authors: Zi Wang, Min Xiao, Yirong Zhou, Chengyan Wang, Naiming Wu, Yi Li, Yiwen Gong, Shufu Chang, Yinyin Chen, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Di Guo, Guang Yang, Xiaobo Qu

Abstract: Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge leads to necessitate extensive training data in many deep learning reconstruction methods. This work proposes a novel and efficient approach, levera… ▽ More Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge leads to necessitate extensive training data in many deep learning reconstruction methods. This work proposes a novel and efficient approach, leveraging a dimension-reduced separable learning scheme that excels even with highly limited training data. We further integrate it with spatiotemporal priors to develop a Deep Separable Spatiotemporal Learning network (DeepSSL), which unrolls an iteration process of a reconstruction model with both temporal low-rankness and spatial sparsity. Intermediate outputs are visualized to provide insights into the network's behavior and enhance its interpretability. Extensive results on cardiac cine datasets show that the proposed DeepSSL is superior to the state-of-the-art methods visually and quantitatively, while reducing the demand for training cases by up to 75%. And its preliminary adaptability to cardiac patients has been verified through experienced radiologists' and cardiologists' blind reader study. Additionally, DeepSSL also benefits for achieving the downstream task of cardiac segmentation with higher accuracy and shows robustness in prospective real-time cardiac MRI. △ Less

Submitted 24 February, 2024; originally announced February 2024.

Comments: 10 pages, 11 figures, 3 tables

arXiv:2402.15801 [pdf]

Topological and superconducting properties of two-dimensional C6-2x(BN)x biphenylene network: a first-principles investigation

Authors: Guang F. Yang, Hong X. Song, Dan Wang, Hao Wang, Hua Y. Geng

Abstract: First-principles calculations have been used to investigate the electronic and topological properties of the two-dimensional C6-2x(BN)x biphenylene network, a graphene-like structure composed of not only hexagonal ring but also octagonal and square rings. Nontrivial topological properties have been found in two of them, with a stoichiometry of C4BN and C2(BN)2. The former C4BN is predicted to be a… ▽ More First-principles calculations have been used to investigate the electronic and topological properties of the two-dimensional C6-2x(BN)x biphenylene network, a graphene-like structure composed of not only hexagonal ring but also octagonal and square rings. Nontrivial topological properties have been found in two of them, with a stoichiometry of C4BN and C2(BN)2. The former C4BN is predicted to be a type-II Dirac semimetal with a superconducting critical temperature Tc=0.38K, which is similar to the pure carbon biphenylene network (C-BPN). The latter shows a novel isolated edge state exists between the conduction and valence bands. By regulation of strains and virtual-crystal approximation calculations, we found the annihilation of two pairs of Dirac points (DPs) in the non-high symmetric region (non-HSR) causes the two corresponding edge states stick together to generate this isolated edge state. In addition, we found that one pair of DPs arises from the shift of DPs in the C-BPN, while another new pair of DPs emerges around the Time Reversal Invariant Momenta (TRIM) point X due to the do** of boron and nitrogen. We constructed a tight-binding (TB) model to reveal the mechanism of forming the isolated edge state from the C-BPN to C2(BN)2. This study not only demonstrates the existence and mechanism of forming the isolated edge state in semimetals, but also provides an example in which the DPs can move away from the high-symmetry region. △ Less

Submitted 24 February, 2024; originally announced February 2024.

Comments: 32 pages, 10 figures, with supplementary materials

Showing 51–100 of 1,451 results for author: Yang, G