Search | arXiv e-print repository

arXiv:2407.01945 [pdf, other]

Indoor 3D Reconstruction with an Unknown Camera-Projector Pair

Authors: Zhaoshuai Qi, Yifeng Hao, Rui Hu, Wenyou Chang, Jiaqi Yang, Yanning Zhang

Abstract: Structured light-based method with a camera-projector pair (CPP) plays a vital role in indoor 3D reconstruction, especially for scenes with weak textures. Previous methods usually assume known intrinsics, which are pre-calibrated from known objects, or self-calibrated from multi-view observations. It is still challenging to reliably recover CPP intrinsics from only two views without any known obje… ▽ More Structured light-based method with a camera-projector pair (CPP) plays a vital role in indoor 3D reconstruction, especially for scenes with weak textures. Previous methods usually assume known intrinsics, which are pre-calibrated from known objects, or self-calibrated from multi-view observations. It is still challenging to reliably recover CPP intrinsics from only two views without any known objects. In this paper, we provide a simple yet reliable solution. We demonstrate that, for the first time, sufficient constraints on CPP intrinsics can be derived from an unknown cuboid corner (C2), e.g. a room's corner, which is a common structure in indoor scenes. In addition, with only known camera principal point, the complex multi-variable estimation of all CPP intrinsics can be simplified to a simple univariable optimization problem, leading to reliable calibration and thus direct 3D reconstruction with unknown CPP. Extensive results have demonstrated the superiority of the proposed method over both traditional and learning-based counterparts. Furthermore, the proposed method also demonstrates impressive potential to solve similar tasks without active lighting, such as sparse-view structure from motion. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.20076 [pdf, other]

EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model

Authors: Yuxuan Zhang, Tianheng Cheng, Rui Hu, Lei Liu, Heng Liu, Long** Ran, Xiaoxin Chen, Wenyu Liu, Xinggang Wang

Abstract: Segment Anything Model (SAM) has attracted widespread attention for its superior interactive segmentation capabilities with visual prompts while lacking further exploration of text prompts. In this paper, we empirically investigate what text prompt encoders (e.g., CLIP or LLM) are good for adapting SAM for referring expression segmentation and introduce the Early Vision-language Fusion-based SAM (… ▽ More Segment Anything Model (SAM) has attracted widespread attention for its superior interactive segmentation capabilities with visual prompts while lacking further exploration of text prompts. In this paper, we empirically investigate what text prompt encoders (e.g., CLIP or LLM) are good for adapting SAM for referring expression segmentation and introduce the Early Vision-language Fusion-based SAM (EVF-SAM). EVF-SAM is a simple yet effective referring segmentation method which exploits multimodal prompts (i.e., image and text) and comprises a pre-trained vision-language model to generate referring prompts and a SAM model for segmentation. Surprisingly, we observe that: (1) multimodal prompts and (2) vision-language models with early fusion (e.g., BEIT-3) are beneficial for prompting SAM for accurate referring segmentation. Our experiments show that the proposed EVF-SAM based on BEIT-3 can obtain state-of-the-art performance on RefCOCO/+/g for referring expression segmentation and demonstrate the superiority of prompting SAM with early vision-language fusion. In addition, the proposed EVF-SAM with 1.32B parameters achieves remarkably higher performance while reducing nearly 82% of parameters compared to previous SAM methods based on large multimodal models. △ Less

Submitted 3 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

Comments: Preprint. Code and models are available at: https://github.com/hustvl/EVF-SAM

arXiv:2406.18045 [pdf, other]

PharmaGPT: Domain-Specific Large Language Models for Bio-Pharmaceutical and Chemistry

Authors: Linqing Chen, Weilei Wang, Zilong Bai, Peng Xu, Yan Fang, Jie Fang, Wentao Wu, Lizhi Zhou, Ruiji Zhang, Yubin Xia, Chaobo Xu, Ran Hu, Licong Xu, Qijun Cai, Haoran Hua, **g Sun, ** Liu, Tian Qiu, Haowen Liu, Meng Hu, Xiuwen Li, Fei Gao, Yufu Wang, Lin Tie, Chaochao Wang , et al. (11 additional authors not shown)

Abstract: Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general pu… ▽ More Large language models (LLMs) have revolutionized Natural Language Processing (NLP) by by minimizing the need for complex feature engineering. However, the application of LLMs in specialized domains like biopharmaceuticals and chemistry remains largely unexplored. These fields are characterized by intricate terminologies, specialized knowledge, and a high demand for precision areas where general purpose LLMs often fall short. In this study, we introduce PharmGPT, a suite of multilingual LLMs with 13 billion and 70 billion parameters, specifically trained on a comprehensive corpus of hundreds of billions of tokens tailored to the Bio-Pharmaceutical and Chemical sectors. Our evaluation shows that PharmGPT matches or surpasses existing general models on key benchmarks, such as NAPLEX, demonstrating its exceptional capability in domain-specific tasks. This advancement establishes a new benchmark for LLMs in the Bio-Pharmaceutical and Chemical fields, addressing the existing gap in specialized language modeling. Furthermore, this suggests a promising path for enhanced research and development in these specialized areas, paving the way for more precise and effective applications of NLP in specialized domains. △ Less

Submitted 3 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17050 [pdf, other]

Pressure-induced exciton formation and superconductivity in platinum-based mineral Sperrylite

Authors: Limin Wang, Rongwei Hu, Yash Anand, Shanta R. Saha, Jason R. Jeffries, Johnpierre Paglione

Abstract: We report a comprehensive study of Sperrylite (PtAs2), the main platinum source in natural minerals, as a function of applied pressures up to 150 GPa. While no structural phase transition was detected from pressure-dependent X-ray measurements, the unit cell volume shrinks monotonically with pressure following the third-order Birch-Murnaghan equation of state. The mildly semiconducting behavior fo… ▽ More We report a comprehensive study of Sperrylite (PtAs2), the main platinum source in natural minerals, as a function of applied pressures up to 150 GPa. While no structural phase transition was detected from pressure-dependent X-ray measurements, the unit cell volume shrinks monotonically with pressure following the third-order Birch-Murnaghan equation of state. The mildly semiconducting behavior found in pure synthesized crystals at ambient pressures becomes more insulating upon increasing applied pressure before metalizing at higher pressures, giving way to the appearance of an abrupt decrease in resistance near 3 K at pressures above 92 GPa consistent with the onset of a superconducing phase. The pressure evolution of the calculated electronic band structure reveals the same physical trend as our transport measurements, with a non-monotonic evolution explained by a hole band that is pushed below the Fermi energy and an electron band that approaches it as a function of pressure, both reaching a touching point suggestive of an excitonic state. A topological Lifshitz transition of the electronic structure and an increase in the density of states may naturally explain the onset of superconductivity in this material △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 7 pages, 7 figures

arXiv:2406.12878 [pdf, other]

Beam test results of the prototype of the multi wire drift chamber for the CSR external-target experiment

Authors: Zhi Qin, Zhoubo He, Zhe Cao, Tao Chen, Zhi Deng, Limin Duan, Dong Guo, Rongjiang Hu, Jie Kong, Canwen Liu, Peng Ma, Xianglun Wei, Shihai Wen, Xiangjie Wen, Junwei Yan, Herun Yang, Zuoqiao Yang, Yuhong Yu, Zhigang Xiao

Abstract: The half-size prototype of the multi wire drift chamber (MWDC) for the cooling storage ring (CSR) external-target experiment (CEE) was assembled and tested in 350 MeV/u Kr+Fe reactions on the heavy ion research facility in Lanzhou (HIRFL). The prototype consists of 6 sense layers, where the sense wires are stretched in three directions X, U and V, meeting $0^\circ$, $30^\circ$ and $-30^\circ$ with… ▽ More The half-size prototype of the multi wire drift chamber (MWDC) for the cooling storage ring (CSR) external-target experiment (CEE) was assembled and tested in 350 MeV/u Kr+Fe reactions on the heavy ion research facility in Lanzhou (HIRFL). The prototype consists of 6 sense layers, where the sense wires are stretched in three directions X, U and V, meeting $0^\circ$, $30^\circ$ and $-30^\circ$ with respect to the vertical axis, respectively. The sensitive area of the prototype is $76 {\rm cm} \times 76 {\rm cm}$. The amplified and shaped signals from the anode wires are digitized in a serial capacity array. Being operated with 1500 V high voltage on the anode wires, the efficiency for each layer is beyond 95\%. The tracking residual is about $301 \pm 2 \rm μm$. The performance meets the requirements of CEE. △ Less

Submitted 15 May, 2024; originally announced June 2024.

arXiv:2406.11145 [pdf, other]

Federated Face Forgery Detection Learning with Personalized Representation

Authors: Decheng Liu, Zhan Dang, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao

Abstract: Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat. Traditional forgery detection methods directly centralized training on data and lacked consideration of information sharing in non-public video data scenarios and data privacy. Naturally, the federated learning strategy can be applied for privacy protection, which aggregates m… ▽ More Deep generator technology can produce high-quality fake videos that are indistinguishable, posing a serious social threat. Traditional forgery detection methods directly centralized training on data and lacked consideration of information sharing in non-public video data scenarios and data privacy. Naturally, the federated learning strategy can be applied for privacy protection, which aggregates model parameters of clients but not original data. However, simple federated learning can't achieve satisfactory performance because of poor generalization capabilities for the real hybrid-domain forgery dataset. To solve the problem, the paper proposes a novel federated face forgery detection learning with personalized representation. The designed Personalized Forgery Representation Learning aims to learn the personalized representation of each client to improve the detection performance of individual client models. In addition, a personalized federated learning training strategy is utilized to update the parameters of the distributed detection model. Here collaborative training is conducted on multiple distributed client devices, and shared representations of these client models are uploaded to the server side for aggregation. Experiments on several public face forgery detection datasets demonstrate the superior performance of the proposed algorithm compared with state-of-the-art methods. The code is available at \emph{https://github.com/GANG370/PFR-Forgery.} △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: The code is publicly available

arXiv:2406.10933 [pdf, other]

Improving Adversarial Robustness via Decoupled Visual Representation Masking

Authors: Decheng Liu, Tao Chen, Chunlei Peng, Nannan Wang, Ruimin Hu, Xinbo Gao

Abstract: Deep neural networks are proven to be vulnerable to fine-designed adversarial examples, and adversarial defense algorithms draw more and more attention nowadays. Pre-processing based defense is a major strategy, as well as learning robust feature representation has been proven an effective way to boost generalization. However, existing defense works lack considering different depth-level visual fe… ▽ More Deep neural networks are proven to be vulnerable to fine-designed adversarial examples, and adversarial defense algorithms draw more and more attention nowadays. Pre-processing based defense is a major strategy, as well as learning robust feature representation has been proven an effective way to boost generalization. However, existing defense works lack considering different depth-level visual features in the training process. In this paper, we first highlight two novel properties of robust features from the feature distribution perspective: 1) \textbf{Diversity}. The robust feature of intra-class samples can maintain appropriate diversity; 2) \textbf{Discriminability}. The robust feature of inter-class samples should ensure adequate separation. We find that state-of-the-art defense methods aim to address both of these mentioned issues well. It motivates us to increase intra-class variance and decrease inter-class discrepancy simultaneously in adversarial training. Specifically, we propose a simple but effective defense based on decoupled visual representation masking. The designed Decoupled Visual Feature Masking (DFM) block can adaptively disentangle visual discriminative features and non-visual features with diverse mask strategies, while the suitable discarding information can disrupt adversarial noise to improve robustness. Our work provides a generic and easy-to-plugin block unit for any former adversarial training algorithm to achieve better protection integrally. Extensive experimental results prove the proposed method can achieve superior performance compared with state-of-the-art defense approaches. The code is publicly available at \href{https://github.com/chenboluo/Adversarial-defense}{https://github.com/chenboluo/Adversarial-defense}. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: The code is publicly available

arXiv:2406.10125 [pdf, other]

MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report

Authors: Zhongyu Yang, Mai Liu, **luo Xie, Yueming Zhang, Chen Shen, Wei Shao, Jichao Jiao, Tengfei Xing, Runbo Hu, Pengfei Xu

Abstract: Autonomous driving without high-definition (HD) maps demands a higher level of active scene understanding. In this competition, the organizers provided the multi-perspective camera images and standard-definition (SD) maps to explore the boundaries of scene reasoning capabilities. We found that most existing algorithms construct Bird's Eye View (BEV) features from these multi-perspective images and… ▽ More Autonomous driving without high-definition (HD) maps demands a higher level of active scene understanding. In this competition, the organizers provided the multi-perspective camera images and standard-definition (SD) maps to explore the boundaries of scene reasoning capabilities. We found that most existing algorithms construct Bird's Eye View (BEV) features from these multi-perspective images and use multi-task heads to delineate road centerlines, boundary lines, pedestrian crossings, and other areas. However, these algorithms perform poorly at the far end of roads and struggle when the primary subject in the image is occluded. Therefore, in this competition, we not only used multi-perspective images as input but also incorporated SD maps to address this issue. We employed map encoder pre-training to enhance the network's geometric encoding capabilities and utilized YOLOX to improve traffic element detection precision. Additionally, for area detection, we innovatively introduced LDTR and auxiliary tasks to achieve higher precision. As a result, our final OLUS score is 0.58. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09523 [pdf, other]

Finite-Agent Stochastic Differential Games on Large Graphs: I. The Linear-Quadratic Case

Authors: Ruimeng Hu, Jihao Long, Haosheng Zhou

Abstract: In this paper, we study finite-agent linear-quadratic games on graphs. Specifically, we propose a comprehensive framework that extends the existing literature by incorporating heterogeneous and interpretable player interactions. Compared to previous works, our model offers a more realistic depiction of strategic decision-making processes. For general graphs, we establish the convergence of fictiti… ▽ More In this paper, we study finite-agent linear-quadratic games on graphs. Specifically, we propose a comprehensive framework that extends the existing literature by incorporating heterogeneous and interpretable player interactions. Compared to previous works, our model offers a more realistic depiction of strategic decision-making processes. For general graphs, we establish the convergence of fictitious play, a widely-used iterative solution method for determining the Nash equilibrium of our proposed game model. Notably, under appropriate conditions, this convergence holds true irrespective of the number of players involved. For vertex-transitive graphs, we develop a semi-explicit characterization of the Nash equilibrium. Through rigorous analysis, we demonstrate the well-posedness of this characterization under certain conditions. We present numerical experiments that validate our theoretical results and provide insights into the intricate relationship between various game dynamics and the underlying graph structure. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.08836 [pdf, ps, other]

Strong asymptotic convergence of a slowly damped inertial primal-dual dynamical system controlled by a Tikhonov regularization term

Authors: Ting-Ting Zhu, Rong Hu, Ya-** Fang

Abstract: We propose a slowly damped inertial primal-dual dynamical system controlled by a Tikhonov regularization term, where the inertial term is introduced only for the primal variable, for the linearly constrained convex optimization problem in a Hilbert space. Under mild conditions on the underlying parameters, by a Lyapunov analysis approach, we prove the strong asymptotic convergence of the trajector… ▽ More We propose a slowly damped inertial primal-dual dynamical system controlled by a Tikhonov regularization term, where the inertial term is introduced only for the primal variable, for the linearly constrained convex optimization problem in a Hilbert space. Under mild conditions on the underlying parameters, by a Lyapunov analysis approach, we prove the strong asymptotic convergence of the trajectory of the proposed dynamic to the minimal norm element of the primal-dual solution set of the problem, along with convergence rate results for the primal-dual gap, the objective residual and the feasibility violation. We perform some numerical experiments to illustrate the theoretical findings. △ Less

Submitted 20 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.06563 [pdf, other]

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Authors: Tianwen Wei, Bo Zhu, Liang Zhao, Cheng Cheng, Biye Li, Weiwei Lü, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Liang Zeng, Xiaokun Wang, Yutuan Ma, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

Abstract: In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initi… ▽ More In this technical report, we introduce the training methodologies implemented in the development of Skywork-MoE, a high-performance mixture-of-experts (MoE) large language model (LLM) with 146 billion parameters and 16 experts. It is initialized from the pre-existing dense checkpoints of our Skywork-13B model. We explore the comparative effectiveness of upcycling versus training from scratch initializations. Our findings suggest that the choice between these two approaches should consider both the performance of the existing dense checkpoints and the MoE training budget. We highlight two innovative techniques: gating logit normalization, which improves expert diversification, and adaptive auxiliary loss coefficients, allowing for layer-specific adjustment of auxiliary loss coefficients. Our experimental results validate the effectiveness of these methods. Leveraging these techniques and insights, we trained our upcycled Skywork-MoE on a condensed subset of our SkyPile corpus. The evaluation results demonstrate that our model delivers strong performance across a wide range of benchmarks. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.01955 [pdf, other]

Chemical map** of temperate sub-Neptune atmospheres: Constraining the deep-interior H2O/H2 using the atmospheric CO2/CH4

Authors: Jeehyun Yang, Renyu Hu

Abstract: Understanding the envelope composition of sub-Neptune-type exoplanets is challenging due to the inherent degeneracy in their interior composition scenarios. Particularly, the H2O/H2 ratio, or can be expressed as the O/H ratio, in the planetary envelope provides crucial insights into the origin of these exoplanets relative to the ice line during formation. Using self-consistent radiative transfer m… ▽ More Understanding the envelope composition of sub-Neptune-type exoplanets is challenging due to the inherent degeneracy in their interior composition scenarios. Particularly, the H2O/H2 ratio, or can be expressed as the O/H ratio, in the planetary envelope provides crucial insights into the origin of these exoplanets relative to the ice line during formation. Using self-consistent radiative transfer modeling and a rate-based automatic chemical network generator combined with 1D photochemical kinetic-transport atmospheric modeling, we investigate atmospheres of temperate sub-Neptunes, ranging from H2-dominated to H2O-dominated scenarios with Teq = 250-400 K, using K2-18 b (Teq = 255 K), LP 791-18 c (Teq = 324 K), and TOI-270 d (Teq = 354 K) as examples. Our models indicate that using the atmospheric CO2/CH4 ratio to infer the deep-interior H2O/H2 ratio. Applying to recent JWST observations, our findings suggest K2-18 b likely has an interior highly enriched in water (approximately 50% H2O), exceeding the amount of water in a 100x solar metallicity scenario and suggesting a formation history that involved substantial accretion of ices. In contrast, TOI-270 d has an interior composition of approximately 25% H2O, which is comparable to the conventional metallicity framework with a metallicity higher than 100x solar metallicity. Furthermore, our models identify carbonyl sulfide (OCS) and sulfur dioxide (SO2) as strong indicators of at least a 10% water-rich envelope in temperate sub-Neptunes. These results provide a method to delineate the internal composition and formation mechanisms of sub-Neptunes with Teq< ~500 K via atmospheric characterization through transmission spectroscopy. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 15 pages, 5 figures, submitted to ApJL

arXiv:2406.01069 [pdf, other]

UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment

Authors: Hantao Zhou, Longxiang Tang, Rui Yang, Guanyi Qin, Yan Zhang, Runze Hu, Xiu Li

Abstract: Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal. Existing methods typically address these tasks independently due to distinct learning objectives. However, they neglect the underlying interconnectedness of both tasks, which hinders the learning of task-agnostic shared representations for hu… ▽ More Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate human subjective perception of image visual quality and aesthetic appeal. Existing methods typically address these tasks independently due to distinct learning objectives. However, they neglect the underlying interconnectedness of both tasks, which hinders the learning of task-agnostic shared representations for human subjective perception. To confront this challenge, we propose Unified vision-language pre-training of Quality and Aesthetics (UniQA), to learn general perceptions of two tasks, thereby benefiting them simultaneously. Addressing the absence of text in the IQA datasets and the presence of textual noise in the IAA datasets, (1) we utilize multimodal large language models (MLLMs) to generate high-quality text descriptions; (2) the generated text for IAA serves as metadata to purify noisy IAA data. To effectively adapt the pre-trained UniQA to downstream tasks, we further propose a lightweight adapter that utilizes versatile cues to fully exploit the extensive knowledge of the pre-trained model. Extensive experiments demonstrate that our approach attains a new state-of-the-art performance on both IQA and IAA tasks, while concurrently showcasing exceptional zero-shot and few-label image assessment capabilities. The source code will be available at https://github.com/zht8506/UniQA. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.01007 [pdf, other]

Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay

Authors: Daya Bay collaboration, F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, J. Cheng, Y. -C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng , et al. (177 additional authors not shown)

Abstract: This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive… ▽ More This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00794 [pdf, other]

doi 10.1038/s41550-024-02271-2

Detection of an Earth-sized exoplanet orbiting the nearby ultracool dwarf star SPECULOOS-3

Authors: Michaël Gillon, Peter P. Pedersen, Benjamin V. Rackham, Georgina Dransfield, Elsa Ducrot, Khalid Barkaoui, Artem Y. Burdanov, Urs Schroffenegger, Yilen Gómez Maqueo Chew, Susan M. Lederer, Roi Alonso, Adam J. Burgasser, Steve B. Howell, Norio Narita, Julien de Wit, Brice-Olivier Demory, Didier Queloz, Amaury H. M. J. Triaud, Laetitia Delrez, Emmanuël Jehin, Matthew J. Hooton, Lionel J. Garcia, Clàudia Jano Muñoz, Catriona A. Murray, Francisco J. Pozuelos , et al. (59 additional authors not shown)

Abstract: Located at the bottom of the main sequence, ultracool dwarf stars are widespread in the solar neighbourhood. Nevertheless, their extremely low luminosity has left their planetary population largely unexplored, and only one of them, TRAPPIST-1, has so far been found to host a transiting planetary system. In this context, we present the SPECULOOS project's detection of an Earth-sized planet in a 17… ▽ More Located at the bottom of the main sequence, ultracool dwarf stars are widespread in the solar neighbourhood. Nevertheless, their extremely low luminosity has left their planetary population largely unexplored, and only one of them, TRAPPIST-1, has so far been found to host a transiting planetary system. In this context, we present the SPECULOOS project's detection of an Earth-sized planet in a 17 h orbit around an ultracool dwarf of M6.5 spectral type located 16.8 pc away. The planet's high irradiation (16 times that of Earth) combined with the infrared luminosity and Jupiter-like size of its host star make it one of the most promising rocky exoplanet targets for detailed emission spectroscopy characterization with JWST. Indeed, our sensitivity study shows that just ten secondary eclipse observations with the Mid-InfraRed Instrument/Low-Resolution Spectrometer on board JWST should provide strong constraints on its atmospheric composition and/or surface mineralogy. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2406.00605 [pdf, other]

LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models

Authors: Liang Zhao, Tianwen Wei, Liang Zeng, Cheng Cheng, Liu Yang, Peng Cheng, Lijie Wang, Chenxia Li, Xuejie Wu, Bo Zhu, Yimeng Gan, Rui Hu, Shuicheng Yan, Han Fang, Yahui Zhou

Abstract: We introduce LongSkywork, a long-context Large Language Model (LLM) capable of processing up to 200,000 tokens. We provide a training recipe for efficiently extending context length of LLMs. We identify that the critical element in enhancing long-context processing capability is to incorporate a long-context SFT stage following the standard SFT stage. A mere 200 iterations can convert the standard… ▽ More We introduce LongSkywork, a long-context Large Language Model (LLM) capable of processing up to 200,000 tokens. We provide a training recipe for efficiently extending context length of LLMs. We identify that the critical element in enhancing long-context processing capability is to incorporate a long-context SFT stage following the standard SFT stage. A mere 200 iterations can convert the standard SFT model into a long-context model. To reduce the effort in collecting and annotating data for long-context language modeling, we develop two novel methods for creating synthetic data. These methods are applied during the continual pretraining phase as well as the Supervised Fine-Tuning (SFT) phase, greatly enhancing the training efficiency of our long-context LLMs. Our findings suggest that synthetic long-context SFT data can surpass the performance of data curated by humans to some extent. LongSkywork achieves outstanding performance on a variety of long-context benchmarks. In the Needle test, a benchmark for long-context information retrieval, our models achieved perfect accuracy across multiple context spans. Moreover, in realistic application scenarios, LongSkywork-13B demonstrates performance on par with Claude2.1, the leading long-context model, underscoring the effectiveness of our proposed methods. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.19740 [pdf, other]

PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations

Authors: Jiatong Li, Renjun Hu, Kunzhe Huang, Yan Zhuang, Qi Liu, Mengxiao Zhu, Xing Shi, Wei Lin

Abstract: Expert-designed close-ended benchmarks serve as vital tools in assessing the knowledge capacity of large language models (LLMs). Despite their widespread use, concerns have mounted regarding their reliability due to limited test scenarios and an unavoidable risk of data contamination. To rectify this, we present PertEval, a toolkit devised for in-depth probing of LLMs' knowledge capacity through k… ▽ More Expert-designed close-ended benchmarks serve as vital tools in assessing the knowledge capacity of large language models (LLMs). Despite their widespread use, concerns have mounted regarding their reliability due to limited test scenarios and an unavoidable risk of data contamination. To rectify this, we present PertEval, a toolkit devised for in-depth probing of LLMs' knowledge capacity through knowledge-invariant perturbations. These perturbations employ human-like restatement techniques to generate on-the-fly test samples from static benchmarks, meticulously retaining knowledge-critical content while altering irrelevant details. Our toolkit further includes a suite of transition analyses that compare performance on raw vs. perturbed test sets to precisely assess LLMs' genuine knowledge capacity. Six state-of-the-art LLMs are re-evaluated using PertEval. Results reveal significantly inflated performance of the LLMs on raw benchmarks, including an absolute 21% overestimation for GPT-4. Additionally, through a nuanced response pattern analysis, we discover that PertEval retains LLMs' uncertainty to specious knowledge, potentially being resolved through rote memorization and leading to inflated performance. We also find that the detailed transition analyses by PertEval could illuminate weaknesses in existing LLMs' knowledge mastery and guide the development of refinement. Given these insights, we posit that PertEval can act as an essential tool that, when applied alongside any close-ended benchmark, unveils the true knowledge capacity of LLMs, marking a significant step toward more trustworthy LLM evaluation. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 23 pages, 12 figures, 10 tables

arXiv:2405.19433 [pdf, other]

Beyond Agreement: Diagnosing the Rationale Alignment of Automated Essay Scoring Methods based on Linguistically-informed Counterfactuals

Authors: Yupei Wang, Renfen Hu, Zhe Zhao

Abstract: While current automated essay scoring (AES) methods show high agreement with human raters, their scoring mechanisms are not fully explored. Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that when scoring essays, BERT-like models primarily focus on sentence-level features, while LLMs are attuned to conventions, language complexity, as well… ▽ More While current automated essay scoring (AES) methods show high agreement with human raters, their scoring mechanisms are not fully explored. Our proposed method, using counterfactual intervention assisted by Large Language Models (LLMs), reveals that when scoring essays, BERT-like models primarily focus on sentence-level features, while LLMs are attuned to conventions, language complexity, as well as organization, indicating a more comprehensive alignment with scoring rubrics. Moreover, LLMs can discern counterfactual interventions during feedback. Our approach improves understanding of neural AES methods and can also apply to other domains seeking transparency in model-driven decisions. The codes and data will be released at GitHub. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.13325 [pdf, other]

DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying

Authors: Guanghui Wang, Dexi Liu, Jian-Yun Nie, Qizhi Wan, Rong Hu, Xi** Liu, Wanlong Liu, Jiaming Liu

Abstract: Recent advancements in event argument extraction (EAE) involve incorporating useful auxiliary information into models during training and inference, such as retrieved instances and event templates. These methods face two challenges: (1) the retrieval results may be irrelevant and (2) templates are developed independently for each event without considering their possible relationship. In this work,… ▽ More Recent advancements in event argument extraction (EAE) involve incorporating useful auxiliary information into models during training and inference, such as retrieved instances and event templates. These methods face two challenges: (1) the retrieval results may be irrelevant and (2) templates are developed independently for each event without considering their possible relationship. In this work, we propose DEGAP to address these challenges through a simple yet effective components: dual prefixes, i.e. learnable prompt vectors, where the instance-oriented prefix and template-oriented prefix are trained to learn information from different event instances and templates. Additionally, we propose an event-guided adaptive gating mechanism, which can adaptively leverage possible connections between different events and thus capture relevant information from the prefix. Finally, these event-guided prefixes provide relevant information as cues to EAE model without retrieval. Extensive experiments demonstrate that our method achieves new state-of-the-art performance on four datasets (ACE05, RAMS, WIKIEVENTS, and MLEE). Further analysis shows the impact of different components. △ Less

Submitted 15 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.08541 [pdf, other]

A Determination of the Local Gravitational Acceleration for the Tsinghua Tabletop Kibble Balance

Authors: Weibo Liu, Nanjia Li, Yongchao Ma, Ruo Hu, Shuqing Wu, Wei Zhao, Songling Huang, Shisong Li

Abstract: The Kibble balance requires a measurement of the local gravitational acceleration, $g$, with a typical relative measurement uncertainty of $10^{-9}$. In this paper, the determination of $g$ for the Tsinghua tabletop Kibble balance is presented. A polynomial fitting method is proposed for blind transfers of the absolute gravitational acceleration using relative gravimeters, showing agreement with t… ▽ More The Kibble balance requires a measurement of the local gravitational acceleration, $g$, with a typical relative measurement uncertainty of $10^{-9}$. In this paper, the determination of $g$ for the Tsinghua tabletop Kibble balance is presented. A polynomial fitting method is proposed for blind transfers of the absolute gravitational acceleration using relative gravimeters, showing agreement with the value obtained by the tide correction within a few parts in $10^{9}$. Horizontal and vertical gravity gradients are extracted by map** the gravity distribution at different heights. The self-attraction effect of major components in the experiment, as well as some time-varying systematic effects, are modeled. The final determination of the gravitational acceleration at the mass position, with an uncertainty of 5.4 $μ$Gal ($k=2$), is achieved for the Tsinghua tabletop Kibble balance experiment. △ Less

Submitted 20 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: 11 figures, submitted to IEEE Trans. Instrum. Meas

arXiv:2405.08130 [pdf, other]

Collisions of Burgers Bores with Nonlinear Waves

Authors: Albert. Dombret, Darryl D. Holm, Ruiao Hu, Oliver D. Street, Hanchun Wang

Abstract: This paper treats nonlinear wave current interactions in their simplest form, as an overtaking collision. In one spatial dimension, the paper investigates the collision interaction formulated as an initial value problem of a Burgers bore overtaking solutions of two types of nonlinear wave equations, Korteweg de Vries (KdV) and nonlinear Schrodinger (NLS). The bore wave state arising after the over… ▽ More This paper treats nonlinear wave current interactions in their simplest form, as an overtaking collision. In one spatial dimension, the paper investigates the collision interaction formulated as an initial value problem of a Burgers bore overtaking solutions of two types of nonlinear wave equations, Korteweg de Vries (KdV) and nonlinear Schrodinger (NLS). The bore wave state arising after the overtaking Burgers-KdV collision in numerical simulations is found to depend qualitatively on the balance between nonlinearity and dispersion in the KdV equation. The Burgers-KdV system is also made stochastic by following the stochastic advection by Lie transport approach (SALT). △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 16 pages, 4 figures, 1st version

arXiv:2405.06332 [pdf, ps, other]

Solving maximally comonotone inclusion problems via an implicit Newton-like inertial dynamical system and its discretization

Authors: Z. Z. Tan, R. Hu, Y. P. Fang

Abstract: This paper deals with an implicit Newton-like inertial dynamical system governed by a maximally comonotone inclusion problem in a Hilbert space. Under suitable conditions, we establish not only pointwise estimates and integral estimates for the velocity and the value of the associated Yosida regularization operator along the trajectory of the system, but also the weak convergence of the trajectory… ▽ More This paper deals with an implicit Newton-like inertial dynamical system governed by a maximally comonotone inclusion problem in a Hilbert space. Under suitable conditions, we establish not only pointwise estimates and integral estimates for the velocity and the value of the associated Yosida regularization operator along the trajectory of the system, but also the weak convergence of the trajectory to a zero of the maximally comonotone operator. Moreover, a new inertial algorithm is developed via a time discretization of the proposed system. Our analysis reveals that the resulting discrete algorithm exhibits fast convergence properties matching the ones of the continuous time counterpart. Finally, the theoretical results are illustrated by numerical experiments. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.05481 [pdf, other]

Achieving millisecond coherence fluxonium through overlap Josephson junctions

Authors: Fei Wang, Kannan Lu, Huijuan Zhan, Lu Ma, Feng Wu, Hantao Sun, Hao Deng, Yang Bai, Feng Bao, Xu Chang, Ran Gao, Xun Gao, Guicheng Gong, Lijuan Hu, Ruizi Hu, Honghong Ji, Xizheng Ma, Liyong Mao, Zhijun Song, Chengchun Tang, Hongcheng Wang, Tenghui Wang, Ziang Wang, Tian Xia, Hongxin Xu , et al. (10 additional authors not shown)

Abstract: Fluxonium qubits are recognized for their high coherence times and high operation fidelities, attributed to their unique design incorporating over 100 Josephson junctions per superconducting loop. However, this complexity poses significant fabrication challenges, particularly in achieving high yield and junction uniformity with traditional methods. Here, we introduce an overlap process for Josephs… ▽ More Fluxonium qubits are recognized for their high coherence times and high operation fidelities, attributed to their unique design incorporating over 100 Josephson junctions per superconducting loop. However, this complexity poses significant fabrication challenges, particularly in achieving high yield and junction uniformity with traditional methods. Here, we introduce an overlap process for Josephson junction fabrication that achieves nearly 100% yield and maintains uniformity across a 2-inch wafer with less than 5% variation for the phase slip junction and less than 2% for the junction array. Our compact junction array design facilitates fluxonium qubits with energy relaxation times exceeding 1 millisecond at the flux frustration point, demonstrating consistency with state-of-the-art dielectric loss tangents and flux noise across multiple devices. This work suggests the scalability of high coherence fluxonium processors using CMOS-compatible processes, marking a significant step towards practical quantum computing. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.04744 [pdf]

doi 10.1038/s41586-024-07432-x

A secondary atmosphere on the rocky exoplanet 55 Cancri e

Authors: Renyu Hu, Aaron Bello-Arufe, Michael Zhang, Kimberly Paragas, Mantas Zilinskas, Christiaan van Buchem, Michael Bess, Jayshil Patel, Yuichi Ito, Mario Damiano, Markus Scheucher, Apurva V. Oza, Heather A. Knutson, Yamila Miguel, Diana Dragomir, Alexis Brandeker, Brice-Olivier Demory

Abstract: Characterizing rocky exoplanets is a central endeavor of astronomy, and yet the search for atmospheres on rocky exoplanets has hitherto resulted in either tight upper limits on the atmospheric mass or inconclusive results. The 1.95-REarth and 8.8-MEarth planet 55 Cnc e, with a predominantly rocky composition and an equilibrium temperature of ~2000 K, may have a volatile envelope (containing molecu… ▽ More Characterizing rocky exoplanets is a central endeavor of astronomy, and yet the search for atmospheres on rocky exoplanets has hitherto resulted in either tight upper limits on the atmospheric mass or inconclusive results. The 1.95-REarth and 8.8-MEarth planet 55 Cnc e, with a predominantly rocky composition and an equilibrium temperature of ~2000 K, may have a volatile envelope (containing molecules made from a combination of C, H, O, N, S, and P elements) that accounts for up to a few percent of its radius. The planet has been observed extensively with transmission spectroscopy, and its thermal emission has been measured in broad photometric bands. These observations disfavor a primordial H2/He-dominated atmosphere but cannot conclusively determine whether the planet has a secondary atmosphere. Here we report a thermal emission spectrum of the planet obtained by JWST's NIRCam and MIRI instruments from 4 to 12 μm. The measurements rule out the scenario where the planet is a lava world shrouded by a tenuous atmosphere made of vaporized rock, and indicate a bona fide volatile atmosphere likely rich in CO2 or CO. This atmosphere can be outgassed from and sustained by a magma ocean. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: Published online in Nature on May 8, 2024. https://www.nature.com/articles/s41586-024-07432-x. Authors' preprint

arXiv:2405.03485 [pdf, other]

doi 10.1145/3641519.3657422

LGTM: Local-to-Global Text-Driven Human Motion Diffusion Model

Authors: Haowen Sun, Ruikun Zheng, Haibin Huang, Chongyang Ma, Hui Huang, Ruizhen Hu

Abstract: In this paper, we introduce LGTM, a novel Local-to-Global pipeline for Text-to-Motion generation. LGTM utilizes a diffusion-based architecture and aims to address the challenge of accurately translating textual descriptions into semantically coherent human motion in computer animation. Specifically, traditional methods often struggle with semantic discrepancies, particularly in aligning specific m… ▽ More In this paper, we introduce LGTM, a novel Local-to-Global pipeline for Text-to-Motion generation. LGTM utilizes a diffusion-based architecture and aims to address the challenge of accurately translating textual descriptions into semantically coherent human motion in computer animation. Specifically, traditional methods often struggle with semantic discrepancies, particularly in aligning specific motions to the correct body parts. To address this issue, we propose a two-stage pipeline to overcome this challenge: it first employs large language models (LLMs) to decompose global motion descriptions into part-specific narratives, which are then processed by independent body-part motion encoders to ensure precise local semantic alignment. Finally, an attention-based full-body optimizer refines the motion generation results and guarantees the overall coherence. Our experiments demonstrate that LGTM gains significant improvements in generating locally accurate, semantically-aligned human motion, marking a notable advancement in text-to-motion applications. Code and data for this paper are available at https://github.com/L-Sun/LGTM △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 9 pages,7 figures, SIGGRAPH 2024

arXiv:2405.03405 [pdf, ps, other]

Hall effect on the joint cascades of magnetic energy and helicity in helical magnetohydrodynamic turbulence

Authors: Running Hu, **-Han Xie, Xinliang Li, Chang** Yu, Yuan Hu, Jianchun Wang, Shiyi Chen

Abstract: Helical magnetohydrodynamic turbulence with Hall effects is ubiquitous in heliophysics and plasma physics, such as star formation and solar activities, and its intrinsic mechanisms are still not clearly explained. Direct numerical simulations reveal that when the forcing scale is comparable to the ion inertial scale, Hall effects induce remarkable cross helicity. It then suppresses the inverse cas… ▽ More Helical magnetohydrodynamic turbulence with Hall effects is ubiquitous in heliophysics and plasma physics, such as star formation and solar activities, and its intrinsic mechanisms are still not clearly explained. Direct numerical simulations reveal that when the forcing scale is comparable to the ion inertial scale, Hall effects induce remarkable cross helicity. It then suppresses the inverse cascade efficiency, leading to the accumulation of large-scale magnetic energy and helicity. The process is accompanied by the breaking of current sheets via filaments along magnetic fields. Using the Ulysses data, the numerical findings are separately confirmed. These results suggest a novel mechanism wherein small-scale Hall effects could strongly affect large-scale magnetic fields through cross helicity. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 5 figures, 6 pages

arXiv:2405.03221 [pdf, other]

Spatial and Surface Correspondence Field for Interaction Transfer

Authors: Zeyu Huang, Honghao Xu, Haibin Huang, Chongyang Ma, Hui Huang, Ruizhen Hu

Abstract: In this paper, we introduce a new method for the task of interaction transfer. Given an example interaction between a source object and an agent, our method can automatically infer both surface and spatial relationships for the agent and target objects within the same category, yielding more accurate and valid transfers. Specifically, our method characterizes the example interaction using a combin… ▽ More In this paper, we introduce a new method for the task of interaction transfer. Given an example interaction between a source object and an agent, our method can automatically infer both surface and spatial relationships for the agent and target objects within the same category, yielding more accurate and valid transfers. Specifically, our method characterizes the example interaction using a combined spatial and surface representation. We correspond the agent points and object points related to the representation to the target object space using a learned spatial and surface correspondence field, which represents objects as deformed and rotated signed distance fields. With the corresponded points, an optimization is performed under the constraints of our spatial and surface interaction representation and additional regularization. Experiments conducted on human-chair and hand-mug interaction transfer tasks show that our approach can handle larger geometry and topology variations between source and target shapes, significantly outperforming state-of-the-art methods. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Accepted to SIGGRAPH 2024, project page at https://vcc.tech/research/2024/InterTransfer

arXiv:2405.02152 [pdf, ps, other]

On the Three-dimensional Nernst-Planck-Boussinesq System

Authors: Elie Abdo, Ruimeng Hu, Quyuan Lin

Abstract: In this paper, we analyze a three-dimensional Nernst-Planck-Boussinesq (NPB) system that describes ionic electrodiffusion in an incompressible viscous fluid. This new model incorporates variational temperature and is forced by buoyancy force stemming from temperature and salinity fluctuations, enhancing its generality and realism. The electromigration term in the NPB system displays a complex nonl… ▽ More In this paper, we analyze a three-dimensional Nernst-Planck-Boussinesq (NPB) system that describes ionic electrodiffusion in an incompressible viscous fluid. This new model incorporates variational temperature and is forced by buoyancy force stemming from temperature and salinity fluctuations, enhancing its generality and realism. The electromigration term in the NPB system displays a complex nonlinear structure influenced by the reciprocal of the temperature that distinguishes its mathematical aspects from other electrodiffusion models studied in the literature. We address the global existence of weak solutions to the NPB system on the three-dimensional torus for large initial data. In addition, we study the long-time dynamics of these weak solutions and the associated relative entropies and establish their exponential decay in time to steady states. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 28 pages

arXiv:2405.01258 [pdf, other]

Towards Consistent Object Detection via LiDAR-Camera Synergy

Authors: Kai Luo, Hao Wu, Kefu Yi, Kailun Yang, Wei Hao, Rongdong Hu

Abstract: As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. However, currently, no model exists that can simultaneously detect an object's position in both point clouds and images and ascertain their corresponding relatio… ▽ More As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. However, currently, no model exists that can simultaneously detect an object's position in both point clouds and images and ascertain their corresponding relationship. This information is invaluable for human-machine interactions, offering new possibilities for their enhancement. In light of this, this paper introduces an end-to-end Consistency Object Detection (COD) algorithm framework that requires only a single forward inference to simultaneously obtain an object's position in both point clouds and images and establish their correlation. Furthermore, to assess the accuracy of the object correlation between point clouds and images, this paper proposes a new evaluation metric, Consistency Precision (CP). To verify the effectiveness of the proposed framework, an extensive set of experiments has been conducted on the KITTI and DAIR-V2X datasets. The study also explored how the proposed consistency detection method performs on images when the calibration parameters between images and point clouds are disturbed, compared to existing post-processing methods. The experimental results demonstrate that the proposed method exhibits excellent detection performance and robustness, achieving end-to-end consistency detection. The source code will be made publicly available at https://github.com/xifen523/COD. △ Less

Submitted 2 May, 2024; originally announced May 2024.

Comments: The source code will be made publicly available at https://github.com/xifen523/COD

arXiv:2404.17569 [pdf, other]

MaPa: Text-driven Photorealistic Material Painting for 3D Shapes

Authors: Shangzhan Zhang, Sida Peng, Tao Xu, Yuanbo Yang, Tianrun Chen, Nan Xue, Yujun Shen, Hujun Bao, Ruizhen Hu, Xiaowei Zhou

Abstract: This paper aims to generate materials for 3D meshes from text descriptions. Unlike existing methods that synthesize texture maps, we propose to generate segment-wise procedural material graphs as the appearance representation, which supports high-quality rendering and provides substantial flexibility in editing. Instead of relying on extensive paired data, i.e., 3D meshes with material graphs and… ▽ More This paper aims to generate materials for 3D meshes from text descriptions. Unlike existing methods that synthesize texture maps, we propose to generate segment-wise procedural material graphs as the appearance representation, which supports high-quality rendering and provides substantial flexibility in editing. Instead of relying on extensive paired data, i.e., 3D meshes with material graphs and corresponding text descriptions, to train a material graph generative model, we propose to leverage the pre-trained 2D diffusion model as a bridge to connect the text and material graphs. Specifically, our approach decomposes a shape into a set of segments and designs a segment-controlled diffusion model to synthesize 2D images that are aligned with mesh parts. Based on generated images, we initialize parameters of material graphs and fine-tune them through the differentiable rendering module to produce materials in accordance with the textual description. Extensive experiments demonstrate the superior performance of our framework in photorealism, resolution, and editability over existing methods. Project page: https://zju3dv.github.io/MaPa △ Less

Submitted 25 June, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

Comments: SIGGRAPH 2024. Project page: https://zju3dv.github.io/MaPa

arXiv:2404.15596 [pdf, other]

VulEval: Towards Repository-Level Evaluation of Software Vulnerability Detection

Authors: Xin-Cheng Wen, Xinchen Wang, Yujia Chen, Ruida Hu, David Lo, Cuiyun Gao

Abstract: Deep Learning (DL)-based methods have proven to be effective for software vulnerability detection, with a potential for substantial productivity enhancements for detecting vulnerabilities. Current methods mainly focus on detecting single functions (i.e., intra-procedural vulnerabilities), ignoring the more complex inter-procedural vulnerability detection scenarios in practice. For example, develop… ▽ More Deep Learning (DL)-based methods have proven to be effective for software vulnerability detection, with a potential for substantial productivity enhancements for detecting vulnerabilities. Current methods mainly focus on detecting single functions (i.e., intra-procedural vulnerabilities), ignoring the more complex inter-procedural vulnerability detection scenarios in practice. For example, developers routinely engage with program analysis to detect vulnerabilities that span multiple functions within repositories. In addition, the widely-used benchmark datasets generally contain only intra-procedural vulnerabilities, leaving the assessment of inter-procedural vulnerability detection capabilities unexplored. To mitigate the issues, we propose a repository-level evaluation system, named \textbf{VulEval}, aiming at evaluating the detection performance of inter- and intra-procedural vulnerabilities simultaneously. Specifically, VulEval consists of three interconnected evaluation tasks: \textbf{(1) Function-Level Vulnerability Detection}, aiming at detecting intra-procedural vulnerability given a code snippet; \textbf{(2) Vulnerability-Related Dependency Prediction}, aiming at retrieving the most relevant dependencies from call graphs for providing developers with explanations about the vulnerabilities; and \textbf{(3) Repository-Level Vulnerability Detection}, aiming at detecting inter-procedural vulnerabilities by combining with the dependencies identified in the second task. VulEval also consists of a large-scale dataset, with a total of 4,196 CVE entries, 232,239 functions, and corresponding 4,699 repository-level source code in C/C++ programming languages. Our analysis highlights the current progress and future directions for software vulnerability detection. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 12 pages

arXiv:2404.14949 [pdf, other]

Multi-Modal Prompt Learning on Blind Image Quality Assessment

Authors: Wensheng Pan, Timin Gao, Yan Zhang, Runze Hu, Xiawu Zheng, Enwei Zhang, Yuting Gao, Yutao Liu, Yunhang Shen, Ke Li, Shengchuan Zhang, Liujuan Cao, Rongrong Ji

Abstract: Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Currently, leveraging semantic information to enhance IQA is a crucial research direction. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semant… ▽ More Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Currently, leveraging semantic information to enhance IQA is a crucial research direction. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. However, the generalist nature of these pre-trained Vision-Language (VL) models often renders them suboptimal for IQA-specific tasks. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. Existing prompt-based VL models overly focus on incremental semantic information from text, neglecting the rich insights available from visual data analysis. This imbalance limits their performance improvements in IQA tasks. This paper introduces an innovative multi-modal prompt-based methodology for IQA. Our approach employs carefully crafted prompts that synergistically mine incremental semantic information from both visual and linguistic data. Specifically, in the visual branch, we introduce a multi-layer prompt structure to enhance the VL model's adaptability. In the text branch, we deploy a dual-prompt scheme that steers the model to recognize and differentiate between scene category and distortion type, thereby refining the model's capacity to assess image quality. Our experimental findings underscore the effectiveness of our method over existing Blind Image Quality Assessment (BIQA) approaches. Notably, it demonstrates competitive performance across various datasets. Our method achieves Spearman Rank Correlation Coefficient (SRCC) values of 0.961(surpassing 0.946 in CSIQ) and 0.941 (exceeding 0.930 in KADID), illustrating its robustness and accuracy in diverse contexts. △ Less

Submitted 18 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.14853 [pdf, ps, other]

Fast convergence rates and trajectory convergence of a Tikhonov regularized inertial primal\mbox{-}dual dynamical system with time scaling and vanishing dam**

Authors: Ting-Ting Zhu, Rong Hu, Ya-** Fang

Abstract: A Tikhonov regularized inertial primal\mbox{-}dual dynamical system with time scaling and vanishing dam** is proposed for solving a linearly constrained convex optimization problem in Hilbert spaces. The system under consideration consists of two coupled second order differential equations and its convergence properties depend upon the decaying speed of the product of the time scaling parameter… ▽ More A Tikhonov regularized inertial primal\mbox{-}dual dynamical system with time scaling and vanishing dam** is proposed for solving a linearly constrained convex optimization problem in Hilbert spaces. The system under consideration consists of two coupled second order differential equations and its convergence properties depend upon the decaying speed of the product of the time scaling parameter and the Tikhonov regularization parameter (named the rescaled regularization parameter) to zero. When the rescaled regularization parameter converges rapidly to zero, the system enjoys fast convergence rates of the primal-dual gap, the feasibility violation, the objective residual, and the gradient norm of the objective function along the trajectory, and the weak convergence of the trajectory to a primal-dual solution of the linearly constrained convex optimization problem. When the rescaled regularization parameter converges slowly to zero, the generated primal trajectory converges strongly to the minimal norm solution of the problem under suitable conditions. Finally, numerical experiments are performed to illustrate the theoretical findings. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.13279 [pdf, other]

Backdoor Attacks and Defenses on Semantic-Symbol Reconstruction in Semantic Communications

Authors: Yuan Zhou, Rose Qingyang Hu, Yi Qian

Abstract: Semantic communication is of crucial importance for the next-generation wireless communication networks. The existing works have developed semantic communication frameworks based on deep learning. However, systems powered by deep learning are vulnerable to threats such as backdoor attacks and adversarial attacks. This paper delves into backdoor attacks targeting deep learning-enabled semantic comm… ▽ More Semantic communication is of crucial importance for the next-generation wireless communication networks. The existing works have developed semantic communication frameworks based on deep learning. However, systems powered by deep learning are vulnerable to threats such as backdoor attacks and adversarial attacks. This paper delves into backdoor attacks targeting deep learning-enabled semantic communication systems. Since current works on backdoor attacks are not tailored for semantic communication scenarios, a new backdoor attack paradigm on semantic symbols (BASS) is introduced, based on which the corresponding defense measures are designed. Specifically, a training framework is proposed to prevent BASS. Additionally, reverse engineering-based and pruning-based defense strategies are designed to protect against backdoor attacks in semantic communication. Simulation results demonstrate the effectiveness of both the proposed attack paradigm and the defense strategies. △ Less

Submitted 20 April, 2024; originally announced April 2024.

Comments: This paper has been accepted by IEEE ICC 2024

arXiv:2404.11967 [pdf, other]

Multi-Agent Relative Investment Games in a Jump Diffusion Market with Deep Reinforcement Learning Algorithm

Authors: Liwei Lu, Ruimeng Hu, Xu Yang, Yi Zhu

Abstract: This paper focuses on multi-agent stochastic differential games for jump-diffusion systems. On one hand, we study the multi-agent game for optimal investment in a jump-diffusion market. We derive constant Nash equilibria and provide sufficient conditions for their existence and uniqueness for exponential, power, and logarithmic utilities, respectively. On the other hand, we introduce a computation… ▽ More This paper focuses on multi-agent stochastic differential games for jump-diffusion systems. On one hand, we study the multi-agent game for optimal investment in a jump-diffusion market. We derive constant Nash equilibria and provide sufficient conditions for their existence and uniqueness for exponential, power, and logarithmic utilities, respectively. On the other hand, we introduce a computational framework based on the actor-critic method in deep reinforcement learning to solve the stochastic control problem with jumps. We extend this algorithm to address the multi-agent game with jumps and utilize parallel computing to enhance computational efficiency. We present numerical examples of the Merton problem with jumps, linear quadratic regulators, and the optimal investment game under various settings to demonstrate the accuracy, efficiency, and robustness of the proposed method. In particular, neural network solutions numerically converge to the derived constant Nash equilibrium for the multi-agent game. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11035 [pdf, other]

Approximate Wireless Communication for Lossy Gradient Updates in IoT Federated Learning

Authors: Xiang Ma, Haijian Sun, Rose Qingyang Hu, Yi Qian

Abstract: Federated learning (FL) has emerged as a distributed machine learning (ML) technique that can protect local data privacy for participating clients and improve system efficiency. Instead of sharing raw data, FL exchanges intermediate learning parameters, such as gradients, among clients. This article presents an efficient wireless communication approach tailored for FL parameter transmission, espec… ▽ More Federated learning (FL) has emerged as a distributed machine learning (ML) technique that can protect local data privacy for participating clients and improve system efficiency. Instead of sharing raw data, FL exchanges intermediate learning parameters, such as gradients, among clients. This article presents an efficient wireless communication approach tailored for FL parameter transmission, especially for Internet of Things (IoT) devices, to facilitate model aggregation. Our study considers practical wireless channels that can lead to random bit errors, which can substantially affect FL performance. Motivated by empirical gradient value distribution, we introduce a novel received bit masking method that confines received gradient values within prescribed limits. Moreover, given the intrinsic error resilience of ML gradients, our approach enables the delivery of approximate gradient values with errors without resorting to extensive error correction coding or retransmission. This strategy reduces computational overhead at both the transmitter and the receiver and minimizes communication latency. Consequently, our scheme is particularly well-suited for resource-constrained IoT devices. Additionally, we explore the inherent protection of the most significant bits (MSBs) through gray coding in high-order modulation. Our simulations demonstrate that our proposed scheme can effectively mitigate random bit errors in FL performance, achieving similar learning objectives, but with the 50% air time required by existing methods involving error correction and retransmission. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: submitted to IEEE journals for publication

arXiv:2404.10596 [pdf, other]

Formation of GW230529 from Isolated Binary Evolution and Its Electromagnetic Counterparts

Authors: **-** Zhu, Rui-Chong Hu, Yacheng Kang, Bing Zhang, Hui Tong, Li**g Shao, Ying Qin

Abstract: In this {\em{Letter}}, we explore the formation of the mass-gap black hole-neutron star (mgBHNS) merger detected in gravitational wave (GW) event, i.e., GW230529, from the isolated binary evolution channel, and study potential signatures of its electromagnetic counterparts. By adopting the `delayed' supernova prescription and reasonable model realizations, our population synthesis simulation resul… ▽ More In this {\em{Letter}}, we explore the formation of the mass-gap black hole-neutron star (mgBHNS) merger detected in gravitational wave (GW) event, i.e., GW230529, from the isolated binary evolution channel, and study potential signatures of its electromagnetic counterparts. By adopting the `delayed' supernova prescription and reasonable model realizations, our population synthesis simulation results can simultaneously match the rate densities of mgBHNS and total BHNS mergers inferred from the population analyses, along with the population distribution of the BH mass in BHNS mergers reported by the LIGO-Virgo-KAGRA Collaboration. Because GW230529 contributes significantly to the inferred mgBHNS rate densities, we suggest that GW230529 can be explained through the isolated binary evolution channel. Considering the AP4 (DD2) equation of state, the probability that GW230529 can make tidal disruption is $12.8\%$ ($63.2\%$). If GW230529 is a disrupted event, its kilonova peak apparent magnitude is predicted $\sim23-24\,{\rm{mag}}$, and hence, can be detected by the present survey projects and LSST. Since GW230529 could be an off-axis event inferred from the GW observation, its associated gamma-ray burst (GRB) might be too dim to be observed by $γ$-ray detectors, interpreting the lack of GRB observations. Our study suggests the existence of mgBHNS mergers formed through the isolated binary evolution channel due to the discovery of GW230529, indicating that BHNS mergers are still likely to be multimessenger sources that emit GWs, GRBs, and kilonovae. Although mgBHNS mergers account for $\sim50\%$ cosmological BHNS population, we find that $\gtrsim90\%$ disrupted BHNS mergers are expected to originate from mgBHNS mergers. △ Less

Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: Submitted to ApJL, 15 pages, 5 figures, 4 tables, comments are welcome!

arXiv:2404.10342 [pdf, other]

Referring Flexible Image Restoration

Authors: Runwei Guan, Rongsheng Hu, Zhuhao Zhou, Tianlang Xue, Ka Lok Man, Jeremy Smith, Eng Gee Lim, Wei** Ding, Yutao Yue

Abstract: In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image… ▽ More In reality, images often exhibit multiple degradations, such as rain and fog at night (triple degradations). However, in many cases, individuals may not want to remove all degradations, for instance, a blurry lens revealing a beautiful snowy landscape (double degradations). In such scenarios, people may only desire to deblur. These situations and requirements shed light on a new challenge in image restoration, where a model must perceive and remove specific degradation types specified by human commands in images with multiple degradations. We term this task Referring Flexible Image Restoration (RFIR). To address this, we first construct a large-scale synthetic dataset called RFIR, comprising 153,423 samples with the degraded image, text prompt for specific degradation removal and restored image. RFIR consists of five basic degradation types: blur, rain, haze, low light and snow while six main sub-categories are included for varying degrees of degradation removal. To tackle the challenge, we propose a novel transformer-based multi-task model named TransRFIR, which simultaneously perceives degradation types in the degraded image and removes specific degradation upon text prompt. TransRFIR is based on two devised attention modules, Multi-Head Agent Self-Attention (MHASA) and Multi-Head Agent Cross Attention (MHACA), where MHASA and MHACA introduce the agent token and reach the linear complexity, achieving lower computation cost than vanilla self-attention and cross-attention and obtaining competitive performances. Our TransRFIR achieves state-of-the-art performances compared with other counterparts and is proven as an effective architecture for image restoration. We release our project at https://github.com/GuanRunwei/FIR-CP. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: 15 pages, 19 figures

arXiv:2404.10332 [pdf, other]

Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning

Authors: Rui Hu, Yahan Tu, Jitao Sang

Abstract: Despite achieving outstanding performance on various cross-modal tasks, current large vision-language models (LVLMs) still suffer from hallucination issues, manifesting as inconsistencies between their generated responses and the corresponding images. Prior research has implicated that the low quality of instruction data, particularly the skewed balance between positive and negative samples, is a… ▽ More Despite achieving outstanding performance on various cross-modal tasks, current large vision-language models (LVLMs) still suffer from hallucination issues, manifesting as inconsistencies between their generated responses and the corresponding images. Prior research has implicated that the low quality of instruction data, particularly the skewed balance between positive and negative samples, is a significant contributor to model hallucinations. Recently, researchers have proposed high-quality instruction datasets, such as LRV-Instruction, to mitigate model hallucination. Nonetheless, our investigation reveals that hallucinatory concepts from different LVLMs exhibit specificity, i.e. the distribution of hallucinatory concepts varies significantly across models. Existing datasets did not consider the hallucination specificity of different models in the design processes, thereby diminishing their efficacy in mitigating model hallucination. In this paper, we propose a targeted instruction data generation framework named DFTG that tailored to the hallucination specificity of different models. Concretely, DFTG consists of two stages: hallucination diagnosis, which extracts the necessary information from the model's responses and images for hallucination diagnosis; and targeted data generation, which generates targeted instruction data based on diagnostic results. The experimental results on hallucination benchmarks demonstrate that the targeted instruction data generated by our method are more effective in mitigating hallucinations compared to previous datasets. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2404.09150 [pdf, other]

Learning Cross-hand Policies for High-DOF Reaching and Gras**

Authors: Qi** She, Shishun Zhang, Yunfan Ye, Min Liu, Ruizhen Hu, Kai Xu

Abstract: Reaching-and-gras** is a fundamental skill for robotic manipulation, but existing methods usually train models on a specific gripper and cannot be reused on another gripper without retraining. In this paper, we propose a novel method that can learn a unified policy model that can be easily transferred to different dexterous grippers. Our method consists of two stages: a gripper-agnostic policy m… ▽ More Reaching-and-gras** is a fundamental skill for robotic manipulation, but existing methods usually train models on a specific gripper and cannot be reused on another gripper without retraining. In this paper, we propose a novel method that can learn a unified policy model that can be easily transferred to different dexterous grippers. Our method consists of two stages: a gripper-agnostic policy model that predicts the displacements of predefined key points on the gripper, and a gripper specific adaptation model that translates these displacements into adjustments for controlling the grippers' joints. The gripper state and interactions with objects are captured at the finger level using robust geometric representations, integrated with a transformer-based network to address variations in gripper morphology and geometry. In the experimental part, we evaluate our method on several dexterous grippers and objects of diverse shapes, and the result shows that our method significantly outperforms the baseline methods. Pioneering the transfer of grasp policies across different dexterous grippers, our method effectively demonstrates its potential for learning generalizable and transferable manipulation skills for various robotic hands △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2404.08907 [pdf, ps, other]

Causal analysis of inner and outer motions in near-wall turbulent flow

Authors: **gxuan Zhang, Zheng** Zhu, Ricardo Vinuesa, Ruifeng Hu

Abstract: In this work, we study the causality of near-wall inner and outer turbulent motions. Here we define the inner motions as the self-sustained near-wall cycle and the outer motions as those living in the logarithmic layer exhibiting a footprint on the near-wall region. We perform causal analysis using two different methods: one is the transfer entropy, based on the information theory, and the other o… ▽ More In this work, we study the causality of near-wall inner and outer turbulent motions. Here we define the inner motions as the self-sustained near-wall cycle and the outer motions as those living in the logarithmic layer exhibiting a footprint on the near-wall region. We perform causal analysis using two different methods: one is the transfer entropy, based on the information theory, and the other one is the Liang--Kleeman information-flow theory. The causal-analysis methods are applied to several scenarios, including a linear and a non-linear problem, a low-dimensional model of the near-wall cycle of turbulence, as well as the interaction between inner and outer turbulent motions in a channel at a friction Reynolds number of $Re_τ=1000$. We find that both methods can well predict the causal links in the linear problem, and the information flow can identify more of the nonlinear problem. Despite richer causalities revealed by the transfer entropy for turbulent-flow problems, both methods can successfully identify the streak-vortex regeneration mechanism that majorly sustains the near-wall turbulence. It is also indicated that both bottom-up and top-down influences of inner and outer motions may coexist in addition to the multiscale self-sustaining mechanism. Lastly, we mention that the computation of the information flow is much more efficient than the transfer entropy. The present study suggests that the information flow can have great potential in causal inference for turbulent-flow problems besides the transfer entropy. △ Less

Submitted 13 April, 2024; originally announced April 2024.

Comments: 13th International Symposium on Turbulence and Shear Flow Phenomena (TSFP13), Montreal, Canada, June 25--28, 2024

arXiv:2404.08725 [pdf, other]

Development of a data overflow protection system for Super-Kamiokande to maximize data from nearby supernovae

Authors: M. Mori, K. Abe, Y. Hayato, K. Hiraide, K. Hosokawa, K. Ieki, M. Ikeda, J. Kameda, Y. Kanemura, R. Kaneshima, Y. Kashiwagi, Y. Kataoka, S. Miki, S. Mine, M. Miura, S. Moriyama, Y. Nakano, M. Nakahata, S. Nakayama, Y. Noguchi, K. Okamoto, K. Sato, H. Sekiya, H. Shiba, K. Shimizu , et al. (230 additional authors not shown)

Abstract: Neutrinos from very nearby supernovae, such as Betelgeuse, are expected to generate more than ten million events over 10\,s in Super-Kamokande (SK). At such large event rates, the buffers of the SK analog-to-digital conversion board (QBEE) will overflow, causing random loss of data that is critical for understanding the dynamics of the supernova explosion mechanism. In order to solve this problem,… ▽ More Neutrinos from very nearby supernovae, such as Betelgeuse, are expected to generate more than ten million events over 10\,s in Super-Kamokande (SK). At such large event rates, the buffers of the SK analog-to-digital conversion board (QBEE) will overflow, causing random loss of data that is critical for understanding the dynamics of the supernova explosion mechanism. In order to solve this problem, two new DAQ modules were developed to aid in the observation of very nearby supernovae. The first of these, the SN module, is designed to save only the number of hit PMTs during a supernova burst and the second, the Veto module, prescales the high rate neutrino events to prevent the QBEE from overflowing based on information from the SN module. In the event of a very nearby supernova, these modules allow SK to reconstruct the time evolution of the neutrino event rate from beginning to end using both QBEE and SN module data. This paper presents the development and testing of these modules together with an analysis of supernova-like data generated with a flashing laser diode. We demonstrate that the Veto module successfully prevents DAQ overflows for Betelgeuse-like supernovae as well as the long-term stability of the new modules. During normal running the Veto module is found to issue DAQ vetos a few times per month resulting in a total dead time less than 1\,ms, and does not influence ordinary operations. Additionally, using simulation data we find that supernovae closer than 800~pc will trigger Veto module resulting in a prescaling of the observed neutrino data. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 28 pages, 18 figures. Submitted to PTEP

arXiv:2404.06528 [pdf, ps, other]

Deterministic and Stochastic Geometric Mechanics for Hall MHD

Authors: Darryl D. Holm, Ruiao Hu, Oliver D. Street

Abstract: We derive new models of stochastic Hall magnetohydrodynamics (MHD) by using a symmetry-reduced stochastic Euler-Poincaré variational principle. The new stochastic Hall MHD theory has potential applications for uncertainty quantification and data assimilation in space plasma (space weather) and solar physics. The stochastic geometric mechanics approach we take here produces coordinate-free results… ▽ More We derive new models of stochastic Hall magnetohydrodynamics (MHD) by using a symmetry-reduced stochastic Euler-Poincaré variational principle. The new stochastic Hall MHD theory has potential applications for uncertainty quantification and data assimilation in space plasma (space weather) and solar physics. The stochastic geometric mechanics approach we take here produces coordinate-free results which may then be applied in a variety of spatial configurations. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 24 pages, 0 figures, 1st version

arXiv:2404.05734 [pdf, other]

An Online Algorithm for Solving Feedback Optimal Control Problems with Partial Observations

Authors: Siming Liang, Ruoyu Hu, Feng Bao, Richard Archibald, Guannan Zhang

Abstract: This paper presents a novel methodology to tackle feedback optimal control problems in scenarios where the exact state of the controlled process is unknown. It integrates data assimilation techniques and optimal control solvers to manage partial observation of the state process, a common occurrence in practical scenarios. Traditional stochastic optimal control methods assume full state observation… ▽ More This paper presents a novel methodology to tackle feedback optimal control problems in scenarios where the exact state of the controlled process is unknown. It integrates data assimilation techniques and optimal control solvers to manage partial observation of the state process, a common occurrence in practical scenarios. Traditional stochastic optimal control methods assume full state observation, which is often not feasible in real-world applications. Our approach underscores the significance of utilizing observational data to inform control policy design. Specifically, we introduce a kernel learning backward stochastic differential equation (SDE) filter to enhance data assimilation efficiency and propose a sample-wise stochastic optimization method within the stochastic maximum principle framework. Numerical experiments validate the efficacy and accuracy of our algorithm, showcasing its high efficiency in solving feedback optimal control problems with partial observation. △ Less

Submitted 21 March, 2024; originally announced April 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2201.10600

arXiv:2404.03712

doi 10.4204/EPTCS.401

Proceedings 15th Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software

Authors: Diana Costa, Raymond Hu

Abstract: This volume contains the proceedings of PLACES 2024, the 15th edition of the Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software. The PLACES workshop series offers a forum for researchers from different fields to exchange new ideas about the challenges of modern and future programming, where concurrency and distribution are the norm rather than a marginal… ▽ More This volume contains the proceedings of PLACES 2024, the 15th edition of the Workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software. The PLACES workshop series offers a forum for researchers from different fields to exchange new ideas about the challenges of modern and future programming, where concurrency and distribution are the norm rather than a marginal concern. PLACES 2024 was held on 6 April 2024 in Luxembourg City, Luxembourg. The programme included keynote talks by Mariangiola Dezani-Ciancaglini and Peter Müller, presentations of five research papers, and three talks about preliminary or already-published work that could foster interesting discussion during the workshop. These proceedings contain the five accepted research papers, the abstracts of the keynote talks, and a list of the other contributions. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Journal ref: EPTCS 401, 2024

arXiv:2404.01687 [pdf, other]

Search for a sub-eV sterile neutrino using Daya Bay's full dataset

Authors: F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding, Y. Y. Ding , et al. (176 additional authors not shown)

Abstract: This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis… ▽ More This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis benefits from a doubling of the statistics of our previous result and from improvements of several important systematic uncertainties. No significant oscillation due to mixing of a sub-eV sterile neutrino with active neutrinos was found. Exclusion limits are set by both Feldman-Cousins and CLs methods. Light sterile neutrino mixing with $\sin^2 2θ_{14} \gtrsim 0.01$ can be excluded at 95\% confidence level in the region of $0.01$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.1 $ eV$^2$. This result represents the world-leading constraints in the region of $2 \times 10^{-4}$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.2 $ eV$^2$. △ Less

Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, 1 table

arXiv:2403.16830 [pdf, other]

Exploring Communication Technologies, Standards, and Challenges in Electrified Vehicle Charging

Authors: Xiang Ma, Yuan Zhou, Hanwen Zhang, Qun Wang, Haijian Sun, Hongjie Wang, Rose Qingyang Hu

Abstract: As public awareness of environmental protection continues to grow, the trend of integrating more electric vehicles (EVs) into the transportation sector is rising. Unlike conventional internal combustion engine (ICE) vehicles, EVs can minimize carbon emissions and potentially achieve autonomous driving. However, several obstacles hinder the widespread adoption of EVs, such as their constrained driv… ▽ More As public awareness of environmental protection continues to grow, the trend of integrating more electric vehicles (EVs) into the transportation sector is rising. Unlike conventional internal combustion engine (ICE) vehicles, EVs can minimize carbon emissions and potentially achieve autonomous driving. However, several obstacles hinder the widespread adoption of EVs, such as their constrained driving range and the extended time required for charging. One alternative solution to address these challenges is implementing dynamic wireless power transfer (DWPT), charging EVs in motion on the road. Moreover, charging stations with static wireless power transfer (SWPT) infrastructure can replace existing gas stations, enabling users to charge EVs in parking lots or at home. This paper surveys the communication infrastructure for static and dynamic wireless charging in electric vehicles. It encompasses all communication aspects involved in the wireless charging process. The architecture and communication requirements for static and dynamic wireless charging are presented separately. Additionally, a comprehensive comparison of existing communication standards is provided. The communication with the grid is also explored in detail. The survey gives attention to security and privacy issues arising during communications. In summary, the paper addresses the challenges and outlines upcoming trends in communication for EV wireless charging. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: submitted to IET Communication as a survey paper

arXiv:2403.16572 [pdf, ps, other]

Weighted Composition Operator on Fock Space

Authors: Rui Hu

Abstract: In this thesis, we establish a necessary and sufficient condition for a weighted composition operator to commute with a self-adjoint weighted composition operator on the Fock space, then obtain a sufficient condition for these commuting weighted composition operators to be normal. In this thesis, we establish a necessary and sufficient condition for a weighted composition operator to commute with a self-adjoint weighted composition operator on the Fock space, then obtain a sufficient condition for these commuting weighted composition operators to be normal. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16477 [pdf, other]

Safeguarding Next Generation Multiple Access Using Physical Layer Security Techniques: A Tutorial

Authors: Lu Lv, Dongyang Xu, Rose Qingyang Hu, Yinghui Ye, Long Yang, Xianfu Lei, Xianbin Wang, Dong In Kim, Arumugam Nallanathan

Abstract: Driven by the ever-increasing requirements of ultra-high spectral efficiency, ultra-low latency, and massive connectivity, the forefront of wireless research calls for the design of advanced next generation multiple access schemes to facilitate provisioning of these stringent demands. This inspires the embrace of non-orthogonal multiple access (NOMA) in future wireless communication networks. Neve… ▽ More Driven by the ever-increasing requirements of ultra-high spectral efficiency, ultra-low latency, and massive connectivity, the forefront of wireless research calls for the design of advanced next generation multiple access schemes to facilitate provisioning of these stringent demands. This inspires the embrace of non-orthogonal multiple access (NOMA) in future wireless communication networks. Nevertheless, the support of massive access via NOMA leads to additional security threats, due to the open nature of the air interface, the broadcast characteristic of radio propagation as well as intertwined relationship among paired NOMA users. To address this specific challenge, the superimposed transmission of NOMA can be explored as new opportunities for security aware design, for example, multiuser interference inherent in NOMA can be constructively engineered to benefit communication secrecy and privacy. The purpose of this tutorial is to provide a comprehensive overview on the state-of-the-art physical layer security techniques that guarantee wireless security and privacy for NOMA networks, along with the opportunities, technical challenges, and future research trends. △ Less

Submitted 21 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: Invited paper by Proceedings of the IEEE

arXiv:2403.15612 [pdf, other]

InterFusion: Text-Driven Generation of 3D Human-Object Interaction

Authors: Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui Huang, Kai Xu, Ruizhen Hu

Abstract: In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner. We identify and address two key challenges: the unsatisfactory outcomes of direct text-to-3D methods in HOI, largely due to the lack of paired text-interaction data, and the inherent difficulties in simultaneously generating multiple concepts with c… ▽ More In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner. We identify and address two key challenges: the unsatisfactory outcomes of direct text-to-3D methods in HOI, largely due to the lack of paired text-interaction data, and the inherent difficulties in simultaneously generating multiple concepts with complex spatial relationships. To effectively address these issues, we present InterFusion, a two-stage framework specifically designed for HOI generation. InterFusion involves human pose estimations derived from text as geometric priors, which simplifies the text-to-3D conversion process and introduces additional constraints for accurate object generation. At the first stage, InterFusion extracts 3D human poses from a synthesized image dataset depicting a wide range of interactions, subsequently map** these poses to interaction descriptions. The second stage of InterFusion capitalizes on the latest developments in text-to-3D generation, enabling the production of realistic and high-quality 3D HOI scenes. This is achieved through a local-global optimization process, where the generation of human body and object is optimized separately, and jointly refined with a global optimization of the entire scene, ensuring a seamless and contextually coherent integration. Our experimental results affirm that InterFusion significantly outperforms existing state-of-the-art methods in 3D HOI generation. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Showing 1–50 of 649 results for author: Hu, R