Search | arXiv e-print repository

Surviving in the Hot Neptune Desert: The Discovery of the Ultra-Hot Neptune TOI-3261b

Authors: Emma Nabbie, Chelsea X. Huang, Jennifer A. Burt, David J. Armstrong, Eric E. Mamajek, Vardan Adibekyan, Sérgio G. Sousa, Eric D. Lopez, Daniel P. Thorngren, Jorge Fernández, Gongjie Li, James S. Jenkins, Jose I. Vines, João Gomes da Silva, Robert A. Wittenmyer, Daniel Bayliss, César Briceño, Karen A. Collins, Xavier Dumusque, Keith D. Horne, Marcelo F. Keniger, Nicholas Law, Jorge Lillo-Box, Shang-Fei Liu, Andrew W. Mann , et al. (23 additional authors not shown)

Abstract: The recent discoveries of Neptune-sized ultra-short period planets (USPs) challenge existing planet formation theories. It is unclear whether these residents of the Hot Neptune Desert have similar origins to smaller, rocky USPs, or if this discrete population is evidence of a different formation pathway altogether. We report the discovery of TOI-3261b, an ultra-hot Neptune with an orbital period… ▽ More The recent discoveries of Neptune-sized ultra-short period planets (USPs) challenge existing planet formation theories. It is unclear whether these residents of the Hot Neptune Desert have similar origins to smaller, rocky USPs, or if this discrete population is evidence of a different formation pathway altogether. We report the discovery of TOI-3261b, an ultra-hot Neptune with an orbital period $P$ = 0.88 days. The host star is a $V = 13.2$ magnitude, slightly super-solar metallicity ([Fe/H] $\simeq$ 0.15), inactive K1.5 main sequence star at $d = 300$ pc. Using data from the Transiting Exoplanet Survey Satellite and the Las Cumbres Observatory Global Telescope, we find that TOI-3261b has a radius of $3.82_{-0.35}^{+0.42}$ $R_{\oplus}$. Moreover, radial velocities from ESPRESSO and HARPS reveal a mass of $30.3_{-2.4}^{+2.2}$ $M_{\oplus}$, more than twice the median mass of Neptune-sized planets on longer orbits. We investigate multiple mechanisms of mass loss that can reproduce the current-day properties of TOI-3261b, simulating the evolution of the planet via tidal strip** and photoevaporation. Thermal evolution models suggest that TOI-3261b should retain an envelope potentially enriched with volatiles constituting $\sim$5% of its total mass. This is the second highest envelope mass fraction among ultra-hot Neptunes discovered to date, making TOI-3261b an ideal candidate for atmospheric follow-up observations. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 20 pages, 11 figures, accepted to AJ

arXiv:2407.00056 [pdf, other]

MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion

Authors: Jiaxin Deng, Shiyao Wang, Yuchen Wang, Jiansong Qi, Liqin Zhao, Guorui Zhou, Gaofeng Meng

Abstract: Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a… ▽ More Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a conventional recommendation problem, and model users' preferences using categorical data and observed historical behaviors. However, it is challenging to precisely describe the real-time content changes in live streaming using limited categorical information. Moreover, due to the sparsity of gifting behaviors, capturing the preferences and intentions of users is quite difficult. In this work, we propose MMBee based on real-time Multi-Modal Fusion and Behaviour Expansion to address these issues. Specifically, we first present a Multi-modal Fusion Module with Learnable Query (MFQ) to perceive the dynamic content of streaming segments and process complex multi-modal interactions, including images, text comments and speech. To alleviate the sparsity issue of gifting behaviors, we present a novel Graph-guided Interest Expansion (GIE) approach that learns both user and streamer representations on large-scale gifting graphs with multi-modal attributes. Comprehensive experiment results show that MMBee achieves significant performance improvements on both public datasets and Kuaishou real-world streaming datasets and the effectiveness has been further validated through online A/B experiments. MMBee has been deployed and is serving hundreds of millions of users at Kuaishou. △ Less

Submitted 15 June, 2024; originally announced July 2024.

Comments: Accepted at KDD 2024

arXiv:2406.16520 [pdf]

Gigantic-oxidative atomically layered epitaxy for designed complex oxides

Authors: Guangdi Zhou, Haoliang Huang, Fengzhe Wang, Heng Wang, Qishuo Yang, Zihao Nie, Wei Lv, Cui Ding, Yueying Li, Danfeng Li, Yujie Sun, Junhao Lin, Guang-Ming Zhang, Qi-Kun Xue, Zhuoyu Chen

Abstract: In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly… ▽ More In designing material functionality within the intricate realm of transition metal oxides, lattice structure and d-orbital occupancy are two principal determinants of the correlated physical properties, such as superconductivity. However, the modulation of these two factors is inherently limited by the need to balance thermodynamic stability, kinetic mobility, and synthesis precision, particularly for oxidation-demanding phases. We introduce a methodology, namely the gigantic-oxidative atomically layered epitaxy (GOAL-Epitaxy), enhancing oxidation power 3-4 orders of magnitude beyond oxide molecular beam epitaxy (OMBE) and pulsed laser deposition (PLD), while ensuring atomic-layer-by-layer growth of designed complex structures. Consequently, thermodynamic stability is markedly augmented at elevated temperatures, improving growth kinetics. We demonstrate the accurate synthesis of complex nickelates and cuprates, especially an artificially designed structure as a parent of high-temperature superconductivity, in which alternating single and double NiO2 layers possess distinct nominal d-orbital occupancy. The GOAL-Epitaxy enables material discovery within the vastly broadened growth parameter space. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.14025 [pdf]

Direct Observation of Dendrites Nucleation in Li Metal Battery by Machine Learning Accelerated Molecular Simulations under Realistic Electrochemical Conditions

Authors: Tai** Hu, Haichao Huang, Guobing Zhou, Xinyan Wang, Zheng Cheng, Fangjia Fu, Xiaoxu Wang, Fuzhi Dai, Kuang Yu, Shenzhen Xu

Abstract: Uncontrollable dendrites growth during electrochemical cycles leads to low Coulombic efficiency and critical safety issues in Li metal batteries. Hence, a comprehensive understanding of the dendrite formation mechanism is essential for further enhancing the performance of Li metal batteries. Machine learning accelerated molecular dynamics (MD) simulations can provide atomic-scale resolution for va… ▽ More Uncontrollable dendrites growth during electrochemical cycles leads to low Coulombic efficiency and critical safety issues in Li metal batteries. Hence, a comprehensive understanding of the dendrite formation mechanism is essential for further enhancing the performance of Li metal batteries. Machine learning accelerated molecular dynamics (MD) simulations can provide atomic-scale resolution for various key processes at an ab-initio level accuracy. However, traditional MD simulation tools hardly capture Li electrochemical depositions, due to lack of an electrochemical constant potential (ConstP) condition. In this work, we propose a ConstP approach that combines a machine learning force field with the charge equilibration method to reveal the dynamic process of Li dendrites nucleation at Li metal anode surfaces. Our results show that both dead Li cluster formation and inhomogeneous Li electro-depositions can induce Li dendrites nucleation. We further reveal that the local aggregation of Li atoms in amorphous inorganic components of solid electrolyte interphase is the key factor triggering the nucleation process. Overall, our simulations provide microscopic insights for Li dendrites formations in Li metal anodes. More importantly, we present an efficient and accurate simulation method for modeling realistic ConstP conditions, which holds considerable potential for broader applications in modeling of complex electrochemical interfaces. △ Less

Submitted 3 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.12186 [pdf, ps, other]

Unlocking the Potential of Early Epochs: Uncertainty-aware CT Metal Artifact Reduction

Authors: Xinquan Yang, Guanqun Zhou, Wei Sun, Youjian Zhang, Zhongya Wang, Jiahui He, Zhicheng Zhang

Abstract: In computed tomography (CT), the presence of metallic implants in patients often leads to disruptive artifacts in the reconstructed images, hindering accurate diagnosis. Recently, a large amount of supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods neglect the influence of initial training weights. In this paper, we have discover… ▽ More In computed tomography (CT), the presence of metallic implants in patients often leads to disruptive artifacts in the reconstructed images, hindering accurate diagnosis. Recently, a large amount of supervised deep learning-based approaches have been proposed for metal artifact reduction (MAR). However, these methods neglect the influence of initial training weights. In this paper, we have discovered that the uncertainty image computed from the restoration result of initial training weights can effectively highlight high-frequency regions, including metal artifacts. This observation can be leveraged to assist the MAR network in removing metal artifacts. Therefore, we propose an uncertainty constraint (UC) loss that utilizes the uncertainty image as an adaptive weight to guide the MAR network to focus on the metal artifact region, leading to improved restoration. The proposed UC loss is designed to be a plug-and-play method, compatible with any MAR framework, and easily adoptable. To validate the effectiveness of the UC loss, we conduct extensive experiments on the public available Deeplesion and CLINIC-metal dataset. Experimental results demonstrate that the UC loss further optimizes the network training process and significantly improves the removal of metal artifacts. △ Less

Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.09021 [pdf, other]

doi 10.1145/3637528.3671514

Contextual Distillation Model for Diversified Recommendation

Authors: Fan Li, Xu Si, Shisong Tang, Dingmin Wang, Kunyan Han, Bing Han, Guorui Zhou, Yang Song, Hechang Chen

Abstract: The diversity of recommendation is equally crucial as accuracy in improving user experience. Existing studies, e.g., Determinantal Point Process (DPP) and Maximal Marginal Relevance (MMR), employ a greedy paradigm to iteratively select items that optimize both accuracy and diversity. However, prior methods typically exhibit quadratic complexity, limiting their applications to the re-ranking stage… ▽ More The diversity of recommendation is equally crucial as accuracy in improving user experience. Existing studies, e.g., Determinantal Point Process (DPP) and Maximal Marginal Relevance (MMR), employ a greedy paradigm to iteratively select items that optimize both accuracy and diversity. However, prior methods typically exhibit quadratic complexity, limiting their applications to the re-ranking stage and are not applicable to other recommendation stages with a larger pool of candidate items, such as the pre-ranking and ranking stages. In this paper, we propose Contextual Distillation Model (CDM), an efficient recommendation model that addresses diversification, suitable for the deployment in all stages of industrial recommendation pipelines. Specifically, CDM utilizes the candidate items in the same user request as context to enhance the diversification of the results. We propose a contrastive context encoder that employs attention mechanisms to model both positive and negative contexts. For the training of CDM, we compare each target item with its context embedding and utilize the knowledge distillation framework to learn the win probability of each target item under the MMR algorithm, where the teacher is derived from MMR outputs. During inference, ranking is performed through a linear combination of the recommendation and student model scores, ensuring both diversity and efficiency. We perform offline evaluations on two industrial datasets and conduct online A/B test of CDM on the short-video platform KuaiShou. The considerable enhancements observed in both recommendation quality and diversity, as shown by metrics, provide strong superiority for the effectiveness of CDM. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: accepted by KDD 2024

arXiv:2406.05234 [pdf, other]

TESS Hunt for Young and Maturing Exoplanets (THYME) X: a two-planet system in the 210 Myr MELANGE-5 Association

Authors: Pa Chia Thao, Andrew W. Mann, Madyson G. Barber, Adam L. Kraus, Benjamin M. Tofflemire, Jonathan L. Bush, Mackenna L. Wood, Karen A. Collins, Andrew Vanderburg, Samuel N. Quinn, George Zhou, Elisabeth R. Newton, Carl Ziegler, Nicholas Law, Khalid Barkaoui, Francisco J. Pozuelos, Mathilde Timmermans, Michaël Gillon, Emmanuël Jehin, Richard P. Schwarz, Tianjun Gan, Avi Shporer, Keith Horne, Ramotholo Sefako, Olga Suarez , et al. (13 additional authors not shown)

Abstract: Young (<500 Myr) planets are critical to studying how planets form and evolve. Among these young planetary systems, multi-planet configurations are particularly useful as they provide a means to control for variables within a system. Here, we report the discovery and characterization of a young planetary system, TOI-1224. We show that the planet-host resides within a young population we denote as… ▽ More Young (<500 Myr) planets are critical to studying how planets form and evolve. Among these young planetary systems, multi-planet configurations are particularly useful as they provide a means to control for variables within a system. Here, we report the discovery and characterization of a young planetary system, TOI-1224. We show that the planet-host resides within a young population we denote as MELANGE-5 . By employing a range of age-dating methods -- isochrone fitting, lithium abundance analysis, gyrochronology, and Gaia excess variability -- we estimate the age of MELANGE-5 to be 210$\pm$27 Myr. MELANGE-5 is situated in close proximity to previously identified younger (80 -110 Myr) associations, Crius 221 and Theia 424/Volans-Carina, motivating further work to map out the group boundaries. In addition to a planet candidate detected by the TESS pipeline and alerted as a TESS Object of Interest, TOI-1224 b, we identify a second planet, TOI-1224 c, using custom search tools optimized for young stars (Notch and LOCoR). We find the planets are 2.10$\pm$0.09$R_\oplus$ and 2.88$\pm$0.10$R_\oplus$ and orbit their host star every 4.18 and 17.95 days, respectively. With their bright ($K$=9.1 mag), small ($R_{*}$=0.44R$_{\odot}$), and cool ($T_{eff}$ =3326K) host star, these planets represent excellent candidates for atmospheric characterization with JWST. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Accepted for publication in The Astronomical Journal; 33 pages, 17 figures, 9 tables

arXiv:2406.00276 [pdf]

Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

Authors: Shengyu Tao, Mengtian Zhang, Zixi Zhao, Haoyang Li, Ruifei Ma, Yunhong Che, Xin Sun, Lin Su, Xiangyu Chen, Zihao Zhou, Heng Chang, Tingwei Cao, Xiao Xiao, Yaojun Liu, Wenjun Yu, Zhongling Xu, Yang Li, Han Hao, Xuan Zhang, Xiaosong Hu, Guangmin ZHou

Abstract: Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed mac… ▽ More Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed machine learning approach can quantify and visualize temporally resolved losses concerning thermodynamics and kinetics only using electric signals. Our method enables non-destructive degradation pattern characterization, expediting temperature-adaptable predictions of entire lifetime trajectories, rather than end-of-life points. The verification speed is 25 times faster yet maintaining 95.1% accuracy across temperatures. Such advances facilitate more sustainable management of defective prototypes before massive production, establishing a 19.76 billion USD scrap material recycling market by 2060 in China. By incorporating stepwise charge acceptance as a measure of the initial manufacturing variability of normally identical batteries, we can immediately identify long-term degradation variations. We attribute the predictive power to interpreting machine learning insights using material-agnostic featurization taxonomy for degradation pattern decoupling. Our findings offer new possibilities for dynamic system analysis, such as battery prototype degradation, demonstrating that complex pattern evolutions can be accurately predicted in a non-destructive and data-driven fashion by integrating physics-informed machine learning. △ Less

Submitted 31 May, 2024; originally announced June 2024.

ACM Class: J.2; G.3

arXiv:2405.19610 [pdf, other]

Factor Augmented Tensor-on-Tensor Neural Networks

Authors: Guanhao Zhou, Yuefeng Han, Xiufan Yu

Abstract: This paper studies the prediction task of tensor-on-tensor regression in which both covariates and responses are multi-dimensional arrays (a.k.a., tensors) across time with arbitrary tensor order and data dimension. Existing methods either focused on linear models without accounting for possibly nonlinear relationships between covariates and responses, or directly employed black-box deep learning… ▽ More This paper studies the prediction task of tensor-on-tensor regression in which both covariates and responses are multi-dimensional arrays (a.k.a., tensors) across time with arbitrary tensor order and data dimension. Existing methods either focused on linear models without accounting for possibly nonlinear relationships between covariates and responses, or directly employed black-box deep learning algorithms that failed to utilize the inherent tensor structure. In this work, we propose a Factor Augmented Tensor-on-Tensor Neural Network (FATTNN) that integrates tensor factor models into deep neural networks. We begin with summarizing and extracting useful predictive information (represented by the ``factor tensor'') from the complex structured tensor covariates, and then proceed with the prediction task using the estimated factor tensor as input of a temporal convolutional neural network. The proposed methods effectively handle nonlinearity between complex data structures, and improve over traditional statistical models and conventional deep learning approaches in both prediction accuracy and computational cost. By leveraging tensor factor models, our proposed methods exploit the underlying latent factor structure to enhance the prediction, and in the meantime, drastically reduce the data dimensionality that speeds up the computation. The empirical performances of our proposed methods are demonstrated via simulation studies and real-world applications to three public datasets. Numerical results show that our proposed algorithms achieve substantial increases in prediction accuracy and significant reductions in computational time compared to benchmark methods. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.17899 [pdf]

doi 10.1002/anie.202300186

Near IR bandgap semiconductive 2D conjugated metal-organic framework with rhombic lattice and high mobility

Authors: Lukas Sporrer, Guojun Zhou, Mingchao Wang, Vasileios Balos, Sergio Revuelta, Kamil Jastrzembski, Markus Loeffler, Petko Petkov, Thomas Heine, Angieszka Kuc, Enrique Canovas, Zhehao Huang, Xinliang Feng, Renhao Dong

Abstract: Two-dimensional conjugated metal-organic frameworks (2D c-MOFs) are emerging as a unique class of 2D electronic materials. However, intrinsically semiconducting 2D c-MOFs with gaps in the Vis-NIR and high charge carrier mobility have been rare. Most of the reported semiconducting 2D c-MOFs are metallic (i.e. gapless), which limits their use in applications where larger band gaps are needed for log… ▽ More Two-dimensional conjugated metal-organic frameworks (2D c-MOFs) are emerging as a unique class of 2D electronic materials. However, intrinsically semiconducting 2D c-MOFs with gaps in the Vis-NIR and high charge carrier mobility have been rare. Most of the reported semiconducting 2D c-MOFs are metallic (i.e. gapless), which limits their use in applications where larger band gaps are needed for logic devices. Herein, we design a new D2h-geometric ligand, 2,3,6,7,11,12,15,16-octahydroxyphenanthro(9,10b)triphenylene (OHPTP), and synthesize the first example of a 2D c-MOF single crystal (OHPTP-Cu) with a rhombohedral pore geometry after coordination with copper. The continuous rotation electron diffraction (cRED) analysis unveils the orthorhombic crystal structure at the atomic level with a unique AB layer stacking. The resultant Cu2(OHPTP) is a p-type semiconductor with an indirect band gap of about 0.50 eV and exhibits high electrical conductivity of 0.10 S cm-1 and high charge carrier mobility of 10.0 cm2V-1s-1. Density-functional theory calculations underline the predominant role of the out-of-plane charge transport in this semiquinone-based 2D c-MOFs. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 11 pages 5 figures

Journal ref: Angew. Chem. Int. Ed. 2023, 62, e202300186

arXiv:2405.15742 [pdf]

Correlated Charge Density Wave Insulators in Chirally Twisted Triple Bilayer Graphene

Authors: Wenxuan Wang, Gengdong Zhou, Wenlu Lin, Zuo Feng, Yijie Wang, Miao Liang, Zaizhe Zhang, Min Wu, Le Liu, Kenji Watanabe, Takashi Taniguchi, Wei Yang, Guangyu Zhang, Kaihui Liu, **hua Gao, Yang Liu, X. C. Xie, Zhida Song, Xiaobo Lu

Abstract: Electrons residing in flat-band system can play a vital role in triggering spectacular phenomenology due to relatively large interactions and spontaneous breaking of different degeneracies. In this work we demonstrate chirally twisted triple bilayer graphene, a new moiré structure formed by three pieces of helically stacked Bernal bilayer graphene, as a highly tunable flat-band system. In addition… ▽ More Electrons residing in flat-band system can play a vital role in triggering spectacular phenomenology due to relatively large interactions and spontaneous breaking of different degeneracies. In this work we demonstrate chirally twisted triple bilayer graphene, a new moiré structure formed by three pieces of helically stacked Bernal bilayer graphene, as a highly tunable flat-band system. In addition to the correlated insulators showing at integer moiré fillings, commonly attributed to interaction induced symmetry broken isospin flavors in graphene, we observe abundant insulating states at half-integer moiré fillings, suggesting a longer-range interaction and the formation of charge density wave insulators which spontaneously break the moiré translation symmetry. With weak out-of-plane magnetic field applied, as observed half-integer filling states are enhanced and more quarter-integer filling states appear, pointing towards further quadrupling moiré unit cells. The insulating states at fractional fillings combined with Hartree-Fock calculations demonstrate the observation of a new type of correlated charge density wave insulators in graphene and points to a new accessible twist manner engineering correlated moiré electronics. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.15297 [pdf]

doi 10.1103/PhysRevB.109.184112

High-field magnetoelectric coupling and successive magnetic transitions in Mn-doped polar antiferromagnet Ni3TeO6

Authors: J. H. Zhang, L. Lin, C. Dong, Y. T. Chang, J. F. Wang, C. L. Lu, P. Z. Chen, W. J. Zhai, G. Z. Zhou, L. Huang, Y. S. Tang, S. H. Zheng, M. F. Liu, X. H. Zhou, Z. B. Yan, J. -M. Liu

Abstract: Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 sing… ▽ More Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 single crystals in high magnetic field (H) up to 52 T. We present a previously unreported weak ferromagnetic behavior appeared in the ab plane below 9.5 K in addition to the incommensurate helical and commensurate collinear antiferromagnetic states. In the low-field region, a spin-flop type metamagnetic transition without any hysteresis occurs at Hc1 for H // c, while another metamagnetic transition accompanied with a change in electric polarization is observed at Hc2 in the high-field region both for H // c and H // ab above 30 K, which can be attributed to the sudden rotation of magnetic moments at Ni2 sites. The ME measurements reveal that a first-order ME effect is observed in the low-T and low-H regions, while a second-order ME coupling term appears above 30 K in the magnetic field range of Hc1 < H < Hc2 for H // c and H < Hc2 for H // ab, both becoming significant with increasing temperature. Eventually, they are dominated by the second-order ME effect near the antiferromagnetic transition temperature. The present work demonstrates that Ni3-xMnxTeO6 is an exotic magnetoelectric material compared with Ni3TeO6 and its derivatives, thereby providing insights to better understand the magnetism and ME coupling in Ni3TeO6 and its derivatives. △ Less

Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: 30 pages with 8 figures

Journal ref: Phys. Rev. B 109, 184112 (2024)

arXiv:2405.14983 [pdf, other]

The Solar Origin of an Intense Geomagnetic Storm on 2023 December 1st: Successive Slip** and Eruption of Multiple Magnetic Flux Ropes

Authors: Zheng Sun, Ting Li, Yijun Hou, Hui Tian, Ziqi Wu, Ke Li, Yining Zhang, Zhentong Li, Xianyong Bai, Li Feng, Chuan Li, Zhenyong Hou, Qiao Song, **gsong Wang, Gui** Zhou

Abstract: The solar eruption that occurred on 2023 November 28 (SOL2023-11-28) triggered an intense geomagnetic storm on Earth on 2023 December 1. The associated Earth's auroras manifested at the most southern latitudes in the northern hemisphere observed in the past two decades. In order to explore the profound geoeffectiveness of this event, we conducted a comprehensive analysis of its solar origin to off… ▽ More The solar eruption that occurred on 2023 November 28 (SOL2023-11-28) triggered an intense geomagnetic storm on Earth on 2023 December 1. The associated Earth's auroras manifested at the most southern latitudes in the northern hemisphere observed in the past two decades. In order to explore the profound geoeffectiveness of this event, we conducted a comprehensive analysis of its solar origin to offer potential factors contributing to its impact. Magnetic flux ropes (MFRs) are twisted magnetic structures recognized as significant contributors to coronal mass ejections (CMEs), thereby impacting space weather greatly. In this event, we identified multiple MFRs in the solar active region and observed distinct slip** processes of the three MFRs: MFR1, MFR2, and MFR3. All three MFRs exhibit slip** motions at a speed of 40--137 km s$^{-1}$, extending beyond their original locations. Notably, the slip** of MFR2 extends to $\sim$30 Mm and initiate the eruption of MFR3. Ultimately, MFR1's eruption results in an M3.4-class flare and a CME, while MFR2 and MFR3 collectively produce an M9.8-class flare and another halo CME. This study shows the slip** process in a multi-MFR system, showing how one MFR's slip** can trigger the eruption of another MFR. We propose that the CME--CME interactions caused by multiple MFR eruptions may contribute to the significant geoeffectiveness. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14824 [pdf, other]

Camera Relocalization in Shadow-free Neural Radiance Fields

Authors: Shiyao Xu, Caiyun Liu, Yuantao Chen, Zhenxin Zhu, Zike Yan, Yongliang Shi, Hao Zhao, Guyue Zhou

Abstract: Camera relocalization is a crucial problem in computer vision and robotics. Recent advancements in neural radiance fields (NeRFs) have shown promise in synthesizing photo-realistic images. Several works have utilized NeRFs for refining camera poses, but they do not account for lighting changes that can affect scene appearance and shadow regions, causing a degraded pose optimization process. In thi… ▽ More Camera relocalization is a crucial problem in computer vision and robotics. Recent advancements in neural radiance fields (NeRFs) have shown promise in synthesizing photo-realistic images. Several works have utilized NeRFs for refining camera poses, but they do not account for lighting changes that can affect scene appearance and shadow regions, causing a degraded pose optimization process. In this paper, we propose a two-staged pipeline that normalizes images with varying lighting and shadow conditions to improve camera relocalization. We implement our scene representation upon a hash-encoded NeRF which significantly boosts up the pose optimization process. To account for the noisy image gradient computing problem in grid-based NeRFs, we further propose a re-devised truncated dynamic low-pass filter (TDLF) and a numerical gradient averaging technique to smoothen the process. Experimental results on several datasets with varying lighting conditions demonstrate that our method achieves state-of-the-art results in camera relocalization under varying lighting conditions. Code and data will be made publicly available. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted by ICRA 2024. 8 pages, 5 figures, 3 tables. Codes and dataset: https://github.com/hnrna/ShadowfreeNeRF-CameraReloc

arXiv:2405.12595 [pdf, other]

Correlated insulators and charge density wave states in chirally twisted triple bilayer graphene

Authors: Geng-Dong Zhou, Yi-Jie Wang, Wen-Xuan Wang, Xiao-Bo Lu, Zhi-Da Song

Abstract: Motivated by recent experimental observations of displacement-field-tuned correlated insulators at integer and half-integer fillings in chirally twisted triple bilayer graphene (CTTBG), we study the single-particle and interacting physics of CTTBG. We find that there are two inequivalent stacking orders, {\it i.e.}, ABABBC and ABABAB, and both exhibit flat bands with nontrivial topology. We then u… ▽ More Motivated by recent experimental observations of displacement-field-tuned correlated insulators at integer and half-integer fillings in chirally twisted triple bilayer graphene (CTTBG), we study the single-particle and interacting physics of CTTBG. We find that there are two inequivalent stacking orders, {\it i.e.}, ABABBC and ABABAB, and both exhibit flat bands with nontrivial topology. We then use the Hartree-Fock approximation to calculate the rich phase diagram of CTTBG at all integer and half-integer fillings in both stacking orders and under the vertical displacement field. Under a small displacement field, the groundstates are flavor polarized states for ABABBC stacking order and intervalley coherent states for ABABAB stacking order at all integer and half-integer fillings. A larger displacement field will turn them into layer-polarized states. At half-integer fillings, the groundstates also exhibit charge density wave (CDW) order. For ABABAB stacking, the groundstates are always $2\times1$ stripe state among a range of displacement fields. For ABABBC stacking, the groundstates are also $2\times1$ stripe states under a small displacement field and a larger displacement will possibly favor further translation-symmetry-breaking, depending on filling and the direction of the displacement field. We demonstrate that the CDW states observed in the experiment can originate from the strong Coulomb interaction of the flat band electrons. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.12217 [pdf, other]

Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning

Authors: Guanglin Zhou, Zhongyi Han, Shiming Chen, Biwei Huang, Liming Zhu, Salman Khan, Xin Gao, Lina Yao

Abstract: Recent studies indicate that large multimodal models (LMMs) are highly robust against natural distribution shifts, often surpassing previous baselines. Despite this, domain-specific adaptation is still necessary, particularly in specialized areas like healthcare. Due to the impracticality of fine-tuning LMMs given their vast parameter space, this work investigates in-context learning (ICL) as an e… ▽ More Recent studies indicate that large multimodal models (LMMs) are highly robust against natural distribution shifts, often surpassing previous baselines. Despite this, domain-specific adaptation is still necessary, particularly in specialized areas like healthcare. Due to the impracticality of fine-tuning LMMs given their vast parameter space, this work investigates in-context learning (ICL) as an effective alternative for enhancing LMMs' adaptability. We find that the success of ICL heavily relies on the choice of demonstration, mirroring challenges seen in large language models but introducing unique complexities for LMMs facing distribution shifts. Our study addresses this by evaluating an unsupervised ICL method, TopKNearestPR, which selects in-context examples through a nearest example search based on feature similarity. We uncover that its effectiveness is limited by the deficiencies of pre-trained vision encoders under distribution shift scenarios. To address these challenges, we propose InvariantSelectPR, a novel method leveraging Class-conditioned Contrastive Invariance (CCI) for more robust demonstration selection. Specifically, CCI enhances pre-trained vision encoders by improving their discriminative capabilities across different classes and ensuring invariance to domain-specific variations. This enhancement allows the encoders to effectively identify and retrieve the most informative examples, which are then used to guide LMMs in adapting to new query samples under varying distributions. Our experiments show that InvariantSelectPR substantially improves the adaptability of LMMs, achieving significant performance gains on benchmark datasets, with a 34.2%$\uparrow$ accuracy increase in 7-shot on Camelyon17 and 16.9%$\uparrow$ increase in 7-shot on HAM10000 compared to the baseline zero-shot performance. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 17 pages, 7 figures, 7 tables

arXiv:2405.11769 [pdf, other]

Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction

Authors: Eric Alcaide, Zhifeng Gao, Guolin Ke, Yaqi Li, Linfeng Zhang, Hang Zheng, Gengmo Zhou

Abstract: In recent years, machine learning (ML) methods have emerged as promising alternatives for molecular docking, offering the potential for high accuracy without incurring prohibitive computational costs. However, recent studies have indicated that these ML models may overfit to quantitative metrics while neglecting the physical constraints inherent in the problem. In this work, we present Uni-Mol Doc… ▽ More In recent years, machine learning (ML) methods have emerged as promising alternatives for molecular docking, offering the potential for high accuracy without incurring prohibitive computational costs. However, recent studies have indicated that these ML models may overfit to quantitative metrics while neglecting the physical constraints inherent in the problem. In this work, we present Uni-Mol Docking V2, which demonstrates a remarkable improvement in performance, accurately predicting the binding poses of 77+% of ligands in the PoseBusters benchmark with an RMSD value of less than 2.0 Å, and 75+% passing all quality checks. This represents a significant increase from the 62% achieved by the previous Uni-Mol Docking model. Notably, our Uni-Mol Docking approach generates chemically accurate predictions, circumventing issues such as chirality inversions and steric clashes that have plagued previous ML models. Furthermore, we observe enhanced performance in terms of high-quality predictions (RMSD values of less than 1.0 Å and 1.5 Å) and physical soundness when Uni-Mol Docking is combined with more physics-based methods like Uni-Dock. Our results represent a significant advancement in the application of artificial intelligence for scientific research, adopting a holistic approach to ligand docking that is well-suited for industrial applications in virtual screening and drug design. The code, data and service for Uni-Mol Docking are publicly available for use and further development in https://github.com/dptech-corp/Uni-Mol. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.09776 [pdf]

doi 10.1103/PhysRevB.109.184106

Magnetic structure and magnetoelectric coupling in antiferromagnet Co5(TeO3)4Cl2

Authors: B. Yu, L. Huang, J. S. Li, L. Lin, V. Ovidiu Garlea, Q. Zhang, T. Zou, J. C. Zhang, J. Peng, Y. S. Tang, G. Z. Zhou, J. H. Zhang, S. H. Zheng, M. F. Liu, Z. B. Yan, X. H. Zhou, S. Dong, J. G. Wan, J. -M. Liu

Abstract: The van der Waals (vdW) layered multiferroics, which host simultaneous ferroelectric and magnetic orders, have attracted attention not only for their potentials to be utilized in nanoelectric devices and spintronics, but also offer alternative opportunities for emergent physical phenomena. To date, the vdW layered multiferroic materials are still very rare. In this work, we have investigated the m… ▽ More The van der Waals (vdW) layered multiferroics, which host simultaneous ferroelectric and magnetic orders, have attracted attention not only for their potentials to be utilized in nanoelectric devices and spintronics, but also offer alternative opportunities for emergent physical phenomena. To date, the vdW layered multiferroic materials are still very rare. In this work, we have investigated the magnetic structure and magnetoelectric effects in Co5(TeO3)4Cl2, a promising new multiferroic compound with antiferromagnetic (AFM) Neel point TN = 18 K. The neutron powder diffraction reveals the non-coplanar AFM state with preferred Neel vector along the c-axis, while a spin re-orientation occurring between 8 K and 15 K is identified, which results from the distinct temperature dependence of the non-equivalent Co sites moment in Co5(TeO3)4Cl2. What is more, it is found that Co5(TeO3)4Cl2 is one of the best vdW multiferroics studied so far in terms of the multiferroic performance. The measured linear ME coefficient exhibits the emergent oscillation dependence of the angle between magnetic field and electric field, and the maximal value is as big as 45 ps/m. It is suggested that Co5(TeO3)4Cl2 is an appreciated platform for exploring the emergent multiferroicity in vdW layered compounds. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 31 pages, 9 figures

Journal ref: Phys. Rev. B 109, 184106(2024)

arXiv:2405.08423 [pdf, other]

NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

Abstract: Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high co… ▽ More Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07367 [pdf, other]

TOI-2447 b / NGTS-29 b: a 69-day Saturn around a Solar analogue

Authors: Samuel Gill, Daniel Bayliss, Solène Ulmer-Moll, Peter J. Wheatley, Rafael Brahm, David R. Anderson, David Armstrong, Ioannis Apergis, Douglas R. Alves, Matthew R. Burleigh, R. P. Butler, François Bouchy, Matthew P. Battley, Edward M. Bryant, Allyson Bieryla, Jeffrey D. Crane, Karen A. Collins, Sarah L. Casewell, Ilaria Carleo, Alastair B. Claringbold, Paul A. Dalba, Diana Dragomir, Philipp Eigmüller, Jan Eberhardt, Michael Fausnaugh , et al. (41 additional authors not shown)

Abstract: Discovering transiting exoplanets with relatively long orbital periods ($>$10 days) is crucial to facilitate the study of cool exoplanet atmospheres ($T_{\rm eq} < 700 K$) and to understand exoplanet formation and inward migration further out than typical transiting exoplanets. In order to discover these longer period transiting exoplanets, long-term photometric and radial velocity campaigns are r… ▽ More Discovering transiting exoplanets with relatively long orbital periods ($>$10 days) is crucial to facilitate the study of cool exoplanet atmospheres ($T_{\rm eq} < 700 K$) and to understand exoplanet formation and inward migration further out than typical transiting exoplanets. In order to discover these longer period transiting exoplanets, long-term photometric and radial velocity campaigns are required. We report the discovery of TOI-2447 b ($=$ NGTS-29b), a Saturn-mass transiting exoplanet orbiting a bright (T=10.0) Solar-type star (T$_{\rm eff}$=5730 K). TOI-2447 b was identified as a transiting exoplanet candidate from a single transit event of 1.3% depth and 7.29 h duration in $TESS$ Sector 31 and a prior transit event from 2017 in NGTS data. Four further transit events were observed with NGTS photometry which revealed an orbital period of P=69.34 days. The transit events establish a radius for TOI-2447 b of $0.865 \pm 0.010\rm R_{\rm J}$, while radial velocity measurements give a mass of $0.386 \pm 0.025 \rm M_{\rm J}$. The equilibrium temperature of the planet is $414$ K, making it much cooler than the majority of $TESS$ planet discoveries. We also detect a transit signal in NGTS data not caused by TOI-2447 b, along with transit timing variations and evidence for a $\sim$150 day signal in radial velocity measurements. It is likely that the system hosts additional planets, but further photometry and radial velocity campaigns will be needed to determine their parameters with confidence. TOI-2447 b/NGTS-29b joins a small but growing population of cool giants that will provide crucial insights into giant planet composition and formation mechanisms. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 16 pages, 12 figures. Accepted for publication in MNRAS

arXiv:2405.06524 [pdf, other]

Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts

Authors: Wenyu Huang, Guancheng Zhou, Mirella Lapata, Pavlos Vougiouklis, Sebastien Montella, Jeff Z. Pan

Abstract: Although Large Language Models (LLMs) are effective in performing various NLP tasks, they still struggle to handle tasks that require extensive, real-world knowledge, especially when dealing with long-tail facts (facts related to long-tail entities). This limitation highlights the need to supplement LLMs with non-parametric knowledge. To address this issue, we analysed the effects of different typ… ▽ More Although Large Language Models (LLMs) are effective in performing various NLP tasks, they still struggle to handle tasks that require extensive, real-world knowledge, especially when dealing with long-tail facts (facts related to long-tail entities). This limitation highlights the need to supplement LLMs with non-parametric knowledge. To address this issue, we analysed the effects of different types of non-parametric knowledge, including textual passage and knowledge graphs (KGs). Since LLMs have probably seen the majority of factual question-answering datasets already, to facilitate our analysis, we proposed a fully automatic pipeline for creating a benchmark that requires knowledge of long-tail facts for answering the involved questions. Using this pipeline, we introduce the LTGen benchmark. We evaluate state-of-the-art LLMs in different knowledge settings using the proposed benchmark. Our experiments show that LLMs alone struggle with answering these questions, especially when the long-tail level is high or rich knowledge is required. Nonetheless, the performance of the same models improved significantly when they were prompted with non-parametric knowledge. We observed that, in most cases, prompting LLMs with KG triples surpasses passage-based prompting using a state-of-the-art retriever. In addition, while prompting LLMs with both KG triples and documents does not consistently improve knowledge coverage, it can dramatically reduce hallucinations in the generated content. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.05957 [pdf, other]

OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning

Authors: Dan Qiao, Yi Su, Pinzheng Wang, **g Ye, Wen**g Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, **fu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang

Abstract: Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived… ▽ More Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived from multi-stage compression and continual pre-training from the original 15B OpenBA model. OpenBA-V2 utilizes more data, more flexible training objectives, and techniques such as layer pruning, neural pruning, and vocabulary pruning to achieve a compression rate of 77.3\% with minimal performance loss. OpenBA-V2 demonstrates competitive performance compared to other open-source models of similar size, achieving results close to or on par with the 15B OpenBA model in downstream tasks such as common sense reasoning and Named Entity Recognition (NER). OpenBA-V2 illustrates that LLMs can be compressed into smaller ones with minimal performance loss by employing advanced training objectives and data strategies, which may help deploy LLMs in resource-limited scenarios. △ Less

Submitted 9 May, 2024; originally announced May 2024.

arXiv:2405.04840 [pdf, other]

Federated Adaptation for Foundation Model-based Recommendations

Authors: Chunxu Zhang, Guodong Long, Hongkuan Guo, Xiao Fang, Yang Song, Zhaojie Liu, Guorui Zhou, Zijian Zhang, Yang Liu, Bo Yang

Abstract: With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while… ▽ More With the recent success of large language models, particularly foundation models with generalization abilities, applying foundation models for recommendations becomes a new paradigm to improve existing recommendation systems. It becomes a new open challenge to enable the foundation model to capture user preference changes in a timely manner with reasonable communication and computation costs while preserving privacy. This paper proposes a novel federated adaptation mechanism to enhance the foundation model-based recommendation system in a privacy-preserving manner. Specifically, each client will learn a lightweight personalized adapter using its private data. The adapter then collaborates with pre-trained foundation models to provide recommendation service efficiently with fine-grained manners. Importantly, users' private behavioral data remains secure as it is not shared with the server. This data localization-based privacy preservation is embodied via the federated learning framework. The model can ensure that shared knowledge is incorporated into all adapters while simultaneously preserving each user's personal preferences. Experimental results on four benchmark datasets demonstrate our method's superior performance. Implementation code is available to ease reproducibility. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted as a regular paper of IJCAI'24

arXiv:2405.03727 [pdf, other]

Large Language Models Synergize with Automated Machine Learning

Authors: **glue Xu, Jialong Li, Zhen Liu, Nagar Anthel Venkatesh Suryanarayanan, Guoyuan Zhou, Jia Guo, Hitoshi Iba, Kenji Tei

Abstract: Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program synthesis, targeting ML programs, by combining LLMs and automated machine learning (autoML). Specifically, our goal is to fully automate the generation and optim… ▽ More Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program synthesis, targeting ML programs, by combining LLMs and automated machine learning (autoML). Specifically, our goal is to fully automate the generation and optimization of the code of the entire ML workflow, from data preparation to modeling and post-processing, utilizing only textual descriptions of the ML tasks. To manage the length and diversity of ML programs, we propose to break each ML program into smaller, manageable parts. Each part is generated separately by the LLM, with careful consideration of their compatibilities. To ensure compatibilities, we design a testing technique for ML programs. Unlike traditional program synthesis, which typically relies on binary evaluations (i.e., correct or incorrect), evaluating ML programs necessitates more than just binary judgments. Therefore, we further assess ML programs numerically and select the optimal programs from a range of candidates using AutoML methods. In experiments across various ML tasks, our method outperforms existing methods in 10 out of 12 tasks for generating ML programs. In addition, autoML significantly improves the performance of the generated ML programs. In experiments, given the textual task description, our method, Text-to-ML, generates the complete and optimized ML program in a fully autonomous process. △ Less

Submitted 11 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.02880 [pdf, other]

Blending Distributed NeRFs with Tri-stage Robust Pose Optimization

Authors: Baijun Ye, Caiyun Liu, Xiaoyu Ye, Yuantao Chen, Yuhai Wang, Zike Yan, Yongliang Shi, Hao Zhao, Guyue Zhou

Abstract: Due to the limited model capacity, leveraging distributed Neural Radiance Fields (NeRFs) for modeling extensive urban environments has become a necessity. However, current distributed NeRF registration approaches encounter aliasing artifacts, arising from discrepancies in rendering resolutions and suboptimal pose precision. These factors collectively deteriorate the fidelity of pose estimation wit… ▽ More Due to the limited model capacity, leveraging distributed Neural Radiance Fields (NeRFs) for modeling extensive urban environments has become a necessity. However, current distributed NeRF registration approaches encounter aliasing artifacts, arising from discrepancies in rendering resolutions and suboptimal pose precision. These factors collectively deteriorate the fidelity of pose estimation within NeRF frameworks, resulting in occlusion artifacts during the NeRF blending stage. In this paper, we present a distributed NeRF system with tri-stage pose optimization. In the first stage, precise poses of images are achieved by bundle adjusting Mip-NeRF 360 with a coarse-to-fine strategy. In the second stage, we incorporate the inverting Mip-NeRF 360, coupled with the truncated dynamic low-pass filter, to enable the achievement of robust and precise poses, termed Frame2Model optimization. On top of this, we obtain a coarse transformation between NeRFs in different coordinate systems. In the third stage, we fine-tune the transformation between NeRFs by Model2Model pose optimization. After obtaining precise transformation parameters, we proceed to implement NeRF blending, showcasing superior performance metrics in both real-world and simulation scenarios. Codes and data will be publicly available at https://github.com/boilcy/Distributed-NeRF. △ Less

Submitted 5 May, 2024; originally announced May 2024.

arXiv:2404.18192 [pdf, other]

Block-Map-Based Localization in Large-Scale Environment

Authors: Yixiao Feng, Zhou Jiang, Yongliang Shi, Yunlong Feng, Xiangyu Chen, Hao Zhao, Guyue Zhou

Abstract: Accurate localization is an essential technology for the flexible navigation of robots in large-scale environments. Both SLAM-based and map-based localization will increase the computing load due to the increase in map size, which will affect downstream tasks such as robot navigation and services. To this end, we propose a localization system based on Block Maps (BMs) to reduce the computational l… ▽ More Accurate localization is an essential technology for the flexible navigation of robots in large-scale environments. Both SLAM-based and map-based localization will increase the computing load due to the increase in map size, which will affect downstream tasks such as robot navigation and services. To this end, we propose a localization system based on Block Maps (BMs) to reduce the computational load caused by maintaining large-scale maps. Firstly, we introduce a method for generating block maps and the corresponding switching strategies, ensuring that the robot can estimate the state in large-scale environments by loading local map information. Secondly, global localization according to Branch-and-Bound Search (BBS) in the 3D map is introduced to provide the initial pose. Finally, a graph-based optimization method is adopted with a dynamic sliding window that determines what factors are being marginalized whether a robot is exposed to a BM or switching to another one, which maintains the accuracy and efficiency of pose tracking. Comparison experiments are performed on publicly available large-scale datasets. Results show that the proposed method can track the robot pose even though the map scale reaches more than 6 kilometers, while efficient and accurate localization is still guaranteed on NCLT and M2DGR. △ Less

Submitted 28 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, 4 tables, published to ICRA 2024

arXiv:2404.16831 [pdf, other]

The Third Monocular Depth Estimation Challenge

Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, Yi** Bao, Xiao Liu, Dohyeong Kim, **seong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, **qiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 submissions outperforming the baseline on the test set: 10 among them submitted a report describing their approach, highlighting a diffused use of foundational models such as Depth Anything at the core of their method. The challenge winners drastically improved 3D F-Score performance, from 17.51% to 23.72%. △ Less

Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: To appear in CVPRW2024

arXiv:2404.15807 [pdf, other]

One Subgraph for All: Efficient Reasoning on Opening Subgraphs for Inductive Knowledge Graph Completion

Authors: Zhiwen Xie, Yi Zhang, Guangyou Zhou, ** Liu, Xinhui Tu, Jimmy Xiangji Huang

Abstract: Knowledge Graph Completion (KGC) has garnered massive research interest recently, and most existing methods are designed following a transductive setting where all entities are observed during training. Despite the great progress on the transductive KGC, these methods struggle to conduct reasoning on emerging KGs involving unseen entities. Thus, inductive KGC, which aims to deduce missing links am… ▽ More Knowledge Graph Completion (KGC) has garnered massive research interest recently, and most existing methods are designed following a transductive setting where all entities are observed during training. Despite the great progress on the transductive KGC, these methods struggle to conduct reasoning on emerging KGs involving unseen entities. Thus, inductive KGC, which aims to deduce missing links among unseen entities, has become a new trend. Many existing studies transform inductive KGC as a graph classification problem by extracting enclosing subgraphs surrounding each candidate triple. Unfortunately, they still face certain challenges, such as the expensive time consumption caused by the repeat extraction of enclosing subgraphs, and the deficiency of entity-independent feature learning. To address these issues, we propose a global-local anchor representation (GLAR) learning method for inductive KGC. Unlike previous methods that utilize enclosing subgraphs, we extract a shared opening subgraph for all candidates and perform reasoning on it, enabling the model to perform reasoning more efficiently. Moreover, we design some transferable global and local anchors to learn rich entity-independent features for emerging entities. Finally, a global-local graph reasoning model is applied on the opening subgraph to rank all candidates. Extensive experiments show that our GLAR outperforms most existing state-of-the-art methods. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.13946 [pdf, other]

Dual Model Replacement:invisible Multi-target Backdoor Attack based on Federal Learning

Authors: Rong Wang, Guichen Zhou, Mingjun Gao, Yunpeng Xiao

Abstract: In recent years, the neural network backdoor hidden in the parameters of the federated learning model has been proved to have great security risks. Considering the characteristics of trigger generation, data poisoning and model training in backdoor attack, this paper designs a backdoor attack method based on federated learning. Firstly, aiming at the concealment of the backdoor trigger, a TrojanGa… ▽ More In recent years, the neural network backdoor hidden in the parameters of the federated learning model has been proved to have great security risks. Considering the characteristics of trigger generation, data poisoning and model training in backdoor attack, this paper designs a backdoor attack method based on federated learning. Firstly, aiming at the concealment of the backdoor trigger, a TrojanGan steganography model with encoder-decoder structure is designed. The model can encode specific attack information as invisible noise and attach it to the image as a backdoor trigger, which improves the concealment and data transformations of the backdoor trigger.Secondly, aiming at the problem of single backdoor trigger mode, an image poisoning attack method called combination trigger attack is proposed. This method realizes multi-backdoor triggering by multiplexing combined triggers and improves the robustness of backdoor attacks. Finally, aiming at the problem that the local training mechanism leads to the decrease of the success rate of backdoor attack, a dual model replacement backdoor attack algorithm based on federated learning is designed. This method can improve the success rate of backdoor attack while maintaining the performance of the federated learning aggregation model. Experiments show that the attack strategy in this paper can not only achieve high backdoor concealment and diversification of trigger forms under federated learning, but also achieve good attack success rate in multi-target attacks.door concealment and diversification of trigger forms but also achieve good results in multi-target attacks. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.13425 [pdf, other]

AdvLoRA: Adversarial Low-Rank Adaptation of Vision-Language Models

Authors: Yuheng Ji, Yue Liu, Zhicheng Zhang, Zhao Zhang, Yuting Zhao, Gang Zhou, Xingwei Zhang, Xinwang Liu, Xiaolong Zheng

Abstract: Vision-Language Models (VLMs) are a significant technique for Artificial General Intelligence (AGI). With the fast growth of AGI, the security problem become one of the most important challenges for VLMs. In this paper, through extensive experiments, we demonstrate the vulnerability of the conventional adaptation methods for VLMs, which may bring significant security risks. In addition, as the siz… ▽ More Vision-Language Models (VLMs) are a significant technique for Artificial General Intelligence (AGI). With the fast growth of AGI, the security problem become one of the most important challenges for VLMs. In this paper, through extensive experiments, we demonstrate the vulnerability of the conventional adaptation methods for VLMs, which may bring significant security risks. In addition, as the size of the VLMs increases, performing conventional adversarial adaptation techniques on VLMs results in high computational costs. To solve these problems, we propose a parameter-efficient \underline{Adv}ersarial adaptation method named \underline{AdvLoRA} by \underline{Lo}w-\underline{R}ank \underline{A}daptation. At first, we investigate and reveal the intrinsic low-rank property during the adversarial adaptation for VLMs. Different from LoRA, we improve the efficiency and robustness of adversarial adaptation by designing a novel reparameterizing method based on parameter clustering and parameter alignment. In addition, an adaptive parameter update strategy is proposed to further improve the robustness. By these settings, our proposed AdvLoRA alleviates the model security and high resource waste problems. Extensive experiments demonstrate the effectiveness and efficiency of the AdvLoRA. △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.10165 [pdf, ps, other]

Algebraic Morse theory via Homological Perturbation Lemma with two applications

Authors: Jun Chen, Yuming Liu, Guodong Zhou

Abstract: As a generalization of the classical killing-contractible-complexes lemma, we present algebraic Morse theory via homological perturbation lemma, in a form more general than existing presentations in the literature. Two-sided Anick resolutions due to E.~Sköldberg are generalised to algebras given by quivers with relations and a minimality criterion is provided as well. Two applications of algebraic… ▽ More As a generalization of the classical killing-contractible-complexes lemma, we present algebraic Morse theory via homological perturbation lemma, in a form more general than existing presentations in the literature. Two-sided Anick resolutions due to E.~Sköldberg are generalised to algebras given by quivers with relations and a minimality criterion is provided as well. Two applications of algebraic Morse theory are presented. It is shown that the Chinese algebra of rank $n\geq 1$ is homologically smooth and of global dimension $\frac{n(n+1)}{2}$, and the minimal two-sided projective resolution of a Koszul algebra is constructed. △ Less

Submitted 15 April, 2024; originally announced April 2024.

MSC Class: 18G35; 16E05; 16E10

arXiv:2404.09514 [pdf, ps, other]

Eruption of a million-Kelvin warm magnetic flux rope on the Sun

Authors: Le** Li, Hongqiang Song, Hardi Peter, Lakshmi Pradeep Chitta, Xin Cheng, Zhentong Li, Gui** Zhou

Abstract: Solar magnetic flux rope (MFR) plays a central role in the physics of coronal mass ejections (CMEs). It mainly includes a cold filament at typical chromospheric temperatures (10000 K) and a hot channel at high coronal temperatures (10 MK). The warm MFR at quiescent coronal temperatures of a million Kelvin is, however, rarely reported. In this study, using multiwavelength images from Atmospheric Im… ▽ More Solar magnetic flux rope (MFR) plays a central role in the physics of coronal mass ejections (CMEs). It mainly includes a cold filament at typical chromospheric temperatures (10000 K) and a hot channel at high coronal temperatures (10 MK). The warm MFR at quiescent coronal temperatures of a million Kelvin is, however, rarely reported. In this study, using multiwavelength images from Atmospheric Imaging Assembly (AIA) on board the Solar Dynamic Observatory (SDO) and Extreme Ultraviolet Imager (EUVI) on board the Solar Terrestrial Relations Observatory-A (STEREO-A), we present an eruption of a warm channel, that represents an MFR with quiescent coronal temperatures (0.6-2.5 MK). On 2022 May 8, we observed the failed eruption of a hot channel, with the average temperature and emission measure (EM) of 10 MK and 1.1*1028 cm^-5, using AIA high-temperature images in active region (AR) 13007. This failed eruption was associated with a C8.2 flare, with no CME. Subsequently, we observed a warm channel that appeared in AIA and EUVI low-temperature images, rather than AIA high-temperature images. It then erupted, and transformed toward a semi-circular shape. An associated C2.1 flare, along with the signatures of magnetic reconnection in AIA high-temperature images, were identified. Additionally, we observed a CME associated with this event. Compared with the hot channel, the warm channel is cooler and rarer with the average temperature and EM of 1.7 (1.6) MK and 2.0*1026 (2.3*1026) cm^-5. All the results suggest an unambiguous observation of the million-Kelvin warm MFR, that erupted as a CME, and fill a gap in the temperature domain of coronal MFRs. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 18 pages, 7 figures, 2 tables, accepted for publication in ApJ

arXiv:2404.08446 [pdf]

Growth of two-inch free-standing heteroepitaxial diamond on Ir/YSZ/Si (001) substrates via laser-patterned templates

Authors: Pengfei Qu, Peng **, Guangdi Zhou, Zhen Wang, Zhanguo Wang

Abstract: In this paper, 2-inch free-standing diamonds were prepared by using heteroepitaxy on composite Ir/YSZ/Si (001) substrates. To release stress, patterned templates were fabricated using laser etching after the initial growth of 50-nm-diamond. Then, the subsequent growth was completed on a patterned template. The full width at half maximum of the diamond (400) and (311) X-ray rocking curves were 313.… ▽ More In this paper, 2-inch free-standing diamonds were prepared by using heteroepitaxy on composite Ir/YSZ/Si (001) substrates. To release stress, patterned templates were fabricated using laser etching after the initial growth of 50-nm-diamond. Then, the subsequent growth was completed on a patterned template. The full width at half maximum of the diamond (400) and (311) X-ray rocking curves were 313.5 and 359.3 arcsecs, respectively. Strong band-edge emission in the cathodoluminescence spectrum of the resulting diamond revealed excellent crystalline quality. Furthermore, the 2D map** of Raman spectra was conducted on a $2 mm \times 2 mm$ area located at the center of the 2-inch sample with a thickness of $400 μm$. The result showed an average peak width of $2.85 \pm 0.36 cm^{-1}$ and residual stress of $-0.03 \pm 0.37 GPa$. The dislocation density, determined by counting etching pits generated from $ H_2/O_2$ plasma etching, was estimated to be around $2.2 \times 10^7 cm^{-2}$. These results evidence that the laser-patterned method can effectively release stress during the growth of large-size diamonds, offering a simpler and more cost-effective alternative to the traditional photolithography-patterned scheme. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 13 pages, 5 figures

arXiv:2404.08171 [pdf, ps, other]

The Rank-1 Completion Problem for Cubic Tensors

Authors: **ling Zhou, Jiawang Nie, Zheng Peng, Guangming Zhou

Abstract: This paper studies the rank-$1$ tensor completion problem for cubic order tensors. First of all, we show that this problem is equivalent to a special rank-$1$ matrix recovery problem. We propose both nuclear norm relaxation and moment relaxation methods for solving the resulting rank-$1$ matrix recovery problem. The nuclear norm relaxation sometimes get a rank-$1$ tensor completion, while sometime… ▽ More This paper studies the rank-$1$ tensor completion problem for cubic order tensors. First of all, we show that this problem is equivalent to a special rank-$1$ matrix recovery problem. We propose both nuclear norm relaxation and moment relaxation methods for solving the resulting rank-$1$ matrix recovery problem. The nuclear norm relaxation sometimes get a rank-$1$ tensor completion, while sometimes it does not. When it fails, we apply the moment hierarchy of semidefinite programming relaxations to solve the rank-$1$ matrix recovery problem. The moment hierarchy can always get a rank-$1$ tensor completion, or detect its nonexistence. In particular, when the tensor is strongly rank-$1$ completable, we show that the problem is equivalent to a rank-$1$ matrix completion problem and it can be solved by an iterative formula. Therefore, much larger size problems can be solved efficiently for strongly rank-$1$ completable tensors. Numerical experiments are shown to demonstrate the efficiency of these proposed methods. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 23 pages

arXiv:2404.06078 [pdf, other]

End-to-end training of Multimodal Model and ranking Model

Authors: Xiuqi Deng, Lu Xu, Xiyao Li, **kai Yu, Erpeng Xue, Zhongyuan Wang, Di Zhang, Zhaojie Liu, Guorui Zhou, Yang Song, Na Mou, Shen Jiang, Han Li

Abstract: Traditional recommender systems heavily rely on ID features, which often encounter challenges related to cold-start and generalization. Modeling pre-extracted content features can mitigate these issues, but is still a suboptimal solution due to the discrepancies between training tasks and model parameters. End-to-end training presents a promising solution for these problems, yet most of the existi… ▽ More Traditional recommender systems heavily rely on ID features, which often encounter challenges related to cold-start and generalization. Modeling pre-extracted content features can mitigate these issues, but is still a suboptimal solution due to the discrepancies between training tasks and model parameters. End-to-end training presents a promising solution for these problems, yet most of the existing works mainly focus on retrieval models, leaving the multimodal techniques under-utilized. In this paper, we propose an industrial multimodal recommendation framework named EM3: End-to-end training of Multimodal Model and ranking Model, which sufficiently utilizes multimodal information and allows personalized ranking tasks to directly train the core modules in the multimodal model to obtain more task-oriented content features, without overburdening resource consumption. First, we propose Fusion-Q-Former, which consists of transformers and a set of trainable queries, to fuse different modalities and generate fixed-length and robust multimodal embeddings. Second, in our sequential modeling for user content interest, we utilize Low-Rank Adaptation technique to alleviate the conflict between huge resource consumption and long sequence length. Third, we propose a novel Content-ID-Contrastive learning task to complement the advantages of content and ID by aligning them with each other, obtaining more task-oriented content embeddings and more generalized ID embeddings. In experiments, we implement EM3 on different ranking models in two scenario, achieving significant improvements in both offline evaluation and online A/B test, verifying the generalizability of our method. Ablation studies and visualization are also performed. Furthermore, we also conduct experiments on two public datasets to show that our proposed method outperforms the state-of-the-art methods. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 9 pages, 8 figures

arXiv:2404.04579 [pdf, other]

TeleAware Robot: Designing Awareness-augmented Telepresence Robot for Remote Collaborative Locomotion

Authors: Ruyi Li, Yaxin Zhu, Min Liu, Yihang Zeng, Shanning Zhuang, Jiayi Fu, Yi Lu, Guyue Zhou, Can Liu, Jiangtao Gong

Abstract: Telepresence robots can be used to support users to navigate an environment remotely and share the visiting experience with their social partners. Although such systems allow users to see and hear the remote environment and communicate with their partners via live video feed, this does not provide enough awareness of the environment and their remote partner's activities. In this paper, we introduc… ▽ More Telepresence robots can be used to support users to navigate an environment remotely and share the visiting experience with their social partners. Although such systems allow users to see and hear the remote environment and communicate with their partners via live video feed, this does not provide enough awareness of the environment and their remote partner's activities. In this paper, we introduce an awareness framework for collaborative locomotion in scenarios of onsite and remote users visiting a place together. From an observational study of small groups of people visiting exhibitions, we derived four design goals for enhancing the environmental and social awareness between social partners, and developed a set of awareness-enhancing techniques to add to a standard telepresence robot - named TeleAware robot. Through a controlled experiment simulating a guided exhibition visiting task, TeleAware robot showed the ability to lower the workload, facilitate closer social proximity, and improve mutual awareness and social presence compared with the standard one. We discuss the impact of mobility and roles of local and remote users, and provide insights for the future design of awareness-enhancing telepresence robot systems that facilitate collaborative locomotion. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: 33 pages, 12 figures

MSC Class: H.5.2

Journal ref: IMUWT 2024

arXiv:2404.04167 [pdf, other]

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

Authors: Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng, Xinchen Luo, Guorui Zhou, Binhang Yuan, Wenhu Chen, Jie Fu, Ge Zhang

Abstract: In this study, we introduce CT-LLM, a 2B large language model (LLM) that illustrates a pivotal shift towards prioritizing the Chinese language in develo** LLMs. Uniquely initiated from scratch, CT-LLM diverges from the conventional methodology by primarily incorporating Chinese textual data, utilizing an extensive corpus of 1,200 billion tokens, including 800 billion Chinese tokens, 300 billion… ▽ More In this study, we introduce CT-LLM, a 2B large language model (LLM) that illustrates a pivotal shift towards prioritizing the Chinese language in develo** LLMs. Uniquely initiated from scratch, CT-LLM diverges from the conventional methodology by primarily incorporating Chinese textual data, utilizing an extensive corpus of 1,200 billion tokens, including 800 billion Chinese tokens, 300 billion English tokens, and 100 billion code tokens. This strategic composition facilitates the model's exceptional proficiency in understanding and processing Chinese, a capability further enhanced through alignment techniques. Demonstrating remarkable performance on the CHC-Bench, CT-LLM excels in Chinese language tasks, and showcases its adeptness in English through SFT. This research challenges the prevailing paradigm of training LLMs predominantly on English corpora and then adapting them to other languages, broadening the horizons for LLM training methodologies. By open-sourcing the full process of training a Chinese LLM, including a detailed data processing procedure with the obtained Massive Appropriate Pretraining Chinese Corpus (MAP-CC), a well-chosen multidisciplinary Chinese Hard Case Benchmark (CHC-Bench), and the 2B-size Chinese Tiny LLM (CT-LLM), we aim to foster further exploration and innovation in both academia and industry, paving the way for more inclusive and versatile language models. △ Less

Submitted 9 April, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

arXiv:2404.03634 [pdf, other]

PreAfford: Universal Affordance-Based Pre-Gras** for Diverse Objects and Environments

Authors: Kairui Ding, Boyuan Chen, Ruihai Wu, Yuyang Li, Zongzheng Zhang, Huan-ang Gao, Siqi Li, Guyue Zhou, Yixin Zhu, Hao Dong, Hao Zhao

Abstract: Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-gras** methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-gras** plan… ▽ More Robotic manipulation with two-finger grippers is challenged by objects lacking distinct graspable features. Traditional pre-gras** methods, which typically involve repositioning objects or utilizing external aids like table edges, are limited in their adaptability across different object categories and environments. To overcome these limitations, we introduce PreAfford, a novel pre-gras** planning framework that incorporates a point-level affordance representation and a relay training approach. Our method significantly improves adaptability, allowing effective manipulation across a wide range of environments and object types. When evaluated on the ShapeNet-v2 dataset, PreAfford not only enhances gras** success rates by 69% but also demonstrates its practicality through successful real-world experiments. These improvements highlight PreAfford's potential to redefine standards for robotic handling of complex manipulation tasks in diverse settings. △ Less

Submitted 4 July, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Project Page: https://air-discover.github.io/PreAfford/

arXiv:2404.00366 [pdf]

doi 10.1016/j.oceaneng.2024.117741

Efficient Multi-branch Segmentation Network for Situation Awareness in Autonomous Navigation

Authors: Guan-Cheng Zhou, Chen Chengb, Yan-zhou Chena

Abstract: Real-time and high-precision situational awareness technology is critical for autonomous navigation of unmanned surface vehicles (USVs). In particular, robust and fast obstacle semantic segmentation methods are essential. However, distinguishing between the sea and the sky is challenging due to the differences between port and maritime environments. In this study, we built a dataset that captured… ▽ More Real-time and high-precision situational awareness technology is critical for autonomous navigation of unmanned surface vehicles (USVs). In particular, robust and fast obstacle semantic segmentation methods are essential. However, distinguishing between the sea and the sky is challenging due to the differences between port and maritime environments. In this study, we built a dataset that captured perspectives from USVs and unmanned aerial vehicles in a maritime port environment and analysed the data features. Statistical analysis revealed a high correlation between the distribution of the sea and sky and row positional information. Based on this finding, a three-branch semantic segmentation network with a row position encoding module (RPEM) was proposed to improve the prediction accuracy between the sea and the sky. The proposed RPEM highlights the effect of row coordinates on feature extraction. Compared to the baseline, the three-branch network with RPEM significantly improved the ability to distinguish between the sea and the sky without significantly reducing the computational speed. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Journal ref: Ocean Engineering 302 (2024) 117741

arXiv:2403.18170 [pdf, ps, other]

Formal deformations, cohomology theory and $L_\infty[1]$-structures for differential Lie algebras of arbitrary weight

Authors: Weiguo Lyu, Zihao Qi, Jian Yang, Guodong Zhou

Abstract: Generalising a previous work of Jiang and Sheng, a cohomology theory for differential Lie algebras of arbitrary weight is introduced. The underlying $L_\infty[1]$-structure on the cochain complex is also determined via a generalised version of higher derived brackets. The equivalence between $L_\infty[1]$-structures for absolute and relative differential Lie algebras are established. Formal deform… ▽ More Generalising a previous work of Jiang and Sheng, a cohomology theory for differential Lie algebras of arbitrary weight is introduced. The underlying $L_\infty[1]$-structure on the cochain complex is also determined via a generalised version of higher derived brackets. The equivalence between $L_\infty[1]$-structures for absolute and relative differential Lie algebras are established. Formal deformations and abelian extensions are interpreted by using lower degree cohomology groups. Also we introduce the homotopy differential Lie algebras. In a forthcoming paper, we will show that the operad of homotopy (relative) differential Lie algebras is the minimal model of the operad of (relative) differential Lie algebras. △ Less

Submitted 26 March, 2024; originally announced March 2024.

MSC Class: 16E40; 16S80; 12H05; 12H10; 16W25; 16S70

arXiv:2403.16535 [pdf, other]

Arm-Constrained Curriculum Learning for Loco-Manipulation of the Wheel-Legged Robot

Authors: Zifan Wang, Yufei Jia, Lu Shi, Haoyu Wang, Haizhou Zhao, Xueyang Li, **ni Zhou, Jun Ma, Guyue Zhou

Abstract: Incorporating a robotic manipulator into a wheel-legged robot enhances its agility and expands its potential for practical applications. However, the presence of potential instability and uncertainties presents additional challenges for control objectives. In this paper, we introduce an arm-constrained curriculum learning architecture to tackle the issues introduced by adding the manipulator. Firs… ▽ More Incorporating a robotic manipulator into a wheel-legged robot enhances its agility and expands its potential for practical applications. However, the presence of potential instability and uncertainties presents additional challenges for control objectives. In this paper, we introduce an arm-constrained curriculum learning architecture to tackle the issues introduced by adding the manipulator. Firstly, we develop an arm-constrained reinforcement learning algorithm to ensure safety and stability in control performance. Additionally, to address discrepancies in reward settings between the arm and the base, we propose a reward-aware curriculum learning method. The policy is first trained in Isaac gym and transferred to the physical robot to do dynamic gras** tasks, including the door-opening task, fan-twitching task and the relay-baton-picking and following task. The results demonstrate that our proposed approach effectively controls the arm-equipped wheel-legged robot to master dynamic gras** skills, allowing it to chase and catch a moving object while in motion. Please refer to our website (https://acodedog.github.io/wheel-legged-loco-manipulation) for the code and supplemental videos. △ Less

Submitted 28 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.14674 [pdf]

Packaging Up Media Mix Modeling: An Introduction to Robyn's Open-Source Approach

Authors: Gufeng Zhou, Igor Skokan, Julian Runge

Abstract: While attribution of user behavior across apps and websites had led to unseen levels of determinism in digital advertising measurement, privacy-centric changes to the digital data landscape are bringing probabilistic techniques such as marketing and media mix modeling en vogue again. Many small and midsize advertisers lack the scale and resources to invest in advanced proprietary modeling efforts… ▽ More While attribution of user behavior across apps and websites had led to unseen levels of determinism in digital advertising measurement, privacy-centric changes to the digital data landscape are bringing probabilistic techniques such as marketing and media mix modeling en vogue again. Many small and midsize advertisers lack the scale and resources to invest in advanced proprietary modeling efforts that would usually require specific expertise and a team of several data scientists. To facilitate broad successful adoption of media mix modeling for digital advertising measurement, marketing data scientists at Meta started the open-source computational package Robyn. This article presents architectural components and choices in Robyn and discusses how Robyn aims to be packaged against biases and for organizational acceptance. As an open-source package with wide adoption and a highly active community, Robyn undergoes continual development. In this vein, what is described in this article should not be seen as conclusive solutions but as an outline of pathways that the Robyn community has embarked on. The article aims to provide a structured introduction to these pathways as a basis for feedback from marketing data scientists, to ensure Robyn's ongoing development aligns with users' needs. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.12787 [pdf, other]

DDSB: An Unsupervised and Training-free Method for Phase Detection in Echocardiography

Authors: Zhenyu Bu, Yang Liu, Jiayu Huo, **g**g Peng, Kaini Wang, Guangquan Zhou, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin

Abstract: Accurate identification of End-Diastolic (ED) and End-Systolic (ES) frames is key for cardiac function assessment through echocardiography. However, traditional methods face several limitations: they require extensive amounts of data, extensive annotations by medical experts, significant training resources, and often lack robustness. Addressing these challenges, we proposed an unsupervised and tra… ▽ More Accurate identification of End-Diastolic (ED) and End-Systolic (ES) frames is key for cardiac function assessment through echocardiography. However, traditional methods face several limitations: they require extensive amounts of data, extensive annotations by medical experts, significant training resources, and often lack robustness. Addressing these challenges, we proposed an unsupervised and training-free method, our novel approach leverages unsupervised segmentation to enhance fault tolerance against segmentation inaccuracies. By identifying anchor points and analyzing directional deformation, we effectively reduce dependence on the accuracy of initial segmentation images and enhance fault tolerance, all while improving robustness. Tested on Echo-dynamic and CAMUS datasets, our method achieves comparable accuracy to learning-based models without their associated drawbacks. The code is available at https://github.com/MRUIL/DDSB △ Less

Submitted 19 March, 2024; originally announced March 2024.

arXiv:2403.12386 [pdf]

Pipelined Biomedical Event Extraction Rivaling Joint Learning

Authors: Pengchao Wu, Xuefeng Li, **ghang Gu, Longhua Qian, Guodong Zhou

Abstract: Biomedical event extraction is an information extraction task to obtain events from biomedical text, whose targets include the type, the trigger, and the respective arguments involved in an event. Traditional biomedical event extraction usually adopts a pipelined approach, which contains trigger identification, argument role recognition, and finally event construction either using specific rules o… ▽ More Biomedical event extraction is an information extraction task to obtain events from biomedical text, whose targets include the type, the trigger, and the respective arguments involved in an event. Traditional biomedical event extraction usually adopts a pipelined approach, which contains trigger identification, argument role recognition, and finally event construction either using specific rules or by machine learning. In this paper, we propose an n-ary relation extraction method based on the BERT pre-training model to construct Binding events, in order to capture the semantic information about an event's context and its participants. The experimental results show that our method achieves promising results on the GE11 and GE13 corpora of the BioNLP shared task with F1 scores of 63.14% and 59.40%, respectively. It demonstrates that by significantly improving theperformance of Binding events, the overall performance of the pipelined event extraction approach or even exceeds those of current joint learning methods. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11061 [pdf, other]

Beamforming Design for Double-Active-RIS-aided Communication Systems with Inter-Excitation

Authors: Boshi Wang, Cunhua Pan, Hong Ren, Zhiyuan Yu, Yang Zhang, Mengyu Liu, Gui Zhou

Abstract: In this paper, we investigate a double-active-reconfigurable intelligent surface (RIS)-aided downlink wireless communication system, where a multi-antenna base station (BS) serves multiple single-antenna users with both double reflection and single reflection links. Due to the signal amplification capability of active RISs, the mutual influence between active RISs, which is termed as the "inter-ex… ▽ More In this paper, we investigate a double-active-reconfigurable intelligent surface (RIS)-aided downlink wireless communication system, where a multi-antenna base station (BS) serves multiple single-antenna users with both double reflection and single reflection links. Due to the signal amplification capability of active RISs, the mutual influence between active RISs, which is termed as the "inter-excitation" effect, cannot be ignored. Then, we develop a feedback-type model to characterize the signal containing the inter-excitation effect. Based on the signal model, we formulate a weighted sum rate (WSR) maximization problem by jointly optimizing the beamforming matrix at the BS and the reflecting coefficient matrices at the two active RISs, subject to power constraints at the BS and active RISs, as well as the maximum amplification gain constraints of the active RISs. To solve this non-convex problem, we first transform the problem into a more tractable form using the fractional programming (FP) method. Then, by introducing auxiliary variables, the problem can be converted into an equivalent form that can be solved by using a low-complexity penalty dual decomposition (PDD) algorithm. Finally, simulation results indicate that it is crucial to consider the inter-excitation effect between active RISs in beamforming design for double-active-RIS-aided communication systems. Additionally, it prevails over other benchmark schemes with single active RIS and double passive RISs in terms of achievable rate. △ Less

Submitted 16 March, 2024; originally announced March 2024.

arXiv:2403.10319 [pdf, other]

NetBench: A Large-Scale and Comprehensive Network Traffic Benchmark Dataset for Foundation Models

Authors: Chen Qian, Xiaochang Li, Qineng Wang, Gang Zhou, Huajie Shao

Abstract: In computer networking, network traffic refers to the amount of data transmitted in the form of packets between internetworked computers or Cyber-Physical Systems. Monitoring and analyzing network traffic is crucial for ensuring the performance, security, and reliability of a network. However, a significant challenge in network traffic analysis is to process diverse data packets including both cip… ▽ More In computer networking, network traffic refers to the amount of data transmitted in the form of packets between internetworked computers or Cyber-Physical Systems. Monitoring and analyzing network traffic is crucial for ensuring the performance, security, and reliability of a network. However, a significant challenge in network traffic analysis is to process diverse data packets including both ciphertext and plaintext. While many methods have been adopted to analyze network traffic, they often rely on different datasets for performance evaluation. This inconsistency results in substantial manual data processing efforts and unfair comparisons. Moreover, some data processing methods may cause data leakage due to improper separation of training and testing data. To address these issues, we introduce the NetBench, a large-scale and comprehensive benchmark dataset for assessing machine learning models, especially foundation models, in both network traffic classification and generation tasks. NetBench is built upon seven publicly available datasets and encompasses a broad spectrum of 20 tasks, including 15 classification tasks and 5 generation tasks. Furthermore, we evaluate eight State-Of-The-Art (SOTA) classification models (including two foundation models) and two generative models using our benchmark. The results show that foundation models significantly outperform the traditional deep learning methods in traffic classification. We believe NetBench will facilitate fair comparisons among various approaches and advance the development of foundation models for network traffic. Our benchmark is available at https://github.com/WM-JayLab/NetBench. △ Less

Submitted 18 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.09330 [pdf, ps, other]

Radar Rainbow Beams For Wideband mmWave Communication: Beam Training And Tracking

Authors: Gui Zhou, Moritz Garkisch, Zhendong Peng, Cunhua Pan, Robert Schober

Abstract: We propose a novel integrated sensing and communication (ISAC) system that leverages sensing to assist communication, ensuring fast initial access, seamless user tracking, and uninterrupted communication for millimeter wave (mmWave) wideband systems. True-time-delayers (TTDs) are utilized to generate frequency-dependent radar rainbow beams by controlling the beam squint effect. These beams cover u… ▽ More We propose a novel integrated sensing and communication (ISAC) system that leverages sensing to assist communication, ensuring fast initial access, seamless user tracking, and uninterrupted communication for millimeter wave (mmWave) wideband systems. True-time-delayers (TTDs) are utilized to generate frequency-dependent radar rainbow beams by controlling the beam squint effect. These beams cover users across the entire angular space simultaneously for fast beam training using just one orthogonal frequency-division multiplexing (OFDM) symbol. Three detection and estimation schemes are proposed based on radar rainbow beams for estimation of the users' angles, distances, and velocities, which are then exploited for communication beamformer design. The first proposed scheme utilizes a single-antenna radar receiver and one set of rainbow beams, but may cause a Doppler ambiguity. To tackle this limitation, two additional schemes are introduced, utilizing two sets of rainbow beams and a multi-antenna receiver, respectively. Furthermore, the proposed detection and estimation schemes are extended to realize user tracking by choosing different subsets of OFDM subcarriers. This approach eliminates the need to switch phase shifters and TTDs, which are typically necessary in existing tracking technologies, thereby reducing the demands on the control circurity. Simulation results reveal the effectiveness of the proposed rainbow beam-based training and tracking methods for mobile users. Notably, the scheme employing a multi-antenna radar receiver can accurately estimate the channel parameters and can support communication rates comparable to those achieved with perfect channel information. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 32 pages

arXiv:2403.09319 [pdf, ps, other]

Anomalous quantum scattering and transport of electrons with Mexican-hat dispersion induced by electrical potential

Authors: Jiating Yao, Benliang Zhou, Xiaoying Zhou, Xianbo Xiao, Guanghui Zhou

Abstract: We theoretically study the quantum scattering and transport of electrons with Mexican-hat dispersion through both step and rectangular potential barriers by using the transfer matrix method. Owing to the torus-like iso-energy lines of the Mexican-hat dispersion, we observe the presence of double reflections and double transmissions in both two different barrier scenarios, i.e., the normal reflecti… ▽ More We theoretically study the quantum scattering and transport of electrons with Mexican-hat dispersion through both step and rectangular potential barriers by using the transfer matrix method. Owing to the torus-like iso-energy lines of the Mexican-hat dispersion, we observe the presence of double reflections and double transmissions in both two different barrier scenarios, i.e., the normal reflection (NR), retro-reflection (RR), normal transmission (NT), and specular transmission (ST).For the step potential with electrons incident from the large wavevector, the transmission is primarily governed by NT with nearly negligible ST, while the reflection is dominant by RR (NR) within (outside) the critical angle. Additionally, for electrons incident from the small wavevector, the NT can be reduced to zero by adjusting the barrier, resulting in a significant enhancement of ST and RR. For the rectangular barrier, the transmission and reflection spectra resemble those of the step barrier, but there are two kinds of resonant tunneling which can lead to perfect NT or ST. There exists a negative differential conductance (NDC) effect in the conductance spectrum. The conductance and the peak-to-valley ratio of the NDC effect can be effectively controlled by adjusting the height and width of the barrier as well as the incident energy. Our results provide a deeper understanding of the electron states governed by the Mexican-hat dispersion. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 8 pages, 5 figures

arXiv:2403.07027 [pdf, ps, other]

FWin transformer for dengue prediction under climate and ocean influence

Authors: Nhat Thanh Tran, Jack Xin, Guofa Zhou

Abstract: Dengue fever is one of the most deadly mosquito-born tropical infectious diseases. Detailed long range forecast model is vital in controlling the spread of disease and making mitigation efforts. In this study, we examine methods used to forecast dengue cases for long range predictions. The dataset consists of local climate/weather in addition to global climate indicators of Singapore from 2000 to… ▽ More Dengue fever is one of the most deadly mosquito-born tropical infectious diseases. Detailed long range forecast model is vital in controlling the spread of disease and making mitigation efforts. In this study, we examine methods used to forecast dengue cases for long range predictions. The dataset consists of local climate/weather in addition to global climate indicators of Singapore from 2000 to 2019. We utilize newly developed deep neural networks to learn the intricate relationship between the features. The baseline models in this study are in the class of recent transformers for long sequence forecasting tasks. We found that a Fourier mixed window attention (FWin) based transformer performed the best in terms of both the mean square error and the maximum absolute error on the long range dengue forecast up to 60 weeks. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2403.06818 [pdf, other]

User Tracking and Direction Estimation Codebook Design for IRS-Assisted mmWave Communication

Authors: Moritz Garkisch, Sebastian Lotter, Gui Zhou, Vahid Jamali, Robert Schober

Abstract: Future communication systems are envisioned to employ intelligent reflecting surfaces (IRSs) and the millimeter wave (mmWave) frequency band to provide reliable high-rate services. For mobile users, the time-varying channel state information (CSI) requires adequate adjustment of the reflection pattern of the IRS. We propose a novel codebook-based user tracking (UT) algorithm for IRS-assisted mmWav… ▽ More Future communication systems are envisioned to employ intelligent reflecting surfaces (IRSs) and the millimeter wave (mmWave) frequency band to provide reliable high-rate services. For mobile users, the time-varying channel state information (CSI) requires adequate adjustment of the reflection pattern of the IRS. We propose a novel codebook-based user tracking (UT) algorithm for IRS-assisted mmWave communication, allowing suitable reconfiguration of the IRS unit cell phase shifts, resulting in a high reflection gain. The presented algorithm acquires the direction information of the user based on a peak likelihood-based direction estimation. Using the direction information, the user's trajectory is extrapolated to proactively update the adopted codeword and adjust the IRS phase shift configuration accordingly. Furthermore, we conduct a theoretical analysis of the direction estimation error and utilize the obtained insights to design a codebook specifically optimized for direction estimation. Our numerical results reveal a lower direction estimation error of the proposed UT algorithm when employing our designed codebook compared to codebooks from the literature. Furthermore, the average achieved signal-to-noise ratio (SNR) as well as the average effective rate of the proposed UT algorithm are analyzed. The proposed UT algorithm requires only a low overhead for direction and channel estimation and avoids outdated IRS phase shifts. Furthermore, it is shown to outperform two benchmark schemes based on direct phase shift optimization and hierarchical codebook search, respectively, via computer simulations. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Showing 1–50 of 739 results for author: ZHou, G