Search | arXiv e-print repository

arXiv:2406.19043 [pdf]

CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 19 pages, 3 figures, 2 tables

arXiv:2406.18280 [pdf, other]

Exploring quantum weight enumerators from the $n$-qubit parallelized SWAP test

Authors: Fei Shi, Kaiyi Guo, Xiande Zhang, Qi Zhao

Abstract: Quantum weight enumerators play a crucial role in quantum error-correcting codes and multipartite entanglement. They can be used to investigate the existence of quantum error-correcting codes and $k$-uniform states. In this work, we build the connection between quantum weight enumerators and the $n$-qubit parallelized SWAP test. We discover that each shadow enumerator corresponds precisely to a pr… ▽ More Quantum weight enumerators play a crucial role in quantum error-correcting codes and multipartite entanglement. They can be used to investigate the existence of quantum error-correcting codes and $k$-uniform states. In this work, we build the connection between quantum weight enumerators and the $n$-qubit parallelized SWAP test. We discover that each shadow enumerator corresponds precisely to a probability in the $n$-qubit parallelized SWAP test, providing a computable and operational meaning for the shadow enumerators. Due to the non-negativity of probabilities, we obtain an elegant proof for the shadow inequalities. Concurrently, we can also calculate the Shor-Laflamme enumerators and the Rains unitary enumerators from the $n$-qubit parallelized SWAP test. For applications, we employ the $n$-qubit parallelized SWAP test to determine the distances of quantum error-correcting codes, and the $k$-uniformity of pure states. Our results indicate that quantum weight enumerators can be efficiently estimated on quantum computers, and opening a path to calculate the distances of quantum error-correcting codes. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.18237 [pdf, other]

PlaMo: Plan and Move in Rich 3D Physical Environments

Authors: Assaf Hallak, Gal Dalal, Chen Tessler, Kelly Guo, Shie Mannor, Gal Chechik

Abstract: Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based… ▽ More Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based controller. The path planner produces a sequence of motion paths, considering the various limitations the scene imposes on the motion, such as location, height, and speed. Complementing the planner, our control policy generates rich and realistic physical motion adhering to the plan. We demonstrate how the combination of both modules enables traversing complex landscapes in diverse forms while responding to real-time changes in the environment. Video: https://youtu.be/wWlqSQlRZ9M . △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.16933 [pdf, other]

SGSM: A Foundation-model-like Semi-generalist Sensing Model

Authors: Tianjian Yang, Hao Zhou, Shuo Liu, Kaiwen Guo, Yiwen Hou, Haohua Du, Zhi Liu, Xiang-Yang Li

Abstract: The significance of intelligent sensing systems is growing in the realm of smart services. These systems extract relevant signal features and generate informative representations for particular tasks. However, building the feature extraction component for such systems requires extensive domain-specific expertise or data. The exceptionally rapid development of foundation models is likely to usher i… ▽ More The significance of intelligent sensing systems is growing in the realm of smart services. These systems extract relevant signal features and generate informative representations for particular tasks. However, building the feature extraction component for such systems requires extensive domain-specific expertise or data. The exceptionally rapid development of foundation models is likely to usher in newfound abilities in such intelligent sensing. We propose a new scheme for sensing model, which we refer to as semi-generalist sensing model (SGSM). SGSM is able to semiautomatically solve various tasks using relatively less task-specific labeled data compared to traditional systems. Built through the analysis of the common theoretical model, SGSM can depict different modalities, such as the acoustic and Wi-Fi signal. Experimental results on such two heterogeneous sensors illustrate that SGSM functions across a wide range of scenarios, thereby establishing its broad applicability. In some cases, SGSM even achieves better performance than sensor-specific specialized solutions. Wi-Fi evaluations indicate a 20\% accuracy improvement when applying SGSM to an existing sensing model. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.13565 [pdf, other]

Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization

Authors: Zijie Lou, Gang Cao, Kun Guo, Haochen Zhu, Lifang Yu

Abstract: Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label map**s without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-w… ▽ More Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label map**s without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization. Specifically, we first pre-train the backbone network with the supervised contrastive loss to model pixel relationships from the perspectives of within-image, cross-scale and cross-modality. That is aimed at increasing intra-class compactness and inter-class separability. Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer. The MPC is trained on three different scale training datasets to make a comprehensive and fair comparison with existing image forgery localization algorithms. Extensive experiments on the small, medium and large scale training datasets show that the proposed MPC achieves higher generalization performance and robustness against post-processing than the state-of-the-arts. Code will be available at https://github.com/multimediaFor/MPC. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.11937 [pdf, other]

Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter

Authors: M. Aamir, B. Acar, G. Adamov, T. Adams, C. Adloff, S. Afanasiev, C. Agrawal, C. Agrawal, A. Ahmad, H. A. Ahmed, S. Akbar, N. Akchurin, B. Akgul, B. Akgun, R. O. Akpinar, E. Aktas, A. AlKadhim, V. Alexakhin, J. Alimena, J. Alison, A. Alpana, W. Alshehri, P. Alvarez Dominguez, M. Alyari, C. Amendola , et al. (550 additional authors not shown)

Abstract: A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr… ▽ More A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated. △ Less

Submitted 30 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JINST

arXiv:2405.10570 [pdf]

Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI

Authors: Yirong Zhou, Chengyan Wang, Mengtian Lu, Kunyuan Guo, Zi Wang, Dan Ruan, Rui Guo, Peijun Zhao, Jianhua Wang, Naiming Wu, Jianzhong Lin, Yinyin Chen, Hang **, Lianxin Xie, Lilan Wu, Liuhong Zhu, Jianjun Zhou, Congbo Cai, He Wang, Xiaobo Qu

Abstract: In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features… ▽ More In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features a T2-refine fusion decoder for quantitative analysis, leveraging global features from the Transformer, and a segmentation decoder with multiple local region supervision for enhanced accuracy. A tight coupling module aligns and fuses CNN and Transformer branch features, enabling SQNet to focus on myocardium regions. Evaluation on healthy controls (HC) and acute myocardial infarction patients (AMI) demonstrates superior segmentation dice scores (89.3/89.2) compared to state-of-the-art methods (87.7/87.9). T2 quantification yields strong linear correlations (Pearson coefficients: 0.84/0.93) with label values for HC/AMI, indicating accurate map**. Radiologist evaluations confirm SQNet's superior image quality scores (4.60/4.58 for segmentation, 4.32/4.42 for T2 quantification) over state-of-the-art methods (4.50/4.44 for segmentation, 3.59/4.37 for T2 quantification). SQNet thus offers accurate simultaneous segmentation and quantification, enhancing cardiac disease diagnosis, such as AMI. △ Less

Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: 10 pages, 8 figures, 6 tables

arXiv:2405.09280 [pdf, other]

1/3 and other magnetization plateaus in a quasi-one-dimensional Ising magnet $\mathbf{TbTi_3Bi_4}$ with zigzag spin chain

Authors: Kaizhen Guo, Zeyu Ma, Hongxiong Liu, Ziyang Wu, Junfeng Wang, Youguo Shi, Yuan Li, Shuang Jia

Abstract: We report the magnetic properties of newly synthesized, single crystals of $\mathrm{TbTi_3Bi_4}$ whose crystal structure is highlighted by the stacking of terbium-based zigzag chains and titanium-based kagome lattices. This compound demonstrates extreme easy-axis magnetic anisotropy due to the crystalline-electric-field effect which aligns the $\mathrm{Tb^{3+}}$ moments along the zigzag chain dire… ▽ More We report the magnetic properties of newly synthesized, single crystals of $\mathrm{TbTi_3Bi_4}$ whose crystal structure is highlighted by the stacking of terbium-based zigzag chains and titanium-based kagome lattices. This compound demonstrates extreme easy-axis magnetic anisotropy due to the crystalline-electric-field effect which aligns the $\mathrm{Tb^{3+}}$ moments along the zigzag chain direction. As the result of the strong single-ion anisotropy and multiple magnetic interactions, $\mathrm{TbTi_3Bi_4}$ behaves as a quasi-one-dimensional Ising magnet with a remarkable antiferromagnetic ordering at $T_\mathrm{N}$ = 20.4 K. When a magnetic field is applied along the direction of the zigzag chain, multiple meta-magnetic transitions occur between 1/3 and other magnetization plateaus. We have created a field-temperature phase diagram and mapped out the complex magnetic structures resulting from frustration. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 10 pages, 10 figures

arXiv:2405.08438 [pdf, other]

Magnetic fluctuation and dominant superconducting pairing symmetry near the tunable Van Hove singularity

Authors: Xiaohan Kong, Boyang Wen, Kaiyi Guo, Ying Liang, Tianxing Ma

Abstract: We have investigated the magnetism and pairing correlations of the triangular lattice based on the Hubbard model using the determinant quantum Monte Carlo method and the constrained path Monte Carlo. The results show that the presence of the next-nearest-neighbor hop** integral $t^{\prime}$ introduces an additional energy scale to the system, and through $t^{\prime}$, one can regulate the shape… ▽ More We have investigated the magnetism and pairing correlations of the triangular lattice based on the Hubbard model using the determinant quantum Monte Carlo method and the constrained path Monte Carlo. The results show that the presence of the next-nearest-neighbor hop** integral $t^{\prime}$ introduces an additional energy scale to the system, and through $t^{\prime}$, one can regulate the shape of the density of states and thus the position of the van Hove singularity point. Increasing inverse temperature $β$ and on-site interaction $U$ favor the formation of ferromagnetic correlation in a rather large filling region, and the calculations for different lattice sizes show that the range of the ferromagnetic correlations is smaller than the smallest lattice simulated at the investigated temperatures. We study the different pairing correlations of the triangular lattice near several typical fillings and show that the $f$-wave pairing dominates the system in the filling region near the van Hove singularity point with a high density of states, where the ferromagnetic correlation is also enhanced. When the filling is close to half-filling, the pairing susceptibility with $f$ wave is suppressed and the pairing susceptibility of $f_n$ wave is enhanced, however, both the effective pairing interaction with $f$ wave and $f_n$ wave are negative, which indicates that neither $f$-wave nor $f_n$-wave superconductivity may exist. Finally, we find that the pairing channel of different symmetry in the system maybe closely related to the magnetic properties. Ferromagnetic fluctuation favors the formation of $f$-wave pairing, while antiferromagnetic fluctuation tends to promote $f_n$-wave pairing. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 7 pages and 9 figures. Accepted for publication as a Regular Article in Physical Review B

arXiv:2405.08125 [pdf, other]

AI-Cybersecurity Education Through Designing AI-based Cyberharassment Detection Lab

Authors: Ebuka Okpala, Nishant Vishwamitra, Keyan Guo, Song Liao, Long Cheng, Hongxin Hu, Yongkai Wu, Xiaohong Yuan, Jeannette Wade, Sajad Khorsandroo

Abstract: Cyberharassment is a critical, socially relevant cybersecurity problem because of the adverse effects it can have on targeted groups or individuals. While progress has been made in understanding cyber-harassment, its detection, attacks on artificial intelligence (AI) based cyberharassment systems, and the social problems in cyberharassment detectors, little has been done in designing experiential… ▽ More Cyberharassment is a critical, socially relevant cybersecurity problem because of the adverse effects it can have on targeted groups or individuals. While progress has been made in understanding cyber-harassment, its detection, attacks on artificial intelligence (AI) based cyberharassment systems, and the social problems in cyberharassment detectors, little has been done in designing experiential learning educational materials that engage students in this emerging social cybersecurity in the era of AI. Experiential learning opportunities are usually provided through capstone projects and engineering design courses in STEM programs such as computer science. While capstone projects are an excellent example of experiential learning, given the interdisciplinary nature of this emerging social cybersecurity problem, it can be challenging to use them to engage non-computing students without prior knowledge of AI. Because of this, we were motivated to develop a hands-on lab platform that provided experiential learning experiences to non-computing students with little or no background knowledge in AI and discussed the lessons learned in develo** this lab. In this lab used by social science students at North Carolina A&T State University across two semesters (spring and fall) in 2022, students are given a detailed lab manual and are to complete a set of well-detailed tasks. Through this process, students learn AI concepts and the application of AI for cyberharassment detection. Using pre- and post-surveys, we asked students to rate their knowledge or skills in AI and their understanding of the concepts learned. The results revealed that the students moderately understood the concepts of AI and cyberharassment. △ Less

Submitted 16 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

Comments: 10 pages

arXiv:2405.05928 [pdf]

Moderating Embodied Cyber Threats Using Generative AI

Authors: Keyan Guo, Freeman Guo, Hongxin Hu

Abstract: The advancement in computing and hardware, like spatial computing and VR headsets (e.g., Apple's Vision Pro) [1], has boosted the popularity of social VR platforms (VRChat, Rec Room, Meta HorizonWorlds) [2, 3, 4]. Unlike traditional digital interactions, social VR allows for more immersive experiences, with avatars that mimic users' real-time movements and enable physical-like interactions. Howeve… ▽ More The advancement in computing and hardware, like spatial computing and VR headsets (e.g., Apple's Vision Pro) [1], has boosted the popularity of social VR platforms (VRChat, Rec Room, Meta HorizonWorlds) [2, 3, 4]. Unlike traditional digital interactions, social VR allows for more immersive experiences, with avatars that mimic users' real-time movements and enable physical-like interactions. However, the immersive nature of social VR may introduce intensified and more physicalized cyber threats-we define as "embodied cyber threats", including trash-talking, virtual "gro**", and such virtual harassment and assault. These new cyber threats are more realistic and invasive due to direct, virtual interactions, underscoring the urgent need for comprehensive understanding and practical strategies to enhance safety and security in virtual environments. △ Less

Submitted 23 April, 2024; originally announced May 2024.

Comments: This is an accepted position statement of CHI 2024 Workshop (Novel Approaches for Understanding and Mitigating Emerging New Harms in Immersive and Embodied Virtual Spaces: A Workshop at CHI 2024)

arXiv:2405.00482 [pdf, other]

PackVFL: Efficient HE Packing for Vertical Federated Learning

Authors: Liu Yang, Shuowei Cai, Di Chai, Junxue Zhang, Han Tian, Yilun **, Kun Guo, Kai Chen, Qiang Yang

Abstract: As an essential tool of secure distributed machine learning, vertical federated learning (VFL) based on homomorphic encryption (HE) suffers from severe efficiency problems due to data inflation and time-consuming operations. To this core, we propose PackVFL, an efficient VFL framework based on packed HE (PackedHE), to accelerate the existing HE-based VFL algorithms. PackVFL packs multiple cleartex… ▽ More As an essential tool of secure distributed machine learning, vertical federated learning (VFL) based on homomorphic encryption (HE) suffers from severe efficiency problems due to data inflation and time-consuming operations. To this core, we propose PackVFL, an efficient VFL framework based on packed HE (PackedHE), to accelerate the existing HE-based VFL algorithms. PackVFL packs multiple cleartexts into one ciphertext and supports single-instruction-multiple-data (SIMD)-style parallelism. We focus on designing a high-performant matrix multiplication (MatMult) method since it takes up most of the ciphertext computation time in HE-based VFL. Besides, devising the MatMult method is also challenging for PackedHE because a slight difference in the packing way could predominantly affect its computation and communication costs. Without domain-specific design, directly applying SOTA MatMult methods is hard to achieve optimal. Therefore, we make a three-fold design: 1) we systematically explore the current design space of MatMult and quantify the complexity of existing approaches to provide guidance; 2) we propose a hybrid MatMult method according to the unique characteristics of VFL; 3) we adaptively apply our hybrid method in representative VFL algorithms, leveraging distinctive algorithmic properties to further improve efficiency. As the batch size, feature dimension and model size of VFL scale up to large sizes, PackVFL consistently delivers enhanced performance. Empirically, PackVFL propels existing VFL algorithms to new heights, achieving up to a 51.52X end-to-end speedup. This represents a substantial 34.51X greater speedup compared to the direct application of SOTA MatMult methods. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: 12 pages excluding references

arXiv:2404.07149 [pdf, other]

Tianyu: search for the second solar system and explore the dynamic universe

Authors: Fabo Feng, Yicheng Rui, Zhimao Du, Qing Lin, Congcong Zhang, Dan Zhou, Kaiming Cui, Masahiro Ogihara, Ming Yang, Jie Lin, Yongzhi Cai, Taozhi Yang, Xiaoying Pang, Mingjie Jian, Wenxiong Li, Hengxiao Guo, Xian Shi, Jianchun Shi, Jianyang Li, Kangrou Guo, Song Yao, Aming Chen, Peng Jia, Xianyu Tan, James S. Jenkins , et al. (10 additional authors not shown)

Abstract: Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to l… ▽ More Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to long period planets due to their limited observation baseline. The Tianyu project, comprising two 1-meter telescopes (Tianyu-I and II), is designed to detect transiting cold giant planets in order to find solar system analogs. Featuring a large field of view and equipped with a high-speed CMOS camera, Tianyu-I will perform a high-precision photometric survey of about 100 million stars, measuring light curves at hour-long cadence. The candidates found by Tianyu-I will be confirmed by Tianyu-II and other surveys and follow-up facilities through multi-band photometry, spectroscopy, and high resolution imaging. Tianyu telescopes will be situated at an elevation about 4000 meters in Lenghu, China. With a photometric precision of 1% for stars with V < 18 mag, Tianyu is expected to find more than 300 transiting exoplanets, including about 12 cold giant planets, over five years. A five-year survey of Tianyu would discover 1-2 solar system analogs. Moreover, Tianyu is also designed for non-exoplanetary exploration, incorporating multiple survey modes covering timescales from sub-seconds to months, with a particular emphasis on events occurring within the sub-second to hour range. It excels in observing areas such as infant supernovae, rare variable stars and binaries, tidal disruption events, Be stars, cometary activities, and interstellar objects. These discoveries not only enhance our comprehension of the universe but also offer compelling opportunities for public engagement in scientific exploration. △ Less

Submitted 10 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: 48 pages, 16 figures, accepted by Acta Astronomica Sinica

arXiv:2404.07033 [pdf, other]

doi 10.3847/1538-4357/ad3deb

Relative Occurrence Rate Between Hot and Cold Jupiters as an Indicator to Probe Planet Migration

Authors: Tianjun Gan, Kangrou Guo, Beibei Liu, Sharon X. Wang, Shude Mao, Johannes Buchner, Benjamin J. Fulton

Abstract: We propose a second-order statistic parameter $\varepsilon$, the relative occurrence rate between hot and cold Jupiters ($\varepsilon=η_{\rm HJ}/η_{\rm CJ}$), to probe the migration of gas giants. Since the planet occurrence rate is the combined outcome of the formation and migration processes, a joint analysis of hot and cold Jupiter frequency may shed light on the dynamical evolution of giant pl… ▽ More We propose a second-order statistic parameter $\varepsilon$, the relative occurrence rate between hot and cold Jupiters ($\varepsilon=η_{\rm HJ}/η_{\rm CJ}$), to probe the migration of gas giants. Since the planet occurrence rate is the combined outcome of the formation and migration processes, a joint analysis of hot and cold Jupiter frequency may shed light on the dynamical evolution of giant planet systems. We first investigate the behavior of $\varepsilon$ as the stellar mass changes observationally. Based on the occurrence rate measurements of hot Jupiters ($η_{\rm HJ}$) from the TESS survey and cold Jupiters ($η_{\rm CJ}$) from the CLS survey, we find a tentative trend (97% confidence) that $\varepsilon$ drops when the stellar mass rises from $0.8$ to $1.4\ M_\odot$, which can be explained by different giant planet growth and disk migration timescales around different stars. We carry out planetesimal and pebble accretion simulations, both of which could reproduce the results of $η_{\rm HJ}$, $η_{\rm CJ}$ and $\varepsilon$. Our findings indicate that the classical core accretion + disk migration model can explain the observed decreasing trend of $\varepsilon$. We propose two ways to increase the significance of the trend and verify the anti-correlation. Future works are required to better constrain $\varepsilon$, especially for M dwarfs and for more massive stars. △ Less

Submitted 12 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: Accepted for publication in ApJ, 14 pages, 6 figures, 3 tables

arXiv:2404.02236 [pdf, other]

Selected Open Problems in Continuous-Time Quantum Walks

Authors: Gabriel Coutinho, Krystal Guo

Abstract: Quantum walks on graphs are fundamental to quantum computing and have led to many interesting open problems in algebraic graph theory. This review article highlights three key classes of open problems in this domain; perfect state transfer, instantaneous uniform mixing, and average mixing matrices. In highlighting these open problems, our aim is to stimulate further research and exploration in thi… ▽ More Quantum walks on graphs are fundamental to quantum computing and have led to many interesting open problems in algebraic graph theory. This review article highlights three key classes of open problems in this domain; perfect state transfer, instantaneous uniform mixing, and average mixing matrices. In highlighting these open problems, our aim is to stimulate further research and exploration in this rapidly evolving field. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 16 pages, 4 figures

MSC Class: 05C50; 05E30

arXiv:2404.01082 [pdf, other]

The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023

Authors: Jun Lyu, Chen Qin, Shuo Wang, Fanwen Wang, Yan Li, Zi Wang, Kunyuan Guo, Cheng Ouyang, Michael Tänzer, Meng Liu, Longyu Sun, Mengting Sun, Qin Li, Zhang Shi, Sha Hua, Hao Li, Zhensen Chen, Zhenlin Zhang, Bingyu Xin, Dimitris N. Metaxas, George Yiasemis, Jonas Teuwen, Li** Zhang, Weitian Chen, Yidong Zhao , et al. (25 additional authors not shown)

Abstract: Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p… ▽ More Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation platform hinder the development of data-driven reconstruction algorithms. To address this issue, we organized the Cardiac MRI Reconstruction Challenge (CMRxRecon) in 2023, in collaboration with the 26th International Conference on MICCAI. CMRxRecon presented an extensive k-space dataset comprising cine and map** raw data, accompanied by detailed annotations of cardiac anatomical structures. With overwhelming participation, the challenge attracted more than 285 teams and over 600 participants. Among them, 22 teams successfully submitted Docker containers for the testing phase, with 7 teams submitted for both cine and map** tasks. All teams use deep learning based approaches, indicating that deep learning has predominately become a promising solution for the problem. The first-place winner of both tasks utilizes the E2E-VarNet architecture as backbones. In contrast, U-Net is still the most popular backbone for both multi-coil and single-coil reconstructions. This paper provides a comprehensive overview of the challenge design, presents a summary of the submitted results, reviews the employed methods, and offers an in-depth discussion that aims to inspire future advancements in cardiac MRI reconstruction models. The summary emphasizes the effective strategies observed in Cardiac MRI reconstruction, including backbone architecture, loss function, pre-processing techniques, physical modeling, and model complexity, thereby providing valuable insights for further developments in this field. △ Less

Submitted 16 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

Comments: 25 pages, 17 figures

arXiv:2403.18957 [pdf, other]

Moderating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large Vision-Language Models

Authors: Keyan Guo, Ayush Utkarsh, Wenbo Ding, Isabelle Ondracek, Ziming Zhao, Guo Freeman, Nishant Vishwamitra, Hongxin Hu

Abstract: Online user-generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promoti… ▽ More Online user-generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promotions of unsafe UGCGs on social media, which can inadvertently attract young users. This challenge arises from the difficulty of obtaining comprehensive training data for UGCG images and the unique nature of these images, which differ from traditional unsafe content. In this work, we take the first step towards studying the threat of illicit promotions of unsafe UGCGs. We collect a real-world dataset comprising 2,924 images that display diverse sexually explicit and violent content used to promote UGCGs by their game creators. Our in-depth studies reveal a new understanding of this problem and the urgent need for automatically flagging illicit UGCG promotions. We additionally create a cutting-edge system, UGCG-Guard, designed to aid social media platforms in effectively identifying images used for illicit UGCG promotions. This system leverages recently introduced large vision-language models (VLMs) and employs a novel conditional prompting strategy for zero-shot domain adaptation, along with chain-of-thought (CoT) reasoning for contextual identification. UGCG-Guard achieves outstanding results, with an accuracy rate of 94% in detecting these images used for the illicit promotion of such games in real-world scenarios. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: To Appear in the 33rd USENIX Security Symposium, August 14-16, 2024

arXiv:2403.17601 [pdf, other]

LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation

Authors: Ke Guo, Zhenwei Miao, Wei **g, Weiwei Liu, Weizi Li, Dayang Hao, Jia Pan

Abstract: Microscopic traffic simulation plays a crucial role in transportation engineering by providing insights into individual vehicle behavior and overall traffic flow. However, creating a realistic simulator that accurately replicates human driving behaviors in various traffic conditions presents significant challenges. Traditional simulators relying on heuristic models often fail to deliver accurate s… ▽ More Microscopic traffic simulation plays a crucial role in transportation engineering by providing insights into individual vehicle behavior and overall traffic flow. However, creating a realistic simulator that accurately replicates human driving behaviors in various traffic conditions presents significant challenges. Traditional simulators relying on heuristic models often fail to deliver accurate simulations due to the complexity of real-world traffic environments. Due to the covariate shift issue, existing imitation learning-based simulators often fail to generate stable long-term simulations. In this paper, we propose a novel approach called learner-aware supervised imitation learning to address the covariate shift problem in multi-agent imitation learning. By leveraging a variational autoencoder simultaneously modeling the expert and learner state distribution, our approach augments expert states such that the augmented state is aware of learner state distribution. Our method, applied to urban traffic simulation, demonstrates significant improvements over existing state-of-the-art baselines in both short-term microscopic and long-term macroscopic realism when evaluated on the real-world dataset pNEUMA. △ Less

Submitted 23 May, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024. arXiv admin note: text overlap with arXiv:2306.06401

arXiv:2402.16586 [pdf, other]

Improving the JPEG-resistance of Adversarial Attacks on Face Recognition by Interpolation Smoothing

Authors: Kefu Guo, Fengfan Zhou, Hefei Ling, ** Li, Hui Liu

Abstract: JPEG compression can significantly impair the performance of adversarial face examples, which previous adversarial attacks on face recognition (FR) have not adequately addressed. Considering this challenge, we propose a novel adversarial attack on FR that aims to improve the resistance of adversarial examples against JPEG compression. Specifically, during the iterative process of generating advers… ▽ More JPEG compression can significantly impair the performance of adversarial face examples, which previous adversarial attacks on face recognition (FR) have not adequately addressed. Considering this challenge, we propose a novel adversarial attack on FR that aims to improve the resistance of adversarial examples against JPEG compression. Specifically, during the iterative process of generating adversarial face examples, we interpolate the adversarial face examples into a smaller size. Then we utilize these interpolated adversarial face examples to create the adversarial examples in the next iteration. Subsequently, we restore the adversarial face examples to their original size by interpolating. Throughout the entire process, our proposed method can smooth the adversarial perturbations, effectively mitigating the presence of high-frequency signals in the crafted adversarial face examples that are typically eliminated by JPEG compression. Our experimental results demonstrate the effectiveness of our proposed method in improving the JPEG-resistance of adversarial face examples. △ Less

Submitted 26 February, 2024; originally announced February 2024.

arXiv:2402.13148 [pdf, other]

Defending Jailbreak Prompts via In-Context Adversarial Game

Authors: Yujun Zhou, Yufei Han, Haomin Zhuang, Taicheng Guo, Kehan Guo, Zhenwen Liang, Hongyan Bao, Xiangliang Zhang

Abstract: Large Language Models (LLMs) demonstrate remarkable capabilities across diverse applications. However, concerns regarding their security, particularly the vulnerability to jailbreak attacks, persist. Drawing inspiration from adversarial training in deep learning and LLM agent learning processes, we introduce the In-Context Adversarial Game (ICAG) for defending against jailbreaks without the need f… ▽ More Large Language Models (LLMs) demonstrate remarkable capabilities across diverse applications. However, concerns regarding their security, particularly the vulnerability to jailbreak attacks, persist. Drawing inspiration from adversarial training in deep learning and LLM agent learning processes, we introduce the In-Context Adversarial Game (ICAG) for defending against jailbreaks without the need for fine-tuning. ICAG leverages agent learning to conduct an adversarial game, aiming to dynamically extend knowledge to defend against jailbreaks. Unlike traditional methods that rely on static datasets, ICAG employs an iterative process to enhance both the defense and attack agents. This continuous improvement process strengthens defenses against newly generated jailbreak prompts. Our empirical studies affirm ICAG's efficacy, where LLMs safeguarded by ICAG exhibit significantly reduced jailbreak success rates across various attack scenarios. Moreover, ICAG demonstrates remarkable transferability to other LLMs, indicating its potential as a versatile defense mechanism. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.08888 [pdf, other]

Quantum Light Generation based on GaN Microring towards Fully On-chip Source

Authors: Hong Zeng, Zhao-Qin He, Yun-Ru Fan, Yue Luo, Chen Lyu, **-Peng Wu, Yun-Bo Li, Sheng Liu, Dong Wang, De-Chao Zhang, Juan-Juan Zeng, Guang-Wei Deng, You Wang, Hai-Zhi Song, Zhen Wang, Li-Xing You, Kai Guo, Chang-Zheng Sun, Yi Luo, Guang-Can Guo, Qiang Zhou

Abstract: Integrated quantum light source is increasingly desirable in large-scale quantum information processing.~Despite recent remarkable advances, new material platform is constantly being explored for the fully on-chip integration of quantum light generation, active and passive manipulation, and detection. Here, for the first time, we demonstrate a gallium nitride (GaN) microring based quantum light ge… ▽ More Integrated quantum light source is increasingly desirable in large-scale quantum information processing.~Despite recent remarkable advances, new material platform is constantly being explored for the fully on-chip integration of quantum light generation, active and passive manipulation, and detection. Here, for the first time, we demonstrate a gallium nitride (GaN) microring based quantum light generation in the telecom C-band, which has potential towards the monolithic integration of quantum light source.~In our demonstration, the GaN microring has a free spectral range of 330 GHz and a near-zero anomalous dispersion region of over 100 nm. The generation of energy-time entangled photon pair is demonstrated with a typical raw two-photon interference visibility of 95.5$\pm$6.5%, which is further configured to generate heralded single photon with a typical heralded second-order auto-correlation $g^{(2)}_{H}(0)$ of 0.045$\pm$0.001. Our results pave the way for develo** chip-scale quantum photonic circuit. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.08228 [pdf, other]

Investigating Out-of-Distribution Generalization of GNNs: An Architecture Perspective

Authors: Kai Guo, Hongzhi Wen, Wei **, Yaming Guo, Jiliang Tang, Yi Chang

Abstract: Graph neural networks (GNNs) have exhibited remarkable performance under the assumption that test data comes from the same distribution of training data. However, in real-world scenarios, this assumption may not always be valid. Consequently, there is a growing focus on exploring the Out-of-Distribution (OOD) problem in the context of graphs. Most existing efforts have primarily concentrated on im… ▽ More Graph neural networks (GNNs) have exhibited remarkable performance under the assumption that test data comes from the same distribution of training data. However, in real-world scenarios, this assumption may not always be valid. Consequently, there is a growing focus on exploring the Out-of-Distribution (OOD) problem in the context of graphs. Most existing efforts have primarily concentrated on improving graph OOD generalization from two \textbf{model-agnostic} perspectives: data-driven methods and strategy-based learning. However, there has been limited attention dedicated to investigating the impact of well-known \textbf{GNN model architectures} on graph OOD generalization, which is orthogonal to existing research. In this work, we provide the first comprehensive investigation of OOD generalization on graphs from an architecture perspective, by examining the common building blocks of modern GNNs. Through extensive experiments, we reveal that both the graph self-attention mechanism and the decoupled architecture contribute positively to graph OOD generalization. In contrast, we observe that the linear classification layer tends to compromise graph OOD generalization capability. Furthermore, we provide in-depth theoretical insights and discussions to underpin these discoveries. These insights have empowered us to develop a novel GNN backbone model, DGAT, designed to harness the robust properties of both graph self-attention mechanism and the decoupled architecture. Extensive experimental results demonstrate the effectiveness of our model under graph OOD, exhibiting substantial and consistent enhancements across various training strategies. △ Less

Submitted 14 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.05138 [pdf, other]

SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark

Authors: Zhenwen Liang, Kehan Guo, Gang Liu, Taicheng Guo, Yujun Zhou, Tianyu Yang, Jiajun Jiao, Renjie Pi, Jipeng Zhang, Xiangliang Zhang

Abstract: The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level. It addresses a critical educational phase often overlooked in existing benchmarks, spanning high school to pre-college levels. SceMQA focuses on core science subjects including Mathematics, Physics, Chemistry, and Biology. It features a blend of multiple-choice and free-respon… ▽ More The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level. It addresses a critical educational phase often overlooked in existing benchmarks, spanning high school to pre-college levels. SceMQA focuses on core science subjects including Mathematics, Physics, Chemistry, and Biology. It features a blend of multiple-choice and free-response formats, ensuring a comprehensive evaluation of AI models' abilities. Additionally, our benchmark provides specific knowledge points for each problem and detailed explanations for each answer. SceMQA also uniquely presents problems with identical contexts but varied questions to facilitate a more thorough and accurate assessment of reasoning capabilities. In the experiment, we evaluate both open-source and close-source state-of-the-art Multimodal Large Language Models (MLLMs), across various experimental settings. The results show that further research and development are needed in develo** more capable MLLM, as highlighted by only 50% to 60% accuracy achieved by the strongest models. Our benchmark and analysis will be available at https://scemqa.github.io/ △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: Work in progress

arXiv:2401.17812 [pdf, other]

Deterministic Computing Power Networking: Architecture, Technologies and Prospects

Authors: Qingmin Jia, Yujiao Hu, Xiaomao Zhou, Qianpiao Ma, Kai Guo, Huayu Zhang, Renchao Xie, Tao Huang, Yunjie Liu

Abstract: With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research… ▽ More With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research of the convergence of computing and networking, a new network paradigm named deterministic computing power networking (Det-CPN) is proposed. In this article, we firstly introduce the research advance of computing power networking. And then the motivations and scenarios of Det-CPN are analyzed. Following that, we present the system architecture, technological capabilities, workflow as well as key technologies for Det-CPN. Finally, the challenges and future trends of Det-CPN are analyzed and discussed. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.11190 [pdf, other]

doi 10.1103/PhysRevD.109.103031

Probing orbits of stellar mass objects deep in galactic nuclei with quasi-periodic eruptions

Authors: Cong Zhou, Lei Huang, Kangrou Guo, Ya-** Li, Zhen Pan

Abstract: Quasi-periodic eruptions (QPEs) are intense repeating soft X-ray bursts with recurrence times about a few to ten hours from nearby galactic nuclei. The origin of QPEs is still unclear. In this work, we investigated the extreme mass ratio inspiral (EMRI) + accretion disk model, where the disk is formed from a previous tidal disruption event (TDE). In this EMRI+TDE disk model, the QPEs are the resul… ▽ More Quasi-periodic eruptions (QPEs) are intense repeating soft X-ray bursts with recurrence times about a few to ten hours from nearby galactic nuclei. The origin of QPEs is still unclear. In this work, we investigated the extreme mass ratio inspiral (EMRI) + accretion disk model, where the disk is formed from a previous tidal disruption event (TDE). In this EMRI+TDE disk model, the QPEs are the result of collisions between a TDE disk and a stellar mass object (a stellar mass black hole or a main sequence star) orbiting around a supermassive black hole (SMBH) in galactic nuclei. If this interpretation is correct, QPEs will be invaluable in probing the orbits of stellar mass objects in the vicinity of SMBHs, and further inferring the formation of EMRIs which are one of the primary targets of spaceborne gravitational wave missions. Taking GSN 069 as an example, we find the EMRI wherein is of low eccentricity ($e<0.1$ at 3-$σ$ confidence level) and semi-major axis about $O(10^2)$ gravitational radii of the central SMBH, which is consistent with the prediction of the wet EMRI formation channel, while incompatible with alternatives. △ Less

Submitted 21 May, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

Comments: 23 pages, 14 figures

Journal ref: Phys. Rev. D 109, 103031 (2024)

arXiv:2401.09188 [pdf, ps, other]

Hankel matrices acting on the Dirichlet space

Authors: Guanlong Bao, Kunyu Guo, Fangmei Sun, Zipeng Wang

Abstract: The characterization of the boundedness of operators induced by Hankel matrices on analytic function spaces can be traced back to the work of Z. Nehari and H. Widom on the Hardy space, and has been extensively studied on many other analytic function spaces recently. However, this question remains open in the context of the Dirichlet space [20]. By Carleson measures, the Widom type condition and th… ▽ More The characterization of the boundedness of operators induced by Hankel matrices on analytic function spaces can be traced back to the work of Z. Nehari and H. Widom on the Hardy space, and has been extensively studied on many other analytic function spaces recently. However, this question remains open in the context of the Dirichlet space [20]. By Carleson measures, the Widom type condition and the reproducing kernel thesis, this paper provides a comprehensive solution to this question. As a beneficial product, characterizations of the boundedness and compactness of operators induced by Cesàro type matrices on the Dirichlet space are given. In addition, we also show that a random Dirichlet function almost surely induces a compact Hankel type operator on the Dirichlet space. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.09175 [pdf, other]

doi 10.1145/3487553

QAnswer: Towards Question Answering Search over Websites

Authors: Kunpeng Guo, Clement Defretiere, Dennis Diefenbach, Christophe Gravier, Antoine Gourru

Abstract: Question Answering (QA) is increasingly used by search engines to provide results to their end-users, yet very few websites currently use QA technologies for their search functionality. To illustrate the potential of QA technologies for the website search practitioner, we demonstrate web searches that combine QA over knowledge graphs and QA over free text -- each being usually tackled separately.… ▽ More Question Answering (QA) is increasingly used by search engines to provide results to their end-users, yet very few websites currently use QA technologies for their search functionality. To illustrate the potential of QA technologies for the website search practitioner, we demonstrate web searches that combine QA over knowledge graphs and QA over free text -- each being usually tackled separately. We also discuss the different benefits and drawbacks of both approaches for web site searches. We use the case studies made of websites hosted by the Wikimedia Foundation (namely Wikipedia and Wikidata). Differently from a search engine (e.g. Google, Bing, etc), the data are indexed integrally, i.e. we do not index only a subset, and they are indexed exclusively, i.e. we index only data available on the corresponding website. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.09168 [pdf, other]

doi 10.1109/ICTAI59109.2023.00032

Fine-tuning Strategies for Domain Specific Question Answering under Low Annotation Budget Constraints

Authors: Kunpeng Guo, Dennis Diefenbach, Antoine Gourru, Christophe Gravier

Abstract: The progress introduced by pre-trained language models and their fine-tuning has resulted in significant improvements in most downstream NLP tasks. The unsupervised training of a language model combined with further target task fine-tuning has become the standard QA fine-tuning procedure. In this work, we demonstrate that this strategy is sub-optimal for fine-tuning QA models, especially under a l… ▽ More The progress introduced by pre-trained language models and their fine-tuning has resulted in significant improvements in most downstream NLP tasks. The unsupervised training of a language model combined with further target task fine-tuning has become the standard QA fine-tuning procedure. In this work, we demonstrate that this strategy is sub-optimal for fine-tuning QA models, especially under a low QA annotation budget, which is a usual setting in practice due to the extractive QA labeling cost. We draw our conclusions by conducting an exhaustive analysis of the performance of the alternatives of the sequential fine-tuning strategy on different QA datasets. Based on the experiments performed, we observed that the best strategy to fine-tune the QA model in low-budget settings is taking a pre-trained language model (PLM) and then fine-tuning PLM with a dataset composed of the target dataset and SQuAD dataset. With zero extra annotation effort, the best strategy outperforms the standard strategy by 2.28% to 6.48%. Our experiments provide one of the first investigations on how to best fine-tune a QA system under a low budget and are therefore of the utmost practical interest to the QA practitioners. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.07812 [pdf, other]

doi 10.1145/3543507

Wikidata as a seed for Web Extraction

Authors: Kunpeng Guo, Dennis Diefenbach, Antoine Gourru, Christophe Gravier

Abstract: Wikidata has grown to a knowledge graph with an impressive size. To date, it contains more than 17 billion triples collecting information about people, places, films, stars, publications, proteins, and many more. On the other side, most of the information on the Web is not published in highly structured data repositories like Wikidata, but rather as unstructured and semi-structured content, more c… ▽ More Wikidata has grown to a knowledge graph with an impressive size. To date, it contains more than 17 billion triples collecting information about people, places, films, stars, publications, proteins, and many more. On the other side, most of the information on the Web is not published in highly structured data repositories like Wikidata, but rather as unstructured and semi-structured content, more concretely in HTML pages containing text and tables. Finding, monitoring, and organizing this data in a knowledge graph is requiring considerable work from human editors. The volume and complexity of the data make this task difficult and time-consuming. In this work, we present a framework that is able to identify and extract new facts that are published under multiple Web domains so that they can be proposed for validation by Wikidata editors. The framework is relying on question-answering technologies. We take inspiration from ideas that are used to extract facts from textual collections and adapt them to extract facts from Web pages. For achieving this, we demonstrate that language models can be adapted to extract facts not only from textual collections but also from Web pages. By exploiting the information already contained in Wikidata the proposed framework can be trained without the need for any additional learning signals and can extract new facts for a wide range of properties and domains. Following this path, Wikidata can be used as a seed to extract facts on the Web. Our experiments show that we can achieve a mean performance of 84.07 at F1-score. Moreover, our estimations show that we can potentially extract millions of facts that can be proposed for human validation. The goal is to help editors in their daily tasks and contribute to the completion of the Wikidata knowledge graph. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.06453 [pdf, other]

Causally Aware Generative Adversarial Networks for Light Pollution Control

Authors: Yuyao Zhang, Ke Guo, Xiao Zhou

Abstract: Artificial light plays an integral role in modern cities, significantly enhancing human productivity and the efficiency of civilization. However, excessive illumination can lead to light pollution, posing non-negligible threats to economic burdens, ecosystems, and human health. Despite its critical importance, the exploration of its causes remains relatively limited within the field of artificial… ▽ More Artificial light plays an integral role in modern cities, significantly enhancing human productivity and the efficiency of civilization. However, excessive illumination can lead to light pollution, posing non-negligible threats to economic burdens, ecosystems, and human health. Despite its critical importance, the exploration of its causes remains relatively limited within the field of artificial intelligence, leaving an incomplete understanding of the factors contributing to light pollution and sustainable illumination planning distant. To address this gap, we introduce a novel framework named Causally Aware Generative Adversarial Networks (CAGAN). This innovative approach aims to uncover the fundamental drivers of light pollution within cities and offer intelligent solutions for optimal illumination resource allocation in the context of sustainable urban development. We commence by examining light pollution across 33,593 residential areas in seven global metropolises. Our findings reveal substantial influences on light pollution levels from various building types, notably grasslands, commercial centers and residential buildings as significant contributors. These discovered causal relationships are seamlessly integrated into the generative modeling framework, guiding the process of generating light pollution maps for diverse residential areas. Extensive experiments showcase CAGAN's potential to inform and guide the implementation of effective strategies to mitigate light pollution. Our code and data are publicly available at https://github.com/zhangyuuao/Light_Pollution_CAGAN. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: 9pages, 9figures, accepted by AAAI2024, AI for Social Impact (Special Track)

arXiv:2401.05334 [pdf, other]

URHand: Universal Relightable Hands

Authors: Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

Abstract: Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows f… ▽ More Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows few-shot personalization using images captured with a mobile phone, and is ready to be photorealistically rendered under novel illuminations. To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities. The key challenge is scaling the cross-identity training while maintaining personalized fidelity and sharp details without compromising generalization under natural illuminations. To this end, we propose a spatially varying linear lighting model as the neural renderer that takes physics-inspired shading as input feature. By removing non-linear activations and bias, our specifically designed lighting model explicitly keeps the linearity of light transport. This enables single-stage training from light-stage data while generalizing to real-time rendering under arbitrary continuous illuminations across diverse identities. In addition, we introduce the joint learning of a physically based model and our neural relighting model, which further improves fidelity and generalization. Extensive experiments show that our approach achieves superior performance over existing methods in terms of both quality and generalizability. We also demonstrate quick personalization of URHand from a short phone scan of an unseen identity. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: Project Page https://frozenburning.github.io/projects/urhand/

arXiv:2401.03346 [pdf, ps, other]

An Investigation of Large Language Models for Real-World Hate Speech Detection

Authors: Keyan Guo, Alexander Hu, Jaden Mu, Ziheng Shi, Ziming Zhao, Nishant Vishwamitra, Hongxin Hu

Abstract: Hate speech has emerged as a major problem plaguing our social spaces today. While there have been significant efforts to address this problem, existing methods are still significantly limited in effectively detecting hate speech online. A major limitation of existing methods is that hate speech detection is a highly contextual problem, and these methods cannot fully capture the context of hate sp… ▽ More Hate speech has emerged as a major problem plaguing our social spaces today. While there have been significant efforts to address this problem, existing methods are still significantly limited in effectively detecting hate speech online. A major limitation of existing methods is that hate speech detection is a highly contextual problem, and these methods cannot fully capture the context of hate speech to make accurate predictions. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. LLMs have undergone extensive training using vast amounts of natural language data, enabling them to grasp intricate contextual details. Hence, they could be used as knowledge bases for context-aware hate speech detection. However, a fundamental problem with using LLMs to detect hate speech is that there are no studies on effectively prompting LLMs for context-aware hate speech detection. In this study, we conduct a large-scale study of hate speech detection, employing five established hate speech datasets. We discover that LLMs not only match but often surpass the performance of current benchmark machine learning models in identifying hate speech. By proposing four diverse prompting strategies that optimize the use of LLMs in detecting hate speech. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech by fully utilizing the knowledge base in LLMs, significantly outperforming existing techniques. Furthermore, although LLMs can provide a rich knowledge base for the contextual detection of hate speech, suitable prompting strategies play a crucial role in effectively leveraging this knowledge base for efficient detection. △ Less

Submitted 6 January, 2024; originally announced January 2024.

Comments: Accepted for publication on 22nd International Conference of Machine Learning and Applications, ICMLA 2023

arXiv:2401.02329 [pdf, other]

Not all Minorities are Equal: Empty-Class-Aware Distillation for Heterogeneous Federated Learning

Authors: Kuangpu Guo, Yuhe Ding, Jian Liang, Ran He, Zilei Wang, Tieniu Tan

Abstract: Data heterogeneity, characterized by disparities in local data distribution across clients, poses a significant challenge in federated learning. Substantial efforts have been devoted to addressing the heterogeneity in local label distribution. As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniqu… ▽ More Data heterogeneity, characterized by disparities in local data distribution across clients, poses a significant challenge in federated learning. Substantial efforts have been devoted to addressing the heterogeneity in local label distribution. As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniques during local training. Despite the improved mean accuracy across all classes, we observe that empty classes-referring to categories absent from a client's data distribution-are still not well recognized. This paper introduces FedED, a novel approach in heterogeneous federated learning that integrates both empty-class distillation and logit suppression simultaneously. Specifically, empty-class distillation leverages knowledge distillation during local training on each client to retain essential information related to empty classes from the global model. Moreover, logit suppression directly penalizes network logits for non-label classes, effectively addressing misclassifications in minority classes that may be biased toward majority classes. Extensive experiments validate the efficacy of FedED, surpassing previous state-of-the-art methods across diverse datasets with varying degrees of label distribution shift. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2312.15099 [pdf, other]

Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models

Authors: Nishant Vishwamitra, Keyan Guo, Farhan Tajwar Romit, Isabelle Ondracek, Long Cheng, Ziming Zhao, Hongxin Hu

Abstract: Online hate is an escalating problem that negatively impacts the lives of Internet users, and is also subject to rapid changes due to evolving events, resulting in new waves of online hate that pose a critical threat. Detecting and mitigating these new waves present two key challenges: it demands reasoning-based complex decision-making to determine the presence of hateful content, and the limited… ▽ More Online hate is an escalating problem that negatively impacts the lives of Internet users, and is also subject to rapid changes due to evolving events, resulting in new waves of online hate that pose a critical threat. Detecting and mitigating these new waves present two key challenges: it demands reasoning-based complex decision-making to determine the presence of hateful content, and the limited availability of training samples hinders updating the detection model. To address this critical issue, we present a novel framework called HATEGUARD for effectively moderating new waves of online hate. HATEGUARD employs a reasoning-based approach that leverages the recently introduced chain-of-thought (CoT) prompting technique, harnessing the capabilities of large language models (LLMs). HATEGUARD further achieves prompt-based zero-shot detection by automatically generating and updating detection prompts with new derogatory terms and targets in new wave samples to effectively address new waves of online hate. To demonstrate the effectiveness of our approach, we compile a new dataset consisting of tweets related to three recently witnessed new waves: the 2022 Russian invasion of Ukraine, the 2021 insurrection of the US Capitol, and the COVID-19 pandemic. Our studies reveal crucial longitudinal patterns in these new waves concerning the evolution of events and the pressing need for techniques to rapidly update existing moderation tools to counteract them. Comparative evaluations against state-of-the-art tools illustrate the superiority of our framework, showcasing a substantial 22.22% to 83.33% improvement in detecting the three new waves of online hate. Our work highlights the severe threat posed by the emergence of new waves of online hate and represents a paradigm shift in addressing this threat practically. △ Less

Submitted 10 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: To Appear in the 45th IEEE Symposium on Security and Privacy, May 20-23, 2024

arXiv:2312.09817 [pdf, other]

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

Authors: Mohsin Hasan, Guojun Zhang, Kaiyang Guo, Xi Chen, Pascal Poupart

Abstract: Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods wh… ▽ More Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods which collect parameter samples from local posteriors, and aggregate them to approximate the global posterior. To improve scalability for larger models, one common Bayesian approach is to approximate the global predictive posterior by multiplying local predictive posteriors. In this work, we demonstrate that this method gives systematically overconfident predictions, and we remedy this by proposing $β$-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors, using a tunable parameter $β$. This parameter is tuned to improve the global ensemble's calibration, before it is distilled to a single model. Our method is evaluated on a variety of regression and classification datasets to demonstrate its superiority in calibration to other baselines, even as data heterogeneity increases. Code available at https://github.com/hasanmohsin/betaPredBayesFL △ Less

Submitted 9 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: 7 pages, 2 figures. To appear at AAAI 2024

arXiv:2311.10289 [pdf, ps, other]

Singular Trudinger--Moser inequality involving $L^{p}$ norm in bounded domain

Authors: Kaiwen Guo, Yanjun Liu

Abstract: In this paper, we use the method of blow-up analysis and capacity estimate to derive the singular Trudinger--Moser inequality involving $N$-Finsler--Laplacian and $L^{p}$ norm, precisely, for any $p>1$, $0\leqγ<γ_{1}:= \inf\limits_{u\in W^{1, N}_{0}(Ω)\backslash \{0\}}\frac{\int_ΩF^{N}(\nabla u)dx}{\| u\|_p^N}$ and $0\leqβ<N$, we have \begin{align} \sup_{u\in W_{0}^{1,N}(Ω),\;\int_ΩF^{N}(\nabla u)… ▽ More In this paper, we use the method of blow-up analysis and capacity estimate to derive the singular Trudinger--Moser inequality involving $N$-Finsler--Laplacian and $L^{p}$ norm, precisely, for any $p>1$, $0\leqγ<γ_{1}:= \inf\limits_{u\in W^{1, N}_{0}(Ω)\backslash \{0\}}\frac{\int_ΩF^{N}(\nabla u)dx}{\| u\|_p^N}$ and $0\leqβ<N$, we have \begin{align} \sup_{u\in W_{0}^{1,N}(Ω),\;\int_ΩF^{N}(\nabla u)dx-γ\| u\|_p^N\leq1}\int_Ω\frac{e^{λ_{N}(1-\fracβ{N})\lvert u\rvert^{\frac{N}{N-1}}}}{F^{o}(x)^β}\;\mathrm{d}x<+\infty\notag, \end{align} where $λ_{N}=N^{\frac{N}{N-1}} κ_{N}^{\frac{1}{N-1}}$ and $κ_{N}$ is the volume of a unit Wulff ball in $\mathbb{R}^N$, moreover, extremal functions for the inequality are also obtained. When $F=\lvert\cdot\rvert$ and $p=N$, we can obtain the singular version of Tintarev type inequality by the obove inequality, namely, for any $0\leqα<α_{1}(Ω):=\inf\limits_{u\in W^{1, N}_{0}(Ω)\backslash \{0\}}\frac{\int_Ω|\nabla u|^Ndx}{\| u\|_N^N}$ and $0\leqβ<N$, it holds $$ \sup_{u\in W_{0}^{1,N}(Ω),\;\int_Ω\lvert\nabla u\rvert^{N}\;\mathrm{d}x-α\|u\|_{N}^{N}\leq1}\int_Ω\frac{e^{α_{N}(1-\fracβ{N})\lvert u\rvert^{\frac{N}{N-1}}}}{\lvert x\rvert^β}\;\mathrm{d}x<+\infty, $$ where $α_{N}:=N^{\frac{N}{N-1}}ω_{N}^{\frac{1}{N-1}}$ and $ ω_{N}$ is the volume of unit ball in $\mathbb{R}^{N}$. Our results extend many well-known Trudinger--Moser type inequalities to more general setting. △ Less

Submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.09417 [pdf]

Preliminary Design of CSNS-II Linac SRF LLRF

Authors: Zhexin Xie, Kai Guo, Zhencheng Mu, Xinpeng Ma, Nan Gan, Maliang Wan, Bo Wang, Linyan Rong, Hui Zhang, Hexin Wang

Abstract: China Spallation Neutron Source(CSNS) target power will upgrade to 500 kW(CSNS-II) from 300kW, energy gain of H-Linac will up to 300 MeV from 80 MeV using about 50 superconductor cavities. LLRF is an important device for controlling the amplitude and phase of the SRF cavity field to be less than 0.6% and 0.6 deg. The parameters and requirements for CSNS-II Linac LLRF are presented here. The prelim… ▽ More China Spallation Neutron Source(CSNS) target power will upgrade to 500 kW(CSNS-II) from 300kW, energy gain of H-Linac will up to 300 MeV from 80 MeV using about 50 superconductor cavities. LLRF is an important device for controlling the amplitude and phase of the SRF cavity field to be less than 0.6% and 0.6 deg. The parameters and requirements for CSNS-II Linac LLRF are presented here. The preliminary design work and algorithm verification progress and results at C-ADS Injector-I are introduced. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: Talk presented at LLRF Workshop 2023(LLRF2023, arXiv:2310.03199)

Report number: LLRF2023/11

arXiv:2311.07186 [pdf]

doi 10.1021/acsphotonics.3c01163

Nonlinear dielectric geometric-phase metasurface with simultaneous structure and lattice symmetry design

Authors: Bingyi Liu, René Geromel, Zhaoxian Su, Kai Guo, Yongtian Wang, Zhongyi Guo, Lingling Huang, Thomas Zentgraf

Abstract: In this work, we utilize thin dielectric meta-atoms placed on a silver substrate to efficiently enhance and manipulate the third harmonic generation. We theoretically and experimentally reveal that when the structural symmetry of the meta-atom is incompatible with the lattice symmetry of an array, some generalized nonlinear geometric phases appear, which offers new possibilities for harmonic gener… ▽ More In this work, we utilize thin dielectric meta-atoms placed on a silver substrate to efficiently enhance and manipulate the third harmonic generation. We theoretically and experimentally reveal that when the structural symmetry of the meta-atom is incompatible with the lattice symmetry of an array, some generalized nonlinear geometric phases appear, which offers new possibilities for harmonic generation control beyond the accessible symmetries governed by the selection rule. The underlying mechanism is attributed to the modified rotation of the effective principal axis of a dense meta-atom array, where the strong coupling among the units gives rise to a generalized linear geometric phase modulation on the pump light. Therefore, nonlinear geometric phases carried by the third-harmonic emissions are the natural result of the wave-mixing process among the modes excited at the fundamental frequency. This mechanism further points out a new strategy to predict the nonlinear geometric phases delivered by the nanostructures according to their linear responses. Our design is simple and efficient, and offers alternatives for the nonlinear meta-devices that are capable of flexible photon generation and manipulation. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2310.11295 [pdf, other]

CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation

Authors: Zhaojie Chu, Kailing Guo, Xiaofen Xing, Yilin Lan, Bolun Cai, Xiangmin Xu

Abstract: Speech-driven 3D facial animation is a challenging cross-modal task that has attracted growing research interest. During speaking activities, the mouth displays strong motions, while the other facial regions typically demonstrate comparatively weak activity levels. Existing approaches often simplify the process by directly map** single-level speech features to the entire facial animation, which… ▽ More Speech-driven 3D facial animation is a challenging cross-modal task that has attracted growing research interest. During speaking activities, the mouth displays strong motions, while the other facial regions typically demonstrate comparatively weak activity levels. Existing approaches often simplify the process by directly map** single-level speech features to the entire facial animation, which overlook the differences in facial activity intensity leading to overly smoothed facial movements. In this study, we propose a novel framework, CorrTalk, which effectively establishes the temporal correlation between hierarchical speech features and facial activities of different intensities across distinct regions. A novel facial activity intensity metric is defined to distinguish between strong and weak facial activity, obtained by computing the short-time Fourier transform of facial vertex displacements. Based on the variances in facial activity, we propose a dual-branch decoding framework to synchronously synthesize strong and weak facial activity, which guarantees wider intensity facial animation synthesis. Furthermore, a weighted hierarchical feature encoder is proposed to establish temporal correlation between hierarchical speech features and facial activity at different intensities, which ensures lip-sync and plausible facial expressions. Extensive qualitatively and quantitatively experiments as well as a user study indicate that our CorrTalk outperforms existing state-of-the-art methods. The source code and supplementary video are publicly available at: https://zjchu.github.io/projects/CorrTalk/ △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2310.05917 [pdf, other]

doi 10.1145/3610548.3618136

Drivable Avatar Clothing: Faithful Full-Body Telepresence with Dynamic Clothing Driven by Sparse RGB-D Input

Authors: Donglai Xiang, Fabian Prada, Zhe Cao, Kaiwen Guo, Chenglei Wu, Jessica Hodgins, Timur Bagautdinov

Abstract: Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well as body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm that can efficiently track the coarse garment shape given sparse depth input. G… ▽ More Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well as body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm that can efficiently track the coarse garment shape given sparse depth input. Given the coarse tracking results, the input RGB-D images are then remapped to texel-aligned features, which are fed into the drivable avatar models to faithfully reconstruct appearance details. We evaluate our method against recent image-driven synthesis baselines, and conduct a comprehensive analysis of the N-ICP algorithm. We demonstrate that our method can generalize to a novel testing environment, while preserving the ability to produce high-fidelity and faithful clothing dynamics and appearance. △ Less

Submitted 11 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

Comments: SIGGRAPH Asia 2023 Conference Paper. Project website: https://xiangdonglai.github.io/www-sa23-drivable-clothing/

arXiv:2310.04674 [pdf, other]

Modeling non-uniform uncertainty in Reaction Prediction via Boosting and Dropout

Authors: Taicheng Guo, Changsheng Ma, Xiuying Chen, Bozhao Nan, Kehan Guo, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, Xiangliang Zhang

Abstract: Reaction prediction has been recognized as a critical task in synthetic chemistry, where the goal is to predict the outcome of a reaction based on the given reactants. With the widespread adoption of generative models, the Variational Autoencoder(VAE) framework has typically been employed to tackle challenges in reaction prediction, where the reactants are encoded as a condition for the decoder, w… ▽ More Reaction prediction has been recognized as a critical task in synthetic chemistry, where the goal is to predict the outcome of a reaction based on the given reactants. With the widespread adoption of generative models, the Variational Autoencoder(VAE) framework has typically been employed to tackle challenges in reaction prediction, where the reactants are encoded as a condition for the decoder, which then generates the product. Despite effectiveness, these conditional VAE (CVAE) models still fail to adequately account for the inherent uncertainty in reaction prediction, which primarily stems from the stochastic reaction process. The principal limitations are twofold. Firstly, in these CVAE models, the prior is independent of the reactants, leading to a default wide and assumed uniform distribution variance of the generated product. Secondly, reactants with analogous molecular representations are presumed to undergo similar electronic transition processes, thereby producing similar products. This hinders the ability to model diverse reaction mechanisms effectively. Since the variance in outcomes is inherently non-uniform, we are thus motivated to develop a framework that generates reaction products with non-uniform uncertainty. Firstly, we eliminate the latent variable in previous CVAE models to mitigate uncontrol-label noise. Instead, we introduce randomness into product generation via boosting to ensemble diverse models and cover the range of potential outcomes, and through dropout to secure models with minor variations. Additionally, we design a ranking method to union the predictions from boosting and dropout, prioritizing the most plausible products. Experimental results on the largest reaction prediction benchmark USPTO-MIT show the superior performance of our proposed method in modeling the non-uniform uncertainty compared to baselines. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2310.02776 [pdf, other]

Dynamic Shuffle: An Efficient Channel Mixture Method

Authors: Kaijun Gong, Zhuowen Yin, Yushu Li, Kailing Guo, Xiangmin Xu

Abstract: The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the data-dependent redundancy, we devise a dynamic shuffle module to generate data-dependent permutation matrices for shuffling. Since the dimension of permutation matrix is… ▽ More The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the data-dependent redundancy, we devise a dynamic shuffle module to generate data-dependent permutation matrices for shuffling. Since the dimension of permutation matrix is proportional to the square of the number of input channels, to make the generation process efficiently, we divide the channels into groups and generate two shared small permutation matrices for each group, and utilize Kronecker product and cross group shuffle to obtain the final permutation matrices. To make the generation process learnable, based on theoretical analysis, softmax, orthogonal regularization, and binarization are employed to asymptotically approximate the permutation matrix. Dynamic shuffle adaptively mixes channel information with negligible extra computation and memory occupancy. Experiment results on image classification benchmark datasets CIFAR-10, CIFAR-100, Tiny ImageNet and ImageNet have shown that our method significantly increases ShuffleNets' performance. Adding dynamic generated matrix with learnable static matrix, we further propose static-dynamic-shuffle and show that it can serve as a lightweight replacement of ordinary pointwise convolution. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.14157 [pdf, other]

LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch

Authors: Pucheng Zhai, Kailing Guo, Fang Liu, Xiaofen Xing, Xiangmin Xu

Abstract: Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow… ▽ More Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow for training, automatic pruning rate setting cannot explore a high pruning rate for a specific layer. To overcome these limitations, we propose a novel framework named Layer Adaptive Progressive Pruning (LAPP), which gradually compresses the network during initial training of a few epochs from scratch. In particular, LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Guided by both task loss and FLOPs constraints, the learnable thresholds are dynamically and gradually updated to accommodate changes of importance scores during training. Therefore the pruning strategy can gradually prune the network and automatically determine the appropriate pruning rates for each layer. What's more, in order to maintain the expressive power of the pruned layer, before training starts, we introduce an additional lightweight bypass for each convolutional layer to be pruned, which only adds relatively few additional burdens. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures. For example, on CIFAR-10, our method compresses ResNet-20 to 40.3% without accuracy drop. 55.6% of FLOPs of ResNet-18 are reduced with 0.21% top-1 accuracy increase and 0.40% top-5 accuracy increase on ImageNet. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 12 pages, 8 tables, 3 figures

arXiv:2309.13563 [pdf, other]

Multivariate Prototype Representation for Domain-Generalized Incremental Learning

Authors: Can Peng, Piotr Koniusz, Kaiyu Guo, Brian C. Lovell, Peyman Moghadam

Abstract: Deep learning models suffer from catastrophic forgetting when being fine-tuned with samples of new classes. This issue becomes even more pronounced when faced with the domain shift between training and testing data. In this paper, we study the critical and less explored Domain-Generalized Class-Incremental Learning (DGCIL). We design a DGCIL approach that remembers old classes, adapts to new class… ▽ More Deep learning models suffer from catastrophic forgetting when being fine-tuned with samples of new classes. This issue becomes even more pronounced when faced with the domain shift between training and testing data. In this paper, we study the critical and less explored Domain-Generalized Class-Incremental Learning (DGCIL). We design a DGCIL approach that remembers old classes, adapts to new classes, and can classify reliably objects from unseen domains. Specifically, our loss formulation maintains classification boundaries and suppresses the domain-specific information of each class. With no old exemplars stored, we use knowledge distillation and estimate old class prototype drift as incremental training advances. Our prototype representations are based on multivariate Normal distributions whose means and covariances are constantly adapted to changing model features to represent old classes well by adapting to the feature space drift. For old classes, we sample pseudo-features from the adapted Normal distributions with the help of Cholesky decomposition. In contrast to previous pseudo-feature sampling strategies that rely solely on average mean prototypes, our method excels at capturing varying semantic information. Experiments on several benchmarks validate our claims. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.12639 [pdf, other]

CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation

Authors: Xiaoheng Jiang, Kaiyi Guo, Yang Lu, Feng Yan, Hao Liu, Jiale Cao, Mingliang Xu, Dacheng Tao

Abstract: Surface defect inspection is of great importance for industrial manufacture and production. Though defect inspection methods based on deep learning have made significant progress, there are still some challenges for these methods, such as indistinguishable weak defects and defect-like interference in the background. To address these issues, we propose a transformer network with multi-stage CNN (Co… ▽ More Surface defect inspection is of great importance for industrial manufacture and production. Though defect inspection methods based on deep learning have made significant progress, there are still some challenges for these methods, such as indistinguishable weak defects and defect-like interference in the background. To address these issues, we propose a transformer network with multi-stage CNN (Convolutional Neural Network) feature injection for surface defect segmentation, which is a UNet-like structure named CINFormer. CINFormer presents a simple yet effective feature integration mechanism that injects the multi-level CNN features of the input image into different stages of the transformer network in the encoder. This can maintain the merit of CNN capturing detailed features and that of transformer depressing noises in the background, which facilitates accurate defect detection. In addition, CINFormer presents a Top-K self-attention module to focus on tokens with more important information about the defects, so as to further reduce the impact of the redundant background. Extensive experiments conducted on the surface defect datasets DAGM 2007, Magnetic tile, and NEU show that the proposed CINFormer achieves state-of-the-art performance in defect detection. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.10836 [pdf, other]

CMRxRecon: An open cardiac MRI dataset for the competition of accelerated image reconstruction

Authors: Chengyan Wang, Jun Lyu, Shuo Wang, Chen Qin, Kunyuan Guo, Xinyu Zhang, Xiaotong Yu, Yan Li, Fanwen Wang, Jianhua **, Zhang Shi, Ziqiang Xu, Yapeng Tian, Sha Hua, Zhensen Chen, Meng Liu, Mengting Sun, Xutong Kuang, Kang Wang, Haoran Wang, Hao Li, Yinghua Chu, Guang Yang, Wenjia Bai, Xiahai Zhuang , et al. (3 additional authors not shown)

Abstract: Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However,… ▽ More Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However, the development of deep learning methods requires large training datasets, which have not been publicly available for CMR. To address this gap, we released a dataset that includes multi-contrast, multi-view, multi-slice and multi-coil CMR imaging data from 300 subjects. Imaging studies include cardiac cine and map** sequences. Manual segmentations of the myocardium and chambers of all the subjects are also provided within the dataset. Scripts of state-of-the-art reconstruction algorithms were also provided as a point of reference. Our aim is to facilitate the advancement of state-of-the-art CMR image reconstruction by introducing standardized evaluation criteria and making the dataset freely accessible to the research community. Researchers can access the dataset at https://www.synapse.org/#!Synapse:syn51471091/wiki/. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 14 pages, 8 figures

arXiv:2309.09306 [pdf, other]

Effective Image Tampering Localization via Enhanced Transformer and Co-attention Fusion

Authors: Kun Guo, Haochen Zhu, Gang Cao

Abstract: Powerful manipulation techniques have made digital image forgeries be easily created and widespread without leaving visual anomalies. The blind localization of tampered regions becomes quite significant for image forensics. In this paper, we propose an effective image tampering localization network (EITLNet) based on a two-branch enhanced transformer encoder with attention-based feature fusion. Sp… ▽ More Powerful manipulation techniques have made digital image forgeries be easily created and widespread without leaving visual anomalies. The blind localization of tampered regions becomes quite significant for image forensics. In this paper, we propose an effective image tampering localization network (EITLNet) based on a two-branch enhanced transformer encoder with attention-based feature fusion. Specifically, a feature enhancement module is designed to enhance the feature representation ability of the transformer encoder. The features extracted from RGB and noise streams are fused effectively by the coordinate attention-based fusion module at multiple scales. Extensive experimental results verify that the proposed scheme achieves the state-of-the-art generalization ability and robustness in various benchmark datasets. Code will be public at https://github.com/multimediaFor/EITLNet. △ Less

Submitted 17 September, 2023; originally announced September 2023.

arXiv:2309.01056 [pdf, other]

Diagnosing the role of observable distribution shift in scientific replications

Authors: Ying **, Kevin Guo, Dominik Rothenhäusler

Abstract: Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how mu… ▽ More Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how much of the effect size discrepancy between an original study and its replication can be explained by distribution shift on observed unit-level characteristics. More specifically, we decompose this discrepancy into ``components" attributable to sampling variability (including publication bias), observable distribution shifts, and residual factors. We compute this decomposition for several directly-replicated behavioral science experiments and find little evidence that observable distribution shifts contribute appreciably to non-replicability. In some cases, this is because there is too much statistical noise. In other cases, there is strong evidence that controlling for additional moderators is necessary for reliable replication. △ Less

Submitted 2 September, 2023; originally announced September 2023.

arXiv:2308.16798 [pdf, other]

doi 10.1016/j.icarus.2023.115757

The stability of unevenly spaced planetary systems

Authors: Sheng Yang, Liangyu Wu, Zekai Zheng, Masahiro Ogihara, Kangrou Guo, Wenzhan Ouyang, Yaxing He

Abstract: Studying the orbital stability of multi-planet systems is essential to understand planet formation, estimate the stable time of an observed planetary system, and advance population synthesis models. Although previous studies have primarily focused on ideal systems characterized by uniform orbital separations, in reality a diverse range of orbital separations exists among planets within the same sy… ▽ More Studying the orbital stability of multi-planet systems is essential to understand planet formation, estimate the stable time of an observed planetary system, and advance population synthesis models. Although previous studies have primarily focused on ideal systems characterized by uniform orbital separations, in reality a diverse range of orbital separations exists among planets within the same system. This study focuses on investigating the dynamical stability of systems with non-uniform separation. We considered a system with 10 planets with masses of $10^{-7}$ solar masses around a central star with a mass of $1$ solar mass. We performed more than 100,000 runs of N-body simulations with different parameters. Results demonstrate that reducing merely one pair of planetary spacing leads to an order of magnitude shorter orbital crossing times that could be formulated based on the Keplerian periods of the closest separation pair. Furthermore, the first collisions are found to be closely associated with the first encounter pair that is likely to be the closest separation pair initially. We conclude that when estimating the orbital crossing time and colliding pairs in a realistic situation, updating the formula derived for evenly spaced systems would be necessary. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: 6 pages, 3 figures, accepted for publication in Icarus

Journal ref: Icarus, Volume 406, 2023, 115757

arXiv:2308.14347 [pdf, other]

doi 10.3847/1538-4357/acf31d

Formation of inner planets in the presence of a Cold Jupiter: orbital evolution and relative velocities of planetesimals

Authors: Kangrou Guo, Eiichiro Kokubo

Abstract: We investigate the orbital evolution of planetesimals in the inner disk in the presence of nebula gas and a (proto-) cold Jupiter. By varying the mass, eccentricity, and semi-major axis of the planet, we study the dependence of the relative velocities of the planetesimals on these parameters. For classic small planetesimals ($10^{16}-10^{20} $g) whose mutual gravitational interaction is negligible… ▽ More We investigate the orbital evolution of planetesimals in the inner disk in the presence of nebula gas and a (proto-) cold Jupiter. By varying the mass, eccentricity, and semi-major axis of the planet, we study the dependence of the relative velocities of the planetesimals on these parameters. For classic small planetesimals ($10^{16}-10^{20} $g) whose mutual gravitational interaction is negligible, gas drag introduces a size-dependent alignment of orbits and keeps the relative velocity low for similar-size bodies, while preventing orbital alignment for different-size planetesimals. Regardless of the location and the mass ratio of the planetesimals, increasing the mass and eccentricity or decreasing the orbital distance of the planet always leads to higher relative velocities of planetesimals. However, for massive planetesimals, the interplay of viscous stirring, gas dam**, and secular perturbation results in lower velocity dispersion of equal-size planetesimals when the planet is more massive or when it is located on a closer or more eccentric orbit. The random velocities of such planetesimals remain almost unperturbed when the planet is located beyond Jupiter's current orbit, or when it is less massive or less eccentric than Jupiter. Unlike small planetesimals, such large planetesimals can grow in a runaway fashion as in the unperturbed case. Our results imply that the presence of a cold Jupiter does not impede the formation of inner rocky planets through planetesimal accretion, provided that the planetesimals are initially large. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: Accepted for publication in ApJ. 17 pages, 11 figures

Journal ref: ApJ 955 109 (2023)

Showing 1–50 of 241 results for author: Guo, K