-
CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI
Authors:
Zi Wang,
Fanwen Wang,
Chen Qin,
Jun Lyu,
Ouyang Cheng,
Shuo Wang,
Yan Li,
Mengyao Yu,
Haoyu Zhang,
Kunyuan Guo,
Zhang Shi,
Qirong Li,
Ziqiang Xu,
Ya**g Zhang,
Hao Li,
Sha Hua,
Binghua Chen,
Longyu Sun,
Mengting Sun,
Qin Li,
Ying-Hua Chu,
Wenjia Bai,
**g Qin,
Xiahai Zhuang,
Claudia Prieto
, et al. (7 additional authors not shown)
Abstract:
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h…
▽ More
Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover high-quality, clinically interpretable images from undersampled measurements. However, the lack of publicly available cardiac MRI k-space dataset in terms of both quantity and diversity has severely hindered substantial technological progress, particularly for data-driven artificial intelligence. Here, we provide a standardized, diverse, and high-quality CMRxRecon2024 dataset to facilitate the technical development, fair evaluation, and clinical transfer of cardiac MRI reconstruction approaches, towards promoting the universal frameworks that enable fast and robust reconstructions across different cardiac MRI protocols in clinical practice. To the best of our knowledge, the CMRxRecon2024 dataset is the largest and most diverse publicly available cardiac k-space dataset. It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI workflows. Besides, an open platform with tutorials, benchmarks, and data processing tools is provided to facilitate data usage, advanced method development, and fair performance evaluation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Exploring quantum weight enumerators from the $n$-qubit parallelized SWAP test
Authors:
Fei Shi,
Kaiyi Guo,
Xiande Zhang,
Qi Zhao
Abstract:
Quantum weight enumerators play a crucial role in quantum error-correcting codes and multipartite entanglement. They can be used to investigate the existence of quantum error-correcting codes and $k$-uniform states. In this work, we build the connection between quantum weight enumerators and the $n$-qubit parallelized SWAP test. We discover that each shadow enumerator corresponds precisely to a pr…
▽ More
Quantum weight enumerators play a crucial role in quantum error-correcting codes and multipartite entanglement. They can be used to investigate the existence of quantum error-correcting codes and $k$-uniform states. In this work, we build the connection between quantum weight enumerators and the $n$-qubit parallelized SWAP test. We discover that each shadow enumerator corresponds precisely to a probability in the $n$-qubit parallelized SWAP test, providing a computable and operational meaning for the shadow enumerators. Due to the non-negativity of probabilities, we obtain an elegant proof for the shadow inequalities. Concurrently, we can also calculate the Shor-Laflamme enumerators and the Rains unitary enumerators from the $n$-qubit parallelized SWAP test. For applications, we employ the $n$-qubit parallelized SWAP test to determine the distances of quantum error-correcting codes, and the $k$-uniformity of pure states. Our results indicate that quantum weight enumerators can be efficiently estimated on quantum computers, and opening a path to calculate the distances of quantum error-correcting codes.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
PlaMo: Plan and Move in Rich 3D Physical Environments
Authors:
Assaf Hallak,
Gal Dalal,
Chen Tessler,
Kelly Guo,
Shie Mannor,
Gal Chechik
Abstract:
Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based…
▽ More
Controlling humanoids in complex physically simulated worlds is a long-standing challenge with numerous applications in gaming, simulation, and visual content creation. In our setup, given a rich and complex 3D scene, the user provides a list of instructions composed of target locations and locomotion types. To solve this task we present PlaMo, a scene-aware path planner and a robust physics-based controller. The path planner produces a sequence of motion paths, considering the various limitations the scene imposes on the motion, such as location, height, and speed. Complementing the planner, our control policy generates rich and realistic physical motion adhering to the plan. We demonstrate how the combination of both modules enables traversing complex landscapes in diverse forms while responding to real-time changes in the environment. Video: https://youtu.be/wWlqSQlRZ9M .
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
SGSM: A Foundation-model-like Semi-generalist Sensing Model
Authors:
Tianjian Yang,
Hao Zhou,
Shuo Liu,
Kaiwen Guo,
Yiwen Hou,
Haohua Du,
Zhi Liu,
Xiang-Yang Li
Abstract:
The significance of intelligent sensing systems is growing in the realm of smart services. These systems extract relevant signal features and generate informative representations for particular tasks. However, building the feature extraction component for such systems requires extensive domain-specific expertise or data. The exceptionally rapid development of foundation models is likely to usher i…
▽ More
The significance of intelligent sensing systems is growing in the realm of smart services. These systems extract relevant signal features and generate informative representations for particular tasks. However, building the feature extraction component for such systems requires extensive domain-specific expertise or data. The exceptionally rapid development of foundation models is likely to usher in newfound abilities in such intelligent sensing. We propose a new scheme for sensing model, which we refer to as semi-generalist sensing model (SGSM). SGSM is able to semiautomatically solve various tasks using relatively less task-specific labeled data compared to traditional systems. Built through the analysis of the common theoretical model, SGSM can depict different modalities, such as the acoustic and Wi-Fi signal. Experimental results on such two heterogeneous sensors illustrate that SGSM functions across a wide range of scenarios, thereby establishing its broad applicability. In some cases, SGSM even achieves better performance than sensor-specific specialized solutions. Wi-Fi evaluations indicate a 20\% accuracy improvement when applying SGSM to an existing sensing model.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization
Authors:
Zijie Lou,
Gang Cao,
Kun Guo,
Haochen Zhu,
Lifang Yu
Abstract:
Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label map**s without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-w…
▽ More
Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label map**s without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization. Specifically, we first pre-train the backbone network with the supervised contrastive loss to model pixel relationships from the perspectives of within-image, cross-scale and cross-modality. That is aimed at increasing intra-class compactness and inter-class separability. Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer. The MPC is trained on three different scale training datasets to make a comprehensive and fair comparison with existing image forgery localization algorithms. Extensive experiments on the small, medium and large scale training datasets show that the proposed MPC achieves higher generalization performance and robustness against post-processing than the state-of-the-arts. Code will be available at https://github.com/multimediaFor/MPC.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Using graph neural networks to reconstruct charged pion showers in the CMS High Granularity Calorimeter
Authors:
M. Aamir,
B. Acar,
G. Adamov,
T. Adams,
C. Adloff,
S. Afanasiev,
C. Agrawal,
C. Agrawal,
A. Ahmad,
H. A. Ahmed,
S. Akbar,
N. Akchurin,
B. Akgul,
B. Akgun,
R. O. Akpinar,
E. Aktas,
A. AlKadhim,
V. Alexakhin,
J. Alimena,
J. Alison,
A. Alpana,
W. Alshehri,
P. Alvarez Dominguez,
M. Alyari,
C. Amendola
, et al. (550 additional authors not shown)
Abstract:
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadr…
▽ More
A novel method to reconstruct the energy of hadronic showers in the CMS High Granularity Calorimeter (HGCAL) is presented. The HGCAL is a sampling calorimeter with very fine transverse and longitudinal granularity. The active media are silicon sensors and scintillator tiles readout by SiPMs and the absorbers are a combination of lead and Cu/CuW in the electromagnetic section, and steel in the hadronic section. The shower reconstruction method is based on graph neural networks and it makes use of a dynamic reduction network architecture. It is shown that the algorithm is able to capture and mitigate the main effects that normally hinder the reconstruction of hadronic showers using classical reconstruction methods, by compensating for fluctuations in the multiplicity, energy, and spatial distributions of the shower's constituents. The performance of the algorithm is evaluated using test beam data collected in 2018 prototype of the CMS HGCAL accompanied by a section of the CALICE AHCAL prototype. The capability of the method to mitigate the impact of energy leakage from the calorimeter is also demonstrated.
△ Less
Submitted 30 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Simultaneous Deep Learning of Myocardium Segmentation and T2 Quantification for Acute Myocardial Infarction MRI
Authors:
Yirong Zhou,
Chengyan Wang,
Mengtian Lu,
Kunyuan Guo,
Zi Wang,
Dan Ruan,
Rui Guo,
Peijun Zhao,
Jianhua Wang,
Naiming Wu,
Jianzhong Lin,
Yinyin Chen,
Hang **,
Lianxin Xie,
Lilan Wu,
Liuhong Zhu,
Jianjun Zhou,
Congbo Cai,
He Wang,
Xiaobo Qu
Abstract:
In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features…
▽ More
In cardiac Magnetic Resonance Imaging (MRI) analysis, simultaneous myocardial segmentation and T2 quantification are crucial for assessing myocardial pathologies. Existing methods often address these tasks separately, limiting their synergistic potential. To address this, we propose SQNet, a dual-task network integrating Transformer and Convolutional Neural Network (CNN) components. SQNet features a T2-refine fusion decoder for quantitative analysis, leveraging global features from the Transformer, and a segmentation decoder with multiple local region supervision for enhanced accuracy. A tight coupling module aligns and fuses CNN and Transformer branch features, enabling SQNet to focus on myocardium regions. Evaluation on healthy controls (HC) and acute myocardial infarction patients (AMI) demonstrates superior segmentation dice scores (89.3/89.2) compared to state-of-the-art methods (87.7/87.9). T2 quantification yields strong linear correlations (Pearson coefficients: 0.84/0.93) with label values for HC/AMI, indicating accurate map**. Radiologist evaluations confirm SQNet's superior image quality scores (4.60/4.58 for segmentation, 4.32/4.42 for T2 quantification) over state-of-the-art methods (4.50/4.44 for segmentation, 3.59/4.37 for T2 quantification). SQNet thus offers accurate simultaneous segmentation and quantification, enhancing cardiac disease diagnosis, such as AMI.
△ Less
Submitted 29 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
1/3 and other magnetization plateaus in a quasi-one-dimensional Ising magnet $\mathbf{TbTi_3Bi_4}$ with zigzag spin chain
Authors:
Kaizhen Guo,
Zeyu Ma,
Hongxiong Liu,
Ziyang Wu,
Junfeng Wang,
Youguo Shi,
Yuan Li,
Shuang Jia
Abstract:
We report the magnetic properties of newly synthesized, single crystals of $\mathrm{TbTi_3Bi_4}$ whose crystal structure is highlighted by the stacking of terbium-based zigzag chains and titanium-based kagome lattices. This compound demonstrates extreme easy-axis magnetic anisotropy due to the crystalline-electric-field effect which aligns the $\mathrm{Tb^{3+}}$ moments along the zigzag chain dire…
▽ More
We report the magnetic properties of newly synthesized, single crystals of $\mathrm{TbTi_3Bi_4}$ whose crystal structure is highlighted by the stacking of terbium-based zigzag chains and titanium-based kagome lattices. This compound demonstrates extreme easy-axis magnetic anisotropy due to the crystalline-electric-field effect which aligns the $\mathrm{Tb^{3+}}$ moments along the zigzag chain direction. As the result of the strong single-ion anisotropy and multiple magnetic interactions, $\mathrm{TbTi_3Bi_4}$ behaves as a quasi-one-dimensional Ising magnet with a remarkable antiferromagnetic ordering at $T_\mathrm{N}$ = 20.4 K. When a magnetic field is applied along the direction of the zigzag chain, multiple meta-magnetic transitions occur between 1/3 and other magnetization plateaus. We have created a field-temperature phase diagram and mapped out the complex magnetic structures resulting from frustration.
△ Less
Submitted 15 May, 2024;
originally announced May 2024.
-
Magnetic fluctuation and dominant superconducting pairing symmetry near the tunable Van Hove singularity
Authors:
Xiaohan Kong,
Boyang Wen,
Kaiyi Guo,
Ying Liang,
Tianxing Ma
Abstract:
We have investigated the magnetism and pairing correlations of the triangular lattice based on the Hubbard model using the determinant quantum Monte Carlo method and the constrained path Monte Carlo. The results show that the presence of the next-nearest-neighbor hop** integral $t^{\prime}$ introduces an additional energy scale to the system, and through $t^{\prime}$, one can regulate the shape…
▽ More
We have investigated the magnetism and pairing correlations of the triangular lattice based on the Hubbard model using the determinant quantum Monte Carlo method and the constrained path Monte Carlo. The results show that the presence of the next-nearest-neighbor hop** integral $t^{\prime}$ introduces an additional energy scale to the system, and through $t^{\prime}$, one can regulate the shape of the density of states and thus the position of the van Hove singularity point. Increasing inverse temperature $β$ and on-site interaction $U$ favor the formation of ferromagnetic correlation in a rather large filling region, and the calculations for different lattice sizes show that the range of the ferromagnetic correlations is smaller than the smallest lattice simulated at the investigated temperatures. We study the different pairing correlations of the triangular lattice near several typical fillings and show that the $f$-wave pairing dominates the system in the filling region near the van Hove singularity point with a high density of states, where the ferromagnetic correlation is also enhanced. When the filling is close to half-filling, the pairing susceptibility with $f$ wave is suppressed and the pairing susceptibility of $f_n$ wave is enhanced, however, both the effective pairing interaction with $f$ wave and $f_n$ wave are negative, which indicates that neither $f$-wave nor $f_n$-wave superconductivity may exist. Finally, we find that the pairing channel of different symmetry in the system maybe closely related to the magnetic properties. Ferromagnetic fluctuation favors the formation of $f$-wave pairing, while antiferromagnetic fluctuation tends to promote $f_n$-wave pairing.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
AI-Cybersecurity Education Through Designing AI-based Cyberharassment Detection Lab
Authors:
Ebuka Okpala,
Nishant Vishwamitra,
Keyan Guo,
Song Liao,
Long Cheng,
Hongxin Hu,
Yongkai Wu,
Xiaohong Yuan,
Jeannette Wade,
Sajad Khorsandroo
Abstract:
Cyberharassment is a critical, socially relevant cybersecurity problem because of the adverse effects it can have on targeted groups or individuals. While progress has been made in understanding cyber-harassment, its detection, attacks on artificial intelligence (AI) based cyberharassment systems, and the social problems in cyberharassment detectors, little has been done in designing experiential…
▽ More
Cyberharassment is a critical, socially relevant cybersecurity problem because of the adverse effects it can have on targeted groups or individuals. While progress has been made in understanding cyber-harassment, its detection, attacks on artificial intelligence (AI) based cyberharassment systems, and the social problems in cyberharassment detectors, little has been done in designing experiential learning educational materials that engage students in this emerging social cybersecurity in the era of AI. Experiential learning opportunities are usually provided through capstone projects and engineering design courses in STEM programs such as computer science. While capstone projects are an excellent example of experiential learning, given the interdisciplinary nature of this emerging social cybersecurity problem, it can be challenging to use them to engage non-computing students without prior knowledge of AI. Because of this, we were motivated to develop a hands-on lab platform that provided experiential learning experiences to non-computing students with little or no background knowledge in AI and discussed the lessons learned in develo** this lab. In this lab used by social science students at North Carolina A&T State University across two semesters (spring and fall) in 2022, students are given a detailed lab manual and are to complete a set of well-detailed tasks. Through this process, students learn AI concepts and the application of AI for cyberharassment detection. Using pre- and post-surveys, we asked students to rate their knowledge or skills in AI and their understanding of the concepts learned. The results revealed that the students moderately understood the concepts of AI and cyberharassment.
△ Less
Submitted 16 May, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Moderating Embodied Cyber Threats Using Generative AI
Authors:
Keyan Guo,
Freeman Guo,
Hongxin Hu
Abstract:
The advancement in computing and hardware, like spatial computing and VR headsets (e.g., Apple's Vision Pro) [1], has boosted the popularity of social VR platforms (VRChat, Rec Room, Meta HorizonWorlds) [2, 3, 4]. Unlike traditional digital interactions, social VR allows for more immersive experiences, with avatars that mimic users' real-time movements and enable physical-like interactions. Howeve…
▽ More
The advancement in computing and hardware, like spatial computing and VR headsets (e.g., Apple's Vision Pro) [1], has boosted the popularity of social VR platforms (VRChat, Rec Room, Meta HorizonWorlds) [2, 3, 4]. Unlike traditional digital interactions, social VR allows for more immersive experiences, with avatars that mimic users' real-time movements and enable physical-like interactions. However, the immersive nature of social VR may introduce intensified and more physicalized cyber threats-we define as "embodied cyber threats", including trash-talking, virtual "gro**", and such virtual harassment and assault. These new cyber threats are more realistic and invasive due to direct, virtual interactions, underscoring the urgent need for comprehensive understanding and practical strategies to enhance safety and security in virtual environments.
△ Less
Submitted 23 April, 2024;
originally announced May 2024.
-
PackVFL: Efficient HE Packing for Vertical Federated Learning
Authors:
Liu Yang,
Shuowei Cai,
Di Chai,
Junxue Zhang,
Han Tian,
Yilun **,
Kun Guo,
Kai Chen,
Qiang Yang
Abstract:
As an essential tool of secure distributed machine learning, vertical federated learning (VFL) based on homomorphic encryption (HE) suffers from severe efficiency problems due to data inflation and time-consuming operations. To this core, we propose PackVFL, an efficient VFL framework based on packed HE (PackedHE), to accelerate the existing HE-based VFL algorithms. PackVFL packs multiple cleartex…
▽ More
As an essential tool of secure distributed machine learning, vertical federated learning (VFL) based on homomorphic encryption (HE) suffers from severe efficiency problems due to data inflation and time-consuming operations. To this core, we propose PackVFL, an efficient VFL framework based on packed HE (PackedHE), to accelerate the existing HE-based VFL algorithms. PackVFL packs multiple cleartexts into one ciphertext and supports single-instruction-multiple-data (SIMD)-style parallelism. We focus on designing a high-performant matrix multiplication (MatMult) method since it takes up most of the ciphertext computation time in HE-based VFL. Besides, devising the MatMult method is also challenging for PackedHE because a slight difference in the packing way could predominantly affect its computation and communication costs. Without domain-specific design, directly applying SOTA MatMult methods is hard to achieve optimal.
Therefore, we make a three-fold design: 1) we systematically explore the current design space of MatMult and quantify the complexity of existing approaches to provide guidance; 2) we propose a hybrid MatMult method according to the unique characteristics of VFL; 3) we adaptively apply our hybrid method in representative VFL algorithms, leveraging distinctive algorithmic properties to further improve efficiency. As the batch size, feature dimension and model size of VFL scale up to large sizes, PackVFL consistently delivers enhanced performance. Empirically, PackVFL propels existing VFL algorithms to new heights, achieving up to a 51.52X end-to-end speedup. This represents a substantial 34.51X greater speedup compared to the direct application of SOTA MatMult methods.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Tianyu: search for the second solar system and explore the dynamic universe
Authors:
Fabo Feng,
Yicheng Rui,
Zhimao Du,
Qing Lin,
Congcong Zhang,
Dan Zhou,
Kaiming Cui,
Masahiro Ogihara,
Ming Yang,
Jie Lin,
Yongzhi Cai,
Taozhi Yang,
Xiaoying Pang,
Mingjie Jian,
Wenxiong Li,
Hengxiao Guo,
Xian Shi,
Jianchun Shi,
Jianyang Li,
Kangrou Guo,
Song Yao,
Aming Chen,
Peng Jia,
Xianyu Tan,
James S. Jenkins
, et al. (10 additional authors not shown)
Abstract:
Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to l…
▽ More
Giant planets like Jupiter and Saturn, play important roles in the formation and habitability of Earth-like planets. The detection of solar system analogs that have multiple cold giant planets is essential for our understanding of planet habitability and planet formation. Although transit surveys such as Kepler and TESS have discovered thousands of exoplanets, these missions are not sensitive to long period planets due to their limited observation baseline. The Tianyu project, comprising two 1-meter telescopes (Tianyu-I and II), is designed to detect transiting cold giant planets in order to find solar system analogs. Featuring a large field of view and equipped with a high-speed CMOS camera, Tianyu-I will perform a high-precision photometric survey of about 100 million stars, measuring light curves at hour-long cadence. The candidates found by Tianyu-I will be confirmed by Tianyu-II and other surveys and follow-up facilities through multi-band photometry, spectroscopy, and high resolution imaging. Tianyu telescopes will be situated at an elevation about 4000 meters in Lenghu, China. With a photometric precision of 1% for stars with V < 18 mag, Tianyu is expected to find more than 300 transiting exoplanets, including about 12 cold giant planets, over five years. A five-year survey of Tianyu would discover 1-2 solar system analogs. Moreover, Tianyu is also designed for non-exoplanetary exploration, incorporating multiple survey modes covering timescales from sub-seconds to months, with a particular emphasis on events occurring within the sub-second to hour range. It excels in observing areas such as infant supernovae, rare variable stars and binaries, tidal disruption events, Be stars, cometary activities, and interstellar objects. These discoveries not only enhance our comprehension of the universe but also offer compelling opportunities for public engagement in scientific exploration.
△ Less
Submitted 10 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Relative Occurrence Rate Between Hot and Cold Jupiters as an Indicator to Probe Planet Migration
Authors:
Tianjun Gan,
Kangrou Guo,
Beibei Liu,
Sharon X. Wang,
Shude Mao,
Johannes Buchner,
Benjamin J. Fulton
Abstract:
We propose a second-order statistic parameter $\varepsilon$, the relative occurrence rate between hot and cold Jupiters ($\varepsilon=η_{\rm HJ}/η_{\rm CJ}$), to probe the migration of gas giants. Since the planet occurrence rate is the combined outcome of the formation and migration processes, a joint analysis of hot and cold Jupiter frequency may shed light on the dynamical evolution of giant pl…
▽ More
We propose a second-order statistic parameter $\varepsilon$, the relative occurrence rate between hot and cold Jupiters ($\varepsilon=η_{\rm HJ}/η_{\rm CJ}$), to probe the migration of gas giants. Since the planet occurrence rate is the combined outcome of the formation and migration processes, a joint analysis of hot and cold Jupiter frequency may shed light on the dynamical evolution of giant planet systems. We first investigate the behavior of $\varepsilon$ as the stellar mass changes observationally. Based on the occurrence rate measurements of hot Jupiters ($η_{\rm HJ}$) from the TESS survey and cold Jupiters ($η_{\rm CJ}$) from the CLS survey, we find a tentative trend (97% confidence) that $\varepsilon$ drops when the stellar mass rises from $0.8$ to $1.4\ M_\odot$, which can be explained by different giant planet growth and disk migration timescales around different stars. We carry out planetesimal and pebble accretion simulations, both of which could reproduce the results of $η_{\rm HJ}$, $η_{\rm CJ}$ and $\varepsilon$. Our findings indicate that the classical core accretion + disk migration model can explain the observed decreasing trend of $\varepsilon$. We propose two ways to increase the significance of the trend and verify the anti-correlation. Future works are required to better constrain $\varepsilon$, especially for M dwarfs and for more massive stars.
△ Less
Submitted 12 April, 2024; v1 submitted 10 April, 2024;
originally announced April 2024.
-
Selected Open Problems in Continuous-Time Quantum Walks
Authors:
Gabriel Coutinho,
Krystal Guo
Abstract:
Quantum walks on graphs are fundamental to quantum computing and have led to many interesting open problems in algebraic graph theory. This review article highlights three key classes of open problems in this domain; perfect state transfer, instantaneous uniform mixing, and average mixing matrices. In highlighting these open problems, our aim is to stimulate further research and exploration in thi…
▽ More
Quantum walks on graphs are fundamental to quantum computing and have led to many interesting open problems in algebraic graph theory. This review article highlights three key classes of open problems in this domain; perfect state transfer, instantaneous uniform mixing, and average mixing matrices. In highlighting these open problems, our aim is to stimulate further research and exploration in this rapidly evolving field.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
The state-of-the-art in Cardiac MRI Reconstruction: Results of the CMRxRecon Challenge in MICCAI 2023
Authors:
Jun Lyu,
Chen Qin,
Shuo Wang,
Fanwen Wang,
Yan Li,
Zi Wang,
Kunyuan Guo,
Cheng Ouyang,
Michael Tänzer,
Meng Liu,
Longyu Sun,
Mengting Sun,
Qin Li,
Zhang Shi,
Sha Hua,
Hao Li,
Zhensen Chen,
Zhenlin Zhang,
Bingyu Xin,
Dimitris N. Metaxas,
George Yiasemis,
Jonas Teuwen,
Li** Zhang,
Weitian Chen,
Yidong Zhao
, et al. (25 additional authors not shown)
Abstract:
Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation p…
▽ More
Cardiac MRI, crucial for evaluating heart structure and function, faces limitations like slow imaging and motion artifacts. Undersampling reconstruction, especially data-driven algorithms, has emerged as a promising solution to accelerate scans and enhance imaging performance using highly under-sampled data. Nevertheless, the scarcity of publicly available cardiac k-space datasets and evaluation platform hinder the development of data-driven reconstruction algorithms. To address this issue, we organized the Cardiac MRI Reconstruction Challenge (CMRxRecon) in 2023, in collaboration with the 26th International Conference on MICCAI. CMRxRecon presented an extensive k-space dataset comprising cine and map** raw data, accompanied by detailed annotations of cardiac anatomical structures. With overwhelming participation, the challenge attracted more than 285 teams and over 600 participants. Among them, 22 teams successfully submitted Docker containers for the testing phase, with 7 teams submitted for both cine and map** tasks. All teams use deep learning based approaches, indicating that deep learning has predominately become a promising solution for the problem. The first-place winner of both tasks utilizes the E2E-VarNet architecture as backbones. In contrast, U-Net is still the most popular backbone for both multi-coil and single-coil reconstructions. This paper provides a comprehensive overview of the challenge design, presents a summary of the submitted results, reviews the employed methods, and offers an in-depth discussion that aims to inspire future advancements in cardiac MRI reconstruction models. The summary emphasizes the effective strategies observed in Cardiac MRI reconstruction, including backbone architecture, loss function, pre-processing techniques, physical modeling, and model complexity, thereby providing valuable insights for further developments in this field.
△ Less
Submitted 16 April, 2024; v1 submitted 1 April, 2024;
originally announced April 2024.
-
Moderating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large Vision-Language Models
Authors:
Keyan Guo,
Ayush Utkarsh,
Wenbo Ding,
Isabelle Ondracek,
Ziming Zhao,
Guo Freeman,
Nishant Vishwamitra,
Hongxin Hu
Abstract:
Online user-generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promoti…
▽ More
Online user-generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promotions of unsafe UGCGs on social media, which can inadvertently attract young users. This challenge arises from the difficulty of obtaining comprehensive training data for UGCG images and the unique nature of these images, which differ from traditional unsafe content. In this work, we take the first step towards studying the threat of illicit promotions of unsafe UGCGs. We collect a real-world dataset comprising 2,924 images that display diverse sexually explicit and violent content used to promote UGCGs by their game creators. Our in-depth studies reveal a new understanding of this problem and the urgent need for automatically flagging illicit UGCG promotions. We additionally create a cutting-edge system, UGCG-Guard, designed to aid social media platforms in effectively identifying images used for illicit UGCG promotions. This system leverages recently introduced large vision-language models (VLMs) and employs a novel conditional prompting strategy for zero-shot domain adaptation, along with chain-of-thought (CoT) reasoning for contextual identification. UGCG-Guard achieves outstanding results, with an accuracy rate of 94% in detecting these images used for the illicit promotion of such games in real-world scenarios.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation
Authors:
Ke Guo,
Zhenwei Miao,
Wei **g,
Weiwei Liu,
Weizi Li,
Dayang Hao,
Jia Pan
Abstract:
Microscopic traffic simulation plays a crucial role in transportation engineering by providing insights into individual vehicle behavior and overall traffic flow. However, creating a realistic simulator that accurately replicates human driving behaviors in various traffic conditions presents significant challenges. Traditional simulators relying on heuristic models often fail to deliver accurate s…
▽ More
Microscopic traffic simulation plays a crucial role in transportation engineering by providing insights into individual vehicle behavior and overall traffic flow. However, creating a realistic simulator that accurately replicates human driving behaviors in various traffic conditions presents significant challenges. Traditional simulators relying on heuristic models often fail to deliver accurate simulations due to the complexity of real-world traffic environments. Due to the covariate shift issue, existing imitation learning-based simulators often fail to generate stable long-term simulations. In this paper, we propose a novel approach called learner-aware supervised imitation learning to address the covariate shift problem in multi-agent imitation learning. By leveraging a variational autoencoder simultaneously modeling the expert and learner state distribution, our approach augments expert states such that the augmented state is aware of learner state distribution. Our method, applied to urban traffic simulation, demonstrates significant improvements over existing state-of-the-art baselines in both short-term microscopic and long-term macroscopic realism when evaluated on the real-world dataset pNEUMA.
△ Less
Submitted 23 May, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Improving the JPEG-resistance of Adversarial Attacks on Face Recognition by Interpolation Smoothing
Authors:
Kefu Guo,
Fengfan Zhou,
Hefei Ling,
** Li,
Hui Liu
Abstract:
JPEG compression can significantly impair the performance of adversarial face examples, which previous adversarial attacks on face recognition (FR) have not adequately addressed. Considering this challenge, we propose a novel adversarial attack on FR that aims to improve the resistance of adversarial examples against JPEG compression. Specifically, during the iterative process of generating advers…
▽ More
JPEG compression can significantly impair the performance of adversarial face examples, which previous adversarial attacks on face recognition (FR) have not adequately addressed. Considering this challenge, we propose a novel adversarial attack on FR that aims to improve the resistance of adversarial examples against JPEG compression. Specifically, during the iterative process of generating adversarial face examples, we interpolate the adversarial face examples into a smaller size. Then we utilize these interpolated adversarial face examples to create the adversarial examples in the next iteration. Subsequently, we restore the adversarial face examples to their original size by interpolating. Throughout the entire process, our proposed method can smooth the adversarial perturbations, effectively mitigating the presence of high-frequency signals in the crafted adversarial face examples that are typically eliminated by JPEG compression. Our experimental results demonstrate the effectiveness of our proposed method in improving the JPEG-resistance of adversarial face examples.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Defending Jailbreak Prompts via In-Context Adversarial Game
Authors:
Yujun Zhou,
Yufei Han,
Haomin Zhuang,
Taicheng Guo,
Kehan Guo,
Zhenwen Liang,
Hongyan Bao,
Xiangliang Zhang
Abstract:
Large Language Models (LLMs) demonstrate remarkable capabilities across diverse applications. However, concerns regarding their security, particularly the vulnerability to jailbreak attacks, persist. Drawing inspiration from adversarial training in deep learning and LLM agent learning processes, we introduce the In-Context Adversarial Game (ICAG) for defending against jailbreaks without the need f…
▽ More
Large Language Models (LLMs) demonstrate remarkable capabilities across diverse applications. However, concerns regarding their security, particularly the vulnerability to jailbreak attacks, persist. Drawing inspiration from adversarial training in deep learning and LLM agent learning processes, we introduce the In-Context Adversarial Game (ICAG) for defending against jailbreaks without the need for fine-tuning. ICAG leverages agent learning to conduct an adversarial game, aiming to dynamically extend knowledge to defend against jailbreaks. Unlike traditional methods that rely on static datasets, ICAG employs an iterative process to enhance both the defense and attack agents. This continuous improvement process strengthens defenses against newly generated jailbreak prompts. Our empirical studies affirm ICAG's efficacy, where LLMs safeguarded by ICAG exhibit significantly reduced jailbreak success rates across various attack scenarios. Moreover, ICAG demonstrates remarkable transferability to other LLMs, indicating its potential as a versatile defense mechanism.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Quantum Light Generation based on GaN Microring towards Fully On-chip Source
Authors:
Hong Zeng,
Zhao-Qin He,
Yun-Ru Fan,
Yue Luo,
Chen Lyu,
**-Peng Wu,
Yun-Bo Li,
Sheng Liu,
Dong Wang,
De-Chao Zhang,
Juan-Juan Zeng,
Guang-Wei Deng,
You Wang,
Hai-Zhi Song,
Zhen Wang,
Li-Xing You,
Kai Guo,
Chang-Zheng Sun,
Yi Luo,
Guang-Can Guo,
Qiang Zhou
Abstract:
Integrated quantum light source is increasingly desirable in large-scale quantum information processing.~Despite recent remarkable advances, new material platform is constantly being explored for the fully on-chip integration of quantum light generation, active and passive manipulation, and detection. Here, for the first time, we demonstrate a gallium nitride (GaN) microring based quantum light ge…
▽ More
Integrated quantum light source is increasingly desirable in large-scale quantum information processing.~Despite recent remarkable advances, new material platform is constantly being explored for the fully on-chip integration of quantum light generation, active and passive manipulation, and detection. Here, for the first time, we demonstrate a gallium nitride (GaN) microring based quantum light generation in the telecom C-band, which has potential towards the monolithic integration of quantum light source.~In our demonstration, the GaN microring has a free spectral range of 330 GHz and a near-zero anomalous dispersion region of over 100 nm. The generation of energy-time entangled photon pair is demonstrated with a typical raw two-photon interference visibility of 95.5$\pm$6.5%, which is further configured to generate heralded single photon with a typical heralded second-order auto-correlation $g^{(2)}_{H}(0)$ of 0.045$\pm$0.001. Our results pave the way for develo** chip-scale quantum photonic circuit.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Investigating Out-of-Distribution Generalization of GNNs: An Architecture Perspective
Authors:
Kai Guo,
Hongzhi Wen,
Wei **,
Yaming Guo,
Jiliang Tang,
Yi Chang
Abstract:
Graph neural networks (GNNs) have exhibited remarkable performance under the assumption that test data comes from the same distribution of training data. However, in real-world scenarios, this assumption may not always be valid. Consequently, there is a growing focus on exploring the Out-of-Distribution (OOD) problem in the context of graphs. Most existing efforts have primarily concentrated on im…
▽ More
Graph neural networks (GNNs) have exhibited remarkable performance under the assumption that test data comes from the same distribution of training data. However, in real-world scenarios, this assumption may not always be valid. Consequently, there is a growing focus on exploring the Out-of-Distribution (OOD) problem in the context of graphs. Most existing efforts have primarily concentrated on improving graph OOD generalization from two \textbf{model-agnostic} perspectives: data-driven methods and strategy-based learning. However, there has been limited attention dedicated to investigating the impact of well-known \textbf{GNN model architectures} on graph OOD generalization, which is orthogonal to existing research. In this work, we provide the first comprehensive investigation of OOD generalization on graphs from an architecture perspective, by examining the common building blocks of modern GNNs. Through extensive experiments, we reveal that both the graph self-attention mechanism and the decoupled architecture contribute positively to graph OOD generalization. In contrast, we observe that the linear classification layer tends to compromise graph OOD generalization capability. Furthermore, we provide in-depth theoretical insights and discussions to underpin these discoveries. These insights have empowered us to develop a novel GNN backbone model, DGAT, designed to harness the robust properties of both graph self-attention mechanism and the decoupled architecture. Extensive experimental results demonstrate the effectiveness of our model under graph OOD, exhibiting substantial and consistent enhancements across various training strategies.
△ Less
Submitted 14 February, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark
Authors:
Zhenwen Liang,
Kehan Guo,
Gang Liu,
Taicheng Guo,
Yujun Zhou,
Tianyu Yang,
Jiajun Jiao,
Renjie Pi,
Jipeng Zhang,
Xiangliang Zhang
Abstract:
The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level. It addresses a critical educational phase often overlooked in existing benchmarks, spanning high school to pre-college levels. SceMQA focuses on core science subjects including Mathematics, Physics, Chemistry, and Biology. It features a blend of multiple-choice and free-respon…
▽ More
The paper introduces SceMQA, a novel benchmark for scientific multimodal question answering at the college entrance level. It addresses a critical educational phase often overlooked in existing benchmarks, spanning high school to pre-college levels. SceMQA focuses on core science subjects including Mathematics, Physics, Chemistry, and Biology. It features a blend of multiple-choice and free-response formats, ensuring a comprehensive evaluation of AI models' abilities. Additionally, our benchmark provides specific knowledge points for each problem and detailed explanations for each answer. SceMQA also uniquely presents problems with identical contexts but varied questions to facilitate a more thorough and accurate assessment of reasoning capabilities. In the experiment, we evaluate both open-source and close-source state-of-the-art Multimodal Large Language Models (MLLMs), across various experimental settings. The results show that further research and development are needed in develo** more capable MLLM, as highlighted by only 50% to 60% accuracy achieved by the strongest models. Our benchmark and analysis will be available at https://scemqa.github.io/
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Deterministic Computing Power Networking: Architecture, Technologies and Prospects
Authors:
Qingmin Jia,
Yujiao Hu,
Xiaomao Zhou,
Qianpiao Ma,
Kai Guo,
Huayu Zhang,
Renchao Xie,
Tao Huang,
Yunjie Liu
Abstract:
With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research…
▽ More
With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research of the convergence of computing and networking, a new network paradigm named deterministic computing power networking (Det-CPN) is proposed. In this article, we firstly introduce the research advance of computing power networking. And then the motivations and scenarios of Det-CPN are analyzed. Following that, we present the system architecture, technological capabilities, workflow as well as key technologies for Det-CPN. Finally, the challenges and future trends of Det-CPN are analyzed and discussed.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Probing orbits of stellar mass objects deep in galactic nuclei with quasi-periodic eruptions
Authors:
Cong Zhou,
Lei Huang,
Kangrou Guo,
Ya-** Li,
Zhen Pan
Abstract:
Quasi-periodic eruptions (QPEs) are intense repeating soft X-ray bursts with recurrence times about a few to ten hours from nearby galactic nuclei. The origin of QPEs is still unclear. In this work, we investigated the extreme mass ratio inspiral (EMRI) + accretion disk model, where the disk is formed from a previous tidal disruption event (TDE). In this EMRI+TDE disk model, the QPEs are the resul…
▽ More
Quasi-periodic eruptions (QPEs) are intense repeating soft X-ray bursts with recurrence times about a few to ten hours from nearby galactic nuclei. The origin of QPEs is still unclear. In this work, we investigated the extreme mass ratio inspiral (EMRI) + accretion disk model, where the disk is formed from a previous tidal disruption event (TDE). In this EMRI+TDE disk model, the QPEs are the result of collisions between a TDE disk and a stellar mass object (a stellar mass black hole or a main sequence star) orbiting around a supermassive black hole (SMBH) in galactic nuclei. If this interpretation is correct, QPEs will be invaluable in probing the orbits of stellar mass objects in the vicinity of SMBHs, and further inferring the formation of EMRIs which are one of the primary targets of spaceborne gravitational wave missions. Taking GSN 069 as an example, we find the EMRI wherein is of low eccentricity ($e<0.1$ at 3-$σ$ confidence level) and semi-major axis about $O(10^2)$ gravitational radii of the central SMBH, which is consistent with the prediction of the wet EMRI formation channel, while incompatible with alternatives.
△ Less
Submitted 21 May, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
Hankel matrices acting on the Dirichlet space
Authors:
Guanlong Bao,
Kunyu Guo,
Fangmei Sun,
Zipeng Wang
Abstract:
The characterization of the boundedness of operators induced by Hankel matrices on analytic function spaces can be traced back to the work of Z. Nehari and H. Widom on the Hardy space, and has been extensively studied on many other analytic function spaces recently. However, this question remains open in the context of the Dirichlet space [20]. By Carleson measures, the Widom type condition and th…
▽ More
The characterization of the boundedness of operators induced by Hankel matrices on analytic function spaces can be traced back to the work of Z. Nehari and H. Widom on the Hardy space, and has been extensively studied on many other analytic function spaces recently. However, this question remains open in the context of the Dirichlet space [20]. By Carleson measures, the Widom type condition and the reproducing kernel thesis, this paper provides a comprehensive solution to this question. As a beneficial product, characterizations of the boundedness and compactness of operators induced by Cesàro type matrices on the Dirichlet space are given. In addition, we also show that a random Dirichlet function almost surely induces a compact Hankel type operator on the Dirichlet space.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
QAnswer: Towards Question Answering Search over Websites
Authors:
Kunpeng Guo,
Clement Defretiere,
Dennis Diefenbach,
Christophe Gravier,
Antoine Gourru
Abstract:
Question Answering (QA) is increasingly used by search engines to provide results to their end-users, yet very few websites currently use QA technologies for their search functionality. To illustrate the potential of QA technologies for the website search practitioner, we demonstrate web searches that combine QA over knowledge graphs and QA over free text -- each being usually tackled separately.…
▽ More
Question Answering (QA) is increasingly used by search engines to provide results to their end-users, yet very few websites currently use QA technologies for their search functionality. To illustrate the potential of QA technologies for the website search practitioner, we demonstrate web searches that combine QA over knowledge graphs and QA over free text -- each being usually tackled separately. We also discuss the different benefits and drawbacks of both approaches for web site searches. We use the case studies made of websites hosted by the Wikimedia Foundation (namely Wikipedia and Wikidata). Differently from a search engine (e.g. Google, Bing, etc), the data are indexed integrally, i.e. we do not index only a subset, and they are indexed exclusively, i.e. we index only data available on the corresponding website.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Fine-tuning Strategies for Domain Specific Question Answering under Low Annotation Budget Constraints
Authors:
Kunpeng Guo,
Dennis Diefenbach,
Antoine Gourru,
Christophe Gravier
Abstract:
The progress introduced by pre-trained language models and their fine-tuning has resulted in significant improvements in most downstream NLP tasks. The unsupervised training of a language model combined with further target task fine-tuning has become the standard QA fine-tuning procedure. In this work, we demonstrate that this strategy is sub-optimal for fine-tuning QA models, especially under a l…
▽ More
The progress introduced by pre-trained language models and their fine-tuning has resulted in significant improvements in most downstream NLP tasks. The unsupervised training of a language model combined with further target task fine-tuning has become the standard QA fine-tuning procedure. In this work, we demonstrate that this strategy is sub-optimal for fine-tuning QA models, especially under a low QA annotation budget, which is a usual setting in practice due to the extractive QA labeling cost. We draw our conclusions by conducting an exhaustive analysis of the performance of the alternatives of the sequential fine-tuning strategy on different QA datasets. Based on the experiments performed, we observed that the best strategy to fine-tune the QA model in low-budget settings is taking a pre-trained language model (PLM) and then fine-tuning PLM with a dataset composed of the target dataset and SQuAD dataset. With zero extra annotation effort, the best strategy outperforms the standard strategy by 2.28% to 6.48%. Our experiments provide one of the first investigations on how to best fine-tune a QA system under a low budget and are therefore of the utmost practical interest to the QA practitioners.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Wikidata as a seed for Web Extraction
Authors:
Kunpeng Guo,
Dennis Diefenbach,
Antoine Gourru,
Christophe Gravier
Abstract:
Wikidata has grown to a knowledge graph with an impressive size. To date, it contains more than 17 billion triples collecting information about people, places, films, stars, publications, proteins, and many more. On the other side, most of the information on the Web is not published in highly structured data repositories like Wikidata, but rather as unstructured and semi-structured content, more c…
▽ More
Wikidata has grown to a knowledge graph with an impressive size. To date, it contains more than 17 billion triples collecting information about people, places, films, stars, publications, proteins, and many more. On the other side, most of the information on the Web is not published in highly structured data repositories like Wikidata, but rather as unstructured and semi-structured content, more concretely in HTML pages containing text and tables. Finding, monitoring, and organizing this data in a knowledge graph is requiring considerable work from human editors. The volume and complexity of the data make this task difficult and time-consuming. In this work, we present a framework that is able to identify and extract new facts that are published under multiple Web domains so that they can be proposed for validation by Wikidata editors. The framework is relying on question-answering technologies. We take inspiration from ideas that are used to extract facts from textual collections and adapt them to extract facts from Web pages. For achieving this, we demonstrate that language models can be adapted to extract facts not only from textual collections but also from Web pages. By exploiting the information already contained in Wikidata the proposed framework can be trained without the need for any additional learning signals and can extract new facts for a wide range of properties and domains. Following this path, Wikidata can be used as a seed to extract facts on the Web. Our experiments show that we can achieve a mean performance of 84.07 at F1-score. Moreover, our estimations show that we can potentially extract millions of facts that can be proposed for human validation. The goal is to help editors in their daily tasks and contribute to the completion of the Wikidata knowledge graph.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Causally Aware Generative Adversarial Networks for Light Pollution Control
Authors:
Yuyao Zhang,
Ke Guo,
Xiao Zhou
Abstract:
Artificial light plays an integral role in modern cities, significantly enhancing human productivity and the efficiency of civilization. However, excessive illumination can lead to light pollution, posing non-negligible threats to economic burdens, ecosystems, and human health. Despite its critical importance, the exploration of its causes remains relatively limited within the field of artificial…
▽ More
Artificial light plays an integral role in modern cities, significantly enhancing human productivity and the efficiency of civilization. However, excessive illumination can lead to light pollution, posing non-negligible threats to economic burdens, ecosystems, and human health. Despite its critical importance, the exploration of its causes remains relatively limited within the field of artificial intelligence, leaving an incomplete understanding of the factors contributing to light pollution and sustainable illumination planning distant. To address this gap, we introduce a novel framework named Causally Aware Generative Adversarial Networks (CAGAN). This innovative approach aims to uncover the fundamental drivers of light pollution within cities and offer intelligent solutions for optimal illumination resource allocation in the context of sustainable urban development. We commence by examining light pollution across 33,593 residential areas in seven global metropolises. Our findings reveal substantial influences on light pollution levels from various building types, notably grasslands, commercial centers and residential buildings as significant contributors. These discovered causal relationships are seamlessly integrated into the generative modeling framework, guiding the process of generating light pollution maps for diverse residential areas. Extensive experiments showcase CAGAN's potential to inform and guide the implementation of effective strategies to mitigate light pollution. Our code and data are publicly available at https://github.com/zhangyuuao/Light_Pollution_CAGAN.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
URHand: Universal Relightable Hands
Authors:
Zhaoxi Chen,
Gyeongsik Moon,
Kaiwen Guo,
Chen Cao,
Stanislav Pidhorskyi,
Tomas Simon,
Rohan Joshi,
Yuan Dong,
Yichen Xu,
Bernardo Pires,
He Wen,
Lucas Evans,
Bo Peng,
Julia Buffalini,
Autumn Trimble,
Kevyn McPhail,
Melissa Schoeller,
Shoou-I Yu,
Javier Romero,
Michael Zollhöfer,
Yaser Sheikh,
Ziwei Liu,
Shunsuke Saito
Abstract:
Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows f…
▽ More
Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows few-shot personalization using images captured with a mobile phone, and is ready to be photorealistically rendered under novel illuminations. To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities. The key challenge is scaling the cross-identity training while maintaining personalized fidelity and sharp details without compromising generalization under natural illuminations. To this end, we propose a spatially varying linear lighting model as the neural renderer that takes physics-inspired shading as input feature. By removing non-linear activations and bias, our specifically designed lighting model explicitly keeps the linearity of light transport. This enables single-stage training from light-stage data while generalizing to real-time rendering under arbitrary continuous illuminations across diverse identities. In addition, we introduce the joint learning of a physically based model and our neural relighting model, which further improves fidelity and generalization. Extensive experiments show that our approach achieves superior performance over existing methods in terms of both quality and generalizability. We also demonstrate quick personalization of URHand from a short phone scan of an unseen identity.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
An Investigation of Large Language Models for Real-World Hate Speech Detection
Authors:
Keyan Guo,
Alexander Hu,
Jaden Mu,
Ziheng Shi,
Ziming Zhao,
Nishant Vishwamitra,
Hongxin Hu
Abstract:
Hate speech has emerged as a major problem plaguing our social spaces today. While there have been significant efforts to address this problem, existing methods are still significantly limited in effectively detecting hate speech online. A major limitation of existing methods is that hate speech detection is a highly contextual problem, and these methods cannot fully capture the context of hate sp…
▽ More
Hate speech has emerged as a major problem plaguing our social spaces today. While there have been significant efforts to address this problem, existing methods are still significantly limited in effectively detecting hate speech online. A major limitation of existing methods is that hate speech detection is a highly contextual problem, and these methods cannot fully capture the context of hate speech to make accurate predictions. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. LLMs have undergone extensive training using vast amounts of natural language data, enabling them to grasp intricate contextual details. Hence, they could be used as knowledge bases for context-aware hate speech detection. However, a fundamental problem with using LLMs to detect hate speech is that there are no studies on effectively prompting LLMs for context-aware hate speech detection. In this study, we conduct a large-scale study of hate speech detection, employing five established hate speech datasets. We discover that LLMs not only match but often surpass the performance of current benchmark machine learning models in identifying hate speech. By proposing four diverse prompting strategies that optimize the use of LLMs in detecting hate speech. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech by fully utilizing the knowledge base in LLMs, significantly outperforming existing techniques. Furthermore, although LLMs can provide a rich knowledge base for the contextual detection of hate speech, suitable prompting strategies play a crucial role in effectively leveraging this knowledge base for efficient detection.
△ Less
Submitted 6 January, 2024;
originally announced January 2024.
-
Not all Minorities are Equal: Empty-Class-Aware Distillation for Heterogeneous Federated Learning
Authors:
Kuangpu Guo,
Yuhe Ding,
Jian Liang,
Ran He,
Zilei Wang,
Tieniu Tan
Abstract:
Data heterogeneity, characterized by disparities in local data distribution across clients, poses a significant challenge in federated learning. Substantial efforts have been devoted to addressing the heterogeneity in local label distribution. As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniqu…
▽ More
Data heterogeneity, characterized by disparities in local data distribution across clients, poses a significant challenge in federated learning. Substantial efforts have been devoted to addressing the heterogeneity in local label distribution. As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniques during local training. Despite the improved mean accuracy across all classes, we observe that empty classes-referring to categories absent from a client's data distribution-are still not well recognized. This paper introduces FedED, a novel approach in heterogeneous federated learning that integrates both empty-class distillation and logit suppression simultaneously. Specifically, empty-class distillation leverages knowledge distillation during local training on each client to retain essential information related to empty classes from the global model. Moreover, logit suppression directly penalizes network logits for non-label classes, effectively addressing misclassifications in minority classes that may be biased toward majority classes. Extensive experiments validate the efficacy of FedED, surpassing previous state-of-the-art methods across diverse datasets with varying degrees of label distribution shift.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Moderating New Waves of Online Hate with Chain-of-Thought Reasoning in Large Language Models
Authors:
Nishant Vishwamitra,
Keyan Guo,
Farhan Tajwar Romit,
Isabelle Ondracek,
Long Cheng,
Ziming Zhao,
Hongxin Hu
Abstract:
Online hate is an escalating problem that negatively impacts the lives of Internet users, and is also subject to rapid changes due to evolving events, resulting in new waves of online hate that pose a critical threat. Detecting and mitigating these new waves present two key challenges: it demands reasoning-based complex decision-making to determine the presence of hateful content, and the limited…
▽ More
Online hate is an escalating problem that negatively impacts the lives of Internet users, and is also subject to rapid changes due to evolving events, resulting in new waves of online hate that pose a critical threat. Detecting and mitigating these new waves present two key challenges: it demands reasoning-based complex decision-making to determine the presence of hateful content, and the limited availability of training samples hinders updating the detection model. To address this critical issue, we present a novel framework called HATEGUARD for effectively moderating new waves of online hate. HATEGUARD employs a reasoning-based approach that leverages the recently introduced chain-of-thought (CoT) prompting technique, harnessing the capabilities of large language models (LLMs). HATEGUARD further achieves prompt-based zero-shot detection by automatically generating and updating detection prompts with new derogatory terms and targets in new wave samples to effectively address new waves of online hate. To demonstrate the effectiveness of our approach, we compile a new dataset consisting of tweets related to three recently witnessed new waves: the 2022 Russian invasion of Ukraine, the 2021 insurrection of the US Capitol, and the COVID-19 pandemic. Our studies reveal crucial longitudinal patterns in these new waves concerning the evolution of events and the pressing need for techniques to rapidly update existing moderation tools to counteract them. Comparative evaluations against state-of-the-art tools illustrate the superiority of our framework, showcasing a substantial 22.22% to 83.33% improvement in detecting the three new waves of online hate. Our work highlights the severe threat posed by the emergence of new waves of online hate and represents a paradigm shift in addressing this threat practically.
△ Less
Submitted 10 May, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space
Authors:
Mohsin Hasan,
Guojun Zhang,
Kaiyang Guo,
Xi Chen,
Pascal Poupart
Abstract:
Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods wh…
▽ More
Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods which collect parameter samples from local posteriors, and aggregate them to approximate the global posterior. To improve scalability for larger models, one common Bayesian approach is to approximate the global predictive posterior by multiplying local predictive posteriors. In this work, we demonstrate that this method gives systematically overconfident predictions, and we remedy this by proposing $β$-Predictive Bayes, a Bayesian FL algorithm that interpolates between a mixture and product of the predictive posteriors, using a tunable parameter $β$. This parameter is tuned to improve the global ensemble's calibration, before it is distilled to a single model. Our method is evaluated on a variety of regression and classification datasets to demonstrate its superiority in calibration to other baselines, even as data heterogeneity increases. Code available at https://github.com/hasanmohsin/betaPredBayesFL
△ Less
Submitted 9 January, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Singular Trudinger--Moser inequality involving $L^{p}$ norm in bounded domain
Authors:
Kaiwen Guo,
Yanjun Liu
Abstract:
In this paper, we use the method of blow-up analysis and capacity estimate to derive the singular Trudinger--Moser inequality involving $N$-Finsler--Laplacian and $L^{p}$ norm, precisely, for any $p>1$, $0\leqγ<γ_{1}:= \inf\limits_{u\in W^{1, N}_{0}(Ω)\backslash \{0\}}\frac{\int_ΩF^{N}(\nabla u)dx}{\| u\|_p^N}$ and $0\leqβ<N$, we have \begin{align} \sup_{u\in W_{0}^{1,N}(Ω),\;\int_ΩF^{N}(\nabla u)…
▽ More
In this paper, we use the method of blow-up analysis and capacity estimate to derive the singular Trudinger--Moser inequality involving $N$-Finsler--Laplacian and $L^{p}$ norm, precisely, for any $p>1$, $0\leqγ<γ_{1}:= \inf\limits_{u\in W^{1, N}_{0}(Ω)\backslash \{0\}}\frac{\int_ΩF^{N}(\nabla u)dx}{\| u\|_p^N}$ and $0\leqβ<N$, we have \begin{align} \sup_{u\in W_{0}^{1,N}(Ω),\;\int_ΩF^{N}(\nabla u)dx-γ\| u\|_p^N\leq1}\int_Ω\frac{e^{λ_{N}(1-\fracβ{N})\lvert u\rvert^{\frac{N}{N-1}}}}{F^{o}(x)^β}\;\mathrm{d}x<+\infty\notag, \end{align} where $λ_{N}=N^{\frac{N}{N-1}} κ_{N}^{\frac{1}{N-1}}$ and $κ_{N}$ is the volume of a unit Wulff ball in $\mathbb{R}^N$, moreover, extremal functions for the inequality are also obtained. When $F=\lvert\cdot\rvert$ and $p=N$, we can obtain the singular version of Tintarev type inequality by the obove inequality, namely, for any $0\leqα<α_{1}(Ω):=\inf\limits_{u\in W^{1, N}_{0}(Ω)\backslash \{0\}}\frac{\int_Ω|\nabla u|^Ndx}{\| u\|_N^N}$ and $0\leqβ<N$, it holds $$ \sup_{u\in W_{0}^{1,N}(Ω),\;\int_Ω\lvert\nabla u\rvert^{N}\;\mathrm{d}x-α\|u\|_{N}^{N}\leq1}\int_Ω\frac{e^{α_{N}(1-\fracβ{N})\lvert u\rvert^{\frac{N}{N-1}}}}{\lvert x\rvert^β}\;\mathrm{d}x<+\infty, $$ where $α_{N}:=N^{\frac{N}{N-1}}ω_{N}^{\frac{1}{N-1}}$ and $ ω_{N}$ is the volume of unit ball in $\mathbb{R}^{N}$. Our results extend many well-known Trudinger--Moser type inequalities to more general setting.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Preliminary Design of CSNS-II Linac SRF LLRF
Authors:
Zhexin Xie,
Kai Guo,
Zhencheng Mu,
Xinpeng Ma,
Nan Gan,
Maliang Wan,
Bo Wang,
Linyan Rong,
Hui Zhang,
Hexin Wang
Abstract:
China Spallation Neutron Source(CSNS) target power will upgrade to 500 kW(CSNS-II) from 300kW, energy gain of H-Linac will up to 300 MeV from 80 MeV using about 50 superconductor cavities. LLRF is an important device for controlling the amplitude and phase of the SRF cavity field to be less than 0.6% and 0.6 deg. The parameters and requirements for CSNS-II Linac LLRF are presented here. The prelim…
▽ More
China Spallation Neutron Source(CSNS) target power will upgrade to 500 kW(CSNS-II) from 300kW, energy gain of H-Linac will up to 300 MeV from 80 MeV using about 50 superconductor cavities. LLRF is an important device for controlling the amplitude and phase of the SRF cavity field to be less than 0.6% and 0.6 deg. The parameters and requirements for CSNS-II Linac LLRF are presented here. The preliminary design work and algorithm verification progress and results at C-ADS Injector-I are introduced.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Nonlinear dielectric geometric-phase metasurface with simultaneous structure and lattice symmetry design
Authors:
Bingyi Liu,
René Geromel,
Zhaoxian Su,
Kai Guo,
Yongtian Wang,
Zhongyi Guo,
Lingling Huang,
Thomas Zentgraf
Abstract:
In this work, we utilize thin dielectric meta-atoms placed on a silver substrate to efficiently enhance and manipulate the third harmonic generation. We theoretically and experimentally reveal that when the structural symmetry of the meta-atom is incompatible with the lattice symmetry of an array, some generalized nonlinear geometric phases appear, which offers new possibilities for harmonic gener…
▽ More
In this work, we utilize thin dielectric meta-atoms placed on a silver substrate to efficiently enhance and manipulate the third harmonic generation. We theoretically and experimentally reveal that when the structural symmetry of the meta-atom is incompatible with the lattice symmetry of an array, some generalized nonlinear geometric phases appear, which offers new possibilities for harmonic generation control beyond the accessible symmetries governed by the selection rule. The underlying mechanism is attributed to the modified rotation of the effective principal axis of a dense meta-atom array, where the strong coupling among the units gives rise to a generalized linear geometric phase modulation on the pump light. Therefore, nonlinear geometric phases carried by the third-harmonic emissions are the natural result of the wave-mixing process among the modes excited at the fundamental frequency. This mechanism further points out a new strategy to predict the nonlinear geometric phases delivered by the nanostructures according to their linear responses. Our design is simple and efficient, and offers alternatives for the nonlinear meta-devices that are capable of flexible photon generation and manipulation.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
CorrTalk: Correlation Between Hierarchical Speech and Facial Activity Variances for 3D Animation
Authors:
Zhaojie Chu,
Kailing Guo,
Xiaofen Xing,
Yilin Lan,
Bolun Cai,
Xiangmin Xu
Abstract:
Speech-driven 3D facial animation is a challenging cross-modal task that has attracted growing research interest. During speaking activities, the mouth displays strong motions, while the other facial regions typically demonstrate comparatively weak activity levels. Existing approaches often simplify the process by directly map** single-level speech features to the entire facial animation, which…
▽ More
Speech-driven 3D facial animation is a challenging cross-modal task that has attracted growing research interest. During speaking activities, the mouth displays strong motions, while the other facial regions typically demonstrate comparatively weak activity levels. Existing approaches often simplify the process by directly map** single-level speech features to the entire facial animation, which overlook the differences in facial activity intensity leading to overly smoothed facial movements. In this study, we propose a novel framework, CorrTalk, which effectively establishes the temporal correlation between hierarchical speech features and facial activities of different intensities across distinct regions. A novel facial activity intensity metric is defined to distinguish between strong and weak facial activity, obtained by computing the short-time Fourier transform of facial vertex displacements. Based on the variances in facial activity, we propose a dual-branch decoding framework to synchronously synthesize strong and weak facial activity, which guarantees wider intensity facial animation synthesis. Furthermore, a weighted hierarchical feature encoder is proposed to establish temporal correlation between hierarchical speech features and facial activity at different intensities, which ensures lip-sync and plausible facial expressions. Extensive qualitatively and quantitatively experiments as well as a user study indicate that our CorrTalk outperforms existing state-of-the-art methods. The source code and supplementary video are publicly available at: https://zjchu.github.io/projects/CorrTalk/
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Drivable Avatar Clothing: Faithful Full-Body Telepresence with Dynamic Clothing Driven by Sparse RGB-D Input
Authors:
Donglai Xiang,
Fabian Prada,
Zhe Cao,
Kaiwen Guo,
Chenglei Wu,
Jessica Hodgins,
Timur Bagautdinov
Abstract:
Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well as body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm that can efficiently track the coarse garment shape given sparse depth input. G…
▽ More
Clothing is an important part of human appearance but challenging to model in photorealistic avatars. In this work we present avatars with dynamically moving loose clothing that can be faithfully driven by sparse RGB-D inputs as well as body and face motion. We propose a Neural Iterative Closest Point (N-ICP) algorithm that can efficiently track the coarse garment shape given sparse depth input. Given the coarse tracking results, the input RGB-D images are then remapped to texel-aligned features, which are fed into the drivable avatar models to faithfully reconstruct appearance details. We evaluate our method against recent image-driven synthesis baselines, and conduct a comprehensive analysis of the N-ICP algorithm. We demonstrate that our method can generalize to a novel testing environment, while preserving the ability to produce high-fidelity and faithful clothing dynamics and appearance.
△ Less
Submitted 11 October, 2023; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Modeling non-uniform uncertainty in Reaction Prediction via Boosting and Dropout
Authors:
Taicheng Guo,
Changsheng Ma,
Xiuying Chen,
Bozhao Nan,
Kehan Guo,
Shichao Pei,
Nitesh V. Chawla,
Olaf Wiest,
Xiangliang Zhang
Abstract:
Reaction prediction has been recognized as a critical task in synthetic chemistry, where the goal is to predict the outcome of a reaction based on the given reactants. With the widespread adoption of generative models, the Variational Autoencoder(VAE) framework has typically been employed to tackle challenges in reaction prediction, where the reactants are encoded as a condition for the decoder, w…
▽ More
Reaction prediction has been recognized as a critical task in synthetic chemistry, where the goal is to predict the outcome of a reaction based on the given reactants. With the widespread adoption of generative models, the Variational Autoencoder(VAE) framework has typically been employed to tackle challenges in reaction prediction, where the reactants are encoded as a condition for the decoder, which then generates the product. Despite effectiveness, these conditional VAE (CVAE) models still fail to adequately account for the inherent uncertainty in reaction prediction, which primarily stems from the stochastic reaction process. The principal limitations are twofold. Firstly, in these CVAE models, the prior is independent of the reactants, leading to a default wide and assumed uniform distribution variance of the generated product. Secondly, reactants with analogous molecular representations are presumed to undergo similar electronic transition processes, thereby producing similar products. This hinders the ability to model diverse reaction mechanisms effectively. Since the variance in outcomes is inherently non-uniform, we are thus motivated to develop a framework that generates reaction products with non-uniform uncertainty. Firstly, we eliminate the latent variable in previous CVAE models to mitigate uncontrol-label noise. Instead, we introduce randomness into product generation via boosting to ensemble diverse models and cover the range of potential outcomes, and through dropout to secure models with minor variations. Additionally, we design a ranking method to union the predictions from boosting and dropout, prioritizing the most plausible products. Experimental results on the largest reaction prediction benchmark USPTO-MIT show the superior performance of our proposed method in modeling the non-uniform uncertainty compared to baselines.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Dynamic Shuffle: An Efficient Channel Mixture Method
Authors:
Kaijun Gong,
Zhuowen Yin,
Yushu Li,
Kailing Guo,
Xiangmin Xu
Abstract:
The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the data-dependent redundancy, we devise a dynamic shuffle module to generate data-dependent permutation matrices for shuffling. Since the dimension of permutation matrix is…
▽ More
The redundancy of Convolutional neural networks not only depends on weights but also depends on inputs. Shuffling is an efficient operation for mixing channel information but the shuffle order is usually pre-defined. To reduce the data-dependent redundancy, we devise a dynamic shuffle module to generate data-dependent permutation matrices for shuffling. Since the dimension of permutation matrix is proportional to the square of the number of input channels, to make the generation process efficiently, we divide the channels into groups and generate two shared small permutation matrices for each group, and utilize Kronecker product and cross group shuffle to obtain the final permutation matrices. To make the generation process learnable, based on theoretical analysis, softmax, orthogonal regularization, and binarization are employed to asymptotically approximate the permutation matrix. Dynamic shuffle adaptively mixes channel information with negligible extra computation and memory occupancy. Experiment results on image classification benchmark datasets CIFAR-10, CIFAR-100, Tiny ImageNet and ImageNet have shown that our method significantly increases ShuffleNets' performance. Adding dynamic generated matrix with learnable static matrix, we further propose static-dynamic-shuffle and show that it can serve as a lightweight replacement of ordinary pointwise convolution.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from Scratch
Authors:
Pucheng Zhai,
Kailing Guo,
Fang Liu,
Xiaofen Xing,
Xiangmin Xu
Abstract:
Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow…
▽ More
Structured pruning is a commonly used convolutional neural network (CNN) compression approach. Pruning rate setting is a fundamental problem in structured pruning. Most existing works introduce too many additional learnable parameters to assign different pruning rates across different layers in CNN or cannot control the compression rate explicitly. Since too narrow network blocks information flow for training, automatic pruning rate setting cannot explore a high pruning rate for a specific layer. To overcome these limitations, we propose a novel framework named Layer Adaptive Progressive Pruning (LAPP), which gradually compresses the network during initial training of a few epochs from scratch. In particular, LAPP designs an effective and efficient pruning strategy that introduces a learnable threshold for each layer and FLOPs constraints for network. Guided by both task loss and FLOPs constraints, the learnable thresholds are dynamically and gradually updated to accommodate changes of importance scores during training. Therefore the pruning strategy can gradually prune the network and automatically determine the appropriate pruning rates for each layer. What's more, in order to maintain the expressive power of the pruned layer, before training starts, we introduce an additional lightweight bypass for each convolutional layer to be pruned, which only adds relatively few additional burdens. Our method demonstrates superior performance gains over previous compression methods on various datasets and backbone architectures. For example, on CIFAR-10, our method compresses ResNet-20 to 40.3% without accuracy drop. 55.6% of FLOPs of ResNet-18 are reduced with 0.21% top-1 accuracy increase and 0.40% top-5 accuracy increase on ImageNet.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Multivariate Prototype Representation for Domain-Generalized Incremental Learning
Authors:
Can Peng,
Piotr Koniusz,
Kaiyu Guo,
Brian C. Lovell,
Peyman Moghadam
Abstract:
Deep learning models suffer from catastrophic forgetting when being fine-tuned with samples of new classes. This issue becomes even more pronounced when faced with the domain shift between training and testing data. In this paper, we study the critical and less explored Domain-Generalized Class-Incremental Learning (DGCIL). We design a DGCIL approach that remembers old classes, adapts to new class…
▽ More
Deep learning models suffer from catastrophic forgetting when being fine-tuned with samples of new classes. This issue becomes even more pronounced when faced with the domain shift between training and testing data. In this paper, we study the critical and less explored Domain-Generalized Class-Incremental Learning (DGCIL). We design a DGCIL approach that remembers old classes, adapts to new classes, and can classify reliably objects from unseen domains. Specifically, our loss formulation maintains classification boundaries and suppresses the domain-specific information of each class. With no old exemplars stored, we use knowledge distillation and estimate old class prototype drift as incremental training advances. Our prototype representations are based on multivariate Normal distributions whose means and covariances are constantly adapted to changing model features to represent old classes well by adapting to the feature space drift. For old classes, we sample pseudo-features from the adapted Normal distributions with the help of Cholesky decomposition. In contrast to previous pseudo-feature sampling strategies that rely solely on average mean prototypes, our method excels at capturing varying semantic information. Experiments on several benchmarks validate our claims.
△ Less
Submitted 24 September, 2023;
originally announced September 2023.
-
CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation
Authors:
Xiaoheng Jiang,
Kaiyi Guo,
Yang Lu,
Feng Yan,
Hao Liu,
Jiale Cao,
Mingliang Xu,
Dacheng Tao
Abstract:
Surface defect inspection is of great importance for industrial manufacture and production. Though defect inspection methods based on deep learning have made significant progress, there are still some challenges for these methods, such as indistinguishable weak defects and defect-like interference in the background. To address these issues, we propose a transformer network with multi-stage CNN (Co…
▽ More
Surface defect inspection is of great importance for industrial manufacture and production. Though defect inspection methods based on deep learning have made significant progress, there are still some challenges for these methods, such as indistinguishable weak defects and defect-like interference in the background. To address these issues, we propose a transformer network with multi-stage CNN (Convolutional Neural Network) feature injection for surface defect segmentation, which is a UNet-like structure named CINFormer. CINFormer presents a simple yet effective feature integration mechanism that injects the multi-level CNN features of the input image into different stages of the transformer network in the encoder. This can maintain the merit of CNN capturing detailed features and that of transformer depressing noises in the background, which facilitates accurate defect detection. In addition, CINFormer presents a Top-K self-attention module to focus on tokens with more important information about the defects, so as to further reduce the impact of the redundant background. Extensive experiments conducted on the surface defect datasets DAGM 2007, Magnetic tile, and NEU show that the proposed CINFormer achieves state-of-the-art performance in defect detection.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
CMRxRecon: An open cardiac MRI dataset for the competition of accelerated image reconstruction
Authors:
Chengyan Wang,
Jun Lyu,
Shuo Wang,
Chen Qin,
Kunyuan Guo,
Xinyu Zhang,
Xiaotong Yu,
Yan Li,
Fanwen Wang,
Jianhua **,
Zhang Shi,
Ziqiang Xu,
Yapeng Tian,
Sha Hua,
Zhensen Chen,
Meng Liu,
Mengting Sun,
Xutong Kuang,
Kang Wang,
Haoran Wang,
Hao Li,
Yinghua Chu,
Guang Yang,
Wenjia Bai,
Xiahai Zhuang
, et al. (3 additional authors not shown)
Abstract:
Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However,…
▽ More
Cardiac magnetic resonance imaging (CMR) has emerged as a valuable diagnostic tool for cardiac diseases. However, a limitation of CMR is its slow imaging speed, which causes patient discomfort and introduces artifacts in the images. There has been growing interest in deep learning-based CMR imaging algorithms that can reconstruct high-quality images from highly under-sampled k-space data. However, the development of deep learning methods requires large training datasets, which have not been publicly available for CMR. To address this gap, we released a dataset that includes multi-contrast, multi-view, multi-slice and multi-coil CMR imaging data from 300 subjects. Imaging studies include cardiac cine and map** sequences. Manual segmentations of the myocardium and chambers of all the subjects are also provided within the dataset. Scripts of state-of-the-art reconstruction algorithms were also provided as a point of reference. Our aim is to facilitate the advancement of state-of-the-art CMR image reconstruction by introducing standardized evaluation criteria and making the dataset freely accessible to the research community. Researchers can access the dataset at https://www.synapse.org/#!Synapse:syn51471091/wiki/.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Effective Image Tampering Localization via Enhanced Transformer and Co-attention Fusion
Authors:
Kun Guo,
Haochen Zhu,
Gang Cao
Abstract:
Powerful manipulation techniques have made digital image forgeries be easily created and widespread without leaving visual anomalies. The blind localization of tampered regions becomes quite significant for image forensics. In this paper, we propose an effective image tampering localization network (EITLNet) based on a two-branch enhanced transformer encoder with attention-based feature fusion. Sp…
▽ More
Powerful manipulation techniques have made digital image forgeries be easily created and widespread without leaving visual anomalies. The blind localization of tampered regions becomes quite significant for image forensics. In this paper, we propose an effective image tampering localization network (EITLNet) based on a two-branch enhanced transformer encoder with attention-based feature fusion. Specifically, a feature enhancement module is designed to enhance the feature representation ability of the transformer encoder. The features extracted from RGB and noise streams are fused effectively by the coordinate attention-based fusion module at multiple scales. Extensive experimental results verify that the proposed scheme achieves the state-of-the-art generalization ability and robustness in various benchmark datasets. Code will be public at https://github.com/multimediaFor/EITLNet.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Diagnosing the role of observable distribution shift in scientific replications
Authors:
Ying **,
Kevin Guo,
Dominik Rothenhäusler
Abstract:
Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how mu…
▽ More
Many researchers have identified distribution shift as a likely contributor to the reproducibility crisis in behavioral and biomedical sciences. The idea is that if treatment effects vary across individual characteristics and experimental contexts, then studies conducted in different populations will estimate different average effects. This paper uses ``generalizability" methods to quantify how much of the effect size discrepancy between an original study and its replication can be explained by distribution shift on observed unit-level characteristics. More specifically, we decompose this discrepancy into ``components" attributable to sampling variability (including publication bias), observable distribution shifts, and residual factors. We compute this decomposition for several directly-replicated behavioral science experiments and find little evidence that observable distribution shifts contribute appreciably to non-replicability. In some cases, this is because there is too much statistical noise. In other cases, there is strong evidence that controlling for additional moderators is necessary for reliable replication.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
The stability of unevenly spaced planetary systems
Authors:
Sheng Yang,
Liangyu Wu,
Zekai Zheng,
Masahiro Ogihara,
Kangrou Guo,
Wenzhan Ouyang,
Yaxing He
Abstract:
Studying the orbital stability of multi-planet systems is essential to understand planet formation, estimate the stable time of an observed planetary system, and advance population synthesis models. Although previous studies have primarily focused on ideal systems characterized by uniform orbital separations, in reality a diverse range of orbital separations exists among planets within the same sy…
▽ More
Studying the orbital stability of multi-planet systems is essential to understand planet formation, estimate the stable time of an observed planetary system, and advance population synthesis models. Although previous studies have primarily focused on ideal systems characterized by uniform orbital separations, in reality a diverse range of orbital separations exists among planets within the same system. This study focuses on investigating the dynamical stability of systems with non-uniform separation. We considered a system with 10 planets with masses of $10^{-7}$ solar masses around a central star with a mass of $1$ solar mass. We performed more than 100,000 runs of N-body simulations with different parameters. Results demonstrate that reducing merely one pair of planetary spacing leads to an order of magnitude shorter orbital crossing times that could be formulated based on the Keplerian periods of the closest separation pair. Furthermore, the first collisions are found to be closely associated with the first encounter pair that is likely to be the closest separation pair initially. We conclude that when estimating the orbital crossing time and colliding pairs in a realistic situation, updating the formula derived for evenly spaced systems would be necessary.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Formation of inner planets in the presence of a Cold Jupiter: orbital evolution and relative velocities of planetesimals
Authors:
Kangrou Guo,
Eiichiro Kokubo
Abstract:
We investigate the orbital evolution of planetesimals in the inner disk in the presence of nebula gas and a (proto-) cold Jupiter. By varying the mass, eccentricity, and semi-major axis of the planet, we study the dependence of the relative velocities of the planetesimals on these parameters. For classic small planetesimals ($10^{16}-10^{20} $g) whose mutual gravitational interaction is negligible…
▽ More
We investigate the orbital evolution of planetesimals in the inner disk in the presence of nebula gas and a (proto-) cold Jupiter. By varying the mass, eccentricity, and semi-major axis of the planet, we study the dependence of the relative velocities of the planetesimals on these parameters. For classic small planetesimals ($10^{16}-10^{20} $g) whose mutual gravitational interaction is negligible, gas drag introduces a size-dependent alignment of orbits and keeps the relative velocity low for similar-size bodies, while preventing orbital alignment for different-size planetesimals. Regardless of the location and the mass ratio of the planetesimals, increasing the mass and eccentricity or decreasing the orbital distance of the planet always leads to higher relative velocities of planetesimals. However, for massive planetesimals, the interplay of viscous stirring, gas dam**, and secular perturbation results in lower velocity dispersion of equal-size planetesimals when the planet is more massive or when it is located on a closer or more eccentric orbit. The random velocities of such planetesimals remain almost unperturbed when the planet is located beyond Jupiter's current orbit, or when it is less massive or less eccentric than Jupiter. Unlike small planetesimals, such large planetesimals can grow in a runaway fashion as in the unperturbed case. Our results imply that the presence of a cold Jupiter does not impede the formation of inner rocky planets through planetesimal accretion, provided that the planetesimals are initially large.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.