Search | arXiv e-print repository

Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection

Authors: Haodi Zhang, Min Cai, Xinhe Zhang, Chen Jason Zhang, Rui Mao, Kaishun Wu

Abstract: While large language models (LLMs) such as ChatGPT and PaLM have demonstrated remarkable performance in various language understanding and generation tasks, their capabilities in complex reasoning and intricate knowledge utilization still fall short of human-level proficiency. Recent studies have established the effectiveness of prompts in steering LLMs towards generating desired outputs. Building… ▽ More While large language models (LLMs) such as ChatGPT and PaLM have demonstrated remarkable performance in various language understanding and generation tasks, their capabilities in complex reasoning and intricate knowledge utilization still fall short of human-level proficiency. Recent studies have established the effectiveness of prompts in steering LLMs towards generating desired outputs. Building on these insights, we introduce a novel framework that harnesses the potential of large-scale pre-trained language models, to iteratively enhance performance of the LLMs. Our framework incorporates three components: \textit{Normal CoT}, a \textit{Convincer}, and an \textit{Answerer}. It processes the output of a typical few-shot chain-of-thought prompt, assesses the correctness of the response, scrutinizes the answer, refines the reasoning, and ultimately produces a new solution. Experimental results on the 7 datasets of miscellaneous problems validate the efficacy of the Self-Convince framework, achieving substantial improvements compared to the baselines. This study contributes to the burgeoning body of research focused on integrating pre-trained language models with tailored prompts and iterative refinement processes to augment their performance in complex tasks. △ Less

Submitted 10 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

arXiv:2310.04610 [pdf, other]

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Authors: Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, Xiaoxia Wu, Jeff Rasley, Ammar Ahmad Awan, Connor Holmes, Martin Cai, Adam Ghanem, Zhongzhu Zhou, Yuxiong He, Pete Luferenko, Divya Kumar, Jonathan Weyn, Ruixiong Zhang, Sylwester Klocek, Volodymyr Vragov, Mohammed AlQuraishi, Gustaf Ahdritz, Christina Floristean, Cristina Negri , et al. (67 additional authors not shown)

Abstract: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique… ▽ More In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research. △ Less

Submitted 11 October, 2023; v1 submitted 6 October, 2023; originally announced October 2023.

arXiv:2309.12530 [pdf, other]

A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance

Authors: Zeyi Huang, Andy Zhou, Zijian Lin, Mu Cai, Haohan Wang, Yong Jae Lee

Abstract: Domain generalization studies the problem of training a model with samples from several domains (or distributions) and then testing the model with samples from a new, unseen domain. In this paper, we propose a novel approach for domain generalization that leverages recent advances in large vision-language models, specifically a CLIP teacher model, to train a smaller model that generalizes to unsee… ▽ More Domain generalization studies the problem of training a model with samples from several domains (or distributions) and then testing the model with samples from a new, unseen domain. In this paper, we propose a novel approach for domain generalization that leverages recent advances in large vision-language models, specifically a CLIP teacher model, to train a smaller model that generalizes to unseen domains. The key technical contribution is a new type of regularization that requires the student's learned image representations to be close to the teacher's learned text representations obtained from encoding the corresponding text descriptions of images. We introduce two designs of the loss function, absolute and relative distance, which provide specific guidance on how the training process of the student model should be regularized. We evaluate our proposed method, dubbed RISE (Regularized Invariance with Semantic Embeddings), on various benchmark datasets and show that it outperforms several state-of-the-art domain generalization methods. To our knowledge, our work is the first to leverage knowledge distillation using a large vision-language model for domain generalization. By incorporating text-based information, RISE improves the generalization capability of machine learning models. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: to appear at ICCV2023

arXiv:2309.10313 [pdf, other]

Investigating the Catastrophic Forgetting in Multimodal Large Language Models

Authors: Yuexiang Zhai, Shengbang Tong, Xiao Li, Mu Cai, Qing Qu, Yong Jae Lee, Yi Ma

Abstract: Following the success of GPT4, there has been a surge in interest in multimodal large language model (MLLM) research. This line of research focuses on develo** general-purpose LLMs through fine-tuning pre-trained LLMs and vision models. However, catastrophic forgetting, a notorious phenomenon where the fine-tuned model fails to retain similar performance compared to the pre-trained model, still… ▽ More Following the success of GPT4, there has been a surge in interest in multimodal large language model (MLLM) research. This line of research focuses on develo** general-purpose LLMs through fine-tuning pre-trained LLMs and vision models. However, catastrophic forgetting, a notorious phenomenon where the fine-tuned model fails to retain similar performance compared to the pre-trained model, still remains an inherent problem in multimodal LLMs (MLLM). In this paper, we introduce EMT: Evaluating MulTimodality for evaluating the catastrophic forgetting in MLLMs, by treating each MLLM as an image classifier. We first apply EMT to evaluate several open-source fine-tuned MLLMs and we discover that almost all evaluated MLLMs fail to retain the same performance levels as their vision encoders on standard image classification tasks. Moreover, we continue fine-tuning LLaVA, an MLLM and utilize EMT to assess performance throughout the fine-tuning. Interestingly, our results suggest that early-stage fine-tuning on an image dataset improves performance across other image datasets, by enhancing the alignment of text and visual features. However, as fine-tuning proceeds, the MLLMs begin to hallucinate, resulting in a significant loss of generalizability, even when the image encoder remains frozen. Our results suggest that MLLMs have yet to demonstrate performance on par with their vision models on standard image classification tasks and the current MLLM fine-tuning procedure still has room for improvement. △ Less

Submitted 5 December, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.08813 [pdf, other]

Control Barrier Function for Linearizable Systems with High Relative Degrees from Signal Temporal Logics: A Reference Governor Approach

Authors: Kaier Liang, Mingyu Cai, Cristian-Ioan Vasile

Abstract: This paper considers the safety-critical navigation problem with Signal Temporal Logic (STL) tasks. We developed an explicit reference governor-guided control barrier function (ERG-guided CBF) method that enables the application of first-order CBFs to high-order linearizable systems. This method significantly reduces the conservativeness of the existing CBF approaches for high-order systems. Furth… ▽ More This paper considers the safety-critical navigation problem with Signal Temporal Logic (STL) tasks. We developed an explicit reference governor-guided control barrier function (ERG-guided CBF) method that enables the application of first-order CBFs to high-order linearizable systems. This method significantly reduces the conservativeness of the existing CBF approaches for high-order systems. Furthermore, our framework provides safety-critical guarantees in the sense of obstacle avoidance by constructing the margin of safety and updating direction of safe evolution in the agent's state space. To improve control performance and enhance STL satisfaction, we employ efficient gradient-based methods for iteratively learning optimal parameters of ERG-guided CBF. We validate the algorithm through both high-order linear and nonlinear systems. A video demonstration can be found on: \url{https://youtu.be/ZRmsA2FeFR4} △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.04198 [pdf, other]

Don't Ignore Dual Logic Ability of LLMs while Privatizing: A Data-Intensive Analysis in Medical Domain

Authors: Yanrui Du, Sendong Zhao, Muzhen Cai, Ming Ma, Danyang Zhao, Jiawei Cao, Bing Qin

Abstract: Extensive studies have been devoted to privatizing general-domain Large Language Models (LLMs) as Domain-Specific LLMs via feeding specific-domain data. However, these privatization efforts often ignored a critical aspect: Dual Logic Ability, which is a core reasoning ability for LLMs. The dual logic ability of LLMs ensures that they can maintain a consistent stance when confronted with both posit… ▽ More Extensive studies have been devoted to privatizing general-domain Large Language Models (LLMs) as Domain-Specific LLMs via feeding specific-domain data. However, these privatization efforts often ignored a critical aspect: Dual Logic Ability, which is a core reasoning ability for LLMs. The dual logic ability of LLMs ensures that they can maintain a consistent stance when confronted with both positive and negative statements about the same fact. Our study focuses on how the dual logic ability of LLMs is affected during the privatization process in the medical domain. We conduct several experiments to analyze the dual logic ability of LLMs by examining the consistency of the stance in responses to paired questions about the same fact. In our experiments, interestingly, we observed a significant decrease in the dual logic ability of existing LLMs after privatization. Besides, our results indicate that incorporating general domain dual logic data into LLMs not only enhances LLMs' dual logic ability but also further improves their accuracy. These findings underscore the importance of prioritizing LLMs' dual logic ability during the privatization process. Our study establishes a benchmark for future research aimed at exploring LLMs' dual logic ability during the privatization process and offers valuable guidance for privatization efforts in real-world applications. △ Less

Submitted 23 February, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

arXiv:2309.04175 [pdf, other]

Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese

Authors: Haochun Wang, Sendong Zhao, Zewen Qiang, Zijian Li, Nuwa Xi, Yanrui Du, MuZhen Cai, Haoqiang Guo, Yuhan Chen, Haoming Xu, Bing Qin, Ting Liu

Abstract: Large Language Models (LLMs) have demonstrated remarkable success in diverse natural language processing (NLP) tasks in general domains. However, LLMs sometimes generate responses with the hallucination about medical facts due to limited domain knowledge. Such shortcomings pose potential risks in the utilization of LLMs within medical contexts. To address this challenge, we propose knowledge-tunin… ▽ More Large Language Models (LLMs) have demonstrated remarkable success in diverse natural language processing (NLP) tasks in general domains. However, LLMs sometimes generate responses with the hallucination about medical facts due to limited domain knowledge. Such shortcomings pose potential risks in the utilization of LLMs within medical contexts. To address this challenge, we propose knowledge-tuning, which leverages structured medical knowledge bases for the LLMs to grasp domain knowledge efficiently and facilitate reliable response generation. We also release cMedKnowQA, a Chinese medical knowledge question-answering dataset constructed from medical knowledge bases to assess the medical knowledge proficiency of LLMs. Experimental results show that the LLMs which are knowledge-tuned with cMedKnowQA, can exhibit higher levels of accuracy in response generation compared with vanilla instruction-tuning and offer a new reliable way for the domain adaptation of LLMs. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: 11 pages, 5 figures

arXiv:2309.04174 [pdf, other]

Manifold-based Verbalizer Space Re-embedding for Tuning-free Prompt-based Classification

Authors: Haochun Wang, Sendong Zhao, Chi Liu, Nuwa Xi, Muzhen Cai, Bing Qin, Ting Liu

Abstract: Prompt-based classification adapts tasks to a cloze question format utilizing the [MASK] token and the filled tokens are then mapped to labels through pre-defined verbalizers. Recent studies have explored the use of verbalizer embeddings to reduce labor in this process. However, all existing studies require a tuning process for either the pre-trained models or additional trainable embeddings. Mean… ▽ More Prompt-based classification adapts tasks to a cloze question format utilizing the [MASK] token and the filled tokens are then mapped to labels through pre-defined verbalizers. Recent studies have explored the use of verbalizer embeddings to reduce labor in this process. However, all existing studies require a tuning process for either the pre-trained models or additional trainable embeddings. Meanwhile, the distance between high-dimensional verbalizer embeddings should not be measured by Euclidean distance due to the potential for non-linear manifolds in the representation space. In this study, we propose a tuning-free manifold-based space re-embedding method called Locally Linear Embedding with Intra-class Neighborhood Constraint (LLE-INC) for verbalizer embeddings, which preserves local properties within the same class as guidance for classification. Experimental results indicate that even without tuning any parameters, our LLE-INC is on par with automated verbalizers with parameter tuning. And with the parameter updating, our approach further enhances prompt-based tuning by up to 3.2%. Furthermore, experiments with the LLaMA-7B&13B indicate that LLE-INC is an efficient tuning-free classification approach for the hyper-scale language models. △ Less

Submitted 29 January, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: Accepted by AAAI 2024, 11 pages, 3 figures

arXiv:2308.12033 [pdf, other]

PREFER: Prompt Ensemble Learning via Feedback-Reflect-Refine

Authors: Chenrui Zhang, Lin Liu, **peng Wang, Chuyuan Wang, Xiao Sun, Hongyu Wang, Mingchen Cai

Abstract: As an effective tool for eliciting the power of Large Language Models (LLMs), prompting has recently demonstrated unprecedented abilities across a variety of complex tasks. To further improve the performance, prompt ensemble has attracted substantial interest for tackling the hallucination and instability of LLMs. However, existing methods usually adopt a two-stage paradigm, which requires a pre-p… ▽ More As an effective tool for eliciting the power of Large Language Models (LLMs), prompting has recently demonstrated unprecedented abilities across a variety of complex tasks. To further improve the performance, prompt ensemble has attracted substantial interest for tackling the hallucination and instability of LLMs. However, existing methods usually adopt a two-stage paradigm, which requires a pre-prepared set of prompts with substantial manual effort, and is unable to perform directed optimization for different weak learners. In this paper, we propose a simple, universal, and automatic method named PREFER (Pompt Ensemble learning via Feedback-Reflect-Refine) to address the stated limitations. Specifically, given the fact that weak learners are supposed to focus on hard examples during boosting, PREFER builds a feedback mechanism for reflecting on the inadequacies of existing weak learners. Based on this, the LLM is required to automatically synthesize new prompts for iterative refinement. Moreover, to enhance stability of the prompt effect evaluation, we propose a novel prompt bagging method involving forward and backward thinking, which is superior to majority voting and is beneficial for both feedback and weight calculation in boosting. Extensive experiments demonstrate that our PREFER achieves state-of-the-art performance in multiple types of tasks by a significant margin. We have made our code publicly available. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: 8 pages, 4 figures

arXiv:2308.09370 [pdf, other]

TrOMR:Transformer-Based Polyphonic Optical Music Recognition

Authors: Yixuan Li, Hua** Liu, Qiang **, Miaomiao Cai, Peng Li

Abstract: Optical Music Recognition (OMR) is an important technology in music and has been researched for a long time. Previous approaches for OMR are usually based on CNN for image understanding and RNN for music symbol classification. In this paper, we propose a transformer-based approach with excellent global perceptual capability for end-to-end polyphonic OMR, called TrOMR. We also introduce a novel con… ▽ More Optical Music Recognition (OMR) is an important technology in music and has been researched for a long time. Previous approaches for OMR are usually based on CNN for image understanding and RNN for music symbol classification. In this paper, we propose a transformer-based approach with excellent global perceptual capability for end-to-end polyphonic OMR, called TrOMR. We also introduce a novel consistency loss function and a reasonable approach for data annotation to improve recognition accuracy for complex music scores. Extensive experiments demonstrate that TrOMR outperforms current OMR methods, especially in real-world scenarios. We also develop a TrOMR system and build a camera scene dataset for full-page music scores in real-world. The code and datasets will be made available for reproducibility. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Journal ref: ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2308.01396 [pdf, other]

Cross-phase modulation in the two dimensional spectroscopy

Authors: Mao-Rui Cai, Xue Zhang, Zi-Qian Cheng, Teng-Fei Yan, Hui Dong

Abstract: Develo** from the transient absorption (TA) spectroscopy, the two dimensional (2D) spectroscopy with pump-probe geometry has emerged as a versatile approach for alleviating the difficulty on implementing the 2D spectroscopy with other geometries. However, the presence of cross-phase modulation (XPM) in TA spectroscopy introduces significant spectral distortions, particularly when the pump and pr… ▽ More Develo** from the transient absorption (TA) spectroscopy, the two dimensional (2D) spectroscopy with pump-probe geometry has emerged as a versatile approach for alleviating the difficulty on implementing the 2D spectroscopy with other geometries. However, the presence of cross-phase modulation (XPM) in TA spectroscopy introduces significant spectral distortions, particularly when the pump and probe pulses overlap. We demonstrate that this phenomenon is extended to the 2D spectroscopy with pump-probe geometry and the XPM is induced by the interference of the two pump pulse. We present the oscillatory behavior of XPM in the 2D spectrum and its displacement with respect to the waiting time delay through both experimental measurements and numerical simulations. Additionally, we explore the influence of probe pulse chirp on XPM and discover that by compressing the chirp, the impact of XPM on the desired signal can be reduced. △ Less

Submitted 2 August, 2023; originally announced August 2023.

Comments: 11 pages, 7 figures

arXiv:2307.06862 [pdf]

doi 10.1364/JOCN.499530

Building a digital twin of EDFA: a grey-box modeling approach

Authors: Yichen Liu, Xiaomin Liu, Yihao Zhang, Meng Cai, Mengfan Fu, Xueying Zhong, Lilin Yi, Weisheng Hu, Qunbi Zhuge

Abstract: To enable intelligent and self-driving optical networks, high-accuracy physical layer models are required. The dynamic wavelength-dependent gain effects of non-constant-pump erbium-doped fiber amplifiers (EDFAs) remain a crucial problem in terms of modeling, as it determines optical-to-signal noise ratio as well as the magnitude of fiber nonlinearities. Black-box data-driven models have been widel… ▽ More To enable intelligent and self-driving optical networks, high-accuracy physical layer models are required. The dynamic wavelength-dependent gain effects of non-constant-pump erbium-doped fiber amplifiers (EDFAs) remain a crucial problem in terms of modeling, as it determines optical-to-signal noise ratio as well as the magnitude of fiber nonlinearities. Black-box data-driven models have been widely studied, but it requires a large size of data for training and suffers from poor generalizability. In this paper, we derive the gain spectra of EDFAs as a simple univariable linear function, and then based on it we propose a grey-box EDFA gain modeling scheme. Experimental results show that for both automatic gain control (AGC) and automatic power control (APC) EDFAs, our model built with 8 data samples can achieve better performance than the neural network (NN) based model built with 900 data samples, which means the required data size for modeling can be reduced by at least two orders of magnitude. Moreover, in the experiment the proposed model demonstrates superior generalizability to unseen scenarios since it is based on the underlying physics of EDFAs. The results indicate that building a customized digital twin of each EDFA in optical networks become feasible, which is essential especially for next generation multi-band network operations. △ Less

Submitted 13 July, 2023; originally announced July 2023.

arXiv:2307.02011 [pdf, other]

Precise WiFi Indoor Positioning using Deep Learning Algorithms

Authors: Minxue Cai, Zihuai Lin

Abstract: This study demonstrates a WiFi indoor positioning system using Deep Learning algorithms. A new method using fitting function in MATLAB will be utilized to compute the path loss coefficient and log-normal fading variance. To reduce the error, a new hybrid localization approach utilizing Received Signal Strength Indicator (RSSI) and Angle of Arrival (AoA) has been created. Three Deep Learning algori… ▽ More This study demonstrates a WiFi indoor positioning system using Deep Learning algorithms. A new method using fitting function in MATLAB will be utilized to compute the path loss coefficient and log-normal fading variance. To reduce the error, a new hybrid localization approach utilizing Received Signal Strength Indicator (RSSI) and Angle of Arrival (AoA) has been created. Three Deep Learning algorithms would be utilized to decrease the adverse influence of the noise and interference. This paper compares the performance of two models in three different indoor environments. The average error of our hybrid positioning model trained by CNN in the big classroom is less than 250 mm. △ Less

Submitted 4 July, 2023; originally announced July 2023.

arXiv:2307.01975 [pdf, ps, other]

Strong convergence rates for a full discretization of stochastic wave equation with nonlinear dam**

Authors: Meng Cai, David Cohen, Xiaojie Wang

Abstract: The paper establishes the strong convergence rates of a spatio-temporal full discretization of the stochastic wave equation with nonlinear dam** in dimension one and two. We discretize the SPDE by applying a spectral Galerkin method in space and a modified implicit exponential Euler scheme in time. The presence of the super-linearly growing dam** in the underlying model brings challenges into… ▽ More The paper establishes the strong convergence rates of a spatio-temporal full discretization of the stochastic wave equation with nonlinear dam** in dimension one and two. We discretize the SPDE by applying a spectral Galerkin method in space and a modified implicit exponential Euler scheme in time. The presence of the super-linearly growing dam** in the underlying model brings challenges into the error analysis. To address these difficulties, we first achieve upper mean-square error bounds, and then obtain mean-square convergence rates of the considered numerical solution. This is done without requiring the moment bounds of the full approximations. The main result shows that, in dimension one, the scheme admits a convergence rate of order $\tfrac12$ in space and order $1$ in time. In dimension two, the error analysis is more subtle and can be done at the expense of an order reduction due to an infinitesimal factor. Numerical experiments are performed and confirm our theoretical findings. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 30 pages, 2 figures

arXiv:2306.06094 [pdf, other]

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Authors: Mu Cai, Zeyi Huang, Yuheng Li, Haohan Wang, Yong Jae Lee

Abstract: Recently, large language models (LLMs) have made significant advancements in natural language understanding and generation. However, their potential in computer vision remains largely unexplored. In this paper, we introduce a new, exploratory approach that enables LLMs to process images using the Scalable Vector Graphics (SVG) format. By leveraging the XML-based textual descriptions of SVG represe… ▽ More Recently, large language models (LLMs) have made significant advancements in natural language understanding and generation. However, their potential in computer vision remains largely unexplored. In this paper, we introduce a new, exploratory approach that enables LLMs to process images using the Scalable Vector Graphics (SVG) format. By leveraging the XML-based textual descriptions of SVG representations instead of raster images, we aim to bridge the gap between the visual and textual modalities, allowing LLMs to directly understand and manipulate images without the need for parameterized visual components. Our method facilitates simple image classification, generation, and in-context learning using only LLM capabilities. We demonstrate the promise of our approach across discriminative and generative tasks, highlighting its (i) robustness against distribution shift, (ii) substantial improvements achieved by tap** into the in-context learning abilities of LLMs, and (iii) image understanding and generation capabilities with human guidance. Our code, data, and models can be found here https://github.com/mu-cai/svg-llm. △ Less

Submitted 9 June, 2023; originally announced June 2023.

arXiv:2306.03885 [pdf, other]

doi 10.1016/j.asoc.2023.111066

Three-way Imbalanced Learning based on Fuzzy Twin SVM

Authors: Wanting Cai, Mingjie Cai, Qingguo Li, Qiong Liu

Abstract: Three-way decision (3WD) is a powerful tool for granular computing to deal with uncertain data, commonly used in information systems, decision-making, and medical care. Three-way decision gets much research in traditional rough set models. However, three-way decision is rarely combined with the currently popular field of machine learning to expand its research. In this paper, three-way decision is… ▽ More Three-way decision (3WD) is a powerful tool for granular computing to deal with uncertain data, commonly used in information systems, decision-making, and medical care. Three-way decision gets much research in traditional rough set models. However, three-way decision is rarely combined with the currently popular field of machine learning to expand its research. In this paper, three-way decision is connected with SVM, a standard binary classification model in machine learning, for solving imbalanced classification problems that SVM needs to improve. A new three-way fuzzy membership function and a new fuzzy twin support vector machine with three-way membership (TWFTSVM) are proposed. The new three-way fuzzy membership function is defined to increase the certainty of uncertain data in both input space and feature space, which assigns higher fuzzy membership to minority samples compared with majority samples. To evaluate the effectiveness of the proposed model, comparative experiments are designed for forty-seven different datasets with varying imbalance ratios. In addition, datasets with different imbalance ratios are derived from the same dataset to further assess the proposed model's performance. The results show that the proposed model significantly outperforms other traditional SVM-based methods. △ Less

Submitted 19 May, 2023; originally announced June 2023.

arXiv:2306.02819 [pdf, other]

doi 10.18653/v1/2023.acl-long.258

Enhancing Language Representation with Constructional Information for Natural Language Understanding

Authors: Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Zhilin Gong, Ming Cai, Tianxiang Wang

Abstract: Natural language understanding (NLU) is an essential branch of natural language processing, which relies on representations generated by pre-trained language models (PLMs). However, PLMs primarily focus on acquiring lexico-semantic information, while they may be unable to adequately handle the meaning of constructions. To address this issue, we introduce construction grammar (CxG), which highlight… ▽ More Natural language understanding (NLU) is an essential branch of natural language processing, which relies on representations generated by pre-trained language models (PLMs). However, PLMs primarily focus on acquiring lexico-semantic information, while they may be unable to adequately handle the meaning of constructions. To address this issue, we introduce construction grammar (CxG), which highlights the pairings of form and meaning, to enrich language representation. We adopt usage-based construction grammar as the basis of our work, which is highly compatible with statistical models such as PLMs. Then a HyCxG framework is proposed to enhance language representation through a three-stage solution. First, all constructions are extracted from sentences via a slot-constraints approach. As constructions can overlap with each other, bringing redundancy and imbalance, we formulate the conditional max coverage problem for selecting the discriminative constructions. Finally, we propose a relational hypergraph attention network to acquire representation from constructional information by capturing high-order word interactions among constructions. Extensive experiments demonstrate the superiority of the proposed model on a variety of NLU tasks. △ Less

Submitted 5 June, 2023; originally announced June 2023.

Comments: Long paper, accepted at the ACL 2023

arXiv:2305.14895 [pdf, other]

doi 10.1088/1674-4527/acd593

The Lobster Eye Imager for Astronomy Onboard the SATech-01 Satellite

Authors: Z. X. Ling, X. J. Sun, C. Zhang, S. L. Sun, G. **, S. N. Zhang, X. F. Zhang, J. B. Chang, F. S. Chen, Y. F. Chen, Z. W. Cheng, W. Fu, Y. X. Han, H. Li, J. F. Li, Y. Li, Z. D. Li, P. R. Liu, Y. H. Lv, X. H. Ma, Y. J. Tang, C. B. Wang, R. J. Xie, Y. L. Xue, A. L. Yan , et al. (101 additional authors not shown)

Abstract: The Lobster Eye Imager for Astronomy (LEIA), a pathfinder of the Wide-field X-ray Telescope of the Einstein Probe (EP) mission, was successfully launched onboard the SATech-01 satellite of the Chinese Academy of Sciences on 27 July 2022. In this paper, we introduce the design and on-ground test results of the LEIA instrument. Using state-of-the-art Micro-Pore Optics (MPO), a wide field-of-view (Fo… ▽ More The Lobster Eye Imager for Astronomy (LEIA), a pathfinder of the Wide-field X-ray Telescope of the Einstein Probe (EP) mission, was successfully launched onboard the SATech-01 satellite of the Chinese Academy of Sciences on 27 July 2022. In this paper, we introduce the design and on-ground test results of the LEIA instrument. Using state-of-the-art Micro-Pore Optics (MPO), a wide field-of-view (FoV) of 346 square degrees (18.6 degrees * 18.6 degrees) of the X-ray imager is realized. An optical assembly composed of 36 MPO chips is used to focus incident X-ray photons, and four large-format complementary metal-oxide semiconductor (CMOS) sensors, each of 6 cm * 6 cm, are used as the focal plane detectors. The instrument has an angular resolution of 4 - 8 arcmin (in FWHM) for the central focal spot of the point spread function, and an effective area of 2 - 3 cm2 at 1 keV in essentially all the directions within the field of view. The detection passband is 0.5 - 4 keV in the soft X-rays and the sensitivity is 2 - 3 * 10-11 erg s-1 cm-2 (about 1 mini-Crab) at 1,000 second observation. The total weight of LEIA is 56 kg and the power is 85 W. The satellite, with a design lifetime of 2 years, operates in a Sun-synchronous orbit of 500 km with an orbital period of 95 minutes. LEIA is paving the way for future missions by verifying in flight the technologies of both novel focusing imaging optics and CMOS sensors for X-ray observation, and by optimizing the working setups of the instrumental parameters. In addition, LEIA is able to carry out scientific observations to find new transients and to monitor known sources in the soft X-ray band, albeit limited useful observing time available. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: Accepted by RAA

arXiv:2305.13688 [pdf]

Experimental observation of Kerr-Raman solitons in a normal-dispersion FP resonator

Authors: Tieying Li, Kan Wu, Xujia Zhang, Minglu Cai, Jian** Chen

Abstract: Different from the Kerr effect,stimulated Raman scattering (SRS) is a delayed response to molecular vibrations in materials. In microcavities, when driven in an anomalous group velocity dispersion (GVD) regime, SRS typically leads to self-frequency shift of solitons and generation of breather solitons which have been verified both theoretically and experimentally. However, when driven in a normal… ▽ More Different from the Kerr effect,stimulated Raman scattering (SRS) is a delayed response to molecular vibrations in materials. In microcavities, when driven in an anomalous group velocity dispersion (GVD) regime, SRS typically leads to self-frequency shift of solitons and generation of breather solitons which have been verified both theoretically and experimentally. However, when driven in a normal GVD regime, recent theoretical work predicts that SRS can cause the locking of switching waves (SWs) and thus support bright moving localized structures (LSs), which we term as Kerr-Raman solitons (KRSs). Limited by the design of suitable experimental parameters, experimental observation of the KRSs is not achieved yet. Here, we provide numerical investigation, and to our knowledge, the first experimental observation of these SRS enabled KRSs in a fiber Fabry-Perot (FP) resonator with ultra-low normal GVD. Such Kerr-Raman solitons exhibit localized temporal features with strong oscillations at ~13 THz local frequency on the top of a flat-top pulse. The corresponding spectrum is a low-noise and broadband Kerr comb with typical platicon-like spectrum in the center and two Raman Stokes and anti-Stokes peaks located near 13 THz away from the center. With such SRS enabled broadband Kerr comb, we have achieved a KRS spectrum with a repetition rate of ~3.68 GHz and a -40 dB spectral width of 260 nm. The corresponding comb tooth count is >9000, covering the S+C+L telecommunication bands. Moreover, the formation process of such KRSs is also revealed, and it is found that the GVD plays a key role in its generation. Our work will help to advance the study of the dynamics of optical frequency combs under the influence of SRS, as well as providing a broadband coherent mode-locked optical source for wide applications. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.12114 [pdf, other]

doi 10.1016/j.ijar.2023.109075

GFDC: A Granule Fusion Density-Based Clustering with Evidential Reasoning

Authors: Mingjie Cai, Zhishan Wu, Qingguo Li, Feng Xu, Jie Zhou

Abstract: Currently, density-based clustering algorithms are widely applied because they can detect clusters with arbitrary shapes. However, they perform poorly in measuring global density, determining reasonable cluster centers or structures, assigning samples accurately and handling data with large density differences among clusters. To overcome their drawbacks, this paper proposes a granule fusion densit… ▽ More Currently, density-based clustering algorithms are widely applied because they can detect clusters with arbitrary shapes. However, they perform poorly in measuring global density, determining reasonable cluster centers or structures, assigning samples accurately and handling data with large density differences among clusters. To overcome their drawbacks, this paper proposes a granule fusion density-based clustering with evidential reasoning (GFDC). Both local and global densities of samples are measured by a sparse degree metric first. Then information granules are generated in high-density and low-density regions, assisting in processing clusters with significant density differences. Further, three novel granule fusion strategies are utilized to combine granules into stable cluster structures, hel** to detect clusters with arbitrary shapes. Finally, by an assignment method developed from Dempster-Shafer theory, unstable samples are assigned. After using GFDC, a reasonable clustering result and some identified outliers can be obtained. The experimental results on extensive datasets demonstrate the effectiveness of GFDC. △ Less

Submitted 20 May, 2023; originally announced May 2023.

arXiv:2305.05367 [pdf]

Exploring assessment method of technological advancement based on literature cross-citation

Authors: Shengxuan Tang, Liming Zhang, Shuo Jiang, Ming Cai, Yao Xiao

Abstract: Assessing advancements of technology is essential for creating science and technology policies and making informed investments in the technology market. However, current methods primarily focus on the characteristics of the technologies themselves, making it difficult to accurately assess technologies across various fields and generations. To address this challenge, we propose a novel approach tha… ▽ More Assessing advancements of technology is essential for creating science and technology policies and making informed investments in the technology market. However, current methods primarily focus on the characteristics of the technologies themselves, making it difficult to accurately assess technologies across various fields and generations. To address this challenge, we propose a novel approach that uses bibliometrics, specifically literature citation networks, to measure changes in knowledge flow throughout the evolution of technology. This method can identify diverse trends in technology development and is an effective tool for evaluating technological advancements. We demonstrate its accuracy and applicability by applying it to mobile communication technology and comparing its quantitative results with other assessment methods. Our work provides critical support for assessing different technical routes and formulating technology policy. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 15 pages, 6 figures

arXiv:2305.02916 [pdf, other]

Enantiodetection via the 2D spectroscopy: extending the methodology to general experimental conditions

Authors: Mao-Rui Cai, Chong Ye, Yong Li, Hui Dong

Abstract: Develo** effective methods to measure the enantiomeric excess of the chiral mixture is one of the major topics in chiral molecular researches, yet remains challenging. Enantiodetection method via two-dimensional (2D) spectroscopy based on a four level model, containing a cyclic three-level system (CTLS), of chiral molecules was recently proposed and demonstrated, yet with a strict condition of t… ▽ More Develo** effective methods to measure the enantiomeric excess of the chiral mixture is one of the major topics in chiral molecular researches, yet remains challenging. Enantiodetection method via two-dimensional (2D) spectroscopy based on a four level model, containing a cyclic three-level system (CTLS), of chiral molecules was recently proposed and demonstrated, yet with a strict condition of the one-photon resonance (where three driving fields are exactly resonantly coupled to the three electric-dipole transitions, respectively) in the CTLS and narrowband probe pulse assumption. Here, we extend the 2D spectroscopy method to more general experimental conditions, with three-photon resonance (where the sum of the two smaller frequencies among the three driving fields equals to the third one) and broadband probe pulse. Our method remains effective on enantiodetection with the help of experimental techniques, such as the chop detection method, which is used to eliminate the influence of the other redundant levels existing in the real system of chiral molecules. Under these more general conditions, the enantiomeric excess of the chiral mixture is estimated by taking an easily available standard sample (usually the racemic mixture) as the reference. △ Less

Submitted 4 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures

arXiv:2305.00561 [pdf, other]

Model-free Motion Planning of Autonomous Agents for Complex Tasks in Partially Observable Environments

Authors: Junchao Li, Mingyu Cai, Zhen Kan, Abstract: Motion planning of autonomous agents in partially known environments with incomplete information is a challenging problem, particularly for complex tasks. This paper proposes a model-free reinforcement learning approach to address this problem. We formulate motion planning as a probabilistic-labeled partially observable Markov decision process (PL-POMDP) problem and use linear temporal logic (LTL)… ▽ More Motion planning of autonomous agents in partially known environments with incomplete information is a challenging problem, particularly for complex tasks. This paper proposes a model-free reinforcement learning approach to address this problem. We formulate motion planning as a probabilistic-labeled partially observable Markov decision process (PL-POMDP) problem and use linear temporal logic (LTL) to express the complex task. The LTL formula is then converted to a limit-deterministic generalized Büchi automaton (LDGBA). The problem is redefined as finding an optimal policy on the product of PL-POMDP with LDGBA based on model-checking techniques to satisfy the complex task. We implement deep Q learning with long short-term memory (LSTM) to process the observation history and task recognition. Our contributions include the proposed method, the utilization of LTL and LDGBA, and the LSTM-enhanced deep Q learning. We demonstrate the applicability of the proposed method by conducting simulations in various environments, including grid worlds, a virtual office, and a multi-agent warehouse. The simulation results demonstrate that our proposed method effectively addresses environment, action, and observation uncertainties. This indicates its potential for real-world applications, including the control of unmanned aerial vehicles (UAVs). △ Less

Submitted 30 April, 2023; originally announced May 2023.

Comments: 32 pages, 22 figures, submitted to Autonomous Agents and Multi-Agent Systems

arXiv:2304.13966 [pdf, ps, other]

Two kinds of numerical algorithms for ultra-slow diffusion equations

Authors: Min Cai, Changpin Li, Yu Wang

Abstract: In this article, two kinds of numerical algorithms are derived for the ultra-slow (or superslow) diffusion equation in one and two space dimensions, where the ultra-slow diffusion is characterized by the Caputo-Hadamard fractional derivative of order $α\in (0,1)$. To describe the spatial interaction, the Riesz fractional derivative and the fractional Laplacian are used in one and two space dimensi… ▽ More In this article, two kinds of numerical algorithms are derived for the ultra-slow (or superslow) diffusion equation in one and two space dimensions, where the ultra-slow diffusion is characterized by the Caputo-Hadamard fractional derivative of order $α\in (0,1)$. To describe the spatial interaction, the Riesz fractional derivative and the fractional Laplacian are used in one and two space dimensions, respectively. The Caputo-Hadamard derivative is discretized by two typical approximate formulae, i.e., L2-1$_σ$ and L1-2 methods. The spatial fractional derivatives are discretized by the 2-nd order finite difference methods. When L2-1$_σ$ discretization is used, the derived numerical scheme is unconditionally stable with error estimate $\mathcal{O}(τ^{2}+h^{2})$ for all $α\in (0, 1)$, in which $τ$ and $h$ are temporal and spatial stepsizes, respectively. When L1-2 discretization is used, the derived numerical scheme is stable with error estimate $\mathcal{O}(τ^{3-α}+h^{2})$ for $α\in (0, 0.3738)$. The illustrative examples displayed are in line with the theoretical analysis. △ Less

Submitted 27 April, 2023; originally announced April 2023.

MSC Class: 35R11; 65M06

arXiv:2304.09176 [pdf, other]

Enhancing Personalized Ranking With Differentiable Group AUC Optimization

Authors: Xiao Sun, Bo Zhang, Chenrui Zhang, Han Ren, Mingchen Cai

Abstract: AUC is a common metric for evaluating the performance of a classifier. However, most classifiers are trained with cross entropy, and it does not optimize the AUC metric directly, which leaves a gap between the training and evaluation stage. In this paper, we propose the PDAOM loss, a Personalized and Differentiable AUC Optimization method with Maximum violation, which can be directly applied when… ▽ More AUC is a common metric for evaluating the performance of a classifier. However, most classifiers are trained with cross entropy, and it does not optimize the AUC metric directly, which leaves a gap between the training and evaluation stage. In this paper, we propose the PDAOM loss, a Personalized and Differentiable AUC Optimization method with Maximum violation, which can be directly applied when training a binary classifier and optimized with gradient-based methods. Specifically, we construct the pairwise exponential loss with difficult pair of positive and negative samples within sub-batches grouped by user ID, aiming to guide the classifier to pay attention to the relation between hard-distinguished pairs of opposite samples from the perspective of independent users. Compared to the origin form of pairwise exponential loss, the proposed PDAOM loss not only improves the AUC and GAUC metrics in the offline evaluation, but also reduces the computation complexity of the training objective. Furthermore, online evaluation of the PDAOM loss on the 'Guess What You Like' feed recommendation application in Meituan manifests 1.40% increase in click count and 0.65% increase in order count compared to the baseline model, which is a significant improvement in this well-developed online life service recommendation system. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: NeuRec@ICDM 2022

arXiv:2304.04609 [pdf]

Inverse design of artificial skins

Authors: Zhiguang Liu, Minkun Cai, Shenda Hong, Junli Shi, Sai Xie, Chang Liu, Huifeng Du, James D. Morin, Gang Li, Wang Liu, Hong Wang, Ke Tang, Nicholas X. Fang, Chuan Fei Guo

Abstract: Mimicking the perceptual functions of human cutaneous mechanoreceptors, artificial skins or flexible pressure sensors can transduce tactile stimuli to quantitative electrical signals. Conventional methods to design such devices follow a forward structure-to-property routine based on trial-and-error experiments/simulations, which take months or longer to determine one solution valid for one specifi… ▽ More Mimicking the perceptual functions of human cutaneous mechanoreceptors, artificial skins or flexible pressure sensors can transduce tactile stimuli to quantitative electrical signals. Conventional methods to design such devices follow a forward structure-to-property routine based on trial-and-error experiments/simulations, which take months or longer to determine one solution valid for one specific material. Target-oriented inverse design that shows far higher output efficiency has proven effective in other fields, but is still absent for artificial skins because of the difficulties in acquiring big data. Here, we report a property-to-structure inverse design of artificial skins based on small dataset machine learning, exhibiting a comprehensive efficiency at least four orders of magnitude higher than the conventional routine. The inverse routine can predict hundreds of solutions that overcome the intrinsic signal saturation problem for linear response in hours, and the solutions are valid to a variety of materials. Our results demonstrate that the inverse design allowed by small dataset is an efficient and powerful tool to target multifarious applications of artificial skins, which can potentially advance the fields of intelligent robots, advanced healthcare, and human-machine interfaces. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2304.00790 [pdf, other]

LQR-CBF-RRT*: Safe and Optimal Motion Planning

Authors: Guang Yang, Mingyu Cai, Ahmad Ahmad, Amanda Prorok, Roberto Tron, Calin Belta

Abstract: We present LQR-CBF-RRT*, an incremental sampling-based algorithm for offline motion planning. Our framework leverages the strength of Control Barrier Functions (CBFs) and Linear Quadratic Regulators (LQR) to generate safety-critical and optimal trajectories for a robot with dynamics described by an affine control system. CBFs are used for safety guarantees, while LQRs are employed for optimal cont… ▽ More We present LQR-CBF-RRT*, an incremental sampling-based algorithm for offline motion planning. Our framework leverages the strength of Control Barrier Functions (CBFs) and Linear Quadratic Regulators (LQR) to generate safety-critical and optimal trajectories for a robot with dynamics described by an affine control system. CBFs are used for safety guarantees, while LQRs are employed for optimal control synthesis during edge extensions. Popular CBF-based formulations for safety critical control require solving Quadratic Programs (QPs), which can be computationally expensive. Moreover, LQR-based controllers require repetitive applications of first-order Taylor approximations for nonlinear systems, which can also create an additional computational burden. To improve the motion planning efficiency, we verify the satisfaction of the CBF constraints directly in edge extension to avoid the burden of solving the QPs. We store computed optimal LQR gain matrices in a hash table to avoid re-computation during the local linearization of the rewiring procedure. Lastly, we utilize the Cross-Entropy Method for importance sampling to improve sampling efficiency. Our results show that the proposed planner surpasses its counterparts in computational efficiency and performs well in an experimental setup. △ Less

Submitted 27 September, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

arXiv:2303.15790 [pdf, other]

doi 10.1007/s11467-023-1333-z

STCF Conceptual Design Report: Volume 1 -- Physics & Detector

Authors: M. Achasov, X. C. Ai, R. Aliberti, L. P. An, Q. An, X. Z. Bai, Y. Bai, O. Bakina, A. Barnyakov, V. Blinov, V. Bobrovnikov, D. Bodrov, A. Bogomyagkov, A. Bondar, I. Boyko, Z. H. Bu, F. M. Cai, H. Cai, J. J. Cao, Q. H. Cao, Z. Cao, Q. Chang, K. T. Chao, D. Y. Chen, H. Chen , et al. (413 additional authors not shown)

Abstract: The Super $τ$-Charm facility (STCF) is an electron-positron collider proposed by the Chinese particle physics community. It is designed to operate in a center-of-mass energy range from 2 to 7 GeV with a peak luminosity of $0.5\times 10^{35}{\rm cm}^{-2}{\rm s}^{-1}$ or higher. The STCF will produce a data sample about a factor of 100 larger than that by the present $τ$-Charm factory -- the BEPCII,… ▽ More The Super $τ$-Charm facility (STCF) is an electron-positron collider proposed by the Chinese particle physics community. It is designed to operate in a center-of-mass energy range from 2 to 7 GeV with a peak luminosity of $0.5\times 10^{35}{\rm cm}^{-2}{\rm s}^{-1}$ or higher. The STCF will produce a data sample about a factor of 100 larger than that by the present $τ$-Charm factory -- the BEPCII, providing a unique platform for exploring the asymmetry of matter-antimatter (charge-parity violation), in-depth studies of the internal structure of hadrons and the nature of non-perturbative strong interactions, as well as searching for exotic hadrons and physics beyond the Standard Model. The STCF project in China is under development with an extensive R\&D program. This document presents the physics opportunities at the STCF, describes conceptual designs of the STCF detector system, and discusses future plans for detector R\&D and physics case studies. △ Less

Submitted 5 October, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

Journal ref: Front. Phys. 19(1), 14701 (2024)

arXiv:2303.04525 [pdf, other]

Continuity-Aware Latent Interframe Information Mining for Reliable UAV Tracking

Authors: Changhong Fu, Mutian Cai, Sihang Li, Kunhan Lu, Haobo Zuo, Chongjun Liu

Abstract: Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential i… ▽ More Unmanned aerial vehicle (UAV) tracking is crucial for autonomous navigation and has broad applications in robotic automation fields. However, reliable UAV tracking remains a challenging task due to various difficulties like frequent occlusion and aspect ratio change. Additionally, most of the existing work mainly focuses on explicit information to improve tracking performance, ignoring potential interframe connections. To address the above issues, this work proposes a novel framework with continuity-aware latent interframe information mining for reliable UAV tracking, i.e., ClimRT. Specifically, a new efficient continuity-aware latent interframe information mining network (ClimNet) is proposed for UAV tracking, which can generate highly-effective latent frame between two adjacent frames. Besides, a novel location-continuity Transformer (LCT) is designed to fully explore continuity-aware spatial-temporal information, thereby markedly enhancing UAV tracking. Extensive qualitative and quantitative experiments on three authoritative aerial benchmarks strongly validate the robustness and reliability of ClimRT in UAV tracking performance. Furthermore, real-world tests on the aerial platform validate its practicability and effectiveness. The code and demo materials are released at https://github.com/vision4robotics/ClimRT. △ Less

Submitted 8 March, 2023; originally announced March 2023.

Comments: 2023 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:2303.02566 [pdf, other]

MFAI: A Scalable Bayesian Matrix Factorization Approach to Leveraging Auxiliary Information

Authors: Zhiwei Wang, Fa Zhang, Cong Zheng, Xianghong Hu, Mingxuan Cai, Can Yang

Abstract: In various practical situations, matrix factorization methods suffer from poor data quality, such as high data sparsity and low signal-to-noise ratio (SNR). Here, we consider a matrix factorization problem by utilizing auxiliary information, which is massively available in real-world applications, to overcome the challenges caused by poor data quality. Unlike existing methods that mainly rely on s… ▽ More In various practical situations, matrix factorization methods suffer from poor data quality, such as high data sparsity and low signal-to-noise ratio (SNR). Here, we consider a matrix factorization problem by utilizing auxiliary information, which is massively available in real-world applications, to overcome the challenges caused by poor data quality. Unlike existing methods that mainly rely on simple linear models to combine auxiliary information with the main data matrix, we propose to integrate gradient boosted trees in the probabilistic matrix factorization framework to effectively leverage auxiliary information (MFAI). Thus, MFAI naturally inherits several salient features of gradient boosted trees, such as the capability of flexibly modeling nonlinear relationships and robustness to irrelevant features and missing values in auxiliary information. The parameters in MFAI can be automatically determined under the empirical Bayes framework, making it adaptive to the utilization of auxiliary information and immune to overfitting. Moreover, MFAI is computationally efficient and scalable to large datasets by exploiting variational inference. We demonstrate the advantages of MFAI through comprehensive numerical results from simulation studies and real data analyses. Our approach is implemented in the R package mfair available at https://github.com/YangLabHKUST/mfair. △ Less

Submitted 12 February, 2024; v1 submitted 4 March, 2023; originally announced March 2023.

arXiv:2302.10491 [pdf, ps, other]

The Laplacian spectral ratio of connected graphs

Authors: Zhen Lin, Jiajia Wang, Min Cai

Abstract: Let $G$ be a simple connected undirected graph. The Laplacian spectral ratio of $G$, denoted by $R_L(G)$, is defined as the quotient between the largest and second smallest Laplacian eigenvalues of $G$, which is closely related to the structural parameters of a graph (or network), such as diameter, $t$-tough, perfect matching, average density of cuts, and synchronizability, etc. In this paper, we… ▽ More Let $G$ be a simple connected undirected graph. The Laplacian spectral ratio of $G$, denoted by $R_L(G)$, is defined as the quotient between the largest and second smallest Laplacian eigenvalues of $G$, which is closely related to the structural parameters of a graph (or network), such as diameter, $t$-tough, perfect matching, average density of cuts, and synchronizability, etc. In this paper, we obtain some bounds of the Laplacian spectral ratio, which improves the known results. In addition, we give counter-examples on the upper bound of the Laplacian spectral ratio conjecture of trees, and propose a new conjecture. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: 17 pages,3 figures

MSC Class: 05C05; 05C50

arXiv:2301.06017 [pdf]

Miniature Magnetic Nano islands in a Morphotropic Cobaltite Matrix

Authors: Shengru Chen, Dongke Rong, Yue Xu, Miming Cai, Xinyan Li, Qinghua Zhang, Shuai Xu, Yan-Xing Shang, Haitao Hong, Ting Cui, Qiao **, Jia-Ou Wang, Haizhong Guo, Lin Gu, Qiang Zheng, Can Wang, **xing Zhang, Gang-Qin Liu, Kui-juan **, Er-Jia Guo

Abstract: High-density magnetic memories are key components in spintronics, quantum computing, and energy-efficient electronics. Reduced dimensionality and magnetic domain stability at the nanoscale are essential for the miniaturization of magnetic storage units. Yet, inducing magnetic order, and selectively tuning spin-orbital coupling at specific locations have remained challenging. Here we demonstrate th… ▽ More High-density magnetic memories are key components in spintronics, quantum computing, and energy-efficient electronics. Reduced dimensionality and magnetic domain stability at the nanoscale are essential for the miniaturization of magnetic storage units. Yet, inducing magnetic order, and selectively tuning spin-orbital coupling at specific locations have remained challenging. Here we demonstrate the construction of switchable magnetic nano-islands in a nonmagnetic matrix based on cobaltite homo-structures. The magnetic and electronic states are laterally modified by epitaxial strain, which is regionally controlled by freestanding membranes. Atomically sharp grain boundaries isolate the crosstalk between magnetically distinct regions. The minimal size of magnetic nano-islands reaches 35 nm in diameter, enabling an areal density of 400 Gbit per inch square. Besides providing an ideal platform for precisely controlled read and write schemes, this methodology can enable scalable and patterned memories on silicon and flexible substrates for various applications. △ Less

Submitted 14 January, 2023; originally announced January 2023.

Comments: 20 pages,4 figures

arXiv:2212.09588 [pdf, other]

Query Enhanced Knowledge-Intensive Conversation via Unsupervised Joint Modeling

Authors: Mingzhu Cai, Siqi Bao, Xin Tian, Huang He, Fan Wang, Hua Wu

Abstract: In this paper, we propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv. There are three modules in QKConv: a query generator, an off-the-shelf knowledge selector, and a response generator. QKConv is optimized through joint training, which produces the response by exploring multiple candidate queries and leveraging corresponding selected knowledge. Th… ▽ More In this paper, we propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv. There are three modules in QKConv: a query generator, an off-the-shelf knowledge selector, and a response generator. QKConv is optimized through joint training, which produces the response by exploring multiple candidate queries and leveraging corresponding selected knowledge. The joint training solely relies on the dialogue context and target response, getting exempt from extra query annotations or knowledge provenances. To evaluate the effectiveness of the proposed QKConv, we conduct experiments on three representative knowledge-intensive conversation datasets: conversational question-answering, task-oriented dialogue, and knowledge-grounded conversation. Experimental results reveal that QKConv performs better than all unsupervised methods across three datasets and achieves competitive performance compared to supervised methods. △ Less

Submitted 26 May, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: Accepted for publication at ACL2023

arXiv:2212.02007 [pdf, other]

Mixed Cloud Control Testbed: Validating Vehicle-Road-Cloud Integration via Mixed Digital Twin

Authors: Jianghong Dong, Qing Xu, Jiawei Wang, Chunying Yang, Mengchi Cai, Chaoyi Chen, Jianqiang Wang, Keqiang Li

Abstract: Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and phys… ▽ More Reliable and efficient validation technologies are critical for the recent development of multi-vehicle cooperation and vehicle-road-cloud integration. In this paper, we introduce our miniature experimental platform, Mixed Cloud Control Testbed (MCCT), developed based on a new notion of Mixed Digital Twin (mixedDT). Combining Mixed Reality with Digital Twin, mixedDT integrates the virtual and physical spaces into a mixed one, where physical entities coexist and interact with virtual entities via their digital counterparts. Under the framework of mixedDT, MCCT contains three major experimental platforms in the physical, virtual and mixed spaces respectively, and provides a unified access for various human-machine interfaces and external devices such as driving simulators. A cloud unit, where the mixed experimental platform is deployed, is responsible for fusing multi-platform information and assigning control instructions, contributing to synchronous operation and real-time cross-platform interaction. Particularly, MCCT allows for multi-vehicle coordination composed of different multi-source vehicles (\eg, physical vehicles, virtual vehicles and human-driven vehicles). Validations on vehicle platooning demonstrate the flexibility and scalability of MCCT. △ Less

Submitted 4 December, 2022; originally announced December 2022.

Comments: 13 pages, 13 figures

arXiv:2211.11157 [pdf, other]

On the Nonexistence of a Strong Minimal Pair

Authors: Mingzhong Cai, Yiqun Liu, Yong Liu, Cheng Peng, Yue Yang

Abstract: Two nonzero recursively enumerable (r.e.) degrees $\mathbf{a}$ and $\mathbf{b}$ form a strong minimal pair if $\mathbf{a} \wedge \mathbf{b}=\mathbf{0}$ and $\mathbf{b}\vee \mathbf{x}\geq \mathbf{a}$ for any nonzero r.e. degree $\mathbf{x}\leq \mathbf{a}$. We prove that there is no strong minimal pair in the r.e. degrees. Our construction goes beyond the usual $\mathbf{0}'''$-priority arguments and… ▽ More Two nonzero recursively enumerable (r.e.) degrees $\mathbf{a}$ and $\mathbf{b}$ form a strong minimal pair if $\mathbf{a} \wedge \mathbf{b}=\mathbf{0}$ and $\mathbf{b}\vee \mathbf{x}\geq \mathbf{a}$ for any nonzero r.e. degree $\mathbf{x}\leq \mathbf{a}$. We prove that there is no strong minimal pair in the r.e. degrees. Our construction goes beyond the usual $\mathbf{0}'''$-priority arguments and we give some evidence to show that it needs $\mathbf{0}^{(4)}$-priority arguments. △ Less

Submitted 20 November, 2022; originally announced November 2022.

MSC Class: 03D25

arXiv:2211.10758 [pdf, ps, other]

A priori error estimates of two fully discrete coupled schemes for Biot's consolidation model

Authors: Huipeng Gu, Mingchao Cai, **gzhi Li, Guoliang Ju

Abstract: This paper concentrates on a priori error estimates of two fully discrete coupled schemes for Biot's consolidation model based on the three-field formulation introduced by Oyarzua et al. (SIAM Journal on Numerical Analysis, 2016). The spatial discretizations are based on the Taylor-Hood finite elements combined with Lagrange elements for the three primary variables. For time discretization, we con… ▽ More This paper concentrates on a priori error estimates of two fully discrete coupled schemes for Biot's consolidation model based on the three-field formulation introduced by Oyarzua et al. (SIAM Journal on Numerical Analysis, 2016). The spatial discretizations are based on the Taylor-Hood finite elements combined with Lagrange elements for the three primary variables. For time discretization, we consider two methods. One uses the backward Euler method, and the other applies a combination of the backward Euler and Crank-Nicolson methods. A priori error estimates show that the two schemes are unconditionally convergent with optimal error orders. Detailed numerical experiments are presented to validate the theoretical analysis. △ Less

Submitted 19 November, 2022; originally announced November 2022.

arXiv:2211.10007 [pdf, other]

doi 10.3847/2041-8213/aca32f

First wide field-of-view X-ray observations by a lobster eye focusing telescope in orbit

Authors: C. Zhang, Z. X. Ling, X. J. Sun, S. L. Sun, Y. Liu, Z. D. Li, Y. L. Xue, Y. F. Chen, Y. F. Dai, Z. Q. Jia, H. Y. Liu, X. F. Zhang, Y. H. Zhang, S. N. Zhang, F. S. Chen, Z. W. Cheng, W. Fu, Y. X. Han, H. Li, J. F. Li, Y. Li, P. R. Liu, X. H. Ma, Y. J. Tang, C. B. Wang , et al. (53 additional authors not shown)

Abstract: As a novel X-ray focusing technology, lobster eye micro-pore optics (MPO) feature both a wide observing field of view and true imaging capability, promising sky monitoring with significantly improved sensitivity and spatial resolution in soft X-rays. Since first proposed by Angel (1979), the optics have been extensively studied, developed and trialed over the past decades. In this Letter, we repor… ▽ More As a novel X-ray focusing technology, lobster eye micro-pore optics (MPO) feature both a wide observing field of view and true imaging capability, promising sky monitoring with significantly improved sensitivity and spatial resolution in soft X-rays. Since first proposed by Angel (1979), the optics have been extensively studied, developed and trialed over the past decades. In this Letter, we report on the first-light results from a flight experiment of the Lobster Eye Imager for Astronomy ($LEIA$), a pathfinder of the wide-field X-ray telescope of the Einstein Probe mission. The piggyback imager, launched in July 2022, has a mostly un-vignetted field of view of $18.6^\circ \times 18.6^\circ $. Its spatial resolution is in the range of 4$-$7 arcmin in FWHM and the focal spot effective area is 2$-$3 cm$^2$, both showing only mild fluctuations across the field of view. We present images of the Galactic center region, Sco X-1 and the diffuse Cygnus Loop nebular taken in snapshot observations over 0.5$-$4 keV. These are truly wide-field X-ray images of celestial bodies observed, for the first time, by a focusing imaging telescope. Initial analyses of the in-flight data show excellent agreement between the observed images and the on-ground calibration and simulations. The instrument and its characterization are briefly described, as well as the flight experiment. The results provide a solid basis for the development of the present and proposed wide-field X-ray missions using lobster eye MPO. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: 11 pages, 4 figures. Accepted for publication in Astrophysical Journal Letter

arXiv:2211.09381 [pdf, other]

Token-level Speaker Change Detection Using Speaker Difference and Speech Content via Continuous Integrate-and-fire

Authors: Zhiyun Fan, Zhenlin Liang, Linhao Dong, Yi Liu, Shiyu Zhou, Meng Cai, Jun Zhang, Zejun Ma, Bo Xu

Abstract: In multi-talker scenarios such as meetings and conversations, speech processing systems are usually required to segment the audio and then transcribe each segmentation. These two stages are addressed separately by speaker change detection (SCD) and automatic speech recognition (ASR). Most previous SCD systems rely solely on speaker information and ignore the importance of speech content. In this p… ▽ More In multi-talker scenarios such as meetings and conversations, speech processing systems are usually required to segment the audio and then transcribe each segmentation. These two stages are addressed separately by speaker change detection (SCD) and automatic speech recognition (ASR). Most previous SCD systems rely solely on speaker information and ignore the importance of speech content. In this paper, we propose a novel SCD system that considers both cues of speaker difference and speech content. These two cues are converted into token-level representations by the continuous integrate-and-fire (CIF) mechanism and then combined for detecting speaker changes on the token acoustic boundaries. We evaluate the performance of our approach on a public real-recorded meeting dataset, AISHELL-4. The experiment results show that our method outperforms a competitive frame-level baseline system by 2.45% equal coverage-purity (ECP). In addition, we demonstrate the importance of speech content and speaker difference to the SCD task, and the advantages of conducting SCD on the token acoustic boundaries compared with conducting SCD frame by frame. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2211.03885 [pdf, other]

Learned Smartphone ISP on Mobile GPUs with Deep Learning, Mobile AI & AIM 2022 Challenge: Report

Authors: Andrey Ignatov, Radu Timofte, Shuai Liu, Chaoyu Feng, Furui Bai, Xiaotao Wang, Lei Lei, Ziyao Yi, Yan Xiang, Zibin Liu, Shaoqing Li, Keming Shi, Dehui Kong, Ke Xu, Minsu Kwon, Yaqi Wu, Jiesi Zheng, Zhihao Fan, Xun Wu, Feng Zhang, Albert No, Minhyeok Cho, Zewen Chen, Xiaze Zhang, Ran Li , et al. (13 additional authors not shown)

Abstract: The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. Th… ▽ More The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper. △ Less

Submitted 7 November, 2022; originally announced November 2022.

arXiv:2210.12364 [pdf, other]

doi 10.18653/v1/2022.findings-emnlp.137

FCGEC: Fine-Grained Corpus for Chinese Grammatical Error Correction

Authors: Lvxiaowei Xu, Jianwang Wu, Jiawei Peng, Jiayu Fu, Ming Cai

Abstract: Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently. However, it is still immature in Chinese GEC due to limited high-quality data from native speakers in terms of category and scale. In this paper, we present FCGEC, a fine-grained corpus to detect, identify and correct the grammatical errors. FCGEC is a human-annotated corpus with m… ▽ More Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently. However, it is still immature in Chinese GEC due to limited high-quality data from native speakers in terms of category and scale. In this paper, we present FCGEC, a fine-grained corpus to detect, identify and correct the grammatical errors. FCGEC is a human-annotated corpus with multiple references, consisting of 41,340 sentences collected mainly from multi-choice questions in public school Chinese examinations. Furthermore, we propose a Switch-Tagger-Generator (STG) baseline model to correct the grammatical errors in low-resource settings. Compared to other GEC benchmark models, experimental results illustrate that STG outperforms them on our FCGEC. However, there exists a significant gap between benchmark models and humans that encourages future models to bridge it. △ Less

Submitted 22 October, 2022; originally announced October 2022.

Comments: Long paper, accepted at the Findings of EMNLP 2022

arXiv:2210.10560 [pdf, other]

Dilepton production in the photodisintegration of the deuteron

Authors: Mengchu Cai, Tianbo Liu, Bo-Qiang Ma

Abstract: We study the lepton pair production in the photodisintegration of the deuteron process. The complete seven-fold differential cross section is calculated via the Bethe-Heitler mechanism with final state interactions taken into account. The deuteron bound state is described by a relativistic covariant deuteron-nucleon vertex. With numerical results, we find that the differential cross section has st… ▽ More We study the lepton pair production in the photodisintegration of the deuteron process. The complete seven-fold differential cross section is calculated via the Bethe-Heitler mechanism with final state interactions taken into account. The deuteron bound state is described by a relativistic covariant deuteron-nucleon vertex. With numerical results, we find that the differential cross section has strong dependence on the lepton azimuthal angle in the small polar angle region and sharp peaks appear in the dependence on the invariant mass of the produced lepton pair or the two nucleons in the final state. We demonstrate that such nearly singular feature originates from the collinearity between the produced lepton or antilepton and the incident photon, and it is physically regularized by the lepton mass in our calculation. The final state interaction between the knocked-out nucleon and the recoil nucleon redistributes the differential cross section over the missing momentum, with a significant enhancement at large missing momentum and a suppression in the intermediate region. With a further decomposition of the final state interaction contribution, it is found that the on-shell term dominates the near quasi-elastic region while the off-shell term dominates the other end. In addition, we examine the contribution from the interference between the proton amplitude and the neutron amplitude, which as expected is found negligible even if the proton-neutron rescattering is included. The result in this work can serve as an input for the analysis and background estimation of multiple exclusive measurements at Jefferson Lab and future electron-ion colliders. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: 30 pages, 16 figures

arXiv:2210.04204 [pdf, ps, other]

Lasso trigonometric polynomial approximation for periodic function recovery in equidistant points

Authors: Congpei An, Mou Cai

Abstract: In this paper, we propose a fully discrete soft thresholding trigonometric polynomial approximation on $[-π,π],$ named Lasso trigonometric interpolation. This approximation is an $\ell_1$-regularized discrete least squares approximation under the same conditions of classical trigonometric interpolation on an equidistant grid. Lasso trigonometric interpolation is sparse and meanwhile it is an effic… ▽ More In this paper, we propose a fully discrete soft thresholding trigonometric polynomial approximation on $[-π,π],$ named Lasso trigonometric interpolation. This approximation is an $\ell_1$-regularized discrete least squares approximation under the same conditions of classical trigonometric interpolation on an equidistant grid. Lasso trigonometric interpolation is sparse and meanwhile it is an efficient tool to deal with noisy data. We theoretically analyze Lasso trigonometric interpolation for continuous periodic function. The principal results show that the $L_2$ error bound of Lasso trigonometric interpolation is less than that of classical trigonometric interpolation, which improved the robustness of trigonometric interpolation. This paper also presents numerical results on Lasso trigonometric interpolation on $[-π,π]$, with or without the presence of data errors. △ Less

Submitted 21 September, 2023; v1 submitted 9 October, 2022; originally announced October 2022.

Comments: 18 pages, 5 figures

arXiv:2210.01910 [pdf, other]

Learning Signal Temporal Logic through Neural Network for Interpretable Classification

Authors: Danyang Li, Mingyu Cai, Cristian-Ioan Vasile, Roberto Tron

Abstract: Machine learning techniques using neural networks have achieved promising success for time-series data classification. However, the models that they produce are challenging to verify and interpret. In this paper, we propose an explainable neural-symbolic framework for the classification of time-series behaviors. In particular, we use an expressive formal language, namely Signal Temporal Logic (STL… ▽ More Machine learning techniques using neural networks have achieved promising success for time-series data classification. However, the models that they produce are challenging to verify and interpret. In this paper, we propose an explainable neural-symbolic framework for the classification of time-series behaviors. In particular, we use an expressive formal language, namely Signal Temporal Logic (STL), to constrain the search of the computation graph for a neural network. We design a novel time function and sparse softmax function to improve the soundness and precision of the neural-STL framework. As a result, we can efficiently learn a compact STL formula for the classification of time-series data through off-the-shelf gradient-based tools. We demonstrate the computational efficiency, compactness, and interpretability of the proposed method through driving scenarios and naval surveillance case studies, compared with state-of-the-art baselines. △ Less

Submitted 30 June, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

arXiv:2210.01162 [pdf, other]

Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

Authors: Mingyu Cai, Makai Mann, Zachary Serlin, Kevin Leahy, Cristian-Ioan Vasile

Abstract: This paper explores continuous-time control synthesis for target-driven navigation to satisfy complex high-level tasks expressed as linear temporal logic (LTL). We propose a model-free framework using deep reinforcement learning (DRL) where the underlying dynamic system is unknown (an opaque box). Unlike prior work, this paper considers scenarios where the given LTL specification might be infeasib… ▽ More This paper explores continuous-time control synthesis for target-driven navigation to satisfy complex high-level tasks expressed as linear temporal logic (LTL). We propose a model-free framework using deep reinforcement learning (DRL) where the underlying dynamic system is unknown (an opaque box). Unlike prior work, this paper considers scenarios where the given LTL specification might be infeasible and therefore cannot be accomplished globally. Instead of modifying the given LTL formula, we provide a general DRL-based approach to satisfy it with minimal violation. To do this, we transform a previously multi-objective DRL problem, which requires simultaneous automata satisfaction and minimum violation cost, into a single objective. By guiding the DRL agent with a sampling-based path planning algorithm for the potentially infeasible LTL task, the proposed approach mitigates the myopic tendencies of DRL, which are often an issue when learning general LTL tasks that can have long or infinite horizons. This is achieved by decomposing an infeasible LTL formula into several reach-avoid sub-tasks with shorter horizons, which can be trained in a modular DRL architecture. Furthermore, we overcome the challenge of the exploration process for DRL in complex and cluttered environments by using path planners to design rewards that are dense in the configuration space. The benefits of the presented approach are demonstrated through testing on various complex nonlinear systems and compared with state-of-the-art baselines. The Video demonstration can be found here:https://youtu.be/jBhx6Nv224E. △ Less

Submitted 16 March, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

arXiv:2209.07459 [pdf, other]

A Robotic Visual Gras** Design: Rethinking Convolution Neural Network with High-Resolutions

Authors: Zhangli Zhou, Shaochen Wang, Ziyang Chen, Mingyu Cai, Zhen Kan

Abstract: High-resolution representations are important for vision-based robotic gras** problems. Existing works generally encode the input images into low-resolution representations via sub-networks and then recover high-resolution representations. This will lose spatial information, and errors introduced by the decoder will be more serious when multiple types of objects are considered or objects are far… ▽ More High-resolution representations are important for vision-based robotic gras** problems. Existing works generally encode the input images into low-resolution representations via sub-networks and then recover high-resolution representations. This will lose spatial information, and errors introduced by the decoder will be more serious when multiple types of objects are considered or objects are far away from the camera. To address these issues, we revisit the design paradigm of CNN for robotic perception tasks. We demonstrate that using parallel branches as opposed to serial stacked convolutional layers will be a more powerful design for robotic visual gras** tasks. In particular, guidelines of neural network design are provided for robotic perception tasks, e.g., high-resolution representation and lightweight design, which respond to the challenges in different manipulation scenarios. We then develop a novel gras** visual architecture referred to as HRG-Net, a parallel-branch structure that always maintains a high-resolution representation and repeatedly exchanges information across resolutions. Extensive experiments validate that these two designs can effectively enhance the accuracy of visual-based gras** and accelerate network training. We show a series of comparative experiments in real physical environments at Youtube: https://youtu.be/Jhlsp-xzHFY. △ Less

Submitted 15 September, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

arXiv:2209.04260 [pdf, other]

doi 10.1103/PhysRevD.106.063026

Search for relativistic fractionally charged particles in space

Authors: DAMPE Collaboration, F. Alemanno, C. Altomare, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, M. Y. Cui, T. S. Cui, Y. X. Cui, H. T. Dai, A. De-Benedittis, I. De Mitri, F. de Palma, M. Deliyergiyev, A. Di Giovanni, M. Di Santo , et al. (126 additional authors not shown)

Abstract: More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been… ▽ More More than a century after the performance of the oil drop experiment, the possible existence of fractionally charged particles FCP still remains unsettled. The search for FCPs is crucial for some extensions of the Standard Model in particle physics. Most of the previously conducted searches for FCPs in cosmic rays were based on experiments underground or at high altitudes. However, there have been few searches for FCPs in cosmic rays carried out in orbit other than AMS-01 flown by a space shuttle and BESS by a balloon at the top of the atmosphere. In this study, we conduct an FCP search in space based on on-orbit data obtained using the DArk Matter Particle Explorer (DAMPE) satellite over a period of five years. Unlike underground experiments, which require an FCP energy of the order of hundreds of GeV, our FCP search starts at only a few GeV. An upper limit of $6.2\times 10^{-10}~~\mathrm{cm^{-2}sr^{-1} s^{-1}}$ is obtained for the flux. Our results demonstrate that DAMPE exhibits higher sensitivity than experiments of similar types by three orders of magnitude that more stringently restricts the conditions for the existence of FCP in primary cosmic rays. △ Less

Submitted 9 September, 2022; originally announced September 2022.

Comments: 19 pages, 6 figures, accepted by PRD

Report number: 106, 063026

Journal ref: Physical Review D 106.6 (2022): 063026

arXiv:2209.01481 [pdf, ps, other]

Decomposition of Frobenius pushforwards of line bundles on wonderful compactifications

Authors: Merrick Cai, Vasily Krylov

Abstract: De Concini-Procesi introduced varieties known as wonderful compactifications, which are smooth projective compactifications of semisimple adjoint groups $G$. We study the Frobenius pushforwards of invertible sheaves on the wonderful compactifications, and in particular its decomposition into locally free subsheaves. We give necessary and sufficient conditions for a specific line bundle to be a dir… ▽ More De Concini-Procesi introduced varieties known as wonderful compactifications, which are smooth projective compactifications of semisimple adjoint groups $G$. We study the Frobenius pushforwards of invertible sheaves on the wonderful compactifications, and in particular its decomposition into locally free subsheaves. We give necessary and sufficient conditions for a specific line bundle to be a direct summand of the Frobenius pushforward of another line bundle, formulated in terms of the weight lattice of $\widetilde{G}$, the universal cover of $G$ (identified with the Picard group of the wonderful compactification). In the case of $G=\mathsf{PSL}_n$, we offer lower bounds on the multiplicities (as direct summands) for those line bundles satisfying the sufficient conditions. We also decompose Frobenius pushforwards of line bundles into a direct sum of vector subbundles, whose ranks are determined by invariants on the weight lattice of $G$. We study a particular block which decomposes as a direct sum of line bundles, and identify the line bundles which appear in this block. Finally, we present two approaches to compute the class of the Frobenius pushforward of line bundles on wonderful compactifications in the rational Grothendieck group and in the rational Chow group. △ Less

Submitted 3 September, 2022; originally announced September 2022.

Comments: 54 pages

arXiv:2208.12931 [pdf, other]

How to relate potential outcomes: Estimating individual treatment effects under a given specified partial correlation

Authors: Mingyang Cai, Stef van Buuren, Gerko Vink

Abstract: In most medical research, the average treatment effect is used to evaluate a treatment's performance. However, precision medicine requires knowledge of individual treatment effects: What is the difference between a unit's measurement under treatment and control conditions? In most treatment effect studies, such answers are not possible because the outcomes under both experimental conditions are no… ▽ More In most medical research, the average treatment effect is used to evaluate a treatment's performance. However, precision medicine requires knowledge of individual treatment effects: What is the difference between a unit's measurement under treatment and control conditions? In most treatment effect studies, such answers are not possible because the outcomes under both experimental conditions are not jointly observed. This makes the problem of causal inference a missing data problem. We propose to solve this problem by imputing the individual potential outcomes under a specified partial correlation (SPC), thereby allowing for heterogeneous treatment effects. We demonstrate in simulation that our proposed methodology yields valid inferences for the marginal distribution of potential outcomes. We highlight that the posterior distribution of individual treatment effects varies with different specified partial correlations. This property can be used to study the sensitivity of optimal treatment outcomes under different correlation specifications. In a practical example on HIV-1 treatment data, we demonstrate that the proposed methodology generalises to real-world data. Imputing under the SPC, therefore, opens up a wealth of possibilities for studying heterogeneous treatment effects on incomplete data and the further adaptation of individual treatment effects. △ Less

Submitted 27 August, 2022; originally announced August 2022.

arXiv:2208.12930 [pdf, ps, other]

Joint distribution properties of Fully Conditional Specification under the normal linear model with normal inverse-gamma priors

Authors: Mingyang Cai, Stef van Buuren, Gerko Vink

Abstract: Fully conditional specification (FCS) is a convenient and flexible multiple imputation approach. It specifies a sequence of simple regression models instead of a potential complex joint density for missing variables. However, FCS may not converge to a stationary distribution. Many authors have studied the convergence properties of FCS when priors of conditional models are non-informative. We exten… ▽ More Fully conditional specification (FCS) is a convenient and flexible multiple imputation approach. It specifies a sequence of simple regression models instead of a potential complex joint density for missing variables. However, FCS may not converge to a stationary distribution. Many authors have studied the convergence properties of FCS when priors of conditional models are non-informative. We extend to the case of informative priors. This paper evaluates the convergence properties of the normal linear model with normal-inverse gamma prior. The theoretical and simulation results prove the convergence of FCS and show the equivalence of prior specification under the joint model and a set of conditional models when the analysis model is a linear regression with normal inverse-gamma priors. △ Less

Submitted 27 August, 2022; originally announced August 2022.

arXiv:2208.12929 [pdf, other]

Graphical and numerical diagnostic tools to assess multiple imputation models by posterior predictive checking

Authors: Mingyang Cai, Stef van Buuren, Gerko Vink

Abstract: Missing data are often dealt with multiple imputation. A crucial part of the multiple imputation process is selecting sensible models to generate plausible values for incomplete data. A method based on posterior predictive checking is proposed to diagnose imputation models based on posterior predictive checking. To assess the congeniality of imputation models, the proposed diagnostic method compar… ▽ More Missing data are often dealt with multiple imputation. A crucial part of the multiple imputation process is selecting sensible models to generate plausible values for incomplete data. A method based on posterior predictive checking is proposed to diagnose imputation models based on posterior predictive checking. To assess the congeniality of imputation models, the proposed diagnostic method compares the observed data with their replicates generated under corresponding posterior predictive distributions. If the imputation model is congenial with the substantive model, the observed data are expected to be located in the centre of corresponding predictive posterior distributions. Simulation and application are designed to investigate the proposed diagnostic method for parametric and semi-parametric imputation approaches, continuous and discrete incomplete variables, univariate and multivariate missingness patterns. The results show the validity of the proposed diagnostic method. △ Less

Submitted 27 August, 2022; originally announced August 2022.

Showing 51–100 of 291 results for author: Cai, M