Search | arXiv e-print repository

Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask

Authors: Zhaoyuan Su, Ammar Ahmed, Zirui Wang, Ali Anwar, Yue Cheng

Abstract: As the number of pre-trained machine learning (ML) models is growing exponentially, data reduction tools are not catching up. Existing data reduction techniques are not specifically designed for pre-trained model (PTM) dataset files. This is largely due to a lack of understanding of the patterns and characteristics of these datasets, especially those relevant to data reduction and compressibility.… ▽ More As the number of pre-trained machine learning (ML) models is growing exponentially, data reduction tools are not catching up. Existing data reduction techniques are not specifically designed for pre-trained model (PTM) dataset files. This is largely due to a lack of understanding of the patterns and characteristics of these datasets, especially those relevant to data reduction and compressibility. This paper presents the first, exhaustive analysis to date of PTM datasets on storage compressibility. Our analysis spans different types of data reduction and compression techniques, from hash-based data deduplication, data similarity detection, to dictionary-coding compression. Our analysis explores these techniques at three data granularity levels, from model layers, model chunks, to model parameters. We draw new observations that indicate that modern data reduction tools are not effective when handling PTM datasets. There is a pressing need for new compression methods that take into account PTMs' data characteristics for effective storage reduction. Motivated by our findings, we design ELF, a simple yet effective, error-bounded, lossy floating-point compression method. ELF transforms floating-point parameters in such a way that the common exponent field of the transformed parameters can be completely eliminated to save storage space. We develop Elves, a compression framework that integrates ELF along with several other data reduction methods. Elves uses the most effective method to compress PTMs that exhibit different patterns. Evaluation shows that Elves achieves an overall compression ratio of $1.52\times$, which is $1.31\times$, $1.32\times$ and $1.29\times$ higher than a general-purpose compressor (zstd), an error-bounded lossy compressor (SZ3), and the uniform model quantization, respectively, with negligible model accuracy loss. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: This paper presents the first, exhaustive analysis to date of PTM datasets on storage compressibility. Motivated by our findings, we design ELF, a simple yet effective, error-bounded, lossy floating-point compression method

ACM Class: H.2.7

arXiv:2402.12821 [pdf, other]

Identifying Factual Inconsistencies in Summaries: Grounding Model Inference via Task Taxonomy

Authors: Liyan Xu, Zhenlin Su, Mo Yu, ** Xu, **ho D. Choi, Jie Zhou, Fei Liu

Abstract: Factual inconsistencies pose a significant hurdle for the faithful summarization by generative models. While a major direction to enhance inconsistency detection is to derive stronger Natural Language Inference (NLI) models, we propose an orthogonal aspect that underscores the importance of incorporating task-specific taxonomy into the inference. To this end, we consolidate key error types of inco… ▽ More Factual inconsistencies pose a significant hurdle for the faithful summarization by generative models. While a major direction to enhance inconsistency detection is to derive stronger Natural Language Inference (NLI) models, we propose an orthogonal aspect that underscores the importance of incorporating task-specific taxonomy into the inference. To this end, we consolidate key error types of inconsistent facts in summaries, and incorporate them to facilitate both the zero-shot and supervised paradigms of LLMs. Extensive experiments on ten datasets of five distinct domains suggest that, zero-shot LLM inference could benefit from the explicit solution space depicted by the error type taxonomy, and achieves state-of-the-art performance overall, surpassing specialized non-LLM baselines, as well as recent LLM baselines. We further distill models that fuse the taxonomy into parameters through our designed prompt completions and supervised training strategies, efficiently substituting state-of-the-art zero-shot inference with much larger LLMs. △ Less

Submitted 19 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.11842 [pdf, other]

CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking

Authors: Zian Su, Xiangzhe Xu, Ziyang Huang, Zhuo Zhang, Yapeng Ye, Jianjun Huang, Xiangyu Zhang

Abstract: Transformer based code models have impressive performance in many software engineering tasks. However, their effectiveness degrades when symbols are missing or not informative. The reason is that the model may not learn to pay attention to the right correlations/contexts without the help of symbols. We propose a new method to pre-train general code models when symbols are lacking. We observe that… ▽ More Transformer based code models have impressive performance in many software engineering tasks. However, their effectiveness degrades when symbols are missing or not informative. The reason is that the model may not learn to pay attention to the right correlations/contexts without the help of symbols. We propose a new method to pre-train general code models when symbols are lacking. We observe that in such cases, programs degenerate to something written in a very primitive language. We hence propose to use program analysis to extract contexts a priori (instead of relying on symbols and masked language modeling as in vanilla models). We then leverage a novel attention masking method to only allow the model attending to these contexts, e.g., bi-directional program dependence transitive closures and token co-occurrences. In the meantime, the inherent self-attention mechanism is utilized to learn which of the allowed attentions are more important compared to others. To realize the idea, we enhance the vanilla tokenization and model architecture of a BERT model, construct and utilize attention masks, and introduce a new pre-training algorithm. We pre-train this BERT-like model from scratch, using a dataset of 26 million stripped binary functions with explicit program dependence information extracted by our tool. We apply the model in three downstream tasks: binary similarity, type inference, and malware family classification. Our pre-trained model can improve the SOTAs in these tasks from 53% to 64%, 49% to 60%, and 74% to 94%, respectively. It also substantially outperforms other general pre-training techniques of code understanding models. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.10754 [pdf, other]

When Dataflow Analysis Meets Large Language Models

Authors: Chengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiaoheng Xie, Xiangyu Zhang

Abstract: Dataflow analysis is a powerful code analysis technique that reasons dependencies between program values, offering support for code optimization, program comprehension, and bug detection. Existing approaches require the successful compilation of the subject program and customizations for downstream applications. This paper introduces LLMDFA, an LLM-powered dataflow analysis framework that analyzes… ▽ More Dataflow analysis is a powerful code analysis technique that reasons dependencies between program values, offering support for code optimization, program comprehension, and bug detection. Existing approaches require the successful compilation of the subject program and customizations for downstream applications. This paper introduces LLMDFA, an LLM-powered dataflow analysis framework that analyzes arbitrary code snippets without requiring a compilation infrastructure and automatically synthesizes downstream applications. Inspired by summary-based dataflow analysis, LLMDFA decomposes the problem into three sub-problems, which are effectively resolved by several essential strategies, including few-shot chain-of-thought prompting and tool synthesis. Our evaluation has shown that the design can mitigate the hallucination and improve the reasoning ability, obtaining high precision and recall in detecting dataflow-related bugs upon benchmark programs, outperforming state-of-the-art (classic) tools, including a very recent industrial analyzer. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 15 pages, 16 figures, 5 tables

MSC Class: 68N30; 68T01 ACM Class: D.3.0; D.2.4; I.2.5; I.2.6

arXiv:2402.09260 [pdf, other]

doi 10.1145/3613904.3642482

Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support

Authors: Zilin Ma, Yiyang Mei, Yinru Long, Zhaoyuan Su, Krzysztof Z. Gajos

Abstract: LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+… ▽ More LGBTQ+ individuals are increasingly turning to chatbots powered by large language models (LLMs) to meet their mental health needs. However, little research has explored whether these chatbots can adequately and safely provide tailored support for this demographic. We interviewed 18 LGBTQ+ and 13 non-LGBTQ+ participants about their experiences with LLM-based chatbots for mental health needs. LGBTQ+ participants relied on these chatbots for mental health support, likely due to an absence of support in real life. Notably, while LLMs offer prompt support, they frequently fall short in gras** the nuances of LGBTQ-specific challenges. Although fine-tuning LLMs to address LGBTQ+ needs can be a step in the right direction, it isn't the panacea. The deeper issue is entrenched in societal discrimination. Consequently, we call on future researchers and designers to look beyond mere technical refinements and advocate for holistic strategies that confront and counteract the societal biases burdening the LGBTQ+ community. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.00422 [pdf, other]

doi 10.1109/TPAMI.2023.3300513

Lightweight Pixel Difference Networks for Efficient Visual Representation Learning

Authors: Zhuo Su, Jiehua Zhang, Longguang Wang, Hua Zhang, Zhen Liu, Matti Pietikäinen, Li Liu

Abstract: Recently, there have been tremendous efforts in develo** lightweight Deep Neural Networks (DNNs) with satisfactory accuracy, which can enable the ubiquitous deployment of DNNs in edge devices. The core challenge of develo** compact and efficient DNNs lies in how to balance the competing goals of achieving high accuracy and high efficiency. In this paper we propose two novel types of convolutio… ▽ More Recently, there have been tremendous efforts in develo** lightweight Deep Neural Networks (DNNs) with satisfactory accuracy, which can enable the ubiquitous deployment of DNNs in edge devices. The core challenge of develo** compact and efficient DNNs lies in how to balance the competing goals of achieving high accuracy and high efficiency. In this paper we propose two novel types of convolutions, dubbed \emph{Pixel Difference Convolution (PDC) and Binary PDC (Bi-PDC)} which enjoy the following benefits: capturing higher-order local differential information, computationally efficient, and able to be integrated with existing DNNs. With PDC and Bi-PDC, we further present two lightweight deep networks named \emph{Pixel Difference Networks (PiDiNet)} and \emph{Binary PiDiNet (Bi-PiDiNet)} respectively to learn highly efficient yet more accurate representations for visual tasks including edge detection and object recognition. Extensive experiments on popular datasets (BSDS500, ImageNet, LFW, YTF, \emph{etc.}) show that PiDiNet and Bi-PiDiNet achieve the best accuracy-efficiency trade-off. For edge detection, PiDiNet is the first network that can be trained without ImageNet, and can achieve the human-level performance on BSDS500 at 100 FPS and with $<$1M parameters. For object recognition, among existing Binary DNNs, Bi-PiDiNet achieves the best accuracy and a nearly $2\times$ reduction of computational cost on ResNet18. Code available at \href{https://github.com/hellozhuo/pidinet}{https://github.com/hellozhuo/pidinet}. △ Less

Submitted 1 February, 2024; originally announced February 2024.

Comments: We design a novel lightweight convolutional operator for computer vision tasks. Both full-precision networks and BNNs are developed. Accepted by TPAMI

arXiv:2401.16659 [pdf, other]

History-Aware Conversational Dense Retrieval

Authors: Fengran Mo, Chen Qu, Kelong Mao, Tianyu Zhu, Zhan Su, Kaiyu Huang, Jian-Yun Nie

Abstract: Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turn… ▽ More Conversational search facilitates complex information retrieval by enabling multi-turn interactions between users and the system. Supporting such interactions requires a comprehensive understanding of the conversational inputs to formulate a good search query based on historical information. In particular, the search query should include the relevant information from the previous conversation turns. However, current approaches for conversational dense retrieval primarily rely on fine-tuning a pre-trained ad-hoc retriever using the whole conversational search session, which can be lengthy and noisy. Moreover, existing approaches are limited by the amount of manual supervision signals in the existing datasets. To address the aforementioned issues, we propose a History-Aware Conversational Dense Retrieval (HAConvDR) system, which incorporates two ideas: context-denoised query reformulation and automatic mining of supervision signals based on the actual impact of historical turns. Experiments on two public conversational search datasets demonstrate the improved history modeling capability of HAConvDR, in particular for long conversations with topic shifts. △ Less

Submitted 28 May, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted to Findings of ACL 2024

arXiv:2401.14860 [pdf, ps, other]

On Log-Concave-Tailed Chaoses and the Restricted Isometry Property

Authors: Guozheng Dai, Zhonggen Su, Vladimir Ulyanov, Hanchao Wang

Abstract: In this paper, we obtain a $p$-th moment bound for the suprema of a log-concave-tailed nonhomogeneous chaos process, which is optimal in some special cases. A crucial ingredient of the proof is a novel decoupling inequality, which may be of independent interest. With this $p$-th moment bound, we show two uniform Hanson-Wright type deviation inequalities for $α$-subexponential entries (… ▽ More In this paper, we obtain a $p$-th moment bound for the suprema of a log-concave-tailed nonhomogeneous chaos process, which is optimal in some special cases. A crucial ingredient of the proof is a novel decoupling inequality, which may be of independent interest. With this $p$-th moment bound, we show two uniform Hanson-Wright type deviation inequalities for $α$-subexponential entries ($1\le α\le 2$), which recover some known results. As applications, we prove the restricted isometry property of partial random circulant matrices and time-frequency structured random matrices induced by standard $α$-subexponential vectors ($1\le α\le 2$), which extends the previously known results for the subgaussian case. △ Less

Submitted 12 May, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

MSC Class: 60B20; 60G15; 65C50

arXiv:2401.09263 [pdf, ps, other]

Deviation Inequalities for the Spectral Norm of Structured Random Matrices

Authors: Guozheng Dai, Zhonggen Su

Abstract: We study the deviation inequality for the spectral norm of structured random matrices with non-gaussian entries. In particular, we establish an optimal bound for the $p$-th moment of the spectral norm by transfering the spectral norm into the suprema of canonical processes. A crucial ingredient of our proof is a comparison of weak and strong moments. As an application, we show a deviation inequali… ▽ More We study the deviation inequality for the spectral norm of structured random matrices with non-gaussian entries. In particular, we establish an optimal bound for the $p$-th moment of the spectral norm by transfering the spectral norm into the suprema of canonical processes. A crucial ingredient of our proof is a comparison of weak and strong moments. As an application, we show a deviation inequality for the smallest singular value of a rectangular random matrix. △ Less

Submitted 12 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.09071 [pdf, other]

Rethinking Spectral Graph Neural Networks with Spatially Adaptive Filtering

Authors: **gwei Guo, Kaizhu Huang, ** Yi, Zixian Su, Rui Zhang

Abstract: Whilst spectral Graph Neural Networks (GNNs) are theoretically well-founded in the spectral domain, their practical reliance on polynomial approximation implies a profound linkage to the spatial domain. As previous studies rarely examine spectral GNNs from the spatial perspective, their spatial-domain interpretability remains elusive, e.g., what information is essentially encoded by spectral GNNs… ▽ More Whilst spectral Graph Neural Networks (GNNs) are theoretically well-founded in the spectral domain, their practical reliance on polynomial approximation implies a profound linkage to the spatial domain. As previous studies rarely examine spectral GNNs from the spatial perspective, their spatial-domain interpretability remains elusive, e.g., what information is essentially encoded by spectral GNNs in the spatial domain? In this paper, to answer this question, we investigate the theoretical connection between spectral filtering and spatial aggregation, unveiling an intrinsic interaction that spectral filtering implicitly leads the original graph to an adapted new graph, explicitly computed for spatial aggregation. Both theoretical and empirical investigations reveal that the adapted new graph not only exhibits non-locality but also accommodates signed edge weights to reflect label consistency among nodes. These findings thus highlight the interpretable role of spectral GNNs in the spatial domain and inspire us to rethink graph spectral filters beyond the fixed-order polynomials, which neglect global information. Built upon the theoretical findings, we revisit the state-of-the-art spectral GNNs and propose a novel Spatially Adaptive Filtering (SAF) framework, which leverages the adapted new graph by spectral filtering for an auxiliary non-local aggregation. Notably, our SAF comprehensively models both node similarity and dissimilarity from a global perspective, therefore alleviating persistent deficiencies of GNNs related to long-range dependencies and graph heterophily. Extensive experiments over 13 node classification benchmarks demonstrate the superiority of our proposed framework to the state-of-the-art methods. △ Less

Submitted 22 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.09027 [pdf, ps, other]

Exact Homomorphic Encryption

Authors: Zheng-Yao Su, Ming-Chung Tsai

Abstract: Inspired by the concept of fault tolerance quantum computation, this article proposes a framework dubbed Exact Homomorphic Encryption, EHE, enabling exact computations on encrypted data without the need for pre-decryption. The introduction of quantum gates is a critical step for constructing the message encryption and the computation encryption within the framework. Of significance is that both en… ▽ More Inspired by the concept of fault tolerance quantum computation, this article proposes a framework dubbed Exact Homomorphic Encryption, EHE, enabling exact computations on encrypted data without the need for pre-decryption. The introduction of quantum gates is a critical step for constructing the message encryption and the computation encryption within the framework. Of significance is that both encryptions are respectively accomplished in a multivariate polynomial set generated by quantum gates. Two fundamental traits of quantum gates, the invertibility and the noncommutativity, establish the success of EHE. The encrypted computation is exact because its encryption transformation is conducted with invertible gates. In the same vein, decryptions for both an encrypted message and encrypted computation are exact. The second trait of noncommutativity among applied quantum gates brings forth the security for the two encryptions. Toward the message encryption, a plaintext is encoded into a ciphertext via a polynomial set generated by a product of noncommuting gates randomly chosen. In the computation encryption, a desired operation is encoded into an encrypted polynomial set generated by another product of noncommuting gates. The encrypted computation is then the evaluation of the encrypted polynomial set on the ciphertext and is referred to as the cryptovaluation. EHE is not only attainable on quantum computers, but also straightforwardly realizable on traditional computing environments. Surpassing the standard security 2^128 of quantum resilience, both the encryptions further reach a security greater than the suggested threshold 2^1024 and are characterized as hyper quantum-resilient. Thanks to the two essential traits of quantum gates, this framework can be regarded as the initial tangible manifestation of the concept noncommutative cryptography. △ Less

Submitted 8 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.08944 [pdf, other]

Optimal local filtering operation for enhancing quantum entanglement

Authors: Zhaofeng Su, Nina Sukhodoeva

Abstract: Quantum entanglement is an indispensable resource for many significant quantum information processing tasks. Thus, distilling more entanglement from less entangled resource is a task of practical significance and has been investigated for decades. The literature [Verstraete \textit{et al}., \href{https://link.aps.org/doi/10.1103/PhysRevA.64.010101}{Phys. Rev. A 64, 010101(2001)}] considered a scen… ▽ More Quantum entanglement is an indispensable resource for many significant quantum information processing tasks. Thus, distilling more entanglement from less entangled resource is a task of practical significance and has been investigated for decades. The literature [Verstraete \textit{et al}., \href{https://link.aps.org/doi/10.1103/PhysRevA.64.010101}{Phys. Rev. A 64, 010101(2001)}] considered a scenario to increase the entanglement by local filtering operation and qualitatively derived the variance relation of entanglement. We investigate the scenario with general two-qubit resources to find the optimal strategy of filtering operations. We obtain the upper bound for the ratio of entanglement increase and find the corresponding optimal local filtering operation to achieve the maximal ratio. Our analysis shows that the upper bound ratio grows with the length of local Bloch vector while the success probability decrease with it. We further extend the research to investigate the optimal measurement strategy by considering general measurement. Our result shows that local measurement can not increase the expectation of quantum entanglement, which gives more analytical evidence to the well known fact that local operation can not create quantum entanglement. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: 11 pages, 1 figures

arXiv:2401.04538 [pdf, other]

UBfuzz: Finding Bugs in Sanitizer Implementations

Authors: Shaohua Li, Zhendong Su

Abstract: In this paper, we propose a testing framework for validating sanitizer implementations in compilers. Our core components are (1) a program generator specifically designed for producing programs containing undefined behavior (UB), and (2) a novel test oracle for sanitizer testing. The program generator employs Shadow Statement Insertion, a general and effective approach for introducing UB into a va… ▽ More In this paper, we propose a testing framework for validating sanitizer implementations in compilers. Our core components are (1) a program generator specifically designed for producing programs containing undefined behavior (UB), and (2) a novel test oracle for sanitizer testing. The program generator employs Shadow Statement Insertion, a general and effective approach for introducing UB into a valid seed program. The generated UB programs are subsequently utilized for differential testing of multiple sanitizer implementations. Nevertheless, discrepant sanitizer reports may stem from either compiler optimization or sanitizer bugs. To accurately determine if a discrepancy is caused by sanitizer bugs, we introduce a new test oracle called crash-site map**. We have incorporated our techniques into UBfuzz, a practical tool for testing sanitizers. Over a five-month testing period, UBfuzz successfully found 31 bugs in both GCC and LLVM sanitizers. These bugs reveal the serious false negative problems in sanitizers, where certain UBs in programs went unreported. This research paves the way for further investigation in this crucial area of study. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: accepted to ASPLOS 2024

arXiv:2401.03636 [pdf, other]

A Perturbed Value-Function-Based Interior-Point Method for Perturbed Pessimistic Bilevel Problems

Authors: Haimei Huo, Risheng Liu, Zhixun Su

Abstract: Bilevel optimizaiton serves as a powerful tool for many machine learning applications. Perturbed pessimistic bilevel problem PBP$ε$, with $ε$ being an arbitrary positive number, is a variant of the bilevel problem to deal with the case where there are multiple solutions in the lower level problem. However, the provably convergent algorithms for PBP$ε$ with a nonlinear lower level problem are lacki… ▽ More Bilevel optimizaiton serves as a powerful tool for many machine learning applications. Perturbed pessimistic bilevel problem PBP$ε$, with $ε$ being an arbitrary positive number, is a variant of the bilevel problem to deal with the case where there are multiple solutions in the lower level problem. However, the provably convergent algorithms for PBP$ε$ with a nonlinear lower level problem are lacking. To fill the gap, we consider in the paper the problem PBP$ε$ with a nonlinear lower level problem. By introducing a log-barrier function to replace the inequality constraint associated with the value function of the lower level problem, and approximating this value function, an algorithm named Perturbed Value-Function-based Interior-point Method(PVFIM) is proposed. We present a stationary condition for PBP$ε$, which has not been given before, and we show that PVFIM can converge to a stationary point of PBP$ε$. Finally, experiments are presented to verify the theoretical results and to show the application of the algorithm to GAN. △ Less

Submitted 7 January, 2024; originally announced January 2024.

arXiv:2312.14590 [pdf, other]

SIG: Speaker Identification in Literature via Prompt-Based Generation

Authors: Zhenlin Su, Liyan Xu, ** Xu, Jiangnan Li, Mingdu Huangfu

Abstract: Identifying speakers of quotations in narratives is an important task in literary analysis, with challenging scenarios including the out-of-domain inference for unseen speakers, and non-explicit cases where there are no speaker mentions in surrounding context. In this work, we propose a simple and effective approach SIG, a generation-based method that verbalizes the task and quotation input based… ▽ More Identifying speakers of quotations in narratives is an important task in literary analysis, with challenging scenarios including the out-of-domain inference for unseen speakers, and non-explicit cases where there are no speaker mentions in surrounding context. In this work, we propose a simple and effective approach SIG, a generation-based method that verbalizes the task and quotation input based on designed prompt templates, which also enables easy integration of other auxiliary tasks that further bolster the speaker identification performance. The prediction can either come from direct generation by the model, or be determined by the highest generation probability of each speaker candidate. Based on our approach design, SIG supports out-of-domain evaluation, and achieves open-world classification paradigm that is able to accept any forms of candidate input. We perform both cross-domain evaluation and in-domain evaluation on PDNC, the largest dataset of this task, where empirical results suggest that SIG outperforms previous baselines of complicated designs, as well as the zero-shot ChatGPT, especially excelling at those hard non-explicit scenarios by up to 17% improvement. Additional experiments on another dataset WP further corroborate the efficacy of SIG. △ Less

Submitted 19 February, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: Accepted to AAAI 2024

arXiv:2312.13596 [pdf, ps, other]

Anchoring Path for Inductive Relation Prediction in Knowledge Graphs

Authors: Zhixiang Su, Di Wang, Chunyan Miao, Lizhen Cui

Abstract: Aiming to accurately predict missing edges representing relations between entities, which are pervasive in real-world Knowledge Graphs (KGs), relation prediction plays a critical role in enhancing the comprehensiveness and utility of KGs. Recent research focuses on path-based methods due to their inductive and explainable properties. However, these methods face a great challenge when lots of reaso… ▽ More Aiming to accurately predict missing edges representing relations between entities, which are pervasive in real-world Knowledge Graphs (KGs), relation prediction plays a critical role in enhancing the comprehensiveness and utility of KGs. Recent research focuses on path-based methods due to their inductive and explainable properties. However, these methods face a great challenge when lots of reasoning paths do not form Closed Paths (CPs) in the KG. To address this challenge, we propose Anchoring Path Sentence Transformer (APST) by introducing Anchoring Paths (APs) to alleviate the reliance of CPs. Specifically, we develop a search-based description retrieval method to enrich entity descriptions and an assessment mechanism to evaluate the rationality of APs. APST takes both APs and CPs as the inputs of a unified Sentence Transformer architecture, enabling comprehensive predictions and high-quality explanations. We evaluate APST on three public datasets and achieve state-of-the-art (SOTA) performance in 30 of 36 transductive, inductive, and few-shot experimental settings. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2312.10655 [pdf, other]

Practical Non-Intrusive GUI Exploration Testing with Visual-based Robotic Arms

Authors: Shengcheng Yu, Chunrong Fang, Mingzhe Du, Yuchen Ling, Zhenyu Chen, Zhendong Su

Abstract: GUI testing is significant in the SE community. Most existing frameworks are intrusive and only support some specific platforms. With the development of distinct scenarios, diverse embedded systems or customized operating systems on different devices do not support existing intrusive GUI testing frameworks. Some approaches adopt robotic arms to replace the interface invoking of mobile apps under t… ▽ More GUI testing is significant in the SE community. Most existing frameworks are intrusive and only support some specific platforms. With the development of distinct scenarios, diverse embedded systems or customized operating systems on different devices do not support existing intrusive GUI testing frameworks. Some approaches adopt robotic arms to replace the interface invoking of mobile apps under test and use computer vision technologies to identify GUI elements. However, some challenges are unsolved. First, existing approaches assume that GUI screens are fixed so that they cannot be adapted to diverse systems with different screen conditions. Second, existing approaches use XY-plane robotic arms, which cannot flexibly simulate testing operations. Third, existing approaches ignore compatibility bugs and only focus on crash bugs. A more practical approach is required for the non-intrusive scenario. We propose a practical non-intrusive GUI testing framework with visual robotic arms. RoboTest integrates novel GUI screen and widget detection algorithms, adaptive to detecting screens of different sizes and then to extracting GUI widgets from the detected screens. Then, a set of testing operations is applied with a 4-DOF robotic arm, which effectively and flexibly simulates human testing operations. During app exploration, RoboTest integrates the Principle of Proximity-guided exploration strategy, choosing close widgets of the previous targets to reduce robotic arm movement overhead and improve exploration efficiency. RoboTest can effectively detect some compatibility bugs beyond crash bugs with a GUI comparison on different devices of the same test operations. We evaluate RoboTest with 20 mobile apps, with a case study on an embedded system. The results show that RoboTest can effectively, efficiently, and generally explore AUTs to find bugs and reduce exploration time overhead. △ Less

Submitted 17 December, 2023; originally announced December 2023.

Comments: Accepted by the 46th International Conference on Software Engineering (ICSE 2024)

arXiv:2312.09486 [pdf, other]

Unraveling Batch Normalization for Realistic Test-Time Adaptation

Authors: Zixian Su, **gwei Guo, Kai Yao, Xi Yang, Qiufeng Wang, Kaizhu Huang

Abstract: While recent test-time adaptations exhibit efficacy by adjusting batch normalization to narrow domain disparities, their effectiveness diminishes with realistic mini-batches due to inaccurate target estimation. As previous attempts merely introduce source statistics to mitigate this issue, the fundamental problem of inaccurate target estimation still persists, leaving the intrinsic test-time domai… ▽ More While recent test-time adaptations exhibit efficacy by adjusting batch normalization to narrow domain disparities, their effectiveness diminishes with realistic mini-batches due to inaccurate target estimation. As previous attempts merely introduce source statistics to mitigate this issue, the fundamental problem of inaccurate target estimation still persists, leaving the intrinsic test-time domain shifts unresolved. This paper delves into the problem of mini-batch degradation. By unraveling batch normalization, we discover that the inexact target statistics largely stem from the substantially reduced class diversity in batch. Drawing upon this insight, we introduce a straightforward tool, Test-time Exponential Moving Average (TEMA), to bridge the class diversity gap between training and testing batches. Importantly, our TEMA adaptively extends the scope of typical methods beyond the current batch to incorporate a diverse set of class information, which in turn boosts an accurate target estimation. Built upon this foundation, we further design a novel layer-wise rectification strategy to consistently promote test-time performance. Our proposed method enjoys a unique advantage as it requires neither training nor tuning parameters, offering a truly hassle-free solution. It significantly enhances model robustness against shifted domains and maintains resilience in diverse real-world scenarios with various batch sizes, achieving state-of-the-art performance on several major benchmarks. Code is available at \url{https://github.com/kiwi12138/RealisticTTA}. △ Less

Submitted 13 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI 2024

arXiv:2312.08591 [pdf, other]

Joint2Human: High-quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

Authors: Muxin Zhang, Qiao Feng, Zhuo Su, Chao Wen, Zhou Xue, Kun Li

Abstract: 3D human generation is increasingly significant in various applications. However, the direct use of 2D generative methods in 3D generation often results in losing local details, while methods that reconstruct geometry from generated images struggle with global view consistency. In this work, we introduce Joint2Human, a novel method that leverages 2D diffusion models to generate detailed 3D human g… ▽ More 3D human generation is increasingly significant in various applications. However, the direct use of 2D generative methods in 3D generation often results in losing local details, while methods that reconstruct geometry from generated images struggle with global view consistency. In this work, we introduce Joint2Human, a novel method that leverages 2D diffusion models to generate detailed 3D human geometry directly, ensuring both global structure and local details. To achieve this, we employ the Fourier occupancy field (FOF) representation, enabling the direct generation of 3D shapes as preliminary results with 2D generative models. With the proposed high-frequency enhancer and the multi-view recarving strategy, our method can seamlessly integrate the details from different views into a uniform global shape. To better utilize the 3D human prior and enhance control over the generated geometry, we introduce a compact spherical embedding of 3D joints. This allows for an effective guidance of pose during the generation process. Additionally, our method can generate 3D humans guided by textual inputs. Our experimental results demonstrate the capability of our method to ensure global structure, local details, high resolution, and low computational cost simultaneously. More results and the code can be found on our project page at http://cic.tju.edu.cn/faculty/likun/projects/Joint2Human. △ Less

Submitted 6 April, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.03461 [pdf, other]

HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting

Authors: Yuheng Jiang, Zhehao Shen, Penghao Wang, Zhuo Su, Yu Hong, Yingliang Zhang, **gyi Yu, Lan Xu

Abstract: We have recently seen tremendous progress in photo-real human modeling and rendering. Yet, efficiently rendering realistic human performance and integrating it into the rasterization pipeline remains challenging. In this paper, we present HiFi4G, an explicit and compact Gaussian-based approach for high-fidelity human performance rendering from dense footage. Our core intuition is to marry the 3D G… ▽ More We have recently seen tremendous progress in photo-real human modeling and rendering. Yet, efficiently rendering realistic human performance and integrating it into the rasterization pipeline remains challenging. In this paper, we present HiFi4G, an explicit and compact Gaussian-based approach for high-fidelity human performance rendering from dense footage. Our core intuition is to marry the 3D Gaussian representation with non-rigid tracking, achieving a compact and compression-friendly representation. We first propose a dual-graph mechanism to obtain motion priors, with a coarse deformation graph for effective initialization and a fine-grained Gaussian graph to enforce subsequent constraints. Then, we utilize a 4D Gaussian optimization scheme with adaptive spatial-temporal regularizers to effectively balance the non-rigid prior and Gaussian updating. We also present a companion compression scheme with residual compensation for immersive experiences on various platforms. It achieves a substantial compression rate of approximately 25 times, with less than 2MB of storage per frame. Extensive experiments demonstrate the effectiveness of our approach, which significantly outperforms existing approaches in terms of optimization speed, rendering quality, and storage overhead. △ Less

Submitted 7 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

arXiv:2311.11722 [pdf, other]

Sparse4D v3: Advancing End-to-End 3D Detection and Tracking

Authors: Xuewu Lin, Zixiang Pei, Tianwei Lin, Lichao Huang, Zhizhong Su

Abstract: In autonomous driving perception systems, 3D detection and tracking are the two fundamental tasks. This paper delves deeper into this field, building upon the Sparse4D framework. We introduce two auxiliary training tasks (Temporal Instance Denoising and Quality Estimation) and propose decoupled attention to make structural improvements, leading to significant enhancements in detection performance.… ▽ More In autonomous driving perception systems, 3D detection and tracking are the two fundamental tasks. This paper delves deeper into this field, building upon the Sparse4D framework. We introduce two auxiliary training tasks (Temporal Instance Denoising and Quality Estimation) and propose decoupled attention to make structural improvements, leading to significant enhancements in detection performance. Additionally, we extend the detector into a tracker using a straightforward approach that assigns instance ID during inference, further highlighting the advantages of query-based algorithms. Extensive experiments conducted on the nuScenes benchmark validate the effectiveness of the proposed improvements. With ResNet50 as the backbone, we witnessed enhancements of 3.0\%, 2.2\%, and 7.6\% in mAP, NDS, and AMOTA, achieving 46.9\%, 56.1\%, and 49.0\%, respectively. Our best model achieved 71.9\% NDS and 67.7\% AMOTA on the nuScenes test set. Code will be released at \url{https://github.com/linxuewu/Sparse4D}. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.07186 [pdf]

doi 10.1021/acsphotonics.3c01163

Nonlinear dielectric geometric-phase metasurface with simultaneous structure and lattice symmetry design

Authors: Bingyi Liu, René Geromel, Zhaoxian Su, Kai Guo, Yongtian Wang, Zhongyi Guo, Lingling Huang, Thomas Zentgraf

Abstract: In this work, we utilize thin dielectric meta-atoms placed on a silver substrate to efficiently enhance and manipulate the third harmonic generation. We theoretically and experimentally reveal that when the structural symmetry of the meta-atom is incompatible with the lattice symmetry of an array, some generalized nonlinear geometric phases appear, which offers new possibilities for harmonic gener… ▽ More In this work, we utilize thin dielectric meta-atoms placed on a silver substrate to efficiently enhance and manipulate the third harmonic generation. We theoretically and experimentally reveal that when the structural symmetry of the meta-atom is incompatible with the lattice symmetry of an array, some generalized nonlinear geometric phases appear, which offers new possibilities for harmonic generation control beyond the accessible symmetries governed by the selection rule. The underlying mechanism is attributed to the modified rotation of the effective principal axis of a dense meta-atom array, where the strong coupling among the units gives rise to a generalized linear geometric phase modulation on the pump light. Therefore, nonlinear geometric phases carried by the third-harmonic emissions are the natural result of the wave-mixing process among the modes excited at the fundamental frequency. This mechanism further points out a new strategy to predict the nonlinear geometric phases delivered by the nanostructures according to their linear responses. Our design is simple and efficient, and offers alternatives for the nonlinear meta-devices that are capable of flexible photon generation and manipulation. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.04527 [pdf, other]

Extended Paper: API-driven Program Synthesis for Testing Static Ty** Implementations

Authors: Thodoris Sotiropoulos, Stefanos Chaliasos, Zhendong Su

Abstract: We introduce a novel approach for testing static ty** implementations based on the concept of API-driven program synthesis. The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries. Our primary insight is backed up by real-world evidence: a significant number of compil… ▽ More We introduce a novel approach for testing static ty** implementations based on the concept of API-driven program synthesis. The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries. Our primary insight is backed up by real-world evidence: a significant number of compiler ty** bugs are caused by small test cases that employ APIs from the standard library of the language under test. This is attributed to the inherent complexity of the majority of these APIs, which often exercise a wide range of sophisticated type-related features. The main contribution of our approach is the ability to produce small client programs with increased feature coverage, without bearing the burden of generating the corresponding well-formed API definitions from scratch. To validate diverse aspects of static ty** procedures (i.e., soundness, precision of type inference), we also enrich our API-driven approach with fault-injection and semantics-preserving modes, along with their corresponding test oracles. We evaluate our implemented tool, Thalia on testing the static ty** implementations of the compilers for three popular languages, namely, Scala, Kotlin, and Groovy. Thalia has uncovered 84 ty** bugs (77 confirmed and 22 fixed), most of which are triggered by test cases featuring APIs that rely on parametric polymorphism, overloading, and higher-order functions. Our comparison with state-of-the-art shows that Thalia yields test programs with distinct characteristics, offering additional and complementary benefits. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2311.03811 [pdf, other]

Controlling FSR in Selective Classification

Authors: Guanlan Zhao, Zhonggen Su

Abstract: Uncertainty quantification and false selection error rate (FSR) control are crucial in many high-consequence scenarios, so we need models with good interpretability. This article introduces the optimality function for the binary classification problem in selective classification. We prove the optimality of this function in oracle situations and provide a data-driven method under the condition of e… ▽ More Uncertainty quantification and false selection error rate (FSR) control are crucial in many high-consequence scenarios, so we need models with good interpretability. This article introduces the optimality function for the binary classification problem in selective classification. We prove the optimality of this function in oracle situations and provide a data-driven method under the condition of exchangeability. We demonstrate it can control global FSR with the finite sample assumption and successfully extend the above situation from binary to multi-class classification. Furthermore, we demonstrate that FSR can still be controlled without exchangeability, ultimately completing the proof using the martingale method. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2310.19531 [pdf, other]

MiLe Loss: a New Loss for Mitigating the Bias of Learning Difficulties in Generative Language Models

Authors: Zhenpeng Su, Xing Wu, Xue Bai, Zijia Lin, Hui Chen, Guiguang Ding, Wei Zhou, Songlin Hu

Abstract: Generative language models are usually pretrained on large text corpus via predicting the next token (i.e., sub-word/word/phrase) given the previous ones. Recent works have demonstrated the impressive performance of large generative language models on downstream tasks. However, existing generative language models generally neglect an inherent challenge in text corpus during training, i.e., the imb… ▽ More Generative language models are usually pretrained on large text corpus via predicting the next token (i.e., sub-word/word/phrase) given the previous ones. Recent works have demonstrated the impressive performance of large generative language models on downstream tasks. However, existing generative language models generally neglect an inherent challenge in text corpus during training, i.e., the imbalance between frequent tokens and infrequent ones. It can lead a language model to be dominated by common and easy-to-learn tokens, thereby overlooking the infrequent and difficult-to-learn ones. To alleviate that, we propose a MiLe Loss function for mitigating the bias of learning difficulties with tokens. During training, it can dynamically assess the learning difficulty of a to-be-learned token, according to the information entropy of the corresponding predicted probability distribution over the vocabulary. Then it scales the training loss adaptively, trying to lead the model to focus more on the difficult-to-learn tokens. On the Pile dataset, we train generative language models at different scales of 468M, 1.2B, and 6.7B parameters. Experiments reveal that models incorporating the proposed MiLe Loss can gain consistent performance improvement on downstream benchmarks. △ Less

Submitted 28 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: This paper has been accepted by NAACL 2024

arXiv:2310.13518 [pdf, other]

Vision-Based Mobile App GUI Testing: A Survey

Authors: Shengcheng Yu, Chunrong Fang, Ziyuan Tuo, Quanjun Zhang, Chunyang Chen, Zhenyu Chen, Zhendong Su

Abstract: Graphical User Interface (GUI) has become one of the most significant parts of mobile applications (apps). It is a direct bridge between mobile apps and end users, which directly affects the end user's experience. Neglecting GUI quality can undermine the value and effectiveness of the entire mobile app solution. Significant research efforts have been devoted to GUI testing, one effective method to… ▽ More Graphical User Interface (GUI) has become one of the most significant parts of mobile applications (apps). It is a direct bridge between mobile apps and end users, which directly affects the end user's experience. Neglecting GUI quality can undermine the value and effectiveness of the entire mobile app solution. Significant research efforts have been devoted to GUI testing, one effective method to ensure mobile app quality. By conducting rigorous GUI testing, developers can ensure that the visual and interactive elements of the mobile apps not only meet functional requirements but also provide a seamless and user-friendly experience. However, traditional solutions, relying on the source code or layout files, have met challenges in both effectiveness and efficiency due to the gap between what is obtained and what app GUI actually presents. Vision-based mobile app GUI testing approaches emerged with the development of computer vision technologies and have achieved promising progress. In this survey paper, we provide a comprehensive investigation of the state-of-the-art techniques on 226 papers, among which 78 are vision-based studies. This survey covers different topics of GUI testing, like GUI test generation, GUI test record & replay, GUI testing framework, etc. Specifically, the research emphasis of this survey is placed mostly on how vision-based techniques outperform traditional solutions and have gradually taken a vital place in the GUI testing field. Based on the investigation of existing studies, we outline the challenges and opportunities of (vision-based) mobile app GUI testing and propose promising research directions with the combination of emerging techniques. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2310.10365 [pdf, other]

doi 10.1103/PhysRevLett.131.133601

Berry Curvature and Bulk-Boundary Correspondence from Transport Measurement for Photonic Chern Bands

Authors: Chao Chen, Run-Ze Liu, Jizhou Wu, Zu-En Su, Xing Ding, Jian Qin, Lin Wang, Wei-Wei Zhang, Yu He, Xi-Lin Wang, Chao-Yang Lu, Li Li, Barry C. Sanders, Xiong-Jun Liu, Jian-Wei Pan

Abstract: Berry curvature is a fundamental element to characterize topological quantum physics, while a full measurement of Berry curvature in momentum space was not reported for topological states. Here we achieve two-dimensional Berry curvature reconstruction in a photonic quantum anomalous Hall system via Hall transport measurement of a momentum-resolved wave packet. Integrating measured Berry curvature… ▽ More Berry curvature is a fundamental element to characterize topological quantum physics, while a full measurement of Berry curvature in momentum space was not reported for topological states. Here we achieve two-dimensional Berry curvature reconstruction in a photonic quantum anomalous Hall system via Hall transport measurement of a momentum-resolved wave packet. Integrating measured Berry curvature over the two-dimensional Brillouin zone, we obtain Chern numbers corresponding to -1 and 0. Further, we identify bulk-boundary correspondence by measuring topology-linked chiral edge states at the boundary. The full topological characterization of photonic Chern bands from Berry curvature, Chern number, and edge transport measurements enables our photonic system to serve as a versatile platform for further in-depth study of novel topological physics. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Journal ref: Phys. Rev. Lett. 131, 133601 (25 September 2023)

arXiv:2310.08817 [pdf]

Exploring the relationship between response time sequence in scale answering process and severity of insomnia: a machine learning approach

Authors: Zhao Su, Rongxun Liu, Keyin Zhou, Xinru Wei, Ning Wang, Zexin Lin, Yuanchen Xie, Jie Wang, Fei Wang, Shenzhong Zhang, Xizhe Zhang

Abstract: Objectives: The study aims to investigate the relationship between insomnia and response time. Additionally, it aims to develop a machine learning model to predict the presence of insomnia in participants using response time data. Methods: A mobile application was designed to administer scale tests and collect response time data from 2729 participants. The relationship between symptom severity and… ▽ More Objectives: The study aims to investigate the relationship between insomnia and response time. Additionally, it aims to develop a machine learning model to predict the presence of insomnia in participants using response time data. Methods: A mobile application was designed to administer scale tests and collect response time data from 2729 participants. The relationship between symptom severity and response time was explored, and a machine learning model was developed to predict the presence of insomnia. Results: The result revealed a statistically significant difference (p<.001) in the total response time between participants with or without insomnia symptoms. A correlation was observed between the severity of specific insomnia aspects and response times at the individual questions level. The machine learning model demonstrated a high predictive accuracy of 0.743 in predicting insomnia symptoms based on response time data. Conclusions: These findings highlight the potential utility of response time data to evaluate cognitive and psychological measures, demonstrating the effectiveness of using response time as a diagnostic tool in the assessment of insomnia. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.05180 [pdf, other]

Blockchain-Envisioned UAV-Aided Disaster Relief Networks: Challenges and Solutions

Authors: Yuntao Wang, Qinnan Hu, Zhendong Li, Zhou Su, Ruidong Li, Xiang Zou, Jian Zhou

Abstract: Natural or man-made disasters pose significant challenges for delivering critical relief to affected populations due to disruptions in critical infrastructures and logistics networks. Unmanned aerial vehicles (UAVs)-aided disaster relief networks (UDRNs) leverage UAVs to assist existing ground relief networks by swiftly assessing affected areas and timely delivering lifesaving supplies. To meet th… ▽ More Natural or man-made disasters pose significant challenges for delivering critical relief to affected populations due to disruptions in critical infrastructures and logistics networks. Unmanned aerial vehicles (UAVs)-aided disaster relief networks (UDRNs) leverage UAVs to assist existing ground relief networks by swiftly assessing affected areas and timely delivering lifesaving supplies. To meet the growing demands for collaborative, trust-free, and transparent UDRN services, blockchain-based UDRNs emerge as a promising approach through immutable ledgers and distributed smart contracts. However, several efficiency and security challenges hinder the deployment of blockchain-based UDRNs, including the lack of cooperation between smart contracts, lack of dynamic audit for smart contract vulnerabilities, and low forensics robustness against transaction malleability attacks. Towards efficient and secure blockchain-based UDRNs, this paper presents potential solutions: (i) a series of collaborative smart contracts for coordinated relief management, (ii) a dynamic contract audit mechanism to prevent known/unknown contract vulnerabilities; and (iii) a robust transaction forensics strategy with on/off-chain cooperation to resist transaction malleability attacks. Our prototype implementation and experimental results demonstrate the feasibility and effectiveness of our approach. Lastly, we outline key open research issues crucial to advancing this emerging field. △ Less

Submitted 24 May, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

Comments: 8 pages

arXiv:2310.00033 [pdf]

OriWheelBot: An origami-wheeled robot

Authors: Jie Liu, Zufeng Pang, Zhiyong Li, Guilin Wen, Zhoucheng Su, Junfeng He, Kaiyue Liu, Dezheng Jiang, Zenan Li, Shouyan Chen, Yang Tian, Yi Min Xie, Zhenpei Wang, Zhuangjian Liu

Abstract: Origami-inspired robots with multiple advantages, such as being lightweight, requiring less assembly, and exhibiting exceptional deformability, have received substantial and sustained attention. However, the existing origami-inspired robots are usually of limited functionalities and develo** feature-rich robots is very challenging. Here, we report an origami-wheeled robot (OriWheelBot) with vari… ▽ More Origami-inspired robots with multiple advantages, such as being lightweight, requiring less assembly, and exhibiting exceptional deformability, have received substantial and sustained attention. However, the existing origami-inspired robots are usually of limited functionalities and develo** feature-rich robots is very challenging. Here, we report an origami-wheeled robot (OriWheelBot) with variable width and outstanding sand walking versatility. The OriWheelBot's ability to adjust wheel width over obstacles is achieved by origami wheels made of Miura origami. An improved version, called iOriWheelBot, is also developed to automatically judge the width of the obstacles. Three actions, namely direct pass, variable width pass, and direct return, will be carried out depending on the width of the channel between the obstacles. We have identified two motion mechanisms, i.e., sand-digging and sand-pushing, with the latter being more conducive to walking on the sand. We have systematically examined numerous sand walking characteristics, including carrying loads, climbing a slope, walking on a slope, and navigating sand pits, small rocks, and sand traps. The OriWheelBot can change its width by 40%, has a loading-carrying ratio of 66.7% on flat sand and can climb a 17-degree sand incline. The OriWheelBot can be useful for planetary subsurface exploration and disaster area rescue. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 23 papes, 7 figures

arXiv:2309.09498 [pdf, other]

doi 10.1109/MNET.2024.3389734

Combating Advanced Persistent Threats: Challenges and Solutions

Authors: Yuntao Wang, Han Liu, Zhendong Li, Zhou Su, Jiliang Li

Abstract: The rise of advanced persistent threats (APTs) has marked a significant cybersecurity challenge, characterized by sophisticated orchestration, stealthy execution, extended persistence, and targeting valuable assets across diverse sectors. Provenance graph-based kernel-level auditing has emerged as a promising approach to enhance visibility and traceability within intricate network environments. Ho… ▽ More The rise of advanced persistent threats (APTs) has marked a significant cybersecurity challenge, characterized by sophisticated orchestration, stealthy execution, extended persistence, and targeting valuable assets across diverse sectors. Provenance graph-based kernel-level auditing has emerged as a promising approach to enhance visibility and traceability within intricate network environments. However, it still faces challenges including reconstructing complex lateral attack chains, detecting dynamic evasion behaviors, and defending smart adversarial subgraphs. To bridge the research gap, this paper proposes an efficient and robust APT defense scheme leveraging provenance graphs, including a network-level distributed audit model for cost-effective lateral attack reconstruction, a trust-oriented APT evasion behavior detection strategy, and a hidden Markov model based adversarial subgraph defense approach. Through prototype implementation and extensive experiments, we validate the effectiveness of our system. Lastly, crucial open research directions are outlined in this emerging field. △ Less

Submitted 12 April, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: This work has been accepted by IEEE NETWORK in April 2024. 9 pages, 5 figures, 1 table

arXiv:2309.09412 [pdf]

Cross-attention-based saliency inference for predicting cancer metastasis on whole slide images

Authors: Ziyu Su, Mostafa Rezapour, Usama Sajjad, Shuo Niu, Metin Nafi Gurcan, Muhammad Khalid Khan Niazi

Abstract: Although multiple instance learning (MIL) methods are widely used for automatic tumor detection on whole slide images (WSI), they suffer from the extreme class imbalance within the small tumor WSIs. This occurs when the tumor comprises only a few isolated cells. For early detection, it is of utmost importance that MIL algorithms can identify small tumors, even when they are less than 1% of the siz… ▽ More Although multiple instance learning (MIL) methods are widely used for automatic tumor detection on whole slide images (WSI), they suffer from the extreme class imbalance within the small tumor WSIs. This occurs when the tumor comprises only a few isolated cells. For early detection, it is of utmost importance that MIL algorithms can identify small tumors, even when they are less than 1% of the size of the WSI. Existing studies have attempted to address this issue using attention-based architectures and instance selection-based methodologies, but have not yielded significant improvements. This paper proposes cross-attention-based salient instance inference MIL (CASiiMIL), which involves a novel saliency-informed attention mechanism, to identify breast cancer lymph node micro-metastasis on WSIs without the need for any annotations. Apart from this new attention mechanism, we introduce a negative representation learning algorithm to facilitate the learning of saliency-informed attention weights for improved sensitivity on tumor WSIs. The proposed model outperforms the state-of-the-art MIL methods on two popular tumor metastasis detection datasets, and demonstrates great cross-center generalizability. In addition, it exhibits excellent accuracy in classifying WSIs with small tumor lesions. Moreover, we show that the proposed model has excellent interpretability attributed to the saliency-informed attention weights. We strongly believe that the proposed method will pave the way for training algorithms for early tumor detection on large datasets where acquiring fine-grained annotations is practically impossible. △ Less

Submitted 17 September, 2023; originally announced September 2023.

arXiv:2309.09304 [pdf, other]

Mobile Metaverse: A Road Map from Metaverse to Metavehicles

Authors: Yilong Hui, Gaosheng Zhao, Nan Cheng, Haibo Zhou, Zhou Su

Abstract: With the rapid development of communication technologies and extended reality (XR), the services and applications of the Metaverse are gradually entering our lives. However, the current development of the Metaverse provides users with services that are homogeneous with the user experience that the Internet has brought in the past, making them more like an extension of the Internet. In addition, as… ▽ More With the rapid development of communication technologies and extended reality (XR), the services and applications of the Metaverse are gradually entering our lives. However, the current development of the Metaverse provides users with services that are homogeneous with the user experience that the Internet has brought in the past, making them more like an extension of the Internet. In addition, as a mobile application carrier for the Metaverse, it is also worth considering how vehicles with diverse onboard components can develop in synergy with the Metaverse. In this article, we focus on the core of the Metaverse, namely user experience, and provide a road map from Metaverse to Metaverse vehicles (Metavehicles). Specifically, we first elaborate on six features of the Metaverse from the perspective of user experience and propose a hierarchical framework for the Metaverse based on the evolutionary logic of the features. Under the guidance of this framework, we discuss the empowerment of onboard components of Metavehicles on the development of the Metaverse, and analyze the service experience that Metavehicles can bring to two types of users, namely drivers and passengers. Finally, considering the differentiated development levels of Metaverse and autonomous driving, we further establish a hierarchical framework for Metavehicles from three aspects (i.e., enhance Metaverse, enhance driving experience, and enhance entertainment experience), providing an evolutionary path for the development of Metavehicles. △ Less

Submitted 17 September, 2023; originally announced September 2023.

Comments: 7 pages, 5 figures

arXiv:2309.08189 [pdf, ps, other]

Rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales

Authors: Xiequan Fan, Zhonggen Su

Abstract: We give some rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales with differences having finite variances. For the Kolmogorov distances, we present some exact Berry-Esseen bounds for martingales, which generalizes some Berry-Esseen bounds due to Bolthausen. For the Wasserstein distance, with Stein's method and Lindeberg's telesco** sum argument, the r… ▽ More We give some rates of convergence in the distances of Kolmogorov and Wasserstein for standardized martingales with differences having finite variances. For the Kolmogorov distances, we present some exact Berry-Esseen bounds for martingales, which generalizes some Berry-Esseen bounds due to Bolthausen. For the Wasserstein distance, with Stein's method and Lindeberg's telesco** sum argument, the rates of convergence in martingale central limit theorems recover the classical rates for sums of i.i.d.\ random variables, and therefore they are believed to be optimal. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: 31 pages

MSC Class: Primary 60G42; 60F05; Secondary 60E15

arXiv:2309.08106 [pdf, other]

Data-Driven Goal Recognition in Transhumeral Prostheses Using Process Mining Techniques

Authors: Zihang Su, Tianshi Yu, Nir Lipovetzky, Alireza Mohammadi, Denny Oetomo, Artem Polyvyanyy, Sebastian Sardina, Ying Tan, Nick van Beest

Abstract: A transhumeral prosthesis restores missing anatomical segments below the shoulder, including the hand. Active prostheses utilize real-valued, continuous sensor data to recognize patient target poses, or goals, and proactively move the artificial limb. Previous studies have examined how well the data collected in stationary poses, without considering the time steps, can help discriminate the goals.… ▽ More A transhumeral prosthesis restores missing anatomical segments below the shoulder, including the hand. Active prostheses utilize real-valued, continuous sensor data to recognize patient target poses, or goals, and proactively move the artificial limb. Previous studies have examined how well the data collected in stationary poses, without considering the time steps, can help discriminate the goals. In this case study paper, we focus on using time series data from surface electromyography electrodes and kinematic sensors to sequentially recognize patients' goals. Our approach involves transforming the data into discrete events and training an existing process mining-based goal recognition system. Results from data collected in a virtual reality setting with ten subjects demonstrate the effectiveness of our proposed goal recognition approach, which achieves significantly better precision and recall than the state-of-the-art machine learning techniques and is less confident when wrong, which is beneficial when approximating smoother movements of prostheses. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: The 5th International Conference on Process Mining (ICPM 2023)

ACM Class: I.2.4; I.2.9

arXiv:2309.05423 [pdf, other]

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

Authors: **zuomu Zhong, Yang Li, Hui Huang, Korin Richmond, Jie Liu, Zhiba Su, **g Guo, Benlai Tang, Fengjie Zhu

Abstract: In expressive and controllable Text-to-Speech (TTS), explicit prosodic features significantly improve the naturalness and controllability of synthesised speech. However, manual prosody annotation is labor-intensive and inconsistent. To address this issue, a two-stage automatic annotation pipeline is novelly proposed in this paper. In the first stage, we use contrastive pretraining of Speech-Silenc… ▽ More In expressive and controllable Text-to-Speech (TTS), explicit prosodic features significantly improve the naturalness and controllability of synthesised speech. However, manual prosody annotation is labor-intensive and inconsistent. To address this issue, a two-stage automatic annotation pipeline is novelly proposed in this paper. In the first stage, we use contrastive pretraining of Speech-Silence and Word-Punctuation (SSWP) pairs to enhance prosodic information in latent representations. In the second stage, we build a multi-modal prosody annotator, comprising pretrained encoders, a text-speech fusing scheme, and a sequence classifier. Experiments on English prosodic boundaries demonstrate that our method achieves state-of-the-art (SOTA) performance with 0.72 and 0.93 f1 score for Prosodic Word and Prosodic Phrase boundary respectively, while bearing remarkable robustness to data scarcity. △ Less

Submitted 11 June, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

arXiv:2309.02731 [pdf, other]

HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

Authors: Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu

Abstract: ChatGPT has gained significant interest due to its impressive performance, but people are increasingly concerned about its potential risks, particularly around the detection of AI-generated content (AIGC), which is often difficult for untrained humans to identify. Current datasets utilized for detecting ChatGPT-generated text primarily center around question-answering, yet they tend to disregard t… ▽ More ChatGPT has gained significant interest due to its impressive performance, but people are increasingly concerned about its potential risks, particularly around the detection of AI-generated content (AIGC), which is often difficult for untrained humans to identify. Current datasets utilized for detecting ChatGPT-generated text primarily center around question-answering, yet they tend to disregard tasks that possess semantic-invariant properties, such as summarization, translation, and paraphrasing. Our primary studies demonstrate that detecting model-generated text on semantic-invariant tasks is more difficult. To fill this gap, we introduce a more extensive and comprehensive dataset that considers more types of tasks than previous work, including semantic-invariant tasks. In addition, the model after a large number of task instruction fine-tuning shows a strong powerful performance. Owing to its previous success, we further instruct fine-tuning T\textit{k}-instruct and build a more powerful detection system. △ Less

Submitted 25 January, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

Comments: This paper has been accepted by CIKM2023 workshop

arXiv:2309.00251 [pdf, other]

Optimal Repair Strategy Against Advanced Persistent Threats Under Time-Varying Networks

Authors: Zixuan Wang, Jiliang Li, Yuntao Wang, Zhou Su, Shui Yu, Weizhi Meng

Abstract: Advanced persistent threat (APT) is a kind of stealthy, sophisticated, and long-term cyberattack that has brought severe financial losses and critical infrastructure damages. Existing works mainly focus on APT defense under stable network topologies, while the problem under time-varying dynamic networks (e.g., vehicular networks) remains unexplored, which motivates our work. Besides, the spatiotem… ▽ More Advanced persistent threat (APT) is a kind of stealthy, sophisticated, and long-term cyberattack that has brought severe financial losses and critical infrastructure damages. Existing works mainly focus on APT defense under stable network topologies, while the problem under time-varying dynamic networks (e.g., vehicular networks) remains unexplored, which motivates our work. Besides, the spatiotemporal dynamics in defense resources, complex attackers' lateral movement behaviors, and lack of timely defense make APT defense a challenging issue under time-varying networks. In this paper, we propose a novel game-theoretical APT defense approach to promote real-time and optimal defense strategy-making under both periodic time-varying and general time-varying environments. Specifically, we first model the interactions between attackers and defenders in an APT process as a dynamic APT repair game, and then formulate the APT damage minimization problem as the precise prevention and control (PPAC) problem. To derive the optimal defense strategy under both latency and defense resource constraints, we further devise an online optimal control-based mechanism integrated with two backtracking-forward algorithms to fastly derive the near-optimal solution of the PPAC problem in real time. Extensive experiments are carried out, and the results demonstrate that our proposed scheme can efficiently obtain optimal defense strategy in 54481 ms under seven attack-defense interactions with 9.64$\%$ resource occupancy in stimulated periodic time-varying and general time-varying networks. Besides, even under static networks, our proposed scheme still outperforms existing representative APT defense approaches in terms of service stability and defense resource utilization. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2308.11168 [pdf, ps, other]

Discretized Normal Approximation of Sums of Locally Dependent Random Variables via Stein's Method

Authors: Zhonggen Su, Xiaolin Wang

Abstract: Let $\{X_{i}, i\in J\}$ be a family of locally dependent non-negative integer-valued random variables with finite expectation and variance. We consider the sum $W=\sum_{i\in J}X_i$ and establish general error upper bounds for the total variation distance $d_{TV}(W, Y^{d})$, where $Y^{d}$ is the discretized normal distribution. The major ingredient of the proof is to approximate $W$ by a three-para… ▽ More Let $\{X_{i}, i\in J\}$ be a family of locally dependent non-negative integer-valued random variables with finite expectation and variance. We consider the sum $W=\sum_{i\in J}X_i$ and establish general error upper bounds for the total variation distance $d_{TV}(W, Y^{d})$, where $Y^{d}$ is the discretized normal distribution. The major ingredient of the proof is to approximate $W$ by a three-parametric intermediate random variable $M$ based on Stein's method. As applications, we study in detail four well-known examples, which are counting vertices of all edges point inward, birthday problem, counting monochromatic edges in uniformly colored graphs, and triangles in the Erdős-Rényi random graph. Through delicate analysis and computations we obtain sharper upper error bounds than existing results. △ Less

Submitted 21 August, 2023; originally announced August 2023.

arXiv:2308.08855 [pdf, other]

Realistic Full-Body Tracking from Sparse Observations via Joint-Level Modeling

Authors: Xiaozheng Zheng, Zhuo Su, Chao Wen, Zhou Xue, Xiaojie **

Abstract: To bridge the physical and virtual worlds for rapidly developed VR/AR applications, the ability to realistically drive 3D full-body avatars is of great significance. Although real-time body tracking with only the head-mounted displays (HMDs) and hand controllers is heavily under-constrained, a carefully designed end-to-end neural network is of great potential to solve the problem by learning from… ▽ More To bridge the physical and virtual worlds for rapidly developed VR/AR applications, the ability to realistically drive 3D full-body avatars is of great significance. Although real-time body tracking with only the head-mounted displays (HMDs) and hand controllers is heavily under-constrained, a carefully designed end-to-end neural network is of great potential to solve the problem by learning from large-scale motion data. To this end, we propose a two-stage framework that can obtain accurate and smooth full-body motions with the three tracking signals of head and hands only. Our framework explicitly models the joint-level features in the first stage and utilizes them as spatiotemporal tokens for alternating spatial and temporal transformer blocks to capture joint-level correlations in the second stage. Furthermore, we design a set of loss terms to constrain the task of a high degree of freedom, such that we can exploit the potential of our joint-level modeling. With extensive experiments on the AMASS motion dataset and real-captured data, we validate the effectiveness of our designs and show our proposed method can achieve more accurate and smooth motion compared to existing approaches. △ Less

Submitted 17 August, 2023; originally announced August 2023.

Comments: Accepted to ICCV 2023. Project page: https://zxz267.github.io/AvatarJLM

arXiv:2308.08730 [pdf, other]

Learning A Coarse-to-Fine Diffusion Transformer for Image Restoration

Authors: Liyan Wang, Qinyu Yang, Cong Wang, Wei Wang, **shan Pan, Zhixun Su

Abstract: Recent years have witnessed the remarkable performance of diffusion models in various vision tasks. However, for image restoration that aims to recover clear images with sharper details from given degraded observations, diffusion-based methods may fail to recover promising results due to inaccurate noise estimation. Moreover, simple constraining noises cannot effectively learn complex degradation… ▽ More Recent years have witnessed the remarkable performance of diffusion models in various vision tasks. However, for image restoration that aims to recover clear images with sharper details from given degraded observations, diffusion-based methods may fail to recover promising results due to inaccurate noise estimation. Moreover, simple constraining noises cannot effectively learn complex degradation information, which subsequently hinders the model capacity. To solve the above problems, we propose a coarse-to-fine diffusion Transformer (C2F-DFT) for image restoration. Specifically, our C2F-DFT contains diffusion self-attention (DFSA) and diffusion feed-forward network (DFN) within a new coarse-to-fine training scheme. The DFSA and DFN respectively capture the long-range diffusion dependencies and learn hierarchy diffusion representation to facilitate better restoration. In the coarse training stage, our C2F-DFT estimates noises and then generates the final clean image by a sampling algorithm. To further improve the restoration quality, we propose a simple yet effective fine training scheme. It first exploits the coarse-trained diffusion model with fixed steps to generate restoration results, which then would be constrained with corresponding ground-truth ones to optimize the models to remedy the unsatisfactory results affected by inaccurate noise estimation. Extensive experiments show that C2F-DFT significantly outperforms diffusion-based restoration method IR-SDE and achieves competitive performance compared with Transformer-based state-of-the-art methods on $3$ tasks, including image deraining, image deblurring, and real image denoising. Code is available at https://github.com/wlydlut/C2F-DFT. △ Less

Submitted 8 October, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

Comments: 13 pages, 10 figures

arXiv:2308.05925 [pdf, other]

CaPhy: Capturing Physical Properties for Animatable Human Avatars

Authors: Zhaoqi Su, Liangxiao Hu, Siyou Lin, Hongwen Zhang, Sheng** Zhang, Justus Thies, Yebin Liu

Abstract: We present CaPhy, a novel method for reconstructing animatable human avatars with realistic dynamic properties for clothing. Specifically, we aim for capturing the geometric and physical properties of the clothing from real observations. This allows us to apply novel poses to the human avatar with physically correct deformations and wrinkles of the clothing. To this end, we combine unsupervised tr… ▽ More We present CaPhy, a novel method for reconstructing animatable human avatars with realistic dynamic properties for clothing. Specifically, we aim for capturing the geometric and physical properties of the clothing from real observations. This allows us to apply novel poses to the human avatar with physically correct deformations and wrinkles of the clothing. To this end, we combine unsupervised training with physics-based losses and 3D-supervised training using scanned data to reconstruct a dynamic model of clothing that is physically realistic and conforms to the human scans. We also optimize the physical parameters of the underlying physical model from the scans by introducing gradient constraints of the physics-based losses. In contrast to previous work on 3D avatar reconstruction, our method is able to generalize to novel poses with realistic dynamic cloth deformations. Experiments on several subjects demonstrate that our method can estimate the physical properties of the garments, resulting in superior quantitative and qualitative results compared with previous methods. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2308.04189 [pdf, other]

Yak: An Asynchronous Bundled Data Pipeline Description Language

Authors: Carsten Nielsen, Zhe Su, Giacomo Indiveri

Abstract: The design of asynchronous circuits typically requires a judicious definition of signals and modules, combined with a proper specification of their timing constraints, which can be a complex and error-prone process, using standard Hardware Description Languages (HDLs). In this paper we introduce Yak, a new dataflow description language for asynchronous bundled data circuits. Yak allows designers t… ▽ More The design of asynchronous circuits typically requires a judicious definition of signals and modules, combined with a proper specification of their timing constraints, which can be a complex and error-prone process, using standard Hardware Description Languages (HDLs). In this paper we introduce Yak, a new dataflow description language for asynchronous bundled data circuits. Yak allows designers to generate Verilog and timing constraints automatically, from a textual description of bundled data control flow structures and combinational logic blocks. The timing constraints are generated using the Local Clock Set methodology and can be consumed by standard industry tools. Yak includes ergonomic language features such as structured bindings of channels undergoing fork and join operations, named value scope propagation along channels, and channel ty**. Here we present Yak's language front-end and compare the automated synthesis and layout results of an example circuit with a manual constraint specification approach. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.04171 [pdf, other]

Core interface optimization for multi-core neuromorphic processors

Authors: Zhe Su, Hyunjung Hwang, Tristan Torchet, Giacomo Indiveri

Abstract: Hardware implementations of Spiking Neural Networks (SNNs) represent a promising approach to edge-computing for applications that require low-power and low-latency, and which cannot resort to external cloud-based computing services. However, most solutions proposed so far either support only relatively small networks, or take up significant hardware resources, to implement large networks. To reali… ▽ More Hardware implementations of Spiking Neural Networks (SNNs) represent a promising approach to edge-computing for applications that require low-power and low-latency, and which cannot resort to external cloud-based computing services. However, most solutions proposed so far either support only relatively small networks, or take up significant hardware resources, to implement large networks. To realize large-scale and scalable SNNs it is necessary to develop an efficient asynchronous communication and routing fabric that enables the design of multi-core architectures. In particular the core interface that manages inter-core spike communication is a crucial component as it represents the bottleneck of Power-Performance-Area (PPA) especially for the arbitration architecture and the routing memory. In this paper we present an arbitration mechanism with the corresponding asynchronous encoding pipeline circuits, based on hierarchical arbiter trees. The proposed scheme reduces the latency by more than 70% in sparse-event mode, compared to the state-of-the-art arbitration architectures, with lower area cost. The routing memory makes use of asynchronous Content Addressable Memory (CAM) with Current Sensing Completion Detection (CSCD), which saves approximately 46% energy, and achieves a 40% increase in throughput against conventional asynchronous CAM using configurable delay lines, at the cost of only a slight increase in area. In addition as it radically reduces the core interface resources in multi-core neuromorphic processors, the arbitration architecture and CAM architecture we propose can be also applied to a wide range of general asynchronous circuits and systems. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2308.00078 [pdf]

Mobile Apps for Children's Health and Wellbeing: Design Features and Future Opportunities

Authors: Jamie Lee, Zhaoyuan Su, Yunan Chen

Abstract: Mobile health apps hold great potential for promoting children's health and wellbeing. However, there is limited understanding of how these technologies are currently designed to support children with their health concerns or wellness goals. To gain insight into the current landscape of mobile apps designed for children's health, we retrieved and reviewed 43 apps from IOS and Google Play store tha… ▽ More Mobile health apps hold great potential for promoting children's health and wellbeing. However, there is limited understanding of how these technologies are currently designed to support children with their health concerns or wellness goals. To gain insight into the current landscape of mobile apps designed for children's health, we retrieved and reviewed 43 apps from IOS and Google Play store that are specifically marketed for children. Our qualitative analysis identified the dominant health focuses and goals of children's mobile health apps. We analyzed the primary users and their expectations as well as the methods of engagement and involvement adopted. Based on our findings, we discussed the opportunities to support children with chronic illnesses through mobile apps, design for dual use, and design for age appropriateness and digital health safety. This study provides insights and recommendations for app designers, health researchers, and policymakers on strategies for engaging children and parents while also promoting children's health and wellbeing through mobile technology. △ Less

Submitted 31 July, 2023; originally announced August 2023.

Comments: Paper accepted for the proceedings of the 2023 American Medical Informatics Association Annual Symposium (AMIA)

arXiv:2307.15917 [pdf, other]

doi 10.1103/PhysRevLett.132.093403

Observation of photoassociation resonances in ultracold atom-molecule collisions

Authors: ** Cao, Bo-Yuan Wang, Huan Yang, Zhi-Jie Fan, Zhen Su, Jun Rui, Bo Zhao, Jian-Wei Pan

Abstract: Photoassociation of ultracold atoms is a resonant light-assisted collision process, in which two colliding atoms absorb a photon and form an excited molecule. Since the first observation about three decades ago, the photoassociation of ultracold atoms has made a significant impact on the study of ultracold atoms and molecules. Extending the photoassociation of atoms to the photoassociation of atom… ▽ More Photoassociation of ultracold atoms is a resonant light-assisted collision process, in which two colliding atoms absorb a photon and form an excited molecule. Since the first observation about three decades ago, the photoassociation of ultracold atoms has made a significant impact on the study of ultracold atoms and molecules. Extending the photoassociation of atoms to the photoassociation of atom-molecule pairs or molecule-molecule pairs will offer many new opportunities in the study of precision polyatomic molecular spectroscopy, formation of ultracold polyatomic molecules, and quantum control of molecular collisions and reactions. However, the high density of states and the photoexcitation of the collision complex by the trap** laser make photoassociation into well-defined quantum states of polyatomic molecules extremely difficult. Here we report on the observation of photoassociation resonances in ultracold collisions between $^{23}$Na$^{40}$K molecules and $^{40}$K atoms. We perform photoassociation in a long-wavelength optical dipole trap to form deeply bound triatomic molecules in the electronically excited states. The atom-molecule Feshbach resonance is used to enhance the free-bound Franck-Condon overlap. The photoassociation into well-defined quantum states of excited triatomic molecules is identified by observing resonantly enhanced loss features. These loss features depend on the polarization of the photoassociation lasers, allowing us to assign the rotational quantum numbers. The observation of ultracold atom-molecule photoassociation resonances paves the way toward preparing ground-state triatomic molecules, provides a new high-resolution spectroscopy technique for polyatomic molecules, and is also important to atom-molecule Feshbach resonances. △ Less

Submitted 29 July, 2023; originally announced July 2023.

Comments: 8 pages, 4 figures

Journal ref: Physical Review Letters 132, 093403 (2024)

arXiv:2307.15810 [pdf]

Understanding the Benefits and Challenges of Using Large Language Model-based Conversational Agents for Mental Well-being Support

Authors: Zilin Ma, Yiyang Mei, Zhaoyuan Su

Abstract: Conversational agents powered by large language models (LLM) have increasingly been utilized in the realm of mental well-being support. However, the implications and outcomes associated with their usage in such a critical field remain somewhat ambiguous and unexplored. We conducted a qualitative analysis of 120 posts, encompassing 2917 user comments, drawn from the most popular subreddit focused o… ▽ More Conversational agents powered by large language models (LLM) have increasingly been utilized in the realm of mental well-being support. However, the implications and outcomes associated with their usage in such a critical field remain somewhat ambiguous and unexplored. We conducted a qualitative analysis of 120 posts, encompassing 2917 user comments, drawn from the most popular subreddit focused on mental health support applications powered by large language models (u/Replika). This exploration aimed to shed light on the advantages and potential pitfalls associated with the integration of these sophisticated models in conversational agents intended for mental health support. We found the app (Replika) beneficial in offering on-demand, non-judgmental support, boosting user confidence, and aiding self-discovery. Yet, it faced challenges in filtering harmful content, sustaining consistent communication, remembering new information, and mitigating users' overdependence. The stigma attached further risked isolating users socially. We strongly assert that future researchers and designers must thoroughly evaluate the appropriateness of employing LLMs for mental well-being support, ensuring their responsible and effective application. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2307.08871 [pdf]

Combining X-ray Nano-CT and XANES Techniques for 3D Operando Monitoring of Lithiation Spatial Composition evolution in NMC Electrode

Authors: Tuan-Tu Nguyen, Jiahui Xu, Zeliang Su, Vincent De Andrade, Alejandro A. Franco, Bruno Delobel, Charles Delacourt, Arnaud Demortière

Abstract: In this study, we present a well-defined methodology for conducting Operando X-ray absorption near-edge structure spectroscopy (XANES) in conjunction with transmission X-ray nano computed tomography (TXM-nanoCT) experiments on the LiNi$_{0.5}$Mn$_{0.3}$Co$_{0.2}$O$_2$ (NMC) cathode electrode. To minimize radiation-induced damage to the sample during charge and discharge cycles and to gain a compre… ▽ More In this study, we present a well-defined methodology for conducting Operando X-ray absorption near-edge structure spectroscopy (XANES) in conjunction with transmission X-ray nano computed tomography (TXM-nanoCT) experiments on the LiNi$_{0.5}$Mn$_{0.3}$Co$_{0.2}$O$_2$ (NMC) cathode electrode. To minimize radiation-induced damage to the sample during charge and discharge cycles and to gain a comprehensive 3D perspective of the (de)lithiation process of the active material, we propose a novel approach that relies on employing only three energy levels, strategically positioned at pre-edge, edge, and post-edge. By adopting this technique, we successfully track the various (de)lithiation states within the three-dimensional space during partial cycling. Furthermore, we are able to extract the nanoscale lithium distribution within individual secondary particles. Our observations reveal the formation of a core-shell structure during lithiation and we also identify that not all surface areas of the particles exhibit activity during the process. Notably, lithium intercalation exhibits a distinct preference, leading to non-uniform lithiation degrees across different electrode locations. The proposed methodology is not limited to the NMC cathode electrode but can be extended to study realistic dedicated electrodes with high active material (AM) density, facilitating exploration and quantification of heterogeneities and inhomogeneous lithiation within such electrodes. This multi-scale insight into the (de)lithiation process and lithiation heterogeneities within the electrodes is expected to provide valuable knowledge for optimizing electrode design and ultimately enhancing electrode performance in the context of material science and battery materials research. △ Less

Submitted 9 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

Comments: 6 figures (SI, 3 figures)

arXiv:2307.08609 [pdf, other]

Overlap** Batch Confidence Intervals on Statistical Functionals Constructed from Time Series: Application to Quantiles, Optimization, and Estimation

Authors: Ziwei Su, Raghu Pasupathy, Yingchieh Yeh, Peter W. Glynn

Abstract: We propose a general purpose confidence interval procedure (CIP) for statistical functionals constructed using data from a stationary time series. The procedures we propose are based on derived distribution-free analogues of the $χ^2$ and Student's $t$ random variables for the statistical functional context, and hence apply in a wide variety of settings including quantile estimation, gradient esti… ▽ More We propose a general purpose confidence interval procedure (CIP) for statistical functionals constructed using data from a stationary time series. The procedures we propose are based on derived distribution-free analogues of the $χ^2$ and Student's $t$ random variables for the statistical functional context, and hence apply in a wide variety of settings including quantile estimation, gradient estimation, M-estimation, CVAR-estimation, and arrival process rate estimation, apart from more traditional statistical settings. Like the method of subsampling, we use overlap** batches of time series data to estimate the underlying variance parameter; unlike subsampling and the bootstrap, however, we assume that the implied point estimator of the statistical functional obeys a central limit theorem (CLT) to help identify the weak asymptotics (called OB-x limits, x=I,II,III) of batched Studentized statistics. The OB-x limits, certain functionals of the Wiener process parameterized by the size of the batches and the extent of their overlap, form the essential machinery for characterizing dependence, and consequently the correctness of the proposed CIPs. The message from extensive numerical experimentation is that in settings where a functional CLT on the point estimator is in effect, using \emph{large overlap** batches} alongside OB-x critical values yields confidence intervals that are often of significantly higher quality than those obtained from more generic methods like subsampling or the bootstrap. We illustrate using examples from CVaR estimation, ARMA parameter estimation, and NHPP rate estimation; R and MATLAB code for OB-x critical values is available at~\texttt{web.ics.purdue.edu/~pasupath/}. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 43 pages, 4 figures

MSC Class: 62F40 (Primary) 60F17; 62M10 (Secondary)

arXiv:2307.08043 [pdf, ps, other]

STAR-RIS Enhanced Joint Physical Layer Security and Covert Communications for Multi-antenna mmWave Systems

Authors: Han Xiao, Xiaoyan Hu, Ang Li, Wenjie Wang, Zhou Su, Kai-Kit Wong, Kun Yang

Abstract: This paper investigates the utilization of simultaneously transmitting and reflecting RIS (STAR-RIS) in supporting joint physical layer security (PLS) and covert communications (CCs) in a multi-antenna millimeter wave (mmWave) system, where the base station (BS) communicates with both covert and security users while defeating eavesdrop** by wardens with the help of a STAR-RIS. Specifically, anal… ▽ More This paper investigates the utilization of simultaneously transmitting and reflecting RIS (STAR-RIS) in supporting joint physical layer security (PLS) and covert communications (CCs) in a multi-antenna millimeter wave (mmWave) system, where the base station (BS) communicates with both covert and security users while defeating eavesdrop** by wardens with the help of a STAR-RIS. Specifically, analytical derivations are performed to obtain the closed-form expression of warden's minimum detection error probability (DEP). Furthermore, the asymptotic result of the minimum DEP and the lower bound of the secure rates are derived, considering the practical assumption that BS only knows the statistical channel state information (CSI) between STAR-RIS and the wardens. Subsequently, an optimization problem is formulated with the aim of maximizing the average sum of the covert rate and the minimum secure rate while ensuring the covert requirement and quality of service (QoS) for legal users by jointly optimizing the active and passive beamformers. Due to the strong coupling among variables, an iterative algorithm based on the alternating strategy and the semi-definite relaxation (SDR) method is proposed to solve the non-convex optimization problem. Simulation results indicate that the performance of the proposed STAR-RIS-assisted scheme greatly surpasses that of the conventional RIS scheme, which validates the superiority of STAR-RIS in simultaneously implementing PLS and CCs. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Showing 51–100 of 489 results for author: Su, Z