Skip to main content

Showing 1–50 of 166 results for author: Du, K

.
  1. arXiv:2407.01245  [pdf, other

    cs.AI cs.CY

    SINKT: A Structure-Aware Inductive Knowledge Tracing Model with Large Language Model

    Authors: Lingyue Fu, Hao Guan, Kounianhua Du, Jianghao Lin, Wei Xia, Weinan Zhang, Ruiming Tang, Yasheng Wang, Yong Yu

    Abstract: Knowledge Tracing (KT) aims to determine whether students will respond correctly to the next question, which is a crucial task in intelligent tutoring systems (ITS). In educational KT scenarios, transductive ID-based methods often face severe data sparsity and cold start problems, where interactions between individual students and questions are sparse, and new questions and concepts consistently a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.18825  [pdf, other

    cs.IR

    ELCoRec: Enhance Language Understanding with Co-Propagation of Numerical and Categorical Features for Recommendation

    Authors: Jizheng Chen, Kounianhua Du, Jianghao Lin, Bo Chen, Ruiming Tang, Weinan Zhang

    Abstract: Large language models have been flourishing in the natural language processing (NLP) domain, and their potential for recommendation has been paid much attention to. Despite the intelligence shown by the recommendation-oriented finetuned models, LLMs struggle to fully understand the user behavior patterns due to their innate weakness in interpreting numerical features and the overhead for long cont… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.08982  [pdf, other

    quant-ph

    A Novel Quantum LSTM Network

    Authors: Yifan Zhou, Chong Cheng Xu, Mingi Song, Yew Kee Wong, Kangsong Du

    Abstract: The rapid evolution of artificial intelligence has led to the widespread adoption of Long Short-Term Memory (LSTM) networks, known for their effectiveness in processing sequential data. However, LSTMs are constrained by inherent limitations such as the vanishing gradient problem and substantial computational demands. The advent of quantum computing presents a revolutionary approach to overcoming t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 13 pages, 7 Figures

  4. arXiv:2406.00012  [pdf, other

    cs.IR cs.AI

    Extracting Essential and Disentangled Knowledge for Recommendation Enhancement

    Authors: Kounianhua Du, Jizheng Chen, Jianghao Lin, Menghui Zhu, Bo Chen, Shuai Li, Ruiming Tang

    Abstract: Recommender models play a vital role in various industrial scenarios, while often faced with the catastrophic forgetting problem caused by the fast shifting data distribution, e.g., the evolving user interests, click signals fluctuation during sales promotions, etc. To alleviate this problem, a common approach is to reuse knowledge from the historical data. However, preserving the vast and fast-ac… ▽ More

    Submitted 20 May, 2024; originally announced June 2024.

  5. arXiv:2406.00011  [pdf, other

    cs.IR cs.AI

    DisCo: Towards Harmonious Disentanglement and Collaboration between Tabular and Semantic Space for Recommendation

    Authors: Kounianhua Du, Jizheng Chen, Jianghao Lin, Yunjia Xi, Hangyu Wang, Xinyi Dai, Bo Chen, Ruiming Tang, Weinan Zhang

    Abstract: Recommender systems play important roles in various applications such as e-commerce, social media, etc. Conventional recommendation methods usually model the collaborative signals within the tabular representation space. Despite the personalization modeling and the efficiency, the latent semantic dependencies are omitted. Methods that introduce semantics into recommendation then emerge, injecting… ▽ More

    Submitted 4 June, 2024; v1 submitted 20 May, 2024; originally announced June 2024.

  6. arXiv:2405.16444  [pdf, other

    cs.LG

    CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion

    Authors: Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang

    Abstract: Large language models (LLMs) often incorporate multiple text chunks in their inputs to provide the necessary contexts. To speed up the prefill of the long LLM inputs, one can pre-compute the KV cache of a text and re-use the KV cache when the context is reused as the prefix of another LLM input. However, the reused text chunks are not always the input prefix, and when they are not, their precomput… ▽ More

    Submitted 3 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  7. arXiv:2405.12442  [pdf, other

    cs.IR cs.AI

    Learning Structure and Knowledge Aware Representation with Large Language Models for Concept Recommendation

    Authors: Qingyao Li, Wei Xia, Kounianhua Du, Qiji Zhang, Weinan Zhang, Ruiming Tang, Yong Yu

    Abstract: Concept recommendation aims to suggest the next concept for learners to study based on their knowledge states and the human knowledge system. While knowledge states can be predicted using knowledge tracing models, previous approaches have not effectively integrated the human knowledge system into the process of designing these educational models. In the era of rapidly evolving Large Language Model… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  8. arXiv:2405.06902   

    cs.LG stat.ML

    Causal Inference from Slowly Varying Nonstationary Processes

    Authors: Kang Du, Yu Xiang

    Abstract: Causal inference from observational data following the restricted structural causal models (SCM) framework hinges largely on the asymmetry between cause and effect from the data generating mechanisms, such as non-Gaussianity or non-linearity. This methodology can be adapted to stationary time series, yet inferring causal relationships from nonstationary time series remains a challenging task. In t… ▽ More

    Submitted 29 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

    Comments: This work was intended as a replacement of arXiv:2012.13025 and any subsequent updates will appear there

  9. arXiv:2405.02355  [pdf, other

    cs.SE cs.AI

    CodeGRAG: Extracting Composed Syntax Graphs for Retrieval Augmented Cross-Lingual Code Generation

    Authors: Kounianhua Du, Renting Rui, Huacan Chai, Lingyue Fu, Wei Xia, Yasheng Wang, Ruiming Tang, Yong Yu, Weinan Zhang

    Abstract: Utilizing large language models to generate codes has shown promising meaning in software development revolution. Despite the intelligence shown by the general large language models, their specificity in code generation can still be improved due to the syntactic gap and mismatched vocabulary existing among natural language and different programming languages. In addition, programming languages are… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  10. arXiv:2404.18164  [pdf, other

    math.PR

    Empirical approximation to invariant measures of mean-field Langevin dynamics

    Authors: Wen**g Cao, Kai Du

    Abstract: This paper is concerned with the approximation to invariant measures for Langevin dynamics of McKean--Vlasov type. Under dissipativity and Lipschitz conditions, we prove that the empirical measures of both the mean-field and self-interacting Langevin dynamics converge to the invariant measure in the Wasserstein distance. Numerical experiments are conducted to illustrate theoretical results.

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 21 pages, 1 figure

    MSC Class: 60B10; 37M25; 82C31; 60H10

  11. arXiv:2404.15245  [pdf, other

    stat.ME cs.LG

    Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification

    Authors: Austin Goddard, Kang Du, Yu Xiang

    Abstract: Making predictions in an unseen environment given data from multiple training environments is a challenging task. We approach this problem from an invariance perspective, focusing on binary classification to shed light on general nonlinear data generation mechanisms. We identify a unique form of invariance that exists solely in a binary setting that allows us to train models invariant over environ… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted to the 2024 International Symposium on Information Theory (ISIT)

  12. arXiv:2404.07490  [pdf, other

    cond-mat.str-el

    Low-energy spin dynamics in a Kitaev material Na3Ni2BiO6 investigated by NMR

    Authors: Xinyu Shi, Yi Cui, Yanyan Shangguan, Xiaoyu Xu, Zhanlong Wu, Ze Hu, Shuo Li, Kefan Du, Ying Chen, Long Ma, Zhengxin Liu, **sheng Wen, **shan Zhang, Weiqiang Yu

    Abstract: We performed 23Na NMR and magnetization measurements on an S = 1, quasi-2D honeycomb lattice antiferromagnet Na3Ni2BiO6. A large positive Curie-Weiss constant of 22.9 K is observed. The NMR spectra at low fields are consistent with a "zigzag" magnetic order, indicating a large easy-axis anisotropy. With field applied along the c* axis, the NMR spectra confirm the existence of a 1/3-magnetization p… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 7 pages, 7 figures

  13. arXiv:2404.04633  [pdf, other

    cs.CL

    Context versus Prior Knowledge in Language Models

    Authors: Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell

    Abstract: To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to… ▽ More

    Submitted 16 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: Long paper accepted at ACL 2024

  14. arXiv:2404.02547  [pdf, ps, other

    math.PR math.AP

    Well-posedness of the obstacle problem for stochastic nonlinear diffusion equations: an entropy formulation

    Authors: Kai Du, Ruoyang Liu

    Abstract: In this paper, we establish the existence, uniqueness and stability results for the obstacle problem associated with a degenerate nonlinear diffusion equation perturbed by conservative gradient noise. Our approach revolves round introducing a new entropy formulation for stochastic variational inequalities. As a consequence, we obtain a novel well-posedness result for the obstacle problem of determ… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 36 pages

    MSC Class: 60H15; 35K86; 35K65; 47J20

  15. arXiv:2404.00633  [pdf, other

    cs.CV

    IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions

    Authors: Zhijun Tu, Kunpeng Du, Hanting Chen, Hailing Wang, Wei Li, Jie Hu, Yunhe Wang

    Abstract: Recent advances have demonstrated the powerful capability of transformer architecture in image restoration. However, our analysis indicates that existing transformerbased methods can not establish both exact global and local dependencies simultaneously, which are much critical to restore the details and missing content of degraded images. To this end, we present an efficient image processing trans… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  16. arXiv:2403.17555  [pdf, ps, other

    math.PR

    Particle approximation for a conditional McKean--Vlasov stochastic differential equation

    Authors: Kai Du, Yunzhang Li, Yuyang Ye

    Abstract: In this paper, we construct a type of interacting particle systems to approximate a class of stochastic different equations whose coefficients depend on the conditional probability distributions of the processes given partial observations. After proving the well-posedness and regularity of the particle systems, we establish a quantitative convergence result for the empirical measures of the partic… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  17. arXiv:2403.16520  [pdf, other

    cs.CV

    CMViM: Contrastive Masked Vim Autoencoder for 3D Multi-modal Representation Learning for AD classification

    Authors: Guangqian Yang, Kangrui Du, Zhihan Yang, Ye Du, Yong** Zheng, Shujun Wang

    Abstract: Alzheimer's disease (AD) is an incurable neurodegenerative condition leading to cognitive and functional deterioration. Given the lack of a cure, prompt and precise AD diagnosis is vital, a complex process dependent on multiple factors and multi-modal data. While successful efforts have been made to integrate multi-modal representation learning into medical datasets, scant attention has been given… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 11 pages, 1 figure

  18. arXiv:2403.12559  [pdf, other

    cs.CV cs.LG

    Confidence Self-Calibration for Multi-Label Class-Incremental Learning

    Authors: Kaile Du, Yifan Zhou, Fan Lyu, Yuyang Li, Chen Lu, Guangcan Liu

    Abstract: The partial label challenge in Multi-Label Class-Incremental Learning (MLCIL) arises when only the new classes are labeled during training, while past and future labels remain unavailable. This issue leads to a proliferation of false-positive errors due to erroneously high confidence multi-label predictions, exacerbating catastrophic forgetting within the disjoint label space. In this paper, we ai… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  19. arXiv:2403.11434  [pdf, other

    cs.NI cs.DC

    Earth+: on-board satellite imagery compression leveraging historical earth observations

    Authors: Kuntai Du, Yihua Cheng, Peder Olsen, Shadi Noghabi, Ranveer Chandra, Junchen Jiang

    Abstract: With the increasing deployment of earth observation satellite constellations, the downlink (satellite-to-ground) capacity often limits the freshness, quality, and coverage of the imagery data available to applications on the ground. To overcome the downlink limitation, we present Earth+, a new satellite imagery compression system that, instead of compressing each image individually, pinpoints and… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  20. arXiv:2403.01714  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Molecular intercalation in the van der Waals antiferromagnets FePS3 and NiPS3

    Authors: Cong Li, Ze Hu, Xiaofei Hou, Sheng Xu, Zhanlong Wu, Kefan Du, Shuo Li, Xiaoyu Xu, Ying Chen, Zeyu Wang, Tiancheng Mu, Tian-Long Xia, Yanfeng Guo, B. Normand, Weiqiang Yu, Yi Cui

    Abstract: We have performed electrochemical treatment of the van der Waals antiferromagnetic materials FePS$_3$ and NiPS$_3$ with the ionic liquid EMIM-BF$_4$, achieving significant molecular intercalation. Mass analysis of the intercalated compounds, EMIM$_x$-FePS$_3$ and EMIM$_x$-NiPS$_3$, indicated respective intercalation levels, $x$, of approximately 27\% and 37\%, and X-ray diffraction measurements de… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Journal ref: Physical Review B 109, 184407(2024)

  21. arXiv:2402.11492  [pdf, other

    eess.SY

    Exponential Cluster Synchronization in Fast Switching Network Topologies: A Pinning Control Approach with Necessary and Sufficient Conditions

    Authors: Ku Du, Yu Kang

    Abstract: This research investigates the intricate domain of synchronization problem among multiple agents operating within a dynamic fast switching network topology. We concentrate on cluster synchronization within coupled linear system under pinning control, providing both necessary and sufficient conditions. As a pivotal aspect, this paper aim to president the weakest possible conditions to make the coup… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  22. arXiv:2402.08182  [pdf, other

    cs.LG stat.ML

    Variational Continual Test-Time Adaptation

    Authors: Fan Lyu, Kaile Du, Yuyang Li, Hanyu Zhao, Zhang Zhang, Guangcan Liu, Liang Wang

    Abstract: The prior drift is crucial in Continual Test-Time Adaptation (CTTA) methods that only use unlabeled test data, as it can cause significant error propagation. In this paper, we introduce VCoTTA, a variational Bayesian approach to measure uncertainties in CTTA. At the source stage, we transform a pre-trained deterministic model into a Bayesian Neural Network (BNN) via a variational warm-up strategy,… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  23. arXiv:2402.06884  [pdf, other

    stat.ML cs.LG

    Low-Rank Approximation of Structural Redundancy for Self-Supervised Learning

    Authors: Kang Du, Yu Xiang

    Abstract: We study the data-generating mechanism for reconstructive SSL to shed light on its effectiveness. With an infinite amount of labeled samples, we provide a sufficient and necessary condition for perfect linear approximation. The condition reveals a full-rank component that preserves the label classes of Y, along with a redundant component. Motivated by the condition, we propose to approximate the r… ▽ More

    Submitted 27 May, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted to the 3rd Conference on Causal Learning and Reasoning (CLeaR)

  24. arXiv:2402.02547  [pdf

    cs.AI cs.CL

    Integration of cognitive tasks into artificial general intelligence test for large models

    Authors: Youzhi Qu, Chen Wei, Penghui Du, Wenxin Che, Chi Zhang, Wanli Ouyang, Yatao Bian, Feiyang Xu, Bin Hu, Kai Du, Haiyan Wu, Jia Liu, Quanying Liu

    Abstract: During the evolution of large models, performance evaluation is necessarily performed to assess their capabilities and ensure safety before practical application. However, current model evaluations mainly rely on specific tasks and datasets, lacking a united framework for assessing the multidimensional intelligence of large models. In this perspective, we advocate for a comprehensive framework of… ▽ More

    Submitted 5 March, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  25. Eloquent: A More Robust Transmission Scheme for LLM Token Streaming

    Authors: Hanchen Li, Yuhan Liu, Yihua Cheng, Siddhant Ray, Kuntai Du, Junchen Jiang

    Abstract: To render each generated token in real-time for users, the Large Language Model (LLM) server generates tokens one by one and streams each token (or group of a few tokens) through the network to the user right after generation, which we refer to as LLM token streaming. However, under unstable network conditions, the LLM token streaming experience could suffer greatly from stalls since one packet lo… ▽ More

    Submitted 16 June, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: In SIGCOMM Workshop on Networks for AI Computing (NAIC '24)

  26. arXiv:2401.11788  [pdf, other

    math.NA

    Obtaining the pseudoinverse solution of singular range-symmetric linear systems with GMRES-type methods

    Authors: Kui Du, Jia-Jun Fan, Fang Wang

    Abstract: It is well known that for singular inconsistent range-symmetric linear systems, the generalized minimal residual (GMRES) method determines a least squares solution without breakdown. The reached least squares solution may be or not be the pseudoinverse solution. We show that a lift strategy can be used to obtain the pseudoinverse solution. In addition, we propose a new iterative method named RSMAR… ▽ More

    Submitted 22 January, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 22 pages, 4 figures

    MSC Class: 15A06; 15A09; 65F10; 65F25; 65F50

  27. arXiv:2401.08221  [pdf, other

    cs.LG cs.AI

    Towards Causal Relationship in Indefinite Data: Baseline Model and New Datasets

    Authors: Hang Chen, Xinyu Yang, Keqing Du

    Abstract: Integrating deep learning and causal discovery has encouraged us to spot that learning causal structures and representations in dialogue and video is full of challenges. We defined These data forms as "Indefinite Data", characterized by multi-structure data and multi-value representations. Unlike existing adaptable data forms, Indefinite Data still faces gaps in datasets and methods. To address th… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: If you are interested in the two new datasets, pls contact us by email

  28. arXiv:2401.02608  [pdf, other

    math.NA

    GPBiLQ and GPQMR: Two iterative methods for unsymmetric partitioned linear systems

    Authors: Kui Du, Jia-Jun Fan, Fang Wang

    Abstract: We introduce two iterative methods, GPBiLQ and GPQMR, for solving unsymmetric partitioned linear systems. The basic mechanism underlying GPBiLQ and GPQMR is a novel simultaneous tridiagonalization via biorthogonality that allows for short-recurrence iterative schemes. Similar to the biconjugate gradient method, it is possible to develop another method, GPBiCG, whose iterate (if it exists) can be o… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: 22 pages, 4 figures

    MSC Class: 15A06; 65F10; 65F25; 65F50

  29. arXiv:2401.01490  [pdf

    physics.optics physics.app-ph

    Chirality tuning and reversing with resonant phase-change metasurfaces

    Authors: Xinbo Sha, Kang Du, Yixuan Zeng, Fangxing Lai, Jun Yin, Hanxu Zhang, Bo Song, Jiecai Han, Shumin Xiao, Yuri Kivshar, Qinghai Song

    Abstract: Dynamic control of circular dichroism in photonic structures is critically important for compact spectrometers, stereoscopic displays, and information processing exploiting multiple degrees of freedom. Metasurfaces can help miniaturize chiral devices but only produce static and limited chiral responses. While external stimuli are able to tune resonances, their modulations are often weak, and rever… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 14 pages, 4 figures

  30. arXiv:2312.07081  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Giant X-ray circular dichroism in a time-reversal invariant altermagnet

    Authors: Jun Okamoto, Ru-Pan Wang, Yen-Yi Chu, Hung-Wei Shiu, Amol Singh, Hsiao-Yu Huang, Chung-Yu Mou, Sucitto Teh, Horng-Tay Jeng, Kai Du, Xianghan Xu, Sang-Wook Cheong, Chao-Hung Du, Chien-Te Chen, Atsushi Fujimori, Di-**g Huang

    Abstract: X-ray circular dichroism, arising from the contrast in X-ray absorption between opposite photon helicities, serves as a spectroscopic tool to measure the magnetization of ferromagnetic materials and identify the handedness of chiral crystals. Antiferromagnets with crystallographic chirality typically lack X-ray magnetic circular dichroism because of time-reversal symmetry, yet exhibit weak X-ray n… ▽ More

    Submitted 23 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted by Advanced Materials (2024.2.16) Revised title: Giant X-ray circular dichroism in a time-reversal invariant altermagnet Revised drafts: Main 14 pages, 4 figures, and SI 20 pages, 8 figures

  31. arXiv:2311.18567  [pdf, other

    cs.CL

    Grammatical Gender's Influence on Distributional Semantics: A Causal Perspective

    Authors: Karolina Stańczak, Kevin Du, Adina Williams, Isabelle Augenstein, Ryan Cotterell

    Abstract: How much meaning influences gender assignment across languages is an active area of research in modern linguistics and cognitive science. We can view current approaches as aiming to determine where gender assignment falls on a spectrum, from being fully arbitrarily determined to being largely semantically determined. For the latter case, there is a formulation of the neo-Whorfian hypothesis, which… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  32. arXiv:2311.12401   

    cs.CV cs.MM

    CASR: Refining Action Segmentation via Marginalizing Frame-levle Causal Relationships

    Authors: Keqing Du, Xinyu Yang, Hang Chen

    Abstract: Integrating deep learning and causal discovery has increased the interpretability of Temporal Action Segmentation (TAS) tasks. However, frame-level causal relationships exist many complicated noises outside the segment-level, making it infeasible to directly express macro action semantics. Thus, we propose Causal Abstraction Segmentation Refiner (CASR), which can refine TAS results from various mo… ▽ More

    Submitted 26 January, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: We found that the paper needs to be modified in the model and all experiments must be re-run, so we request to withdraw the current version

  33. arXiv:2311.12200  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Hydrogen-induced tunable remanent polarization in a perovskite nickelate

    Authors: Yifan Yuan, Michele Kotiuga, Tae Joon Park, Yuanyuan Ni, Arnob Saha, Hua Zhou, Jerzy T. Sadowski, Abdullah Al-Mahboob, Haoming Yu, Kai Du, Minning Zhu, Sunbin Deng, Ravindra S. Bisht, Xiao Lyu, Chung-Tse Michael Wu, Peide D. Ye, Abhronil Sengupta, Sang-Wook Cheong, Xiaoshan Xu, Karin M. Rabe, Shriram Ramanathan

    Abstract: Materials with field-tunable polarization are of broad interest to condensed matter sciences and solid-state device technologies. Here, using hydrogen (H) donor do**, we modify the room temperature metallic phase of a perovskite nickelate NdNiO3 into an insulating phase with both metastable dipolar polarization and space-charge polarization. We then demonstrate transient negative differential ca… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 13 pages, 5 figures

  34. arXiv:2311.11428  [pdf, other

    math.PR

    Self-interacting approximation to McKean-Vlasov long-time limit: a Markov chain Monte Carlo method

    Authors: Kai Du, Zhenjie Ren, Florin Suciu, Songbo Wang

    Abstract: For a certain class of McKean--Vlasov processes, we introduce proxy processes that substitute the mean-field interaction with self-interaction, employing a weighted occupation measure. Our study encompasses two key achievements. First, we demonstrate the ergodicity of the self-interacting dynamics, under broad conditions, by applying the reflection coupling method. Second, in scenarios where the d… ▽ More

    Submitted 14 January, 2024; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: 42 pages, 1 figure; contains a minor correction

  35. arXiv:2311.07240  [pdf, other

    astro-ph.GA

    The \ion{H}{I}-rich Ultra-diffuse Galaxies follow the Extended Schmidt Law

    Authors: Sai Zhai, Yong Shi, Zhi-Yu Zhang, Jun-Zhi Wang, Yu Gao, Qiusheng Gu, Tao Wang, Kaiyi Du, Xiaoling Yu, Xin Li

    Abstract: The \ion{H}{I}-rich ultra-diffuse galaxies (HUDGs) offer a unique case for studies of star formation laws (SFLs) as they host low star formation efficiency (SFE) and low-metallicity environments where gas is predominantly atomic. We collect a sample of six HUDGs in the field and investigate their location in the extended Schmidt law(… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: 6 pages, 4 figures, accepted for publication in MNRAS

  36. arXiv:2311.00923  [pdf, other

    cs.LG stat.ME

    A Review and Roadmap of Deep Causal Model from Different Causal Structures and Representations

    Authors: Hang Chen, Keqing Du, Chenguang Li, Xinyu Yang

    Abstract: The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area. Nonetheless, the broadening of original causal concepts and theories to such complex, non-statistical data has been met with serious challenges. In response, our study proposes redefinitions… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: under review

  37. arXiv:2310.19302  [pdf, other

    math.PR

    Empirical approximation to invariant measures of non-degenerate McKean-Vlasov dynamics

    Authors: Wen**g Cao, Kai Du

    Abstract: This paper studies the approximation of invariant measures of McKean-Vlasov dynamics with non-degenerate additive noise. While prior findings necessitated a strong monotonicity condition on the McKean-Vlasov process, we expand these results to encompass dissipative and weak interaction scenarios. Utilizing a reflection coupling technique, we prove that the empirical measures of the McKean-Vlasov p… ▽ More

    Submitted 23 January, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: 21 pages, 1 figure; typos corrected, email address updated

    MSC Class: 60B10; 37M25; 60F25; 60H10

  38. arXiv:2310.18634  [pdf, other

    cs.LG

    SSL Framework for Causal Inconsistency between Structures and Representations

    Authors: Hang Chen, Xinyu Yang, Keqing Du

    Abstract: The cross-pollination of deep learning and causal discovery has catalyzed a burgeoning field of research seeking to elucidate causal relationships within non-statistical data forms like images, videos, and text. Such data, often being named `indefinite data', exhibit unique challenges-inconsistency between causal structure and representation, which are not common in conventional data forms. To tac… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  39. arXiv:2310.07240  [pdf, other

    cs.NI cs.LG

    CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving

    Authors: Yuhan Liu, Hanchen Li, Yihua Cheng, Siddhant Ray, Yuyang Huang, Qizheng Zhang, Kuntai Du, Jiayi Yao, Shan Lu, Ganesh Ananthanarayanan, Michael Maire, Henry Hoffmann, Ari Holtzman, Junchen Jiang

    Abstract: As large language models (LLMs) take on complex tasks, their inputs are supplemented with longer contexts that incorporate domain knowledge or user-specific information. Yet using long contexts poses a challenge for responsive LLM systems, as nothing can be generated until the whole context is processed by the LLM. . CacheGen is a fast context-loading module for LLM systems. First, CacheGen uses… ▽ More

    Submitted 30 April, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  40. arXiv:2310.04685  [pdf, other

    cs.SE cs.AI cs.NI

    Automatic and Efficient Customization of Neural Networks for ML Applications

    Authors: Yuhan Liu, Chengcheng Wan, Kuntai Du, Henry Hoffmann, Junchen Jiang, Shan Lu, Michael Maire

    Abstract: ML APIs have greatly relieved application developers of the burden to design and train their own neural network models -- classifying objects in an image can now be as simple as one line of Python code to call an API. However, these APIs offer the same pre-trained models regardless of how their output is used by different applications. This can be suboptimal as not all ML inference errors can caus… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

  41. Spin-Mediated Direct Photon Scattering by Plasmons in BiTeI

    Authors: A. C. Lee, S. Sarkar, K. Du, H. -H. Kung, C. J. Won, K. Wang, S. -W. Cheong, S. Maiti, G. Blumberg

    Abstract: We use polarization resolved Raman spectroscopy to demonstrate that for a 3D giant Rashba system the bulk plasmon collective mode can directly couple to the Raman response even in the long wavelength $\mathbf q \rightarrow 0$ limit. Although conventional theory predicts the plasmon spectral weight to be suppressed as the square of its quasi-momentum and thus negligibly weak in the Raman spectra, w… ▽ More

    Submitted 18 February, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: Editors' Suggestion

    Journal ref: Phys. Rev. B 109, L041111 (2024)

  42. arXiv:2310.02422  [pdf, other

    cs.LG cs.AI cs.DC cs.MM cs.NI

    OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation

    Authors: Kuntai Du, Yuhan Liu, Yitian Hao, Qizheng Zhang, Haodong Wang, Yuyang Huang, Ganesh Ananthanarayanan, Junchen Jiang

    Abstract: Deep learning inference on streaming media data, such as object detection in video or LiDAR feeds and text extraction from audio waves, is now ubiquitous. To achieve high inference accuracy, these applications typically require significant network bandwidth to gather high-fidelity data and extensive GPU resources to run deep neural networks (DNNs). While the high demand for network bandwidth and G… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: SoCC' 23

  43. arXiv:2309.06118  [pdf, other

    cs.CV

    CHITNet: A Complementary to Harmonious Information Transfer Network for Infrared and Visible Image Fusion

    Authors: Yafei Zhang, Keying Du, Huafeng Li, Zhengtao Yu, Yu Liu

    Abstract: Current infrared and visible image fusion (IVIF) methods go to great lengths to excavate complementary features and design complex fusion strategies, which is extremely challenging. To this end, we rethink the IVIF outside the box, proposing a complementary to harmonious information transfer network (CHITNet). It reasonably transfers complementary information into harmonious one, which integrates… ▽ More

    Submitted 25 October, 2023; v1 submitted 12 September, 2023; originally announced September 2023.

  44. arXiv:2309.05793  [pdf, other

    cs.CV cs.AI

    PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models

    Authors: Li Chen, Mengyi Zhao, Yiheng Liu, Mingxu Ding, Yangyang Song, Shizun Wang, Xu Wang, Hao Yang, **g Liu, Kang Du, Min Zheng

    Abstract: Personalized text-to-image generation has emerged as a powerful and sought-after tool, empowering users to create customized images based on their specific concepts and prompts. However, existing approaches to personalization encounter multiple challenges, including long tuning times, large storage requirements, the necessity for multiple input images per identity, and limitations in preserving id… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  45. arXiv:2309.01940  [pdf, other

    cs.CL cs.AI

    CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

    Authors: Lingyue Fu, Huacan Chai, Shuang Luo, Kounianhua Du, Weiming Zhang, Longteng Fan, Jiayi Lei, Renting Rui, Jianghao Lin, Yuchen Fang, Yifan Liu, **gkuan Wang, Siyuan Qi, Kangning Zhang, Weinan Zhang, Yong Yu

    Abstract: With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers. Evaluating the programming capabilities of LLMs is crucial as it reflects the multifaceted abilities of LLMs, and it has numerous downstream applications. In this paper, we propose CodeApex, a bilingual benchmark data… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: 33pages

  46. arXiv:2308.11131  [pdf, other

    cs.IR cs.AI

    ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

    Authors: Jianghao Lin, Rong Shan, Chenxu Zhu, Kounianhua Du, Bo Chen, Shigang Quan, Ruiming Tang, Yong Yu, Weinan Zhang

    Abstract: With large language models (LLMs) achieving remarkable breakthroughs in natural language processing (NLP) domains, LLM-enhanced recommender systems have received much attention and have been actively explored currently. In this paper, we focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. First and foremost, we identify and formulate the li… ▽ More

    Submitted 26 June, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted by WWW 2024. Full and More Readable Version

  47. arXiv:2308.03610  [pdf, other

    cs.CV

    AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose

    Authors: Huichao Zhang, Bowen Chen, Hao Yang, Liao Qu, Xu Wang, Li Chen, Chao Long, Feida Zhu, Kang Du, Min Zheng

    Abstract: Creating expressive, diverse and high-quality 3D avatars from highly customized text descriptions and pose guidance is a challenging task, due to the intricacy of modeling and texturing in 3D that ensure details and various styles (realistic, fictional, etc). We present AvatarVerse, a stable pipeline for generating expressive high-quality 3D avatars from nothing but text descriptions and pose guid… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  48. arXiv:2308.00391  [pdf, other

    cs.LG cs.AI

    Counterfactual Graph Transformer for Traffic Flow Prediction

    Authors: Ying Yang, Kai Du, Xingyuan Dai, Jianwu Fang

    Abstract: Traffic flow prediction (TFP) is a fundamental problem of the Intelligent Transportation System (ITS), as it models the latent spatial-temporal dependency of traffic flow for potential congestion prediction. Recent graph-based models with multiple kinds of attention mechanisms have achieved promising performance. However, existing methods for traffic flow prediction tend to inherit the bias patter… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: accepted by ITSC 2023

  49. arXiv:2307.16460  [pdf, other

    math.NA

    On Krylov subspace methods for skew-symmetric and shifted skew-symmetric linear systems

    Authors: Kui Du, Jia-Jun Fan, Xiao-Hui Sun, Fang Wang, Ya-Lan Zhang

    Abstract: Krylov subspace methods for solving linear systems of equations involving skew-symmetric matrices have gained recent attention. Numerical equivalences among Krylov subspace methods for nonsingular skew-symmetric linear systems have been given in Greif et al. [SIAM J. Matrix Anal. Appl., 37 (2016), pp. 1071--1087]. In this work, we extend the results of Greif et al. to singular skew-symmetric linea… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: 23 pages, 3 figures

  50. arXiv:2307.11956  [pdf, other

    cond-mat.str-el

    Light-induced electronic polarization in antiferromagnetic Cr2O3

    Authors: Xinshu Zhang, Tyler Carbin, Adrian B. Culver, Kai Du, Kefeng Wang, Sang-Wook Cheong, Rahul Roy, Anshul Kogar

    Abstract: In a solid, the electronic subsystem can exhibit incipient order with lower point group symmetry than the crystal lattice. External fields that couple to electronic order parameters have rarely been investigated, however, despite their potential importance to inducing exotic effects. Here, we show that when inversion symmetry is broken by the antiferromagnetic (AFM) order in Cr2O3, transmitting a… ▽ More

    Submitted 9 January, 2024; v1 submitted 21 July, 2023; originally announced July 2023.