Skip to main content

Showing 1–50 of 201 results for author: Deng, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.08707  [pdf, other

    cs.LG

    Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory

    Authors: Xueyan Niu, Bo Bai, Lei Deng, Wei Han

    Abstract: Increasing the size of a Transformer model does not always lead to enhanced performance. This phenomenon cannot be explained by the empirical scaling laws. Furthermore, improved generalization ability occurs as the model memorizes the training samples. We present a theoretical framework that sheds light on the memorization process and performance dynamics of transformer-based language models. We m… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  2. arXiv:2405.07919  [pdf, other

    cs.CV

    Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

    Authors: Haoyu Deng, Zi**g Xu, Yule Duan, Xiao Wu, Wenjie Shu, Liang-Jian Deng

    Abstract: Deep neural networks for image super-resolution have shown significant advantages over traditional approaches like interpolation. However, they are often criticized as `black boxes' compared to traditional approaches which have solid mathematical foundations. In this paper, we attempt to interpret the behavior of deep neural networks using theories from signal processing theories. We first report… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  3. Lightweight Spatial Modeling for Combinatorial Information Extraction From Documents

    Authors: Yanfei Dong, Lambert Deng, Jiazheng Zhang, Xiaodong Yu, Ting Lin, Francesco Gelli, Soujanya Poria, Wee Sun Lee

    Abstract: Documents that consist of diverse templates and exhibit complex spatial structures pose a challenge for document entity classification. We propose KNN-former, which incorporates a new kind of spatial bias in attention calculation based on the K-nearest-neighbor (KNN) graph of document entities. We limit entities' attention only to their local radius defined by the KNN graph. We also use combinator… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  4. arXiv:2405.04929  [pdf, ps, other

    cs.IR

    Enabling Roll-up and Drill-down Operations in News Exploration with Knowledge Graphs for Due Diligence and Risk Management

    Authors: Sha Wang, Yuchen Li, Hanhua Xiao, Zhifeng Bao, Lambert Deng, Yanfei Dong

    Abstract: Efficient news exploration is crucial in real-world applications, particularly within the financial sector, where numerous control and risk assessment tasks rely on the analysis of public news reports. The current processes in this domain predominantly rely on manual efforts, often involving keywordbased searches and the compilation of extensive keyword lists. In this paper, we introduce NCEXPLORE… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: The paper was accepted by ICDE 2024

  5. arXiv:2404.19652  [pdf, other

    cs.CV cs.AI

    VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization

    Authors: Yuliang Liu, Mingxin Huang, Hao Yan, Linger Deng, Weijia Wu, Hao Lu, Chunhua Shen, Lianwen **, Xiang Bai

    Abstract: Text spotting, a task involving the extraction of textual information from image or video sequences, faces challenges in cross-domain adaption, such as image-to-image and image-to-video generalization. In this paper, we introduce a new method, termed VimTS, which enhances the generalization ability of the model by achieving better synergy among different tasks. Typically, we propose a Prompt Queri… ▽ More

    Submitted 14 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  6. arXiv:2404.16174  [pdf, other

    cs.HC cs.CV cs.LG

    MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification Models

    Authors: Grace Guo, Lifu Deng, Animesh Tandon, Alex Endert, Bum Chul Kwon

    Abstract: The recent prevalence of publicly accessible, large medical imaging datasets has led to a proliferation of artificial intelligence (AI) models for cardiovascular image classification and analysis. At the same time, the potentially significant impacts of these models have motivated the development of a range of explainable AI (XAI) methods that aim to explain model predictions given certain image i… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures, ACM FAccT 2024

  7. arXiv:2404.15174  [pdf, other

    cs.CV

    Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion

    Authors: Yu-Jie Liang, Zihan Cao, Liang-Jian Deng, Xiao Wu

    Abstract: Recently, implicit neural representations (INR) have made significant strides in various vision-related domains, providing a novel solution for Multispectral and Hyperspectral Image Fusion (MHIF) tasks. However, INR is prone to losing high-frequency information and is confined to the lack of global perceptual capabilities. To address these issues, this paper introduces a Fourier-enhanced Implicit… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  8. arXiv:2404.11537  [pdf, other

    cs.CV eess.IV

    SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening

    Authors: Yu Zhong, Xiao Wu, Liang-Jian Deng, Zihan Cao

    Abstract: Pansharpening is a significant image fusion technique that merges the spatial content and spectral characteristics of remote sensing images to generate high-resolution multispectral images. Recently, denoising diffusion probabilistic models have been gradually applied to visual tasks, enhancing controllable image generation through low-rank adaptation (LoRA). In this paper, we introduce a spatial-… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  9. arXiv:2404.11416  [pdf, other

    cs.CV

    Neural Shrödinger Bridge Matching for Pansharpening

    Authors: Zihan Cao, Xiao Wu, Liang-Jian Deng

    Abstract: Recent diffusion probabilistic models (DPM) in the field of pansharpening have been gradually gaining attention and have achieved state-of-the-art (SOTA) performance. In this paper, we identify shortcomings in directly applying DPMs to the task of pansharpening as an inverse problem: 1) initiating sampling directly from Gaussian noise neglects the low-resolution multispectral image (LRMS) as a pri… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  10. arXiv:2404.10004  [pdf

    cs.LG physics.soc-ph stat.AP

    A Strategy Transfer and Decision Support Approach for Epidemic Control in Experience Shortage Scenarios

    Authors: X. Xiao, P. Chen, X. Cao, K. Liu, L. Deng, D. Zhao, Z. Chen, Q. Deng, F. Yu, H. Zhang

    Abstract: Epidemic outbreaks can cause critical health concerns and severe global economic crises. For countries or regions with new infectious disease outbreaks, it is essential to generate preventive strategies by learning lessons from others with similar risk profiles. A Strategy Transfer and Decision Support Approach (STDSA) is proposed based on the profile similarity evaluation. There are four steps in… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 20 pages, 9 figures

  11. arXiv:2404.09293  [pdf, other

    cs.CV

    A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion

    Authors: Zihan Cao, Xiao Wu, Liang-Jian Deng, Yu Zhong

    Abstract: In image fusion tasks, images from different sources possess distinct characteristics. This has driven the development of numerous methods to explore better ways of fusing them while preserving their respective characteristics. Mamba, as a state space model, has emerged in the field of natural language processing. Recently, many studies have attempted to extend Mamba to vision tasks. However, due… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  12. arXiv:2404.07932  [pdf, other

    cs.CV eess.IV

    FusionMamba: Efficient Image Fusion with State Space Model

    Authors: Siran Peng, Xiangyu Zhu, Haoyu Deng, Zhen Lei, Liang-Jian Deng

    Abstract: Image fusion aims to generate a high-resolution multi/hyper-spectral image by combining a high-resolution image with limited spectral information and a low-resolution image with abundant spectral data. Current deep learning (DL)-based methods for image fusion primarily rely on CNNs or Transformers to extract features and merge different types of data. While CNNs are efficient, their receptive fiel… ▽ More

    Submitted 10 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  13. arXiv:2404.07543  [pdf, other

    cs.CV eess.IV

    Content-Adaptive Non-Local Convolution for Remote Sensing Pansharpening

    Authors: Yule Duan, Xiao Wu, Haoyu Deng, Liang-Jian Deng

    Abstract: Currently, machine learning-based methods for remote sensing pansharpening have progressed rapidly. However, existing pansharpening methods often do not fully exploit differentiating regional information in non-local spaces, thereby limiting the effectiveness of the methods and resulting in redundant learning parameters. In this paper, we introduce a so-called content-adaptive non-local convolutio… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  14. arXiv:2404.01121  [pdf, other

    cs.CV eess.IV

    CMT: Cross Modulation Transformer with Hybrid Loss for Pansharpening

    Authors: Wen-Jie Shu, Hong-Xia Dou, Rui Wen, Xiao Wu, Liang-Jian Deng

    Abstract: Pansharpening aims to enhance remote sensing image (RSI) quality by merging high-resolution panchromatic (PAN) with multispectral (MS) images. However, prior techniques struggled to optimally fuse PAN and MS images for enhanced spatial and spectral information, due to a lack of a systematic framework capable of effectively coordinating their individual strengths. In response, we present the Cross… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  15. arXiv:2403.17040  [pdf

    cs.AI cs.LG cs.NE

    Enhancing Graph Representation Learning with Attention-Driven Spiking Neural Networks

    Authors: Huifeng Yin, Mingkun Xu, **g Pei, Lei Deng

    Abstract: Graph representation learning has become a crucial task in machine learning and data mining due to its potential for modeling complex structures such as social networks, chemical compounds, and biological systems. Spiking neural networks (SNNs) have recently emerged as a promising alternative to traditional neural networks for graph learning tasks, benefiting from their ability to efficiently enco… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  16. arXiv:2403.16674  [pdf, other

    cs.NE cs.AI cs.LG

    Understanding the Functional Roles of Modelling Components in Spiking Neural Networks

    Authors: Huifeng Yin, Hanle Zheng, Jiayi Mao, Siyuan Ding, Xing Liu, Mingkun Xu, Yifan Hu, **g Pei, Lei Deng

    Abstract: Spiking neural networks (SNNs), inspired by the neural circuits of the brain, are promising in achieving high computational efficiency with biological fidelity. Nevertheless, it is quite difficult to optimize SNNs because the functional roles of their modelling components remain unclear. By designing and evaluating several variants of the classic model, we systematically investigate the functional… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  17. arXiv:2403.05818  [pdf

    cs.LG q-bio.QM

    PR-NET: Leveraging Pathway Refined Network Structures for Prostate Cancer Patient Condition Prediction

    Authors: R. Li, J. Liu, X. L. Deng, X. Liu, J. C. Guo, W. Y. Wu, L. Yang

    Abstract: The diagnosis and monitoring of Castrate Resistant Prostate Cancer (CRPC) are crucial for cancer patients, but the current models (such as P-NET) have limitations in terms of parameter count, generalization, and cost. To address the issue, we develop a more accurate and efficient Prostate Cancer patient condition prediction model, named PR-NET. By compressing and optimizing the network structure o… ▽ More

    Submitted 12 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  18. arXiv:2402.16810  [pdf

    cs.CL

    OncoGPT: A Medical Conversational Model Tailored with Oncology Domain Expertise on a Large Language Model Meta-AI (LLaMA)

    Authors: Fujian Jia, Xin Liu, Lixi Deng, Jiwen Gu, Chunchao Pu, Tunan Bai, Mengjiang Huang, Yuanzhi Lu, Kang Liu

    Abstract: In the past year, there has been a growing trend in applying Large Language Models (LLMs) to the field of medicine, particularly with the advent of advanced language models such as ChatGPT developed by OpenAI. However, there is limited research on LLMs specifically addressing oncology-related queries. The primary aim of this research was to develop a specialized language model that demonstrates im… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  19. arXiv:2402.12655  [pdf, other

    cs.SI stat.AP

    Ego Group Partition: A Novel Framework for Improving Ego Experiments in Social Networks

    Authors: Lu Deng, **g**g Zhang, Yong Wang, Chuan Chen

    Abstract: Estimating the average treatment effect in social networks is challenging due to individuals influencing each other. One approach to address interference is ego cluster experiments, where each cluster consists of a central individual (ego) and its peers (alters). Clusters are randomized, and only the effects on egos are measured. In this work, we propose an improved framework for ego cluster exper… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  20. arXiv:2402.12653  [pdf, other

    cs.SI stat.AP

    Unbiased Estimation for Total Treatment Effect Under Interference Using Aggregated Dyadic Data

    Authors: Lu Deng, Yilin Li, **g**g Zhang, Yong Wang, Chuan Chen

    Abstract: In social media platforms, user behavior is often influenced by interactions with other users, complicating the accurate estimation of causal effects in traditional A/B experiments. This study investigates situations where an individual's outcome can be broken down into the sum of multiple pairwise outcomes, a reflection of user interactions. These outcomes, referred to as dyadic data, are prevale… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  21. arXiv:2402.08934  [pdf, other

    eess.IV cs.CV

    Extreme Video Compression with Pre-trained Diffusion Models

    Authors: Bohan Li, Yiming Liu, Xueyan Niu, Bo Bai, Lei Deng, Deniz Gündüz

    Abstract: Diffusion models have achieved remarkable success in generating high quality image and video data. More recently, they have also been used for image compression with high perceptual quality. In this paper, we present a novel approach to extreme video compression leveraging the predictive power of diffusion-based generative models at the decoder. The conditional diffusion model takes several neural… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  22. arXiv:2402.02235  [pdf, other

    cs.CV

    Image Fusion via Vision-Language Model

    Authors: Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc Van Gool

    Abstract: Image fusion integrates essential information from multiple source images into a single composite, emphasizing the highlighting structure and textures, and refining imperfect areas. Existing methods predominantly focus on pixel-level and semantic visual features for recognition. However, they insufficiently explore the deeper semantic information at a text-level beyond vision. Therefore, we introd… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

  23. arXiv:2401.06252  [pdf

    cs.CV cs.LG

    AGSPNet: A framework for parcel-scale crop fine-grained semantic change detection from UAV high-resolution imagery with agricultural geographic scene constraints

    Authors: Shaochun Li, Yanjun Wang, Hengfan Cai, Lina Deng, Yunhao Lin

    Abstract: Real-time and accurate information on fine-grained changes in crop cultivation is of great significance for crop growth monitoring, yield prediction and agricultural structure adjustment. Aiming at the problems of serious spectral confusion in visible high-resolution unmanned aerial vehicle (UAV) images of different phases, interference of large complex background and salt-and-pepper noise by exis… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  24. arXiv:2401.04150  [pdf, other

    cs.CV

    Two-stream joint matching method based on contrastive learning for few-shot action recognition

    Authors: Long Deng, Ziqiang Li, Bingxin Zhou, Zhongming Chen, Ao Li, Yongxin Ge

    Abstract: Although few-shot action recognition based on metric learning paradigm has achieved significant success, it fails to address the following issues: (1) inadequate action relation modeling and underutilization of multi-modal information; (2) challenges in handling video matching problems with different lengths and speeds, and video matching problems with misalignment of video sub-actions. To address… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  25. arXiv:2312.13778  [pdf, other

    cs.CV

    Progressive Evolution from Single-Point to Polygon for Scene Text

    Authors: Linger Deng, Mingxin Huang, Xudong Xie, Yuliang Liu, Lianwen **, Xiang Bai

    Abstract: The advancement of text shape representations towards compactness has enhanced text detection and spotting performance, but at a high annotation cost. Current models use single-point annotations to reduce costs, yet they lack sufficient localization information for downstream applications. To overcome this limitation, we introduce Point2Polygon, which can efficiently transform single-points into c… ▽ More

    Submitted 10 May, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted in ICDAR 2024

  26. arXiv:2312.11935  [pdf, other

    cs.AI

    Parameterized Decision-making with Multi-modal Perception for Autonomous Driving

    Authors: Yuyang Xia, Shuncheng Liu, Quanlin Yu, Liwei Deng, You Zhang, Han Su, Kai Zheng

    Abstract: Autonomous driving is an emerging technology that has advanced rapidly over the last decade. Modern transportation is expected to benefit greatly from a wise decision-making framework of autonomous vehicles, including the improvement of mobility and the minimization of risks and travel time. However, existing methods either ignore the complexity of environments only fitting straight roads, or igno… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: IEEE International Conference on Data Engineering (ICDE2024)

  27. arXiv:2312.09571  [pdf, other

    cs.CL cs.IT

    Extending Context Window of Large Language Models via Semantic Compression

    Authors: Weizhi Fei, Xueyan Niu, **yi Zhou, Lu Hou, Bo Bai, Lei Deng, Wei Han

    Abstract: Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long texts. We propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer, without incurring significant computational c… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  28. arXiv:2312.07943  [pdf, other

    cs.CV

    ReFusion: Learning Image Fusion from Reconstruction with Learnable Loss via Meta-Learning

    Authors: Haowen Bai, Zixiang Zhao, Jiangshe Zhang, Yichen Wu, Lilun Deng, Yukun Cui, Shuang Xu, Baisong Jiang

    Abstract: Image fusion aims to combine information from multiple source images into a single one with more comprehensive informational content. The significant challenges for deep learning-based image fusion algorithms are the lack of a definitive ground truth as well as the corresponding distance measurement, with current manually given loss functions constrain the flexibility of model and generalizability… ▽ More

    Submitted 11 March, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

  29. arXiv:2312.06197  [pdf, other

    cs.SD cs.MM eess.AS

    MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer

    Authors: Dong Yao, Jieming Zhu, Jiahao Xun, Shengyu Zhang, Zhou Zhao, Liqun Deng, Wenqiao Zhang, Zhenhua Dong, Xin Jiang

    Abstract: Recent research in self-supervised contrastive learning of music representations has demonstrated remarkable results across diverse downstream tasks. However, a prevailing trend in existing methods involves representing equally-sized music clips in either waveform or spectrogram formats, often overlooking the intrinsic part-whole hierarchies within music. In our quest to comprehend the bottom-up s… ▽ More

    Submitted 19 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Short paper accepted by WWW 2024. This is revised and condensed based on the previous version titled "Music-PAW: Learning Music Representations via Hierarchical Part-whole Interaction and Contrast". For more experimental details and discussions, please refer to the original long paper at arXiv:2312.06197v1

  30. arXiv:2311.11652  [pdf, other

    cs.AI

    Web News Timeline Generation with Extended Task Prompting

    Authors: Sha Wang, Yuchen Li, Hanhua Xiao, Lambert Deng, Yanfei Dong

    Abstract: The creation of news timeline is essential for a comprehensive and contextual understanding of events as they unfold over time. This approach aids in discerning patterns and trends that might be obscured when news is viewed in isolation. By organizing news in a chronological sequence, it becomes easier to track the development of stories, understand the interrelation of events, and grasp the broad… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 4 pages

  31. arXiv:2310.14576  [pdf, other

    cs.LG cs.CV

    Tensor Decomposition Based Attention Module for Spiking Neural Networks

    Authors: Haoyu Deng, Ruijie Zhu, Xuerui Qiu, Yule Duan, Malu Zhang, Liangjian Deng

    Abstract: The attention mechanism has been proven to be an effective way to improve spiking neural network (SNN). However, based on the fact that the current SNN input data flow is split into tensors to process on GPUs, none of the previous works consider the properties of tensors to implement an attention module. This inspires us to rethink current SNN from the perspective of tensor-relevant theories. Usin… ▽ More

    Submitted 10 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by Knowledge-Based Systems

  32. arXiv:2310.06140  [pdf, other

    cs.CC math.CO

    NP-Hardness of Tensor Network Contraction Ordering

    Authors: Jianyu Xu, Hanwen Zhang, Ling Liang, Lei Deng, Yuan Xie, Guoqi Li

    Abstract: We study the optimal order (or sequence) of contracting a tensor network with a minimal computational cost. We conclude 2 different versions of this optimal sequence: that minimize the operation number (OMS) and that minimize the time complexity (CMS). Existing results only shows that OMS is NP-hard, but no conclusion on CMS problem. In this work, we firstly reduce CMS to CMS-0, which is a sub-pro… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: Jianyu Xu and Hanwen Zhang are equal contributors. 10 pages (reference and appendix excluded), 20 pages in total, 6 figures

    MSC Class: 05C35; 05C76 ACM Class: F.2.2

  33. arXiv:2309.15889  [pdf, other

    eess.IV cs.CV cs.IT cs.LG cs.MM

    High Perceptual Quality Wireless Image Delivery with Denoising Diffusion Models

    Authors: Selim F. Yilmaz, Xueyan Niu, Bo Bai, Wei Han, Lei Deng, Deniz Gunduz

    Abstract: We consider the image transmission problem over a noisy wireless channel via deep learning-based joint source-channel coding (DeepJSCC) along with a denoising diffusion probabilistic model (DDPM) at the receiver. Specifically, we are interested in the perception-distortion trade-off in the practical finite block length regime, in which separate source and channel coding can be highly suboptimal. W… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: 6 pages, 4 figures

  34. arXiv:2308.12605  [pdf, other

    cs.CV cs.AI

    APLA: Additional Perturbation for Latent Noise with Adversarial Training Enables Consistency

    Authors: Yupu Yao, Shangqi Deng, Zihan Cao, Harry Zhang, Liang-Jian Deng

    Abstract: Diffusion models have exhibited promising progress in video generation. However, they often struggle to retain consistent details within local regions across frames. One underlying cause is that traditional diffusion models approximate Gaussian noise distribution by utilizing predictive noise, without fully accounting for the impact of inherent information within the input itself. Additionally, th… ▽ More

    Submitted 1 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

  35. arXiv:2308.06582  [pdf, other

    cs.NE

    Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks

    Authors: Xuerui Qiu, Rui-Jie Zhu, Yuhong Chou, Zhaorui Wang, Liang-jian Deng, Guoqi Li

    Abstract: Spiking neural networks (SNNs) are emerging as an energy-efficient alternative to traditional artificial neural networks (ANNs) due to their unique spike-based event-driven nature. Coding is crucial in SNNs as it converts external input stimuli into spatio-temporal feature sequences. However, most existing deep SNNs rely on direct coding that generates powerless spike representation and lacks the… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 12 pages, 7 figures

    Report number: accepted to Proceedings of the AAAI Conference on Artificial Intelligence 38 (AAAI 24)

  36. arXiv:2308.05564  [pdf, other

    econ.EM cs.LG q-fin.ST stat.CO

    Large Skew-t Copula Models and Asymmetric Dependence in Intraday Equity Returns

    Authors: Lin Deng, Michael Stanley Smith, Worapree Maneesoonthorn

    Abstract: Skew-t copula models are attractive for the modeling of financial data because they allow for asymmetric and extreme tail dependence. We show that the copula implicit in the skew-t distribution of Azzalini and Capitanio (2003) allows for a higher level of pairwise asymmetric dependence than two popular alternative skew-t copulas. Estimation of this copula in high dimensions is challenging, and we… ▽ More

    Submitted 7 March, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

  37. arXiv:2308.03866  [pdf, other

    cs.CL cs.AI

    Trusting Language Models in Education

    Authors: Jogi Suda Neto, Li Deng, Thejaswi Raya, Reza Shahbazi, Nick Liu, Adhitya Venkatesh, Miral Shah, Neeru Khosla, Rodrigo Capobianco Guido

    Abstract: Language Models are being widely used in Education. Even though modern deep learning models achieve very good performance on question-answering tasks, sometimes they make errors. To avoid misleading students by showing wrong answers, it is important to calibrate the confidence - that is, the prediction probability - of these models. In our work, we propose to use an XGBoost on top of BERT to outpu… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  38. arXiv:2307.12656  [pdf, other

    cs.CV math.OC

    A Theoretically Guaranteed Quaternion Weighted Schatten p-norm Minimization Method for Color Image Restoration

    Authors: Qing-Hua Zhang, Liang-Tian He, Yi-Lun Wang, Liang-Jian Deng, Jun Liu

    Abstract: Inspired by the fact that the matrix formulated by nonlocal similar patches in a natural image is of low rank, the rank approximation issue have been extensively investigated over the past decades, among which weighted nuclear norm minimization (WNNM) and weighted Schatten $p$-norm minimization (WSNM) are two prevailing methods have shown great superiority in various image restoration (IR) problem… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 46 pages, 10 figures; references added

  39. arXiv:2307.09775  [pdf, other

    cs.IR cs.SD eess.AS

    DisCover: Disentangled Music Representation Learning for Cover Song Identification

    Authors: Jiahao Xun, Shengyu Zhang, Yanting Yang, Jieming Zhu, Liqun Deng, Zhou Zhao, Zhenhua Dong, Ruiqi Li, Lichao Zhang, Fei Wu

    Abstract: In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and inter-song correlations, due to the entangled nature of version-specific and version-invariant factors in their modeling. In this work, we set the goal… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  40. arXiv:2307.07288  [pdf, other

    cs.CV

    Implicit Neural Feature Fusion Function for Multispectral and Hyperspectral Image Fusion

    Authors: ShangQi Deng, RuoCheng Wu, Liang-Jian Deng, Ran Ran, Gemine Vivone

    Abstract: Multispectral and Hyperspectral Image Fusion (MHIF) is a practical task that aims to fuse a high-resolution multispectral image (HR-MSI) and a low-resolution hyperspectral image (LR-HSI) of the same scene to obtain a high-resolution hyperspectral image (HR-HSI). Benefiting from powerful inductive bias capability, CNN-based methods have achieved great success in the MHIF task. However, they lack ce… ▽ More

    Submitted 29 October, 2023; v1 submitted 14 July, 2023; originally announced July 2023.

  41. Game Theory and Coverage Optimization Based Multihop Routing Protocol for Network Lifetime in Wireless Sensor Networks

    Authors: Yindi Yao, Xiong Li, Yanpeng Cui, Lang Deng, Chen Wang

    Abstract: Wireless sensor networks (WSNs) are self-organizing monitoring networks with a large number of randomly deployed microsensor nodes to collect various physical information to realize tasks such as intelligent perception, efficient control, and decision-making. However, WSN nodes are powered by batteries, so they will run out of energy after a certain time. This energy limitation will greatly constr… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: 14 pages, 13 figure, 3 tables

    Journal ref: in IEEE Sensors Journal, vol. 22, no. 13, pp. 13739-13752, July, 2022

  42. arXiv:2306.06872  [pdf, other

    cs.CL

    History Semantic Graph Enhanced Conversational KBQA with Temporal Information Modeling

    Authors: Hao Sun, Yang Li, Liwei Deng, Bowen Li, Binyuan Hui, Binhua Li, Yunshi Lan, Yan Zhang, Yongbin Li

    Abstract: Context information modeling is an important task in conversational KBQA. However, existing methods usually assume the independence of utterances and model them in isolation. In this paper, we propose a History Semantic Graph Enhanced KBQA model (HSGE) that is able to effectively model long-range semantic dependencies in conversation history while maintaining low computational cost. The framework… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023 Main Conference

  43. arXiv:2305.13652  [pdf, ps, other

    cs.CL eess.AS

    Cross-lingual Knowledge Transfer and Iterative Pseudo-labeling for Low-Resource Speech Recognition with Transducers

    Authors: Jan Silovsky, Liuhui Deng, Arturo Argueta, Tresi Arvizo, Roger Hsiao, Sasha Kuznietsov, Yiu-Chang Lin, Xiaoqiang Xiao, Yuanyuan Zhang

    Abstract: Voice technology has become ubiquitous recently. However, the accuracy, and hence experience, in different languages varies significantly, which makes the technology not equally inclusive. The availability of data for different languages is one of the key factors affecting accuracy, especially in training of all-neural end-to-end automatic speech recognition systems. Cross-lingual knowledge tran… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

  44. arXiv:2305.12881  [pdf, other

    cs.CV cs.MM

    Building an Invisible Shield for Your Portrait against Deepfakes

    Authors: Jiazhi Guan, Tianshu Hu, Hang Zhou, Zhizhi Guo, Lirui Deng, Chengbin Quan, Errui Ding, Youjian Zhao

    Abstract: The issue of detecting deepfakes has garnered significant attention in the research community, with the goal of identifying facial manipulations for abuse prevention. Although recent studies have focused on develo** generalized models that can detect various types of deepfakes, their performance is not always be reliable and stable, which poses limitations in real-world applications. Instead of… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: under review

  45. arXiv:2304.04774  [pdf, other

    cs.CV cs.AI eess.IV

    DDRF: Denoising Diffusion Model for Remote Sensing Image Fusion

    Authors: ZiHan Cao, ShiQi Cao, Xiao Wu, JunMing Hou, Ran Ran, Liang-Jian Deng

    Abstract: Denosing diffusion model, as a generative model, has received a lot of attention in the field of image generation recently, thanks to its powerful generation capability. However, diffusion models have not yet received sufficient research in the field of image fusion. In this article, we introduce diffusion model to the image fusion field, treating the image fusion task as image-to-image translatio… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  46. arXiv:2303.07758  [pdf, other

    cs.LG cs.SI

    Traffic4cast at NeurIPS 2022 -- Predict Dynamics along Graph Edges from Sparse Node Data: Whole City Traffic and ETA from Stationary Vehicle Detectors

    Authors: Moritz Neun, Christian Eichenberger, Henry Martin, Markus Spanring, Rahul Siripurapu, Daniel Springer, Leyan Deng, Chenwang Wu, Defu Lian, Min Zhou, Martin Lumiste, Andrei Ilie, Xinhua Wu, Cheng Lyu, Qing-Long Lu, Vishal Mahajan, Yichao Lu, Jiezhang Li, Junjun Li, Yue-Jiao Gong, Florian Grötschla, Joël Mathys, Ye Wei, He Haitao, Hui Fang , et al. (5 additional authors not shown)

    Abstract: The global trends of urbanization and increased personal mobility force us to rethink the way we live and use urban space. The Traffic4cast competition series tackles this problem in a data-driven way, advancing the latest methods in machine learning for modeling complex spatial systems over time. In this edition, our dynamic road graph data combine information from road maps, $10^{12}$ probe data… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: Pre-print under review, submitted to Proceedings of Machine Learning Research

  47. arXiv:2211.00641  [pdf, other

    cs.LG cs.AI

    Transposed Variational Auto-encoder with Intrinsic Feature Learning for Traffic Forecasting

    Authors: Leyan Deng, Chenwang Wu, Defu Lian, Min Zhou

    Abstract: In this technical report, we present our solutions to the Traffic4cast 2022 core challenge and extended challenge. In this competition, the participants are required to predict the traffic states for the future 15-minute based on the vehicle counter data in the previous hour. Compared to other competitions in the same series, this year focuses on the prediction of different data sources and sparse… ▽ More

    Submitted 15 December, 2022; v1 submitted 30 October, 2022; originally announced November 2022.

    Comments: 8 pages, 5 figures

  48. arXiv:2210.14515  [pdf, other

    eess.AS cs.SD

    UFO2: A unified pre-training framework for online and offline speech recognition

    Authors: Li Fu, Siqi Li, Qingtao Li, Li** Deng, Fangzhu Li, Lu Fan, Meng Chen, Xiaodong He

    Abstract: In this paper, we propose a Unified pre-training Framework for Online and Offline (UFO2) Automatic Speech Recognition (ASR), which 1) simplifies the two separate training workflows for online and offline modes into one process, and 2) improves the Word Error Rate (WER) performance with limited utterance annotating. Specifically, we extend the conventional offline-mode Self-Supervised Learning (SSL… ▽ More

    Submitted 3 April, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted by ICASSP 2023

  49. Compressing Explicit Voxel Grid Representations: fast NeRFs become also small

    Authors: Chenxi Lola Deng, Enzo Tartaglione

    Abstract: NeRFs have revolutionized the world of per-scene radiance field reconstruction because of their intrinsic compactness. One of the main limitations of NeRFs is their slow rendering speed, both at training and inference time. Recent research focuses on the optimization of an explicit voxel grid (EVG) that represents the scene, which can be paired with neural networks to learn radiance fields. This a… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  50. arXiv:2210.12214  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Optimizing Bilingual Neural Transducer with Synthetic Code-switching Text Generation

    Authors: Thien Nguyen, Nathalie Tran, Liuhui Deng, Thiago Fraga da Silva, Matthew Radzihovsky, Roger Hsiao, Henry Mason, Stefan Braun, Erik McDermott, Dogan Can, Pawel Swietojanski, Lyan Verwimp, Sibel Oyman, Tresi Arvizo, Honza Silovsky, Arnab Ghoshal, Mathieu Martel, Bharat Ram Ambati, Mohamed Ali

    Abstract: Code-switching describes the practice of using more than one language in the same sentence. In this study, we investigate how to optimize a neural transducer based bilingual automatic speech recognition (ASR) model for code-switching speech. Focusing on the scenario where the ASR model is trained without supervised code-switching data, we found that semi-supervised training and synthetic code-swit… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: 5 pages, 1 figure, submitted to ICASSP 2023, *: equal contributions