Skip to main content

Showing 1–19 of 19 results for author: Lian, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.10707  [pdf, ps, other

    cs.CV

    HARIS: Human-Like Attention for Reference Image Segmentation

    Authors: Mengxi Zhang, Heqing Lian, Yiming Liu, Jie Chen

    Abstract: Referring image segmentation (RIS) aims to locate the particular region corresponding to the language expression. Existing methods incorporate features from different modalities in a \emph{bottom-up} manner. This design may get some unnecessary image-text pairs, which leads to an inaccurate segmentation mask. In this paper, we propose a referring image segmentation method called HARIS, which intro… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  2. arXiv:2404.17808  [pdf, other

    cs.CL

    Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

    Authors: Haoran Lian, Yizhe Xiong, Jianwei Niu, Shasha Mo, Zhenpeng Su, Zijia Lin, Peng Liu, Hui Chen, Guiguang Ding

    Abstract: Byte Pair Encoding (BPE) serves as a foundation method for text tokenization in the Natural Language Processing (NLP) field. Despite its wide adoption, the original BPE algorithm harbors an inherent flaw: it inadvertently introduces a frequency imbalance for tokens in the text corpus. Since BPE iteratively merges the most frequent token pair in the text corpus while kee** all tokens that have be… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  3. arXiv:2404.17785  [pdf, other

    cs.CL

    Temporal Scaling Law for Large Language Models

    Authors: Yizhe Xiong, Xiansheng Chen, Xin Ye, Hui Chen, Zijia Lin, Haoran Lian, Zhenpeng Su, Jianwei Niu, Guiguang Ding

    Abstract: Recently, Large Language Models (LLMs) have been widely adopted in a wide range of tasks, leading to increasing attention towards the research on how scaling LLMs affects their performance. Existing works, termed Scaling Laws, have discovered that the final test loss of LLMs scales as power-laws with model size, computational budget, and dataset size. However, the temporal change of the test loss… ▽ More

    Submitted 16 June, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures; Under review

  4. arXiv:2404.08242  [pdf, other

    cs.NE cs.AI

    RLEMMO: Evolutionary Multimodal Optimization Assisted By Deep Reinforcement Learning

    Authors: Hongqiao Lian, Zeyuan Ma, Hongshu Guo, Ting Huang, Yue-Jiao Gong

    Abstract: Solving multimodal optimization problems (MMOP) requires finding all optimal solutions, which is challenging in limited function evaluations. Although existing works strike the balance of exploration and exploitation through hand-crafted adaptive strategies, they require certain expert knowledge, hence inflexible to deal with MMOP with different properties. In this paper, we propose RLEMMO, a Meta… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted as full paper at GECCO 2024

  5. arXiv:2403.01494  [pdf, other

    eess.AS cs.SD eess.SP

    PAVITS: Exploring Prosody-aware VITS for End-to-End Emotional Voice Conversion

    Authors: Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian

    Abstract: In this paper, we propose Prosody-aware VITS (PAVITS) for emotional voice conversion (EVC), aiming to achieve two major objectives of EVC: high content naturalness and high emotional naturalness, which are crucial for meeting the demands of human perception. To improve the content naturalness of converted audio, we have developed an end-to-end EVC architecture inspired by the high audio quality of… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: Accepted to ICASSP2024

  6. arXiv:2401.10536  [pdf, other

    cs.CL

    Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition

    Authors: Yong Wang, Cheng Lu, Hailun Lian, Yan Zhao, Björn Schuller, Yuan Zong, Wenming Zheng

    Abstract: Swin-Transformer has demonstrated remarkable success in computer vision by leveraging its hierarchical feature representation based on Transformer. In speech signals, emotional information is distributed across different scales of speech features, e.\,g., word, phrase, and utterance. Drawing above inspiration, this paper presents a hierarchical speech Transformer with shifted windows to aggregate… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  7. arXiv:2401.09752  [pdf, other

    cs.SD cs.LG eess.AS

    Improving Speaker-independent Speech Emotion Recognition Using Dynamic Joint Distribution Adaptation

    Authors: Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn Schuller, Wenming Zheng

    Abstract: In speaker-independent speech emotion recognition, the training and testing samples are collected from diverse speakers, leading to a multi-domain shift challenge across the feature distributions of data from different speakers. Consequently, when the trained model is confronted with data from new speakers, its performance tends to degrade. To address the issue, we propose a Dynamic Joint Distribu… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  8. arXiv:2312.06466  [pdf, other

    cs.SD eess.AS

    Towards Domain-Specific Cross-Corpus Speech Emotion Recognition Approach

    Authors: Yan Zhao, Yuan Zong, Hailun Lian, Cheng Lu, **gang Shi, Wenming Zheng

    Abstract: Cross-corpus speech emotion recognition (SER) poses a challenge due to feature distribution mismatch, potentially degrading the performance of established SER methods. In this paper, we tackle this challenge by proposing a novel transfer subspace learning method called acoustic knowledgeguided transfer linear regression (AKTLR). Unlike existing approaches, which often overlook domain-specific know… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  9. arXiv:2310.03992  [pdf, other

    cs.SD eess.AS

    Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

    Authors: Yan Zhao, Yuan Zong, **cen Wang, Hailun Lian, Cheng Lu, Li Zhao, Wenming Zheng

    Abstract: In this paper, we propose a new unsupervised domain adaptation (DA) method called layer-adapted implicit distribution alignment networks (LIDAN) to address the challenge of cross-corpus speech emotion recognition (SER). LIDAN extends our previous ICASSP work, deep implicit distribution alignment networks (DIDAN), whose key contribution lies in the introduction of a novel regularization term called… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  10. arXiv:2308.14568  [pdf, other

    cs.SD eess.AS

    Time-Frequency Transformer: A Novel Time Frequency Joint Learning Method for Speech Emotion Recognition

    Authors: Yong Wang, Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Sunan Li

    Abstract: In this paper, we propose a novel time-frequency joint learning method for speech emotion recognition, called Time-Frequency Transformer. Its advantage is that the Time-Frequency Transformer can excavate global emotion patterns in the time-frequency domain of speech signal while modeling the local emotional correlations in the time domain and frequency domain respectively. For the purpose, we firs… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by International Conference on Neural Information Processing (ICONIP2023)

  11. arXiv:2306.01491  [pdf, other

    cs.SD

    Learning Local to Global Feature Aggregation for Speech Emotion Recognition

    Authors: Cheng Lu, Hailun Lian, Wenming Zheng, Yuan Zong, Yan Zhao, Sunan Li

    Abstract: Transformer has emerged in speech emotion recognition (SER) at present. However, its equal patch division not only damages frequency information but also ignores local emotion correlations across frames, which are key cues to represent emotion. To handle the issue, we propose a Local to Global Feature Aggregation learning (LGFA) for SER, which can aggregate longterm emotion correlations at differe… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted on INTERSPEECH 2023

  12. arXiv:2302.08921  [pdf, other

    cs.SD cs.CL eess.AS

    Deep Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

    Authors: Yan Zhao, **cen Wang, Yuan Zong, Wenming Zheng, Hailun Lian, Li Zhao

    Abstract: In this paper, we propose a novel deep transfer learning method called deep implicit distribution alignment networks (DIDAN) to deal with cross-corpus speech emotion recognition (SER) problem, in which the labeled training (source) and unlabeled testing (target) speech signals come from different corpora. Specifically, DIDAN first adopts a simple deep regression network consisting of a set of conv… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  13. arXiv:2210.12430  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Speech Emotion Recognition via an Attentive Time-Frequency Neural Network

    Authors: Cheng Lu, Wenming Zheng, Hailun Lian, Yuan Zong, Chuangao Tang, Sunan Li, Yan Zhao

    Abstract: Spectrogram is commonly used as the input feature of deep neural networks to learn the high(er)-level time-frequency pattern of speech signal for speech emotion recognition (SER). \textcolor{black}{Generally, different emotions correspond to specific energy activations both within frequency bands and time frames on spectrogram, which indicates the frequency and time domains are both essential to r… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: This paper has been accepted as a regular paper on IEEE Transactions on Computational Social Systems

  14. arXiv:2210.03460  [pdf, other

    eess.IV cs.CV

    Flexible Alignment Super-Resolution Network for Multi-Contrast MRI

    Authors: Yiming Liu, Mengxi Zhang, Weiqin Zhang, Bo Jiang, Bo Hou, Dan Liu, Jie Chen, Heqing Lian

    Abstract: Magnetic resonance imaging plays an essential role in clinical diagnosis by acquiring the structural information of biological tissue. Recently, many multi-contrast MRI super-resolution networks achieve good effects. However, most studies ignore the impact of the inappropriate foreground scale and patch size of multi-contrast MRI, which probably leads to inappropriate feature alignment. To tackle… ▽ More

    Submitted 8 January, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

  15. Online Deep Learning from Doubly-Streaming Data

    Authors: Heng Lian, John Scovil Atwood, Bojian Hou, Jian Wu, Yi He

    Abstract: This paper investigates a new online learning problem with doubly-streaming data, where the data streams are described by feature spaces that constantly evolve, with new features emerging and old features fading away. The challenges of this problem are two folds: 1) Data samples ceaselessly flowing in may carry shifted patterns over time, requiring learners to update hence adapt on-the-fly. 2) New… ▽ More

    Submitted 14 September, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: Have accepted by ACMMM 2022. Legends mistake in Figure 4 has been corrected

  16. arXiv:1110.1915  [pdf, ps, other

    math.CO cs.CC

    Further hardness results on the rainbow vertex-connection number of graphs

    Authors: Lily Chen, Xueliang Li, Huishu Lian

    Abstract: A vertex-colored graph $G$ is {\it rainbow vertex-connected} if any pair of vertices in $G$ are connected by a path whose internal vertices have distinct colors, which was introduced by Krivelevich and Yuster. The {\it rainbow vertex-connection number} of a connected graph $G$, denoted by $rvc(G)$, is the smallest number of colors that are needed in order to make $G$ rainbow vertex-connected. In a… ▽ More

    Submitted 9 October, 2011; originally announced October 2011.

    Comments: 10 pages

    MSC Class: 05C15; 05C40; 68Q17; 68Q25; 90C27

  17. arXiv:0906.0434  [pdf, ps, other

    cs.CV math.NA stat.ME

    Total Variation, Adaptive Total Variation and Nonconvex Smoothly Clipped Absolute Deviation Penalty for Denoising Blocky Images

    Authors: Aditya Chopra, Heng Lian

    Abstract: The total variation-based image denoising model has been generalized and extended in numerous ways, improving its performance in different contexts. We propose a new penalty function motivated by the recent progress in the statistical literature on high-dimensional variable selection. Using a particular instantiation of the majorization-minimization algorithm, the optimization problem can be eff… ▽ More

    Submitted 2 June, 2009; originally announced June 2009.

  18. arXiv:0802.1258  [pdf, ps, other

    cs.CV cs.LG

    Bayesian Nonlinear Principal Component Analysis Using Random Fields

    Authors: Heng Lian

    Abstract: We propose a novel model for nonlinear dimension reduction motivated by the probabilistic formulation of principal component analysis. Nonlinearity is achieved by specifying different transformation matrices at different locations of the latent space and smoothing the transformation using a Markov random field type prior. The computation is made feasible by the recent advances in sampling from v… ▽ More

    Submitted 9 February, 2008; originally announced February 2008.

  19. arXiv:0709.1771  [pdf, ps, other

    cs.CV

    Variational local structure estimation for image super-resolution

    Authors: Heng Lian

    Abstract: Super-resolution is an important but difficult problem in image/video processing. If a video sequence or some training set other than the given low-resolution image is available, this kind of extra information can greatly aid in the reconstruction of the high-resolution image. The problem is substantially more difficult with only a single low-resolution image on hand. The image reconstruction me… ▽ More

    Submitted 12 September, 2007; originally announced September 2007.

    Comments: 9 pages