Skip to main content

Showing 1–50 of 82 results for author: Lian, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.21027  [pdf, other

    cs.GT cs.AI cs.LG cs.MA

    Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

    Authors: Jiesong Lian

    Abstract: A popular approach for solving zero-sum games is to maintain populations of policies to approximate the Nash Equilibrium (NE). Previous studies have shown that Policy Space Response Oracle (PSRO) algorithm is an effective multi-agent reinforcement learning framework for solving such games. However, repeatedly training new policies from scratch to approximate Best Response (BR) to opponents' mixed… ▽ More

    Submitted 21 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: 20 pages, 5 figures

  2. arXiv:2405.10640  [pdf, other

    cs.SI

    COMET: NFT Price Prediction with Wallet Profiling

    Authors: Tianfu Wang, Liwei Deng, Chao Wang, Jianxun Lian, Yue Yan, Nicholas **g Yuan, Qi Zhang, Hui Xiong

    Abstract: As the non-fungible token (NFT) market flourishes, price prediction emerges as a pivotal direction for investors gaining valuable insight to maximize returns. However, existing works suffer from a lack of practical definitions and standardized evaluations, limiting their practical application. Moreover, the influence of users' multi-behaviour transactions that are publicly accessible on NFT price… ▽ More

    Submitted 29 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Accepted by KDD 2024 (ADS Track)

  3. arXiv:2405.00527  [pdf, other

    cs.DB

    ChatBI: Towards Natural Language to Complex Business Intelligence SQL

    Authors: **qing Lian, Xinyi Liu, Yingxia Shao, Yang Dong, Ming Wang, Zhang Wei, Tianqi Wan, Ming Dong, Hailin Yan

    Abstract: The Natural Language to SQL (NL2SQL) technology provides non-expert users who are unfamiliar with databases the opportunity to use SQL for data analysis.Converting Natural Language to Business Intelligence (NL2BI) is a popular practical scenario for NL2SQL in actual production systems. Compared to NL2SQL, NL2BI introduces more challenges. In this paper, we propose ChatBI, a comprehensive and eff… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  4. arXiv:2404.07687  [pdf, other

    cs.CV

    Chaos in Motion: Unveiling Robustness in Remote Heart Rate Measurement through Brain-Inspired Skin Tracking

    Authors: Jie Wang, **g Lian, Minjie Ma, Junqiang Lei, Chunbiao Li, Bin Li, Jizhao Liu

    Abstract: Heart rate is an important physiological indicator of human health status. Existing remote heart rate measurement methods typically involve facial detection followed by signal extraction from the region of interest (ROI). These SOTA methods have three serious problems: (a) inaccuracies even failures in detection caused by environmental influences or subject movement; (b) failures for special patie… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 8 pages, 10 figures

  5. RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems

    Authors: Jianxun Lian, Yuxuan Lei, Xu Huang, **g Yao, Wei Xu, Xing Xie

    Abstract: This paper introduces RecAI, a practical toolkit designed to augment or even revolutionize recommender systems with the advanced capabilities of Large Language Models (LLMs). RecAI provides a suite of tools, including Recommender AI Agent, Recommendation-oriented Language Models, Knowledge Plugin, RecExplainer, and Evaluator, to facilitate the integration of LLMs into recommender systems from mult… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 4 pages. Webconf 2024 demo track

    MSC Class: 68T50

  6. arXiv:2403.05063  [pdf, other

    cs.IR cs.AI

    Aligning Large Language Models for Controllable Recommendations

    Authors: Wensheng Lu, Jianxun Lian, Wei Zhang, Guanghua Li, Mingyang Zhou, Hao Liao, Xing Xie

    Abstract: Inspired by the exceptional general intelligence of Large Language Models (LLMs), researchers have begun to explore their application in pioneering the next generation of recommender systems - systems that are conversational, explainable, and controllable. However, existing literature primarily concentrates on integrating domain-specific knowledge into LLMs to enhance accuracy, often neglecting th… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: 13 pages

    MSC Class: 68T50

  7. arXiv:2403.04483  [pdf, other

    cs.AI cs.CL

    GraphInstruct: Empowering Large Language Models with Graph Understanding and Reasoning Capability

    Authors: Zihan Luo, Xiran Song, Hong Huang, Jianxun Lian, Chenhao Zhang, **qi Jiang, Xing Xie

    Abstract: Evaluating and enhancing the general capabilities of large language models (LLMs) has been an important research topic. Graph is a common data structure in the real world, and understanding graph data is a crucial part for advancing general intelligence. To evaluate and enhance the graph understanding abilities of LLMs, in this paper, we propose a benchmark named GraphInstruct, which comprehensive… ▽ More

    Submitted 2 April, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: 9 pages

  8. arXiv:2403.00529  [pdf, other

    cs.SD cs.LG eess.AS

    VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis

    Authors: Weiwei Lin, Chenhang He, Man-Wai Mak, Jiachen Lian, Kong Aik Lee

    Abstract: Achieving nuanced and accurate emulation of human voice has been a longstanding goal in artificial intelligence. Although significant progress has been made in recent years, the mainstream of speech synthesis models still relies on supervised speaker modeling and explicit reference utterances. However, there are many aspects of human voice, such as emotion, intonation, and speaking style, for whic… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: preprint

  9. arXiv:2402.18899  [pdf, other

    cs.IR

    Aligning Language Models for Versatile Text-based Item Retrieval

    Authors: Yuxuan Lei, Jianxun Lian, **g Yao, Mingqi Wu, Defu Lian, Xing Xie

    Abstract: This paper addresses the gap between general-purpose text embeddings and the specific demands of item retrieval tasks. We demonstrate the shortcomings of existing models in capturing the nuances necessary for zero-shot performance on item retrieval tasks. To overcome these limitations, we propose generate in-domain dataset from ten tasks tailored to unlocking models' representation ability for ite… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 4 pages,1 figures, 4 tables

  10. arXiv:2402.14493  [pdf, ps, other

    cs.DS

    An Improved Pseudopolynomial Time Algorithm for Subset Sum

    Authors: Lin Chen, Jiayi Lian, Yuchen Mao, Guochuan Zhang

    Abstract: We investigate pseudo-polynomial time algorithms for Subset Sum. Given a multi-set $X$ of $n$ positive integers and a target $t$, Subset Sum asks whether some subset of $X$ sums to $t$. Bringmann proposes an $\tilde{O}(n + t)$-time algorithm [Bringmann SODA'17], and an open question has naturally arisen: can Subset Sum be solved in $O(n + w)$ time? Here $w$ is the maximum integer in $X$. We make a… ▽ More

    Submitted 4 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: In first version, we falsely claimed that our algorithm is also able to reconstruct a subset that sums to t. In the latest version, we removed this false claim and explained why we cannot do reconstruction

  11. arXiv:2402.11426  [pdf, ps, other

    cs.DS

    Approximating Partition in Near-Linear Time

    Authors: Lin Chen, Jiayi Lian, Yuchen Mao, Guochuan Zhang

    Abstract: We propose an $\widetilde{O}(n + 1/\eps)$-time FPTAS (Fully Polynomial-Time Approximation Scheme) for the classical Partition problem. This is the best possible (up to a polylogarithmic factor) assuming SETH (Strong Exponential Time Hypothesis) [Abboud, Bringmann, Hermelin, and Shabtay'22]. Prior to our work, the best known FPTAS for Partition runs in $\widetilde{O}(n + 1/\eps^{5/4})$ time [Deng,… ▽ More

    Submitted 6 April, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: To appear in STOC2024

  12. arXiv:2402.02411  [pdf, other

    eess.IV cs.CV

    Physics-Inspired Degradation Models for Hyperspectral Image Fusion

    Authors: Jie Lian, Lizhi Wang, Lin Zhu, Renwei Dian, Zhiwei Xiong, Hua Huang

    Abstract: The fusion of a low-spatial-resolution hyperspectral image (LR-HSI) with a high-spatial-resolution multispectral image (HR-MSI) has garnered increasing research interest. However, most fusion methods solely focus on the fusion algorithm itself and overlook the degradation models, which results in unsatisfactory performance in practical scenarios. To fill this gap, we propose physics-inspired degra… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  13. arXiv:2401.10015  [pdf, other

    cs.CL eess.AS

    Towards Hierarchical Spoken Language Dysfluency Modeling

    Authors: Jiachen Lian, Gopala Anumanchipalli

    Abstract: Speech disfluency modeling is the bottleneck for both speech therapy and language learning. However, there is no effective AI solution to systematically tackle this problem. We solidify the concept of disfluent speech and disfluent speech modeling. We then present Hierarchical Unconstrained Disfluency Modeling (H-UDM) approach, the hierarchical extension of UDM that addresses both disfluency trans… ▽ More

    Submitted 21 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 2024 EACL. Hierarchical extension of our previous workshop paper arXiv:2312.12810

  14. arXiv:2401.08649  [pdf, other

    cs.NE cs.LG

    Deep Pulse-Coupled Neural Networks

    Authors: Zexiang Yi, **g Lian, Yunliang Qi, Zhaofei Yu, Hua** Tang, Yide Ma, Jizhao Liu

    Abstract: Spiking Neural Networks (SNNs) capture the information processing mechanism of the brain by taking advantage of spiking neurons, such as the Leaky Integrate-and-Fire (LIF) model neuron, which incorporates temporal dynamics and transmits information via discrete and asynchronous spikes. However, the simplified biological properties of LIF ignore the neuronal coupling and dendritic structure of real… ▽ More

    Submitted 24 December, 2023; originally announced January 2024.

  15. arXiv:2401.06633  [pdf, other

    cs.IR cs.AI

    Ada-Retrieval: An Adaptive Multi-Round Retrieval Paradigm for Sequential Recommendations

    Authors: Lei Li, Jianxun Lian, Xiao Zhou, Xing Xie

    Abstract: Retrieval models aim at selecting a small set of item candidates which match the preference of a given user. They play a vital role in large-scale recommender systems since subsequent models such as rankers highly depend on the quality of item candidates. However, most existing retrieval models employ a single-round inference paradigm, which may not adequately capture the dynamic nature of user pr… ▽ More

    Submitted 31 January, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: 9 pages, Accepted to AAAI2024

  16. arXiv:2312.12810  [pdf, other

    eess.AS cs.SD

    Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

    Authors: Jiachen Lian, Carly Feng, Naasir Farooqi, Steve Li, Anshul Kashyap, Cheol Jun Cho, Peter Wu, Robbie Netzorg, Tingle Li, Gopala Krishna Anumanchipalli

    Abstract: Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level. However, current research in dysfluency modeling primarily focuses on either transcription or detection, and the performance of each aspect remains limited. In this work, we present an unconstrained dysfluency modeling (UDM) approach that addresses both transcription and dete… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 2023 ASRU

  17. arXiv:2312.11111  [pdf, other

    cs.AI cs.CL cs.HC

    The Good, The Bad, and Why: Unveiling Emotions in Generative AI

    Authors: Cheng Li, **dong Wang, Yixuan Zhang, Kaijie Zhu, Xinyi Wang, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie

    Abstract: Emotion significantly impacts our daily behaviors and interactions. While recent generative AI models, such as large language models, have shown impressive performance in various tasks, it remains unclear whether they truly comprehend emotions. This paper aims to address this gap by incorporating psychological theories to gain a holistic understanding of emotions in generative AI models. Specifica… ▽ More

    Submitted 7 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: International Conference on Machine Learning (ICML) 2024; an extension to EmotionPrompt (arXiv:2307.11760)

  18. arXiv:2311.14304  [pdf, other

    cs.LG

    AdaMedGraph: Adaboosting Graph Neural Networks for Personalized Medicine

    Authors: Jie Lian, Xufang Luo, Caihua Shan, Dongqi Han, Varut Vardhanabhuti, Dongsheng Li

    Abstract: Precision medicine tailored to individual patients has gained significant attention in recent times. Machine learning techniques are now employed to process personalized data from various sources, including images, genetics, and assessments. These techniques have demonstrated good outcomes in many clinical prediction tasks. Notably, the approach of constructing graphs by linking similar patients a… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 9 pages

  19. RecExplainer: Aligning Large Language Models for Explaining Recommendation Models

    Authors: Yuxuan Lei, Jianxun Lian, **g Yao, Xu Huang, Defu Lian, Xing Xie

    Abstract: Recommender systems are widely used in online services, with embedding-based models being particularly popular due to their expressiveness in representing complex signals. However, these models often function as a black box, making them less transparent and reliable for both users and developers. Recently, large language models (LLMs) have demonstrated remarkable intelligence in understanding, rea… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 12 pages, 9 figures, 5 tables

  20. arXiv:2311.10779  [pdf, other

    cs.IR cs.AI

    Knowledge Plugins: Enhancing Large Language Models for Domain-Specific Recommendations

    Authors: **g Yao, Wei Xu, Jianxun Lian, Xiting Wang, Xiaoyuan Yi, Xing Xie

    Abstract: The significant progress of large language models (LLMs) provides a promising opportunity to build human-like systems for various practical applications. However, when applied to specific task domains, an LLM pre-trained on a general-purpose corpus may exhibit a deficit or inadequacy in two types of domain-specific knowledge. One is a comprehensive set of domain data that is typically large-scale… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  21. arXiv:2310.13260  [pdf, other

    cs.IR

    A Data-Centric Multi-Objective Learning Framework for Responsible Recommendation Systems

    Authors: Xu Huang, Jianxun Lian, Hao Wang, Defu Lian, Xing Xie

    Abstract: Recommendation systems effectively guide users in locating their desired information within extensive content repositories. Generally, a recommendation model is optimized to enhance accuracy metrics from a user utility standpoint, such as click-through rate or matching relevance. However, a responsible industrial recommendation system must address not only user utility (responsibility to users) bu… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 10 pages

  22. arXiv:2310.05962  [pdf, other

    cs.IT cs.LG eess.SP

    Improving the Performance of R17 Type-II Codebook with Deep Learning

    Authors: Ke Ma, Yiliang Sang, Yang Ming, ** Lian, Chang Tian, Zhaocheng Wang

    Abstract: The Type-II codebook in Release 17 (R17) exploits the angular-delay-domain partial reciprocity between uplink and downlink channels to select part of angular-delay-domain ports for measuring and feeding back the downlink channel state information (CSI), where the performance of existing deep learning enhanced CSI feedback methods is limited due to the deficiency of sparse structures. To address th… ▽ More

    Submitted 13 September, 2023; originally announced October 2023.

    Comments: Accepted by IEEE GLOBECOM 2023, conference version of Arxiv:2305.08081

  23. arXiv:2309.15203  [pdf, other

    cs.CR cs.HC eess.SP

    Eve Said Yes: AirBone Authentication for Head-Wearable Smart Voice Assistant

    Authors: Chenpei Huang, Hui Zhong, Jie Lian, Pavana Prakash, Dian Shi, Yuan Xu, Miao Pan

    Abstract: Recent advances in machine learning and natural language processing have fostered the enormous prosperity of smart voice assistants and their services, e.g., Alexa, Google Home, Siri, etc. However, voice spoofing attacks are deemed to be one of the major challenges of voice control security, and never stop evolving such as deep-learning-based voice conversion and speech synthesis techniques. To so… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 13 pages, 12 figures

  24. arXiv:2309.14764  [pdf, other

    cs.CV

    InvKA: Gait Recognition via Invertible Koopman Autoencoder

    Authors: Fan Li, Dong Liang, **g Lian, Qidong Liu, Hegui Zhu, Jizhao Liu

    Abstract: Most current gait recognition methods suffer from poor interpretability and high computational cost. To improve interpretability, we investigate gait features in the embedding space based on Koopman operator theory. The transition matrix in this space captures complex kinematic features of gait cycles, namely the Koopman operator. The diagonal elements of the operator matrix can represent the over… ▽ More

    Submitted 27 September, 2023; v1 submitted 26 September, 2023; originally announced September 2023.

  25. arXiv:2309.12239  [pdf, other

    cs.DB

    ContTune: Continuous Tuning by Conservative Bayesian Optimization for Distributed Stream Data Processing Systems

    Authors: **qing Lian, Xinyi Zhang, Yingxia Shao, Zenglin Pu, Qingfeng Xiang, Yawen Li, Bin Cui

    Abstract: The past decade has seen rapid growth of distributed stream data processing systems. Under these systems, a stream application is realized as a Directed Acyclic Graph (DAG) of operators, where the level of parallelism of each operator has a substantial impact on its overall performance. However, finding optimal levels of parallelism remains challenging. Most existing methods are heavily coupled wi… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  26. arXiv:2309.09088  [pdf, other

    cs.SD eess.AS

    Enhancing GAN-Based Vocoders with Contrastive Learning Under Data-limited Condition

    Authors: Haoming Guo, Seth Z. Zhao, Jiachen Lian, Gopala Anumanchipalli, Gerald Friedland

    Abstract: Vocoder models have recently achieved substantial progress in generating authentic audio comparable to human quality while significantly reducing memory requirement and inference time. However, these data-hungry generative models require large-scale audio data for learning good representations. In this paper, we apply contrastive learning methods in training the vocoder to improve the perceptual q… ▽ More

    Submitted 18 December, 2023; v1 submitted 16 September, 2023; originally announced September 2023.

  27. arXiv:2308.16505  [pdf, other

    cs.IR cs.AI

    Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations

    Authors: Xu Huang, Jianxun Lian, Yuxuan Lei, **g Yao, Defu Lian, Xing Xie

    Abstract: Recommender models excel at providing domain-specific item recommendations by leveraging extensive user behavior data. Despite their ability to act as lightweight domain experts, they struggle to perform versatile tasks such as providing explanations and engaging in conversations. On the other hand, large language models (LLMs) represent a significant step towards artificial general intelligence,… ▽ More

    Submitted 29 January, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

    Comments: 18 pages, 17 figures, 7 tables

  28. arXiv:2308.07821  [pdf, ps, other

    cs.DS

    A Nearly Quadratic-Time FPTAS for Knapsack

    Authors: Lin Chen, Jiayi Lian, Yuchen Mao, Guochuan Zhang

    Abstract: We investigate polynomial-time approximation schemes for the classic 0-1 knapsack problem. The previous algorithm by Deng, **, and Mao (SODA'23) has approximation factor $1 + \eps$ with running time $\widetilde{O}(n + \frac{1}{\eps^{2.2}})$. There is a lower Bound of $(n + \frac{1}{\eps})^{2-o(1)}$ conditioned on the hypothesis that $(\min, +)$ has no truly subquadratic algorithm. We close the ga… ▽ More

    Submitted 29 April, 2024; v1 submitted 15 August, 2023; originally announced August 2023.

  29. arXiv:2308.02925  [pdf, other

    cs.AI cs.IR cs.SI

    ConvFormer: Revisiting Transformer for Sequential User Modeling

    Authors: Hao Wang, Jianxun Lian, Mingqi Wu, Haoxuan Li, Jiajun Fan, Wanyue Xu, Chaozhuo Li, Xing Xie

    Abstract: Sequential user modeling, a critical task in personalized recommender systems, focuses on predicting the next item a user would prefer, requiring a deep understanding of user behavior sequences. Despite the remarkable success of Transformer-based models across various domains, their full potential in comprehending user behavior remains untapped. In this paper, we re-examine Transformer-like archit… ▽ More

    Submitted 8 October, 2023; v1 submitted 5 August, 2023; originally announced August 2023.

  30. arXiv:2307.12582  [pdf, ps, other

    cs.DS

    Faster Algorithms for Bounded Knapsack and Bounded Subset Sum Via Fine-Grained Proximity Results

    Authors: Lin Chen, Jiayi Lian, Yuchen Mao, Guochuan Zhang

    Abstract: We investigate pseudopolynomial-time algorithms for Bounded Knapsack and Bounded Subset Sum. Recent years have seen a growing interest in settling their fine-grained complexity with respect to various parameters. For Bounded Knapsack, the number of items $n$ and the maximum item weight $w_{\max}$ are two of the most natural parameters that have been studied extensively in the literature. The previ… ▽ More

    Submitted 4 December, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: To appear in SODA2024

  31. arXiv:2307.11760  [pdf, other

    cs.CL cs.AI cs.HC

    Large Language Models Understand and Can be Enhanced by Emotional Stimuli

    Authors: Cheng Li, **dong Wang, Yixuan Zhang, Kaijie Zhu, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie

    Abstract: Emotional intelligence significantly impacts our daily behaviors and interactions. Although Large Language Models (LLMs) are increasingly viewed as a stride toward artificial general intelligence, exhibiting impressive performance in numerous tasks, it is still uncertain if LLMs can genuinely grasp psychological emotional stimuli. Understanding and responding to emotional cues gives humans a disti… ▽ More

    Submitted 12 November, 2023; v1 submitted 13 July, 2023; originally announced July 2023.

    Comments: Technical report; updated the std error for human study; short version (v1) was accepted by LLM@IJCAI'23; 32 pages; more work: https://llm-enhance.github.io/

  32. arXiv:2306.12111  [pdf, other

    cs.CV

    A Comprehensive Study on the Robustness of Image Classification and Object Detection in Remote Sensing: Surveying and Benchmarking

    Authors: Shaohui Mei, Jiawei Lian, Xiaofei Wang, Yuru Su, Mingyang Ma, Lap-Pui Chau

    Abstract: Deep neural networks (DNNs) have found widespread applications in interpreting remote sensing (RS) imagery. However, it has been demonstrated in previous works that DNNs are vulnerable to different types of noises, particularly adversarial noises. Surprisingly, there has been a lack of comprehensive studies on the robustness of RS tasks, prompting us to undertake a thorough survey and benchmark on… ▽ More

    Submitted 15 September, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  33. Toward Cost-effective Adaptive Random Testing: An Approximate Nearest Neighbor Approach

    Authors: Rubing Huang, Chenhui Cui, Junlong Lian, Dave Towey, Weifeng Sun, Haibo Chen

    Abstract: Adaptive Random Testing (ART) enhances the testing effectiveness (including fault-detection capability) of Random Testing (RT) by increasing the diversity of the random test cases throughout the input domain. Many ART algorithms have been investigated such as Fixed-Size-Candidate-Set ART (FSCS) and Restricted Random Testing (RRT), and have been widely used in many practical applications. Despite i… ▽ More

    Submitted 19 March, 2024; v1 submitted 27 May, 2023; originally announced May 2023.

    Comments: To be published in IEEE Transactions on Software Engineering

  34. arXiv:2305.17371  [pdf, other

    cs.CL

    Towards Better Entity Linking with Multi-View Enhanced Distillation

    Authors: Yi Liu, Yuan Tian, Jianxun Lian, Xinlong Wang, Yanan Cao, Fang Fang, Wen Zhang, Haizhen Huang, Denvy Deng, Qi Zhang

    Abstract: Dense retrieval is widely used for entity linking to retrieve entities from large-scale knowledge bases. Mainstream techniques are based on a dual-encoder framework, which encodes mentions and entities independently and calculates their relevances via rough interaction metrics, resulting in difficulty in explicitly modeling multiple mention-relevant parts within entities to match divergent mention… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 Main Conference

  35. arXiv:2305.08081  [pdf, other

    cs.IT cs.AI

    Deep Learning Empowered Type-II Codebook: New Paradigm for Enhancing CSI Feedback

    Authors: Ke Ma, Yiliang Sang, Yang Ming, ** Lian, Chang Tian, Zhaocheng Wang

    Abstract: Deep learning based channel state information (CSI) feedback in frequency division duplex systems has drawn much attention in both academia and industry. In this paper, we focus on integrating the Type-II codebook in the beyond fifth-generation (B5G) wireless systems with deep learning to enhance the performance of CSI feedback. In contrast to its counterpart in Release 16, the Type-II codebook in… ▽ More

    Submitted 30 May, 2023; v1 submitted 14 May, 2023; originally announced May 2023.

    Comments: This updated version has been submitted to IEEE for possible publication. Copyright may be transferred without notice

  36. Towards Explainable Collaborative Filtering with Taste Clusters Learning

    Authors: Yuntao Du, Jianxun Lian, **g Yao, Xiting Wang, Mingqi Wu, Lu Chen, Yunjun Gao, Xing Xie

    Abstract: Collaborative Filtering (CF) is a widely used and effective technique for recommender systems. In recent decades, there have been significant advancements in latent embedding-based CF methods for improved accuracy, such as matrix factorization, neural collaborative filtering, and LightGCN. However, the explainability of these models has not been fully explored. Adding explainability to recommendat… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: Accepted to WWW 2023

  37. arXiv:2303.01130  [pdf, other

    cs.IR cs.AI

    Distillation from Heterogeneous Models for Top-K Recommendation

    Authors: SeongKu Kang, Wonbin Kweon, Dongha Lee, Jianxun Lian, Xing Xie, Hwanjo Yu

    Abstract: Recent recommender systems have shown remarkable performance by using an ensemble of heterogeneous models. However, it is exceedingly costly because it requires resources and inference latency proportional to the number of models, which remains the bottleneck for production. Our work aims to transfer the ensemble knowledge of heterogeneous teachers to a lightweight student model using knowledge di… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: TheWebConf'23

  38. CBA: Contextual Background Attack against Optical Aerial Detection in the Physical World

    Authors: Jiawei Lian, Xiaofei Wang, Yuru Su, Mingyang Ma, Shaohui Mei

    Abstract: Patch-based physical attacks have increasingly aroused concerns. However, most existing methods focus on obscuring targets captured on the ground, and some of these methods are simply extended to deceive aerial detectors. They smear the targeted objects in the physical world with the elaborated adversarial patches, which can only slightly sway the aerial detectors' prediction and with weak att… ▽ More

    Submitted 23 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  39. arXiv:2302.13487  [pdf, other

    cs.CV

    Contextual adversarial attack against aerial detection in the physical world

    Authors: Jiawei Lian, Xiaofei Wang, Yuru Su, Mingyang Ma, Shaohui Mei

    Abstract: Deep Neural Networks (DNNs) have been extensively utilized in aerial detection. However, DNNs' sensitivity and vulnerability to maliciously elaborated adversarial examples have progressively garnered attention. Recently, physical attacks have gradually become a hot issue due to they are more practical in the real world, which poses great threats to some security-critical applications. In this pape… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

  40. arXiv:2302.06419  [pdf, other

    eess.AS cs.AI cs.CL

    AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

    Authors: Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli

    Abstract: Self-supervision has shown great potential for audio-visual speech recognition by vastly reducing the amount of labeled data required to build good systems. However, existing methods are either not entirely end-to-end or do not train joint representations of both modalities. In this paper, we introduce AV-data2vec which addresses these challenges and builds audio-visual representations based on pr… ▽ More

    Submitted 21 January, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: 2023 ASRU

  41. Benchmarking Adversarial Patch Against Aerial Detection

    Authors: Jiawei Lian, Shaohui Mei, Shun Zhang, Mingyang Ma

    Abstract: DNNs are vulnerable to adversarial examples, which poses great security concerns for security-critical systems. In this paper, a novel adaptive-patch-based physical attack (AP-PA) framework is proposed, which aims to generate adversarial patches that are adaptive in both physical dynamics and varying scales, and by which the particular targets can be hidden from being detected. Furthermore, the ad… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: 14 pages, 14 figures

  42. arXiv:2210.16498  [pdf, other

    eess.AS cs.SD

    Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization

    Authors: Jiachen Lian, Alan W Black, Yi**g Lu, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

    Abstract: Articulatory representation learning is the fundamental research in modeling neural speech production system. Our previous work has established a deep paradigm to decompose the articulatory kinematics data into gestures, which explicitly model the phonological and linguistic structure encoded with human speech production mechanism, and corresponding gestural scores. We continue with this line of w… ▽ More

    Submitted 20 February, 2023; v1 submitted 29 October, 2022; originally announced October 2022.

    Comments: Accepted to 2023 ICASSP. Camera Ready

  43. arXiv:2208.09953  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Do-AIQ: A Design-of-Experiment Approach to Quality Evaluation of AI Mislabel Detection Algorithm

    Authors: J. Lian, K. Choi, B. Veeramani, A. Hu, L. Freeman, E. Bowen, X. Deng

    Abstract: The quality of Artificial Intelligence (AI) algorithms is of significant importance for confidently adopting algorithms in various applications such as cybersecurity, healthcare, and autonomous driving. This work presents a principled framework of using a design-of-experimental approach to systematically evaluate the quality of AI algorithms, named as Do-AIQ. Specifically, we focus on investigatin… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

  44. arXiv:2208.01250  [pdf, other

    cs.IR

    Geometric Interaction Augmented Graph Collaborative Filtering

    Authors: Yiding Zhang, Chaozhuo Li, Senzhang Wang, Jianxun Lian, Xing Xie

    Abstract: Graph-based collaborative filtering is capable of capturing the essential and abundant collaborative signals from the high-order interactions, and thus received increasingly research interests. Conventionally, the embeddings of users and items are defined in the Euclidean spaces, along with the propagation on the interaction graphs. Meanwhile, recent works point out that the high-order interaction… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

  45. arXiv:2206.02512  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder

    Authors: Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu

    Abstract: In this paper, we propose a novel unsupervised text-to-speech (UTTS) framework which does not require text-audio pairs for the TTS acoustic modeling (AM). UTTS is a multi-speaker speech synthesizer that supports zero-shot voice cloning, it is developed from a perspective of disentangled speech representation learning. The framework offers a flexible choice of a speaker's duration model, timbre fea… ▽ More

    Submitted 11 October, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Under Review

  46. arXiv:2206.00212  [pdf, ps, other

    cs.IR

    Negative Sampling for Contrastive Representation Learning: A Review

    Authors: Lanling Xu, Jianxun Lian, Wayne Xin Zhao, Ming Gong, Linjun Shou, Daxin Jiang, Xing Xie, Ji-Rong Wen

    Abstract: The learn-to-compare paradigm of contrastive representation learning (CRL), which compares positive samples with negative ones for representation learning, has achieved great success in a wide range of domains, including natural language processing, computer vision, information retrieval and graph learning. While many research works focus on data augmentations, nonlinear transformations or other c… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

    Comments: 6 pages

  47. Ada-Ranker: A Data Distribution Adaptive Ranking Paradigm for Sequential Recommendation

    Authors: Xinyan Fan, Jianxun Lian, Wayne Xin Zhao, Zheng Liu, Chaozhuo Li, Xing Xie

    Abstract: A large-scale recommender system usually consists of recall and ranking modules. The goal of ranking modules (aka rankers) is to elaborately discriminate users' preference on item candidates proposed by recall modules. With the success of deep learning techniques in various domains, we have witnessed the mainstream rankers evolve from traditional models to deep neural models. However, the way that… ▽ More

    Submitted 22 May, 2022; originally announced May 2022.

    Comments: 12 pages

  48. arXiv:2205.05227  [pdf, ps, other

    eess.AS cs.AI cs.CL cs.SD

    Towards Improved Zero-shot Voice Conversion with Conditional DSVAE

    Authors: Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu

    Abstract: Disentangling content and speaking style information is essential for zero-shot non-parallel voice conversion (VC). Our previous study investigated a novel framework with disentangled sequential variational autoencoder (DSVAE) as the backbone for information decomposition. We have demonstrated that simultaneous disentangling content embedding and speaker embedding from one utterance is feasible fo… ▽ More

    Submitted 20 June, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: Accepted to 2022 Interspeech. Demo link is here https://jlian2.github.io/Improved-Voice-Conversion-with-Conditional-DSVAE/

  49. arXiv:2204.00465  [pdf, other

    eess.AS cs.AI eess.SP

    Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition

    Authors: Jiachen Lian, Alan W Black, Louis Goldstein, Gopala Krishna Anumanchipalli

    Abstract: Most of the research on data-driven speech representation learning has focused on raw audios in an end-to-end manner, paying little attention to their internal phonological or gestural structure. This work, investigating the speech representations derived from articulatory kinematics signals, uses a neural implementation of convolutive sparse matrix factorization to decompose the articulatory data… ▽ More

    Submitted 20 June, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: Accepted to 2022 Interspeech. Code is publicly available at https://github.com/Berkeley-Speech-Group/ema_gesture

  50. arXiv:2203.16705  [pdf, other

    eess.AS cs.AI cs.CL eess.SP

    Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion

    Authors: Jiachen Lian, Chunlei Zhang, Dong Yu

    Abstract: Traditional studies on voice conversion (VC) have made progress with parallel training data and known speakers. Good voice conversion quality is obtained by exploring better alignment modules or expressive map** functions. In this study, we investigate zero-shot VC from a novel perspective of self-supervised disentangled speech representation learning. Specifically, we achieve the disentanglemen… ▽ More

    Submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted to 2022 ICASSP