Skip to main content

Showing 1–50 of 103 results for author: Liao, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15522  [pdf, other

    cs.GT econ.EM math.ST stat.AP

    Statistical Inference and A/B Testing in Fisher Markets and Paced Auctions

    Authors: Luofeng Liao, Christian Kroer

    Abstract: We initiate the study of statistical inference and A/B testing for two market equilibrium models: linear Fisher market (LFM) equilibrium and first-price pacing equilibrium (FPPE). LFM arises from fair resource allocation systems such as allocation of food to food banks and notification opportunities to different types of notifications. For LFM, we assume that the data observed is captured by the c… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2301.02276, arXiv:2209.15422

  2. arXiv:2406.15177  [pdf, other

    cs.MM

    EmpathyEar: An Open-source Avatar Multimodal Empathetic Chatbot

    Authors: Hao Fei, Han Zhang, Bin Wang, Lizi Liao, Qian Liu, Erik Cambria

    Abstract: This paper introduces EmpathyEar, a pioneering open-source, avatar-based multimodal empathetic chatbot, to fill the gap in traditional text-only empathetic response generation (ERG) systems. Leveraging the advancements of a large language model, combined with multimodal encoders and generators, EmpathyEar supports user inputs in any combination of text, sound, and vision, and produces multimodal e… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Demonstration Paper

  3. arXiv:2406.05374  [pdf, other

    cs.CL

    Planning Like Human: A Dual-process Framework for Dialogue Planning

    Authors: Tao He, Lizi Liao, Yixin Cao, Yuanxing Liu, Ming Liu, Zerui Chen, Bing Qin

    Abstract: In proactive dialogue, the challenge lies not just in generating responses but in steering conversations toward predetermined goals, a task where Large Language Models (LLMs) typically struggle due to their reactive nature. Traditional approaches to enhance dialogue planning in LLMs, ranging from elaborate prompt engineering to the integration of policy networks, either face efficiency issues or d… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 24 pages, 5 figures, ACL 2024 main conference

  4. arXiv:2406.02472  [pdf, other

    cs.CL

    Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding

    Authors: Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua

    Abstract: The digital landscape is rapidly evolving with an ever-increasing volume of online news, emphasizing the need for swift and precise analysis of complex events. We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE). This paper proposes a novel approach using Large Language Models (LLMs) to systematically extract and analyze the event c… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024

  5. arXiv:2406.01326  [pdf, other

    cs.CV

    TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

    Authors: Weichao Zhao, Hao Feng, Qi Liu, **gqun Tang, Shu Wei, Binghong Wu, Lei Liao, Yongjie Ye, Hao Liu, Houqiang Li, Can Huang

    Abstract: Tables contain factual and quantitative data accompanied by various structures and contents that pose challenges for machine comprehension. Previous methods generally design task-specific architectures and objectives for individual tasks, resulting in modal isolation and intricate workflows. In this paper, we present a novel large vision-language model, TabPedia, equipped with a concept synergy me… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 20 pages, 8 figures

  6. arXiv:2404.12803  [pdf, other

    cs.CV cs.LG

    TextSquare: Scaling up Text-Centric Visual Instruction Tuning

    Authors: **gqun Tang, Chunhui Lin, Zhen Zhao, Shu Wei, Binghong Wu, Qi Liu, Hao Feng, Yang Li, Siqi Wang, Lei Liao, Wei Shi, Yuliang Liu, Hao Liu, Yuan Xie, Xiang Bai, Can Huang

    Abstract: Text-centric visual question answering (VQA) has made great strides with the development of Multimodal Large Language Models (MLLMs), yet open-source models still fall short of leading models like GPT4V and Gemini, partly due to a lack of extensive, high-quality instruction tuning data. To this end, we introduce a new approach for creating a massive, high-quality instruction-tuning dataset, Square… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  7. arXiv:2404.12670  [pdf, other

    cs.IR cs.CL cs.HC

    Towards Human-centered Proactive Conversational Agents

    Authors: Yang Deng, Lizi Liao, Zhonghua Zheng, Grace Hui Yang, Tat-Seng Chua

    Abstract: Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system's capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals before users articulate their requests. This perspectives paper highlights the importance of moving towards building human-centered PCAs that emphasize human needs and expectations, and that considers eth… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGIR 2024 (Perspectives Track)

  8. arXiv:2404.04925  [pdf, other

    cs.CL

    Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers

    Authors: Libo Qin, Qiguang Chen, Yuhang Zhou, Zhi Chen, Yinghui Li, Lizi Liao, Min Li, Wanxiang Che, Philip S. Yu

    Abstract: Multilingual Large Language Models are capable of using powerful Large Language Models to handle and respond to queries in multiple languages, which achieves remarkable success in multilingual natural language processing tasks. Despite these breakthroughs, there still remains a lack of a comprehensive survey to summarize existing approaches and recent developments in this field. To this end, in th… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  9. arXiv:2404.01923  [pdf, other

    cs.CL cs.AI

    SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation

    Authors: Shasha Guo, Lizi Liao, **g Zhang, Yanling Wang, Cui** Li, Hong Chen

    Abstract: Knowledge base question generation (KBQG) aims to generate natural language questions from a set of triplet facts extracted from KB. Existing methods have significantly boosted the performance of KBQG via pre-trained language models (PLMs) thanks to the richly endowed semantic knowledge. With the advance of pre-training techniques, large language models (LLMs) (e.g., GPT-3.5) undoubtedly possess m… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL 2024 Findings

  10. arXiv:2403.17636  [pdf, other

    cs.CL

    Mix-Initiative Response Generation with Dynamic Prefix Tuning

    Authors: Yuxiang Nie, Heyan Huang, Xian-Ling Mao, Lizi Liao

    Abstract: Mixed initiative serves as one of the key factors in controlling conversation directions. For a speaker, responding passively or leading proactively would result in rather different responses. However, most dialogue systems focus on training a holistic response generation model without any distinction among different initiatives. It leads to the cross-contamination problem, where the model confuse… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted to the main conference of NAACL 2024

  11. arXiv:2403.16080  [pdf, other

    cs.CV

    PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

    Authors: Xiaoyun Zheng, Liwei Liao, Xufeng Li, Jianbo Jiao, Rongjie Wang, Feng Gao, Shiqi Wang, Ronggang Wang

    Abstract: High-quality human reconstruction and photo-realistic rendering of a dynamic scene is a long-standing problem in computer vision and graphics. Despite considerable efforts invested in develo** various capture systems and reconstruction algorithms, recent advancements still struggle with loose or oversized clothing and overly complex poses. In part, this is due to the challenges of acquiring high… ▽ More

    Submitted 2 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: CVPR2024(accepted). Project page: https://pku-dymvhumans.github.io

  12. arXiv:2403.10827   

    cs.CL

    Multi-party Response Generation with Relation Disentanglement

    Authors: Tianhao Dai, Chengyu Huang, Lizi Liao

    Abstract: Existing neural response generation models have achieved impressive improvements for two-party conversations, which assume that utterances are sequentially organized. However, many real-world dialogues involve multiple interlocutors and the structure of conversational context is much more complex, e.g. utterances from different interlocutors can occur "in parallel". Facing this challenge, there ar… ▽ More

    Submitted 22 March, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

    Comments: The paper needs systematic polishment to consider recent development in dialogue

  13. arXiv:2403.02825  [pdf, other

    cs.IR

    Contrastive Pre-training for Deep Session Data Understanding

    Authors: Zixuan Li, Lizi Liao, Yunshan Ma, Tat-Seng Chua

    Abstract: Session data has been widely used for understanding user's behavior in e-commerce. Researchers are trying to leverage session data for different tasks, such as purchase intention prediction, remaining length prediction, recommendation, etc., as it provides context clues about the user's dynamic interests. However, online shop** session data is semi-structured and complex in nature, which contain… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  14. arXiv:2403.02754  [pdf, other

    cs.IR

    Learning to Ask Critical Questions for Assisting Product Search

    Authors: Zixuan Li, Lizi Liao, Tat-Seng Chua

    Abstract: Product search plays an essential role in eCommerce. It was treated as a special type of information retrieval problem. Most existing works make use of historical data to improve the search performance, which do not take the opportunity to ask for user's current interest directly. Some session-aware methods take the user's clicks within the session as implicit feedback, but it is still just a gues… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: SIGIR eCom'22

  15. arXiv:2402.18267  [pdf, ps, other

    cs.CL cs.AI

    A Survey on Neural Question Generation: Methods, Applications, and Prospects

    Authors: Shasha Guo, Lizi Liao, Cui** Li, Tat-Seng Chua

    Abstract: In this survey, we present a detailed examination of the advancements in Neural Question Generation (NQG), a field leveraging neural network techniques to generate relevant questions from diverse inputs like knowledge bases, texts, and images. The survey begins with an overview of NQG's background, encompassing the task's problem formulation, prevalent benchmark datasets, established evaluation me… ▽ More

    Submitted 7 May, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by IJCAI 2024

  16. Learning Invariant Inter-pixel Correlations for Superpixel Generation

    Authors: Sen Xu, Shikui Wei, Tao Ruan, Lixin Liao

    Abstract: Deep superpixel algorithms have made remarkable strides by substituting hand-crafted features with learnable ones. Nevertheless, we observe that existing deep superpixel methods, serving as mid-level representation operations, remain sensitive to the statistical properties (e.g., color distribution, high-level semantics) embedded within the training dataset. Consequently, learnable features exhibi… ▽ More

    Submitted 9 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI24

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 6351-6359 (2024)

  17. arXiv:2402.17152  [pdf, other

    cs.LG cs.IR

    Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations

    Authors: Jiaqi Zhai, Lucy Liao, Xing Liu, Yueming Wang, Rui Li, Xuan Cao, Leon Gao, Zhaojie Gong, Fangda Gu, Michael He, Yinghai Lu, Yu Shi

    Abstract: Large-scale recommendation systems are characterized by their reliance on high cardinality, heterogeneous features and the need to handle tens of billions of user actions on a daily basis. Despite being trained on huge volume of data with thousands of features, most Deep Learning Recommendation Models (DLRMs) in industry fail to scale with compute. Inspired by success achieved by Transformers in… ▽ More

    Submitted 5 May, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 26 pages, 13 figures. ICML'24. Code available at https://github.com/facebookresearch/generative-recommenders

  18. arXiv:2402.16641  [pdf, other

    cs.CV

    Towards Open-ended Visual Quality Comparison

    Authors: Haoning Wu, Hanwei Zhu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Annan Wang, Wenxiu Sun, Qiong Yan, Xiaohong Liu, Guangtao Zhai, Shiqi Wang, Weisi Lin

    Abstract: Comparative settings (e.g. pairwise choice, listwise ranking) have been adopted by a wide range of subjective studies for image quality assessment (IQA), as it inherently standardizes the evaluation criteria across different observers and offer more clear-cut responses. In this work, we extend the edge of emerging large multi-modality models (LMMs) to further advance visual quality comparison into… ▽ More

    Submitted 4 March, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Fix typos

  19. arXiv:2402.07322  [pdf, other

    math.ST cs.GT econ.EM

    Interference Among First-Price Pacing Equilibria: A Bias and Variance Analysis

    Authors: Luofeng Liao, Christian Kroer, Sergei Leonenkov, Okke Schrijvers, Liang Shi, Nicolas Stier-Moses, Congshan Zhang

    Abstract: Online A/B testing is widely used in the internet industry to inform decisions on new feature roll-outs. For online marketplaces (such as advertising markets), standard approaches to A/B testing may lead to biased results when buyers operate under a budget constraint, as budget consumption in one arm of the experiment impacts performance of the other arm. To counteract this interference, one can u… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  20. arXiv:2402.02303  [pdf, other

    math.ST cs.GT econ.EM stat.AP

    Bootstrap** Fisher Market Equilibrium and First-Price Pacing Equilibrium

    Authors: Luofeng Liao, Christian Kroer

    Abstract: The linear Fisher market (LFM) is a basic equilibrium model from economics, which also has applications in fair and efficient resource allocation. First-price pacing equilibrium (FPPE) is a model capturing budget-management mechanisms in first-price auctions. In certain practical settings such as advertising auctions, there is an interest in performing statistical inference over these models. A po… ▽ More

    Submitted 11 February, 2024; v1 submitted 3 February, 2024; originally announced February 2024.

    Comments: fix author names

  21. arXiv:2401.14009  [pdf, other

    cs.SI

    On the Feasibility of Simple Transformer for Dynamic Graph Modeling

    Authors: Yuxia Wu, Yuan Fang, Lizi Liao

    Abstract: Dynamic graph modeling is crucial for understanding complex structures in web graphs, spanning applications in social networks, recommender systems, and more. Most existing methods primarily emphasize structural dependencies and their temporal changes. However, these approaches often overlook detailed temporal aspects or struggle with long-term dependencies. Furthermore, many solutions overly comp… ▽ More

    Submitted 26 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: accepted by WWW'24

  22. arXiv:2312.17090  [pdf, other

    cs.CV cs.CL cs.LG

    Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

    Authors: Haoning Wu, Zicheng Zhang, Weixia Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Yixuan Gao, Annan Wang, Erli Zhang, Wenxiu Sun, Qiong Yan, Xiongkuo Min, Guangtao Zhai, Weisi Lin

    Abstract: The explosion of visual content available online underscores the requirement for an accurate machine assessor to robustly evaluate scores across diverse types of visual contents. While recent studies have demonstrated the exceptional potentials of large multi-modality models (LMMs) on a wide range of related fields, in this work, we explore how to teach them for visual rating aligned with human op… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Technical Report

  23. arXiv:2312.15291  [pdf, other

    cs.CL

    Reverse Multi-Choice Dialogue Commonsense Inference with Graph-of-Thought

    Authors: Li Zheng, Hao Fei, Fei Li, Bobo Li, Lizi Liao, Donghong Ji, Chong Teng

    Abstract: With the proliferation of dialogic data across the Internet, the Dialogue Commonsense Multi-choice Question Answering (DC-MCQ) task has emerged as a response to the challenge of comprehending user queries and intentions. Although prevailing methodologies exhibit effectiveness in addressing single-choice questions, they encounter difficulties in handling multi-choice queries due to the heightened i… ▽ More

    Submitted 26 December, 2023; v1 submitted 23 December, 2023; originally announced December 2023.

    Comments: This paper has been accepted by the 38th Annual AAAI Conference on Artificial Intelligence (AAAI'24, FEBRUARY 20-27, 2024, VANCOUVER, CANADA)

  24. arXiv:2312.12439  [pdf, other

    cs.CV physics.optics

    Single-pixel 3D imaging based on fusion temporal data of single photon detector and millimeter-wave radar

    Authors: Tingqin Lai, Xiaolin Liang, Yi Zhu, Xinyi Wu, Lianye Liao, Xuelin Yuan, ** Su, Shihai Sun

    Abstract: Recently, there has been increased attention towards 3D imaging using single-pixel single-photon detection (also known as temporal data) due to its potential advantages in terms of cost and power efficiency. However, to eliminate the symmetry blur in the reconstructed images, a fixed background is required. This paper proposes a fusion-data-based 3D imaging method that utilizes a single-pixel sing… ▽ More

    Submitted 20 October, 2023; originally announced December 2023.

    Comments: Accepted by Chinese Optics Letters, and comments are welcome

    Journal ref: Chinese Optics Letters, Vol.2, No.2, 2024

  25. arXiv:2312.05616  [pdf, other

    cs.CV

    Iterative Token Evaluation and Refinement for Real-World Super-Resolution

    Authors: Chaofeng Chen, Shangchen Zhou, Liang Liao, Haoning Wu, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: Real-world image super-resolution (RWSR) is a long-standing problem as low-quality (LQ) images often have complex and unidentified degradations. Existing methods such as Generative Adversarial Networks (GANs) or continuous diffusion models present their own issues including GANs being difficult to train while continuous diffusion models requiring numerous inference steps. In this paper, we propose… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: To appear in AAAI2024, https://github.com/chaofengc/ITER

  26. arXiv:2311.15657  [pdf, other

    cs.CV

    Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

    Authors: Chaofeng Chen, Annan Wang, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: Text-to-image diffusion models are typically trained to optimize the log-likelihood objective, which presents challenges in meeting specific requirements for downstream tasks, such as image aesthetics and image-text alignment. Recent research addresses this issue by refining the diffusion U-Net using human rewards through reinforcement learning or direct backpropagation. However, many of them over… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  27. arXiv:2311.09008  [pdf, other

    cs.CL

    End-to-end Task-oriented Dialogue: A Survey of Tasks, Methods, and Future Directions

    Authors: Libo Qin, Wenbo Pan, Qiguang Chen, Lizi Liao, Zhou Yu, Yue Zhang, Wanxiang Che, Min Li

    Abstract: End-to-end task-oriented dialogue (EToD) can directly generate responses in an end-to-end fashion without modular training, which attracts escalating popularity. The advancement of deep neural networks, especially the successful use of large pre-trained models, has further led to significant progress in EToD research in recent years. In this paper, we present a thorough review and provide a unifie… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP2023

  28. arXiv:2311.06783  [pdf, other

    cs.CV cs.MM

    Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

    Authors: Haoning Wu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Annan Wang, Kaixin Xu, Chunyi Li, **gwen Hou, Guangtao Zhai, Geng Xue, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: Multi-modality foundation models, as represented by GPT-4V, have brought a new paradigm for low-level visual perception and understanding tasks, that can respond to a broad range of natural human instructions in a model. While existing foundation models have shown exciting potentials on low-level visual tasks, their related abilities are still preliminary and need to be improved. In order to enhan… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: 16 pages, 11 figures, page 12-16 as appendix

  29. arXiv:2311.02791  [pdf, other

    cs.CV

    MirrorCalib: Utilizing Human Pose Information for Mirror-based Virtual Camera Calibration

    Authors: Longyun Liao, Rong Zheng, Andrew Mitchell

    Abstract: In this paper, we present the novel task of estimating the extrinsic parameters of a virtual camera relative to a real camera in exercise videos with a mirror. This task poses a significant challenge in scenarios where the views from the real and mirrored cameras have no overlap or share salient features. To address this issue, prior knowledge of a human body and 2D joint locations are utilized to… ▽ More

    Submitted 17 May, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: Accepted by AVSS2024

  30. arXiv:2310.20305  [pdf

    cs.CV

    Bilateral Network with Residual U-blocks and Dual-Guided Attention for Real-time Semantic Segmentation

    Authors: Liang Liao, Liang Wan, Mingsheng Liu, Shusheng Li

    Abstract: When some application scenarios need to use semantic segmentation technology, like automatic driving, the primary concern comes to real-time performance rather than extremely high segmentation accuracy. To achieve a good trade-off between speed and accuracy, two-branch architecture has been proposed in recent years. It treats spatial information and semantics information separately which allows th… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  31. arXiv:2309.14181  [pdf, other

    cs.CV cs.AI cs.MM

    Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

    Authors: Haoning Wu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Annan Wang, Chunyi Li, Wenxiu Sun, Qiong Yan, Guangtao Zhai, Weisi Lin

    Abstract: The rapid evolution of Multi-modality Large Language Models (MLLMs) has catalyzed a shift in computer vision from specialized models to general-purpose foundation models. Nevertheless, there is still an inadequacy in assessing the abilities of MLLMs on low-level visual perception and understanding. To address this gap, we present Q-Bench, a holistic benchmark crafted to systematically evaluate pot… ▽ More

    Submitted 1 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: 27 pages, 11 tables, with updated results

  32. arXiv:2308.14393  [pdf

    eess.SY cs.RO

    Research on the Influence of Underwater Environment on the Dynamic Performance of the Mechanical Leg of a Deep-sea Crawling and Swimming Robot

    Authors: Lihui Liao, Baoren Li, Dijia Zhang, Lu** Gao, Mboulé Ngwa, **gmin Du

    Abstract: The performance of underwater crawling and adjustment of the body posture for underwater manipulating of the deep-sea crawling and swimming robot (DCSR) is directly influenced by the dynamic performance of the underwater mechanical legs (UWML), as it serves as the executive mechanism of the DCSR. Compared with the mechanical legs of legged robots working on land, the UWML of the DCSR not only poss… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: conference for 2023 IEEE 9th International Conference on Fluid Power and Mechatronics (FPM2023)

    MSC Class: 93C40 ACM Class: C.5

  33. arXiv:2308.12001  [pdf, other

    cs.CV

    Local Distortion Aware Efficient Transformer Adaptation for Image Quality Assessment

    Authors: Kangmin Xu, Liang Liao, **g Xiao, Chaofeng Chen, Haoning Wu, Qiong Yan, Weisi Lin

    Abstract: Image Quality Assessment (IQA) constitutes a fundamental task within the field of computer vision, yet it remains an unresolved challenge, owing to the intricate distortion conditions, diverse image contents, and limited availability of data. Recently, the community has witnessed the emergence of numerous large-scale pretrained foundation models, which greatly benefit from dramatically increased d… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  34. arXiv:2308.11584  [pdf, other

    cs.CL cs.AI

    Building Emotional Support Chatbots in the Era of LLMs

    Authors: Zhonghua Zheng, Lizi Liao, Yang Deng, Liqiang Nie

    Abstract: The integration of emotional support into various conversational scenarios presents profound societal benefits, such as social interactions, mental health counseling, and customer service. However, there are unsolved challenges that hinder real-world applications in this field, including limited data availability and the absence of well-accepted model training paradigms. This work endeavors to nav… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  35. arXiv:2308.09277  [pdf, ps, other

    cs.GT cs.DS

    Greedy-Based Online Fair Allocation with Adversarial Input: Enabling Best-of-Many-Worlds Guarantees

    Authors: Zongjun Yang, Luofeng Liao, Christian Kroer

    Abstract: We study an online allocation problem with sequentially arriving items and adversarially chosen agent values, with the goal of balancing fairness and efficiency. Our goal is to study the performance of algorithms that achieve strong guarantees under other input models such as stochastic inputs, in order to achieve robust guarantees against a variety of inputs. To that end, we study the PACE (Pacin… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  36. arXiv:2308.04502  [pdf, other

    cs.CL

    Revisiting Disentanglement and Fusion on Modality and Context in Conversational Multimodal Emotion Recognition

    Authors: Bobo Li, Hao Fei, Lizi Liao, Yu Zhao, Chong Teng, Tat-Seng Chua, Donghong Ji, Fei Li

    Abstract: It has been a hot research topic to enable machines to understand human emotions in multimodal contexts under dialogue scenarios, which is tasked with multimodal emotion analysis in conversation (MM-ERC). MM-ERC has received consistent attention in recent years, where a diverse range of methods has been proposed for securing better task performance. Most existing works treat MM-ERC as a standard m… ▽ More

    Submitted 12 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  37. arXiv:2308.03060  [pdf, other

    cs.CV

    TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality Assessment

    Authors: Chaofeng Chen, Jiadi Mo, **gwen Hou, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks. Inspired by the characteristics of the human visual system, existing methods typically use a combination of global and local representations (\ie, multi-scale features) to achieve superior performance. However, most of them adopt simple linear fusion of multi-sc… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: 13 pages, 8 figures, 10 tables. In submission

  38. arXiv:2308.02621  [pdf, other

    cs.CV cs.LG

    Color Image Recovery Using Generalized Matrix Completion over Higher-Order Finite Dimensional Algebra

    Authors: Liang Liao, Zhuang Guo, Qi Gao, Yan Wang, Fajun Yu, Qifeng Zhao, Stephen Johh Maybank

    Abstract: To improve the accuracy of color image completion with missing entries, we present a recovery method based on generalized higher-order scalars. We extend the traditional second-order matrix model to a more comprehensive higher-order matrix equivalent, called the "t-matrix" model, which incorporates a pixel neighborhood expansion strategy to characterize the local pixel constraints. This "t-matrix"… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 24 pages; 9 figures

  39. arXiv:2307.08221  [pdf, other

    cs.RO

    NDT-Map-Code: A 3D global descriptor for real-time loop closure detection in lidar SLAM

    Authors: Lizhou Liao, Wenlei Yan, Li Sun, Xinhui Bai, Zhenxing You, Hongyuan Yuan, Chunyun Fu

    Abstract: Loop-closure detection, also known as place recognition, aiming to identify previously visited locations, is an essential component of a SLAM system. Existing research on lidar-based loop closure heavily relies on dense point cloud and 360 FOV lidars. This paper proposes an out-of-the-box NDT (Normal Distribution Transform) based global descriptor, NDT-Map-Code, designed for both on-road driving a… ▽ More

    Submitted 20 March, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: 8 pages, 6 figures, 4 tables

  40. arXiv:2307.00968  [pdf, other

    cs.LG cs.AI

    REAL: A Representative Error-Driven Approach for Active Learning

    Authors: Cheng Chen, Yong Wang, Lizi Liao, Yueguo Chen, Xiaoyong Du

    Abstract: Given a limited labeling budget, active learning (AL) aims to sample the most informative instances from an unlabeled pool to acquire labels for subsequent model training. To achieve this, AL typically measures the informativeness of unlabeled instances based on uncertainty and diversity. However, it does not consider erroneous instances with their neighborhood error density, which have great pote… ▽ More

    Submitted 5 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted by ECML/PKDD 2023

  41. arXiv:2306.11528  [pdf, other

    cs.CV

    TransRef: Multi-Scale Reference Embedding Transformer for Reference-Guided Image Inpainting

    Authors: Liang Liao, Taorong Liu, Delin Chen, **g Xiao, Zheng Wang, Chia-Wen Lin, Shin'ichi Satoh

    Abstract: Image inpainting for completing complicated semantic environments and diverse hole patterns of corrupted images is challenging even for state-of-the-art learning-based inpainting methods trained on large-scale data. A reference image capturing the same scene of a corrupted image offers informative guidance for completing the corrupted image as it shares similar texture and structure priors to that… ▽ More

    Submitted 20 June, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Under review

  42. arXiv:2306.03975  [pdf, other

    cs.CL

    Revisiting Conversation Discourse for Dialogue Disentanglement

    Authors: Bobo Li, Hao Fei, Fei Li, Shengqiong Wu, Lizi Liao, Yinwei Wei, Tat-Seng Chua, Donghong Ji

    Abstract: Dialogue disentanglement aims to detach the chronologically ordered utterances into several independent sessions. Conversation utterances are essentially organized and described by the underlying discourse, and thus dialogue disentanglement requires the full understanding and harnessing of the intrinsic discourse attribute. In this paper, we propose enhancing dialogue disentanglement by taking ful… ▽ More

    Submitted 10 June, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: under review

  43. arXiv:2305.13626  [pdf, other

    cs.CL

    Prompting and Evaluating Large Language Models for Proactive Dialogues: Clarification, Target-guided, and Non-collaboration

    Authors: Yang Deng, Lizi Liao, Liang Chen, Hongru Wang, Wenqiang Lei, Tat-Seng Chua

    Abstract: Conversational systems based on Large Language Models (LLMs), such as ChatGPT, show exceptional proficiency in context understanding and response generation. However, despite their impressive capabilities, they still possess limitations, such as providing randomly-guessed answers to ambiguous queries or failing to refuse users' requests, both of which are considered aspects of a conversational age… ▽ More

    Submitted 14 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted by EMNLP 2023 Findings

  44. arXiv:2305.12726  [pdf, other

    cs.CV cs.CL cs.MM

    Towards Explainable In-the-Wild Video Quality Assessment: A Database and a Language-Prompted Approach

    Authors: Haoning Wu, Erli Zhang, Liang Liao, Chaofeng Chen, **gwen Hou, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: The proliferation of in-the-wild videos has greatly expanded the Video Quality Assessment (VQA) problem. Unlike early definitions that usually focus on limited distortion types, VQA on in-the-wild videos is especially challenging as it could be affected by complicated factors, including various distortions and diverse contents. Though subjective studies have collected overall quality scores for th… ▽ More

    Submitted 3 August, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Proceedings of the 31st ACM International Conference on Multimedia (MM '23)

  45. arXiv:2305.06799  [pdf, other

    cs.CV

    GCFAgg: Global and Cross-view Feature Aggregation for Multi-view Clustering

    Authors: Weiqing Yan, Yuanyang Zhang, Chenlei Lv, Chang Tang, Guanghui Yue, Liang Liao, Weisi Lin

    Abstract: Multi-view clustering can partition data samples into their categories by learning a consensus representation in unsupervised way and has received more and more attention in recent years. However, most existing deep clustering methods learn consensus representation or view-specific representations from multiple views via view-wise aggregation way, where they ignore structure relationship of all sa… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

  46. arXiv:2305.05105  [pdf, other

    eess.SP cs.AI cs.LG

    TinyML Design Contest for Life-Threatening Ventricular Arrhythmia Detection

    Authors: Zhenge Jia, Dawei Li, Cong Liu, Liqi Liao, Xiaowei Xu, Lichuan **, Yiyu Shi

    Abstract: The first ACM/IEEE TinyML Design Contest (TDC) held at the 41st International Conference on Computer-Aided Design (ICCAD) in 2022 is a challenging, multi-month, research and development competition. TDC'22 focuses on real-world medical problems that require the innovation and implementation of artificial intelligence/machine learning (AI/ML) algorithms on implantable devices. The challenge problem… ▽ More

    Submitted 26 August, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: The paper is about the first TinyML design contest for healthcare

  47. arXiv:2305.04049  [pdf, other

    cs.CL

    Actively Discovering New Slots for Task-oriented Conversation

    Authors: Yuxia Wu, Tianhao Dai, Zhedong Zheng, Lizi Liao

    Abstract: Existing task-oriented conversational search systems heavily rely on domain ontologies with pre-defined slots and candidate value sets. In practical applications, these prerequisites are hard to meet, due to the emerging new user requirements and ever-changing scenarios. To mitigate these issues for better interaction performance, there are efforts working towards detecting out-of-vocabulary value… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: 11 pages, 5 figures, 3 tables

  48. arXiv:2305.00767  [pdf, other

    cs.CV cs.LG

    RViDeformer: Efficient Raw Video Denoising Transformer with a Larger Benchmark Dataset

    Authors: Huan**g Yue, Cong Cao, Lei Liao, **gyu Yang

    Abstract: In recent years, raw video denoising has garnered increased attention due to the consistency with the imaging process and well-studied noise modeling in the raw domain. However, two problems still hinder the denoising performance. Firstly, there is no large dataset with realistic motions for supervised raw video denoising, as capturing noisy and clean frames for real dynamic scenes is difficult. T… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: 16 pages,15 figures

  49. arXiv:2304.14672  [pdf, other

    cs.CV

    Towards Robust Text-Prompted Semantic Criterion for In-the-Wild Video Quality Assessment

    Authors: Haoning Wu, Liang Liao, Annan Wang, Chaofeng Chen, **gwen Hou, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: The proliferation of videos collected during in-the-wild natural settings has pushed the development of effective Video Quality Assessment (VQA) methodologies. Contemporary supervised opinion-driven VQA strategies predominantly hinge on training from expensive human annotations for quality scores, which limited the scale and distribution of VQA datasets and consequently led to unsatisfactory gener… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Comments: 13 pages, 10 figures, under review

  50. arXiv:2302.13269  [pdf, other

    cs.CV cs.MM

    Exploring Opinion-unaware Video Quality Assessment with Semantic Affinity Criterion

    Authors: Haoning Wu, Liang Liao, **gwen Hou, Chaofeng Chen, Erli Zhang, Annan Wang, Wenxiu Sun, Qiong Yan, Weisi Lin

    Abstract: Recent learning-based video quality assessment (VQA) algorithms are expensive to implement due to the cost of data collection of human quality opinions, and are less robust across various scenarios due to the biases of these opinions. This motivates our exploration on opinion-unaware (a.k.a zero-shot) VQA approaches. Existing approaches only considers low-level naturalness in spatial or temporal d… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.