Skip to main content

Showing 1–50 of 229 results for author: Ma, Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17438  [pdf, other

    cs.CV

    Implicit-Zoo: A Large-Scale Dataset of Neural Implicit Functions for 2D Images and 3D Scenes

    Authors: Qi Ma, Danda Pani Paudel, Ender Konukoglu, Luc Van Gool

    Abstract: Neural implicit functions have demonstrated significant importance in various areas such as computer vision, graphics. Their advantages include the ability to represent complex shapes and scenes with high fidelity, smooth interpolation capabilities, and continuous representations. Despite these benefits, the development and analysis of implicit functions have been limited by the lack of comprehens… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.16079  [pdf, other

    cs.CL cs.AI

    EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection

    Authors: Zheng Li, Dawei Zhu, Qilong Ma, Weimin Xiong, Sujian Li

    Abstract: Personality is a fundamental construct in psychology, reflecting an individual's behavior, thinking, and emotional patterns. Previous researches have made some progress in personality detection, primarily by utilizing the whole text to predict personality. However, these studies generally tend to overlook psychological knowledge: they rarely apply the well-established correlations between emotion… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  3. arXiv:2406.15000  [pdf, other

    cs.CL cs.AI

    Unveiling the Impact of Multi-Modal Interactions on User Engagement: A Comprehensive Evaluation in AI-driven Conversations

    Authors: Lichao Zhang, Jia Yu, Shuai Zhang, Long Li, Yangyang Zhong, Guanbao Liang, Yuming Yan, Qing Ma, Fangsheng Weng, Fayu Pan, **g Li, Renjun Xu, Zhenzhong Lan

    Abstract: Large Language Models (LLMs) have significantly advanced user-bot interactions, enabling more complex and coherent dialogues. However, the prevalent text-only modality might not fully exploit the potential for effective user engagement. This paper explores the impact of multi-modal interactions, which incorporate images and audio alongside text, on user engagement in chatbot conversations. We cond… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  4. arXiv:2406.14880  [pdf, other

    cs.LG cs.LO

    Pathformer: Recursive Path Query Encoding for Complex Logical Query Answering

    Authors: Chongzhi Zhang, Zhi** Peng, Junhao Zheng, Linghao Wang, Ruifeng Shi, Qianli Ma

    Abstract: Complex Logical Query Answering (CLQA) over incomplete knowledge graphs is a challenging task. Recently, Query Embedding (QE) methods are proposed to solve CLQA by performing multi-hop logical reasoning. However, most of them only consider historical query context information while ignoring future information, which leads to their failure to capture the complex dependencies behind the elements of… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE

  5. arXiv:2406.09701  [pdf, other

    cs.SE

    Towards Effectively Detecting and Explaining Vulnerabilities Using Large Language Models

    Authors: Qiheng Mao, Zhenhao Li, Xing Hu, Kui Liu, Xin Xia, Jianling Sun

    Abstract: Software vulnerabilities pose significant risks to the security and integrity of software systems. Prior studies have proposed a series of approaches to vulnerability detection using deep learning or pre-trained models. However, there is still a lack of vulnerability's detailed explanation for understanding apart from detecting its occurrence. Recently, large language models (LLMs) have shown a re… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  6. arXiv:2406.07385  [pdf, other

    cs.GT cs.CC

    Disrupting Bipartite Trading Networks: Matching for Revenue Maximization

    Authors: Luca D'Amico-Wong, Yannai A. Gonczarowski, Gary Qiurui Ma, David C. Parkes

    Abstract: We model the role of an online platform disrupting a market with unit-demand buyers and unit-supply sellers. Each seller can transact with a subset of the buyers whom she already knows, as well as with any additional buyers to whom she is introduced by the platform. Given these constraints on trade, prices and transactions are induced by a competitive equilibrium. The platform's revenue is proport… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted at the Twenty-Fifth ACM Conference on Economics and Computation (EC'24), 2024

  7. arXiv:2406.06391  [pdf, other

    cs.LG cs.CL

    Towards Lifelong Learning of Large Language Models: A Survey

    Authors: Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma

    Abstract: As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for co** with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental le… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 37 pages

  8. arXiv:2406.03625  [pdf, other

    cs.CV cs.AI

    Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories

    Authors: Yan Zhang, Sergey Prokudin, Marko Mihajlovic, Qianli Ma, Siyu Tang

    Abstract: Understanding the dynamics of generic 3D scenes is fundamentally challenging in computer vision, essential in enhancing applications related to scene reconstruction, motion tracking, and avatar creation. In this work, we address the task as the problem of inferring dense, long-range motion of 3D points. By observing a set of point trajectories, we aim to learn an implicit motion field parameterize… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: cvpr24 post camera ready

  9. arXiv:2406.02377  [pdf, other

    cs.IR cs.AI cs.CL

    XRec: Large Language Models for Explainable Recommendation

    Authors: Qiyao Ma, Xubin Ren, Chao Huang

    Abstract: Recommender systems help users navigate information overload by providing personalized recommendations aligned with their preferences. Collaborative Filtering (CF) is a widely adopted approach, but while advanced techniques like graph neural networks (GNNs) and self-supervised learning (SSL) have enhanced CF models for better user representations, they often lack the ability to provide explanation… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  10. arXiv:2405.20710  [pdf, other

    cs.IR

    Information Maximization via Variational Autoencoders for Cross-Domain Recommendation

    Authors: Xuying Ning, Wujiang Xu, Xiaolei Liu, Mingming Ha, Qiongxu Ma, Youru Li, Linxun Chen, Yongfeng Zhang

    Abstract: Cross-Domain Sequential Recommendation (CDSR) methods aim to address the data sparsity and cold-start problems present in Single-Domain Sequential Recommendation (SDSR). Existing CDSR methods typically rely on overlap** users, designing complex cross-domain modules to capture users' latent interests that can propagate across different domains. However, their propagated informative information is… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  11. arXiv:2405.20336  [pdf, other

    cs.CV cs.SD eess.AS

    RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text

    Authors: Jiaben Chen, Xin Yan, Yihang Chen, Siyuan Cen, Qinwei Ma, Haoyu Zhen, Kaizhi Qian, Lie Lu, Chuang Gan

    Abstract: In this work, we introduce a challenging task for simultaneously generating 3D holistic body motions and singing vocals directly from textual lyrics inputs, advancing beyond existing works that typically address these two modalities in isolation. To facilitate this, we first collect the RapVerse dataset, a large dataset containing synchronous rap** vocals, lyrics, and high-quality 3D holistic bo… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Project website: https://vis-www.cs.umass.edu/RapVerse

  12. arXiv:2405.15403  [pdf, other

    cs.LG stat.ML

    Fine-Grained Dynamic Framework for Bias-Variance Joint Optimization on Data Missing Not at Random

    Authors: Mingming Ha, Xuewen Tao, Wenfang Lin, Qionxu Ma, Wujiang Xu, Linxun Chen

    Abstract: In most practical applications such as recommendation systems, display advertising, and so forth, the collected data often contains missing values and those missing values are generally missing-not-at-random, which deteriorates the prediction performance of models. Some existing estimators and regularizers attempt to achieve unbiased estimation to improve the predictive performance. However, varia… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  13. arXiv:2405.15124  [pdf, other

    cs.LG cs.AI

    Scaling Law for Time Series Forecasting

    Authors: **gzhe Shi, Qinwei Ma, Huan Ma, Lei Li

    Abstract: Scaling law that rewards large datasets, complex models and enhanced data granularity has been observed in various fields of deep learning. Yet, studies on time series forecasting have cast doubt on scaling behaviors of deep learning methods for time series forecasting: while more training data improves performance, more capable models do not always outperform less capable models, and longer input… ▽ More

    Submitted 26 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 20 pages

  14. arXiv:2405.10160  [pdf, other

    cs.CV cs.AI

    PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning

    Authors: Jiancheng Pan, Muyuan Ma, Qing Ma, Cong Bai, Shengyong Chen

    Abstract: Remote sensing image-text retrieval constitutes a foundational aspect of remote sensing interpretation tasks, facilitating the alignment of vision and language representations. This paper introduces a prior instruction representation (PIR) learning paradigm that draws on prior knowledge to instruct adaptive learning of vision and text representations. Based on PIR, a domain-adapted remote sensing… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  15. arXiv:2405.08328  [pdf, other

    cs.NI

    Towards Multi-Task Generative-AI Edge Services with an Attention-based Diffusion DRL Approach

    Authors: Yaju Liu, Xi Lin, Siyuan Li, Gaolei Li, Qinghua Mao, Jianhua Li

    Abstract: As an emerging paradigm of content creation, AI-Generated Content (AIGC) has been widely adopted by a large number of edge end users. However, the requests for generated content from AIGC users have obvious diversity, and there remains a notable lack of research addressing the variance in user demands for AIGC services. This gap underscores a critical need for suitable AIGC service selection mecha… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  16. arXiv:2405.03435  [pdf, other

    cond-mat.dis-nn cs.AI cs.LG

    A method for quantifying the generalization capabilities of generative models for solving Ising models

    Authors: Qunlong Ma, Zhi Ma, Ming Gao

    Abstract: For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilitie… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures

    Journal ref: Mach. Learn.: Sci. Technol. 5 (2024) 025011

  17. arXiv:2404.18262  [pdf, other

    cs.AI

    Generating Situated Reflection Triggers about Alternative Solution Paths: A Case Study of Generative AI for Computer-Supported Collaborative Learning

    Authors: Atharva Naik, Jessica Ruhan Yin, Anusha Kamath, Qianou Ma, Sherry Tongshuang Wu, Charles Murray, Christopher Bogart, Majd Sakr, Carolyn P. Rose

    Abstract: An advantage of Large Language Models (LLMs) is their contextualization capability - providing different responses based on student inputs like solution strategy or prior discussion, to potentially better engage students than standard feedback. We present a design and evaluation of a proof-of-concept LLM application to offer students dynamic and contextualized feedback. Specifically, we augment an… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  18. arXiv:2404.16831  [pdf, other

    cs.CV

    The Third Monocular Depth Estimation Challenge

    Authors: Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, Yi** Bao, Xiao Liu, Dohyeong Kim, **seong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, **qiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora , et al. (16 additional authors not shown)

    Abstract: This paper discusses the results of the third edition of the Monocular Depth Estimation Challenge (MDEC). The challenge focuses on zero-shot generalization to the challenging SYNS-Patches dataset, featuring complex scenes in natural and indoor settings. As with the previous edition, methods can use any form of supervision, i.e. supervised or self-supervised. The challenge received a total of 19 su… ▽ More

    Submitted 27 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: To appear in CVPRW2024

  19. arXiv:2404.12006  [pdf, other

    cs.CL

    Variational Multi-Modal Hypergraph Attention Network for Multi-Modal Relation Extraction

    Authors: Qian Li, Cheng Ji, Shu Guo, Yong Zhao, Qianren Mao, Shangguang Wang, Yuntao Wei, Jianxin Li

    Abstract: Multi-modal relation extraction (MMRE) is a challenging task that aims to identify relations between entities in text leveraging image information. Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task. To address this limitation, we pro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  20. arXiv:2404.08951  [pdf, other

    cs.CV cs.LG

    Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

    Authors: Qinghe Ma, Jian Zhang, Lei Qi, Qian Yu, Yinghuan Shi, Yang Gao

    Abstract: Both limited annotation and domain shift are prevalent challenges in medical image segmentation. Traditional semi-supervised segmentation and unsupervised domain adaptation methods address one of these issues separately. However, the coexistence of limited annotation and domain shift is quite common, which motivates us to introduce a novel and challenging scenario: Mixed Domain Semi-supervised med… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  21. arXiv:2404.06225  [pdf, other

    cond-mat.stat-mech cond-mat.dis-nn cs.LG

    Message Passing Variational Autoregressive Network for Solving Intractable Ising Models

    Authors: Qunlong Ma, Zhi Ma, **long Xu, Hairui Zhang, Ming Gao

    Abstract: Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks. Learning a probability distribution of energy configuration or finding the ground states of a disordered, fully connected Ising model is essential for statistical mechanics and NP-hard problems. Despite tremen… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 18 pages, 14 figures

  22. arXiv:2404.01343  [pdf, other

    cs.CL cs.AI

    CHOPS: CHat with custOmer Profile Systems for Customer Service with LLMs

    Authors: **gzhe Shi, Jialuo Li, Qinwei Ma, Zaiwen Yang, Huan Ma, Lei Li

    Abstract: Businesses and software platforms are increasingly turning to Large Language Models (LLMs) such as GPT-3.5, GPT-4, GLM-3, and LLaMa-2 for chat assistance with file access or as reasoning agents for customer service. However, current LLM-based customer service models have limited integration with customer profiles and lack the operational capabilities necessary for effective service. Moreover, exis… ▽ More

    Submitted 15 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: 14 pages

  23. arXiv:2403.19826   

    cs.AI

    Segmentation Re-thinking Uncertainty Estimation Metrics for Semantic Segmentation

    Authors: Qitian Ma, Shyam Nanda Rai, Carlo Masone, Tatiana Tommasi

    Abstract: In the domain of computer vision, semantic segmentation emerges as a fundamental application within machine learning, wherein individual pixels of an image are classified into distinct semantic categories. This task transcends traditional accuracy metrics by incorporating uncertainty quantification, a critical measure for assessing the reliability of each segmentation prediction. Such quantificati… ▽ More

    Submitted 8 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Premature Submission: accidentally submitted before it was ready

  24. arXiv:2403.12865  [pdf, other

    cs.RO

    PE-Planner: A Performance-Enhanced Quadrotor Motion Planner for Autonomous Flight in Complex and Dynamic Environments

    Authors: Jiaxin Qiu, Qingchen Liu, Jiahu Qin, Dewang Cheng, Yawei Tian, Qichao Ma

    Abstract: The role of a motion planner is pivotal in quadrotor applications, yet existing methods often struggle to adapt to complex environments, limiting their ability to achieve fast, safe, and robust flight. In this letter, we introduce a performance-enhanced quadrotor motion planner designed for autonomous flight in complex environments including dense obstacles, dynamic obstacles, and unknown disturba… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  25. arXiv:2403.03736  [pdf, other

    cs.CV cs.LG eess.IV

    Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer

    Authors: Naifu Xue, Qi Mao, Zijian Wang, Yuan Zhang, Siwei Ma

    Abstract: Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios (<0.05 bpp). Motivat… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  26. arXiv:2402.19371  [pdf

    cs.CL cs.AI cs.IR

    OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models

    Authors: Jenish Maharjan, Anurag Garikipati, Navan Preet Singh, Leo Cyrus, Mayank Sharma, Madalina Ciobanu, Gina Barnes, Rahul Thapa, Qingqing Mao, Ritankar Das

    Abstract: LLMs have become increasingly capable at accomplishing a range of specialized-tasks and can be utilized to expand equitable access to medical knowledge. Most medical LLMs have involved extensive fine-tuning, leveraging specialized medical data and significant, thus costly, amounts of computational power. Many of the top performing LLMs are proprietary and their access is limited to very few resear… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  27. arXiv:2402.15680  [pdf, other

    cs.LG

    Overcoming Pitfalls in Graph Contrastive Learning Evaluation: Toward Comprehensive Benchmarks

    Authors: Qian Ma, Hongliang Chi, Hengrui Zhang, Kay Liu, Zhiwei Zhang, Lu Cheng, Suhang Wang, Philip S. Yu, Yao Ma

    Abstract: The rise of self-supervised learning, which operates without the need for labeled data, has garnered significant interest within the graph learning community. This enthusiasm has led to the development of numerous Graph Contrastive Learning (GCL) techniques, all aiming to create a versatile graph encoder that leverages the wealth of unlabeled data for various downstream tasks. However, the current… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  28. arXiv:2402.15153  [pdf, other

    cs.CL cs.LG

    Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised Sentence Embeddings

    Authors: Junlong Liu, Xichen Shang, Huawen Feng, Junhao Zheng, Qianli Ma

    Abstract: Unsupervised sentence embeddings task aims to convert sentences to semantic vector representations. Most previous works directly use the sentence representations derived from pretrained language models. However, due to the token bias in pretrained language models, the models can not capture the fine-grained semantics in sentences, which leads to poor predictions. To address this issue, we propose… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 8 pages, 3 figures

  29. arXiv:2402.14609  [pdf, other

    cs.LG cs.AI cs.CR cs.DB

    FedCQA: Answering Complex Queries on Multi-Source Knowledge Graphs via Federated Learning

    Authors: Qi Hu, Weifeng Jiang, Haoran Li, Zihao Wang, Jiaxin Bai, Qianren Mao, Yangqiu Song, Lixin Fan, Jianxin Li

    Abstract: Complex logical query answering is a challenging task in knowledge graphs (KGs) that has been widely studied. The ability to perform complex logical reasoning is essential and supports various graph reasoning-based downstream tasks, such as search engines. Recent approaches are proposed to represent KG entities and logical queries into embedding vectors and find answers to logical queries from the… ▽ More

    Submitted 25 February, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

  30. arXiv:2402.14145  [pdf, other

    stat.ML cs.LG stat.ME

    Multiply Robust Estimation for Local Distribution Shifts with Multiple Domains

    Authors: Steven Wilkins-Reeves, Xu Chen, Qi Ma, Christine Agarwal, Aude Hofleitner

    Abstract: Distribution shifts are ubiquitous in real-world machine learning applications, posing a challenge to the generalization of models trained on one data distribution to another. We focus on scenarios where data distributions vary across multiple segments of the entire population and only make local assumptions about the differences between training and test (deployment) distributions within each seg… ▽ More

    Submitted 3 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: 9 pages, 4 figures

  31. arXiv:2402.12954  [pdf, other

    cs.LG cs.AI cs.LO

    Conditional Logical Message Passing Transformer for Complex Query Answering

    Authors: Chongzhi Zhang, Zhi** Peng, Junhao Zheng, Qianli Ma

    Abstract: Complex Query Answering (CQA) over Knowledge Graphs (KGs) is a challenging task. Given that KGs are usually incomplete, neural models are proposed to solve CQA by performing multi-hop logical reasoning. However, most of them cannot perform well on both one-hop and multi-hop queries simultaneously. Recent work proposes a logical message passing mechanism based on the pre-trained neural link predict… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 13 pages, 3 figures, and 12 tables

  32. arXiv:2402.10447  [pdf, other

    cs.CL cs.LG

    Incremental Sequence Labeling: A Tale of Two Shifts

    Authors: Shengjie Qiu, Junhao Zheng, Zhen Liu, Yicheng Luo, Qianli Ma

    Abstract: The incremental sequence labeling task involves continuously learning new classes over time while retaining knowledge of the previous ones. Our investigation identifies two significant semantic shifts: E2O (where the model mislabels an old entity as a non-entity) and O2E (where the model labels a non-entity or old entity as a new entity). Previous research has predominantly focused on addressing t… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: accepted to ACL 2024

  33. arXiv:2402.10063  [pdf, other

    cs.LG

    Balancing the Causal Effects in Class-Incremental Learning

    Authors: Junhao Zheng, Ruiyan Wang, Chongzhi Zhang, Huawen Feng, Qianli Ma

    Abstract: Class-Incremental Learning (CIL) is a practical and challenging problem for achieving general artificial intelligence. Recently, Pre-Trained Models (PTMs) have led to breakthroughs in both visual and natural language processing tasks. Despite recent studies showing PTMs' potential ability to learn sequentially, a plethora of work indicates the necessity of alleviating the catastrophic forgetting o… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  34. arXiv:2402.08526  [pdf, other

    cs.LG cs.CL

    Can LLMs Learn New Concepts Incrementally without Forgetting?

    Authors: Junhao Zheng, Shengjie Qiu, Qianli Ma

    Abstract: Large Language Models (LLMs) have achieved remarkable success across various tasks, yet their ability to learn incrementally without forgetting remains underexplored. Incremental learning (IL) is crucial as it enables models to acquire new knowledge while retaining previously learned information, akin to human learning. Existing benchmarks for IL are insufficient due to data leakage issues and the… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: 28 pages

  35. EmoWear: Exploring Emotional Teasers for Voice Message Interaction on Smartwatches

    Authors: Pengcheng An, Jiawen Zhu, Zibo Zhang, Yifei Yin, Qingyuan Ma, Che Yan, Linghao Du, Jian Zhao

    Abstract: Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"-pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: To appear at ACM CHI '24

  36. arXiv:2402.05952  [pdf, other

    cs.LG cs.AI cs.CL

    Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques

    Authors: Qiheng Mao, Zemin Liu, Chenghao Liu, Zhuo Li, Jianling Sun

    Abstract: The integration of Large Language Models (LLMs) with Graph Representation Learning (GRL) marks a significant evolution in analyzing complex data structures. This collaboration harnesses the sophisticated linguistic capabilities of LLMs to improve the contextual understanding and adaptability of graph models, thereby broadening the scope and potential of GRL. Despite a growing body of research dedi… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  37. arXiv:2402.05410  [pdf, other

    cs.CV

    SpirDet: Towards Efficient, Accurate and Lightweight Infrared Small Target Detector

    Authors: Qianchen Mao, Qiang Li, Bingshu Wang, Yongjun Zhang, Tao Dai, C. L. Philip Chen

    Abstract: In recent years, the detection of infrared small targets using deep learning methods has garnered substantial attention due to notable advancements. To improve the detection capability of small targets, these methods commonly maintain a pathway that preserves high-resolution features of sparse and tiny targets. However, it can result in redundant and expensive computations. To tackle this challeng… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  38. arXiv:2402.03492  [pdf, other

    eess.IV cs.CV

    Beyond Strong labels: Weakly-supervised Learning Based on Gaussian Pseudo Labels for The Segmentation of Ellipse-like Vascular Structures in Non-contrast CTs

    Authors: Qixiang Ma, Antoine Łucas, Huazhong Shu, Adrien Kaladji, Pascal Haigron

    Abstract: Deep-learning-based automated segmentation of vascular structures in preoperative CT scans contributes to computer-assisted diagnosis and intervention procedure in vascular diseases. While CT angiography (CTA) is the common standard, non-contrast CT imaging is significant as a contrast-risk-free alternative, avoiding complications associated with contrast agents. However, the challenges of labor-i… ▽ More

    Submitted 10 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  39. arXiv:2402.02514  [pdf, other

    eess.IV cs.CV cs.LG

    Deep Supervision by Gaussian Pseudo-label-based Morphological Attention for Abdominal Aorta Segmentation in Non-Contrast CTs

    Authors: Qixiang Ma, Antoine Lucas, Adrien Kaladji, Pascal Haigron

    Abstract: The segmentation of the abdominal aorta in non-contrast CT images is a non-trivial task for computer-assisted endovascular navigation, particularly in scenarios where contrast agents are unsuitable. While state-of-the-art deep learning segmentation models have been proposed recently for this task, they are trained on manually annotated strong labels. However, the inherent ambiguity in the boundary… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by 21st IEEE International Symposium on Biomedical Imaging

  40. arXiv:2402.02425  [pdf, other

    cs.LG physics.flu-dyn

    DeepLag: Discovering Deep Lagrangian Dynamics for Intuitive Fluid Prediction

    Authors: Qilong Ma, Haixu Wu, Lanxiang Xing, Shangchen Miao, Mingsheng Long

    Abstract: Accurately predicting the future fluid is vital to extensive areas such as meteorology, oceanology, and aerodynamics. However, since the fluid is usually observed from the Eulerian perspective, its moving and intricate dynamics are seriously obscured and confounded in static grids, bringing thorny challenges to the prediction. This paper introduces a new Lagrangian-Eulerian combined paradigm to ta… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  41. arXiv:2402.00262  [pdf

    cs.AI

    Computational Experiments Meet Large Language Model Based Agents: A Survey and Perspective

    Authors: Qun Ma, Xiao Xue, Deyu Zhou, Xiangning Yu, Donghua Liu, Xuwen Zhang, Zihan Zhao, Yifan Shen, Peilin Ji, Juanjuan Li, Gang Wang, Wanpeng Ma

    Abstract: Computational experiments have emerged as a valuable method for studying complex systems, involving the algorithmization of counterfactuals. However, accurately representing real social systems in Agent-based Modeling (ABM) is challenging due to the diverse and intricate characteristics of humans, including bounded rationality and heterogeneity. To address this limitation, the integration of Large… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  42. arXiv:2401.17812  [pdf, other

    cs.NI cs.AI

    Deterministic Computing Power Networking: Architecture, Technologies and Prospects

    Authors: Qingmin Jia, Yujiao Hu, Xiaomao Zhou, Qianpiao Ma, Kai Guo, Huayu Zhang, Renchao Xie, Tao Huang, Yunjie Liu

    Abstract: With the development of new Internet services such as computation-intensive and delay-sensitive tasks, the traditional "Best Effort" network transmission mode has been greatly challenged. The network system is urgently required to provide end-to-end transmission determinacy and computing determinacy for new applications to ensure the safe and efficient operation of services. Based on the research… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  43. arXiv:2401.09181  [pdf, other

    cs.LG

    Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer

    Authors: Junhao Zheng, Qianli Ma, Zhen Liu, Binquan Wu, Huawen Feng

    Abstract: Multimodal Continual Instruction Tuning (MCIT) enables Multimodal Large Language Models (MLLMs) to meet continuously emerging requirements without expensive retraining. MCIT faces two major obstacles: catastrophic forgetting (where old knowledge is forgotten) and negative forward transfer (where the performance of future tasks is degraded). Although existing methods have greatly alleviated catastr… ▽ More

    Submitted 26 June, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  44. arXiv:2401.05507  [pdf, other

    cs.CL cs.AI

    InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

    Authors: Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, **g Su, **g**g Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

    Abstract: In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to-end solving complex tasks by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 257 data analysis questions derived from 52 CSV files, and an agent framework which incorpora… ▽ More

    Submitted 11 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: 27 pages, 7 figures, work in progress

  45. arXiv:2401.04507  [pdf, other

    cs.CL cs.AI

    TechGPT-2.0: A large language model project to solve the task of knowledge graph construction

    Authors: Jiaqi Wang, Yuying Chang, Zhong Li, Ning An, Qi Ma, Lei Hei, Haibo Luo, Yifei Lu, Feiliang Ren

    Abstract: Large language models have exhibited robust performance across diverse natural language processing tasks. This report introduces TechGPT-2.0, a project designed to enhance the capabilities of large language models specifically in knowledge graph construction tasks, including named entity recognition (NER) and relationship triple extraction (RTE) tasks in NLP applications. Additionally, it serves a… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  46. arXiv:2312.15622  [pdf, other

    cs.CV cs.AI cs.MM

    Scalable Face Image Coding via StyleGAN Prior: Towards Compression for Human-Machine Collaborative Vision

    Authors: Qi Mao, Chongyu Wang, Meng Wang, Shiqi Wang, Ruijie Chen, Libiao **, Siwei Ma

    Abstract: The accelerated proliferation of visual content and the rapid development of machine vision technologies bring significant challenges in delivering visual data on a gigantic scale, which shall be effectively represented to satisfy both human and machine requirements. In this work, we investigate how hierarchical representations derived from the advanced generative prior facilitate constructing an… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE TIP

  47. arXiv:2312.11396  [pdf, other

    cs.CV cs.AI

    MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

    Authors: Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou

    Abstract: Recent diffusion-based image editing approaches have exhibited impressive editing capabilities in images with simple compositions. However, localized editing in complex scenarios has not been well-studied in the literature, despite its growing real-world demands. Existing mask-based inpainting methods fall short of retaining the underlying structure within the edit region. Meanwhile, mask-free att… ▽ More

    Submitted 21 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: for project page, see https://mag-edit.github.io/

  48. arXiv:2312.07887  [pdf, other

    cs.CL cs.LG

    Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models

    Authors: Junhao Zheng, Shengjie Qiu, Qianli Ma

    Abstract: Incremental Learning (IL) has been a long-standing problem in both vision and Natural Language Processing (NLP) communities. In recent years, as Pre-trained Language Models (PLMs) have achieved remarkable progress in various NLP downstream tasks, utilizing PLMs as backbones has become a common practice in recent research of IL in NLP. Most assume that catastrophic forgetting is the biggest obstacl… ▽ More

    Submitted 27 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: accepted to ACL 2024 (main conference)

  49. arXiv:2312.07248  [pdf, ps, other

    cs.LG cs.AI

    Multi-Granularity Framework for Unsupervised Representation Learning of Time Series

    Authors: Chengyang Ye, Qiang Ma

    Abstract: Representation learning plays a critical role in the analysis of time series data and has high practical value across a wide range of applications. including trend analysis, time series data retrieval and forecasting. In practice, data confusion is a significant issue as it can considerably impact the effectiveness and accuracy of data analysis, machine learning models and decision-making processe… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  50. arXiv:2312.01919  [pdf, other

    cs.CV

    COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy Prediction

    Authors: Qihang Ma, Xin Tan, Yanyun Qu, Lizhuang Ma, Zhizhong Zhang, Yuan Xie

    Abstract: The autonomous driving community has shown significant interest in 3D occupancy prediction, driven by its exceptional geometric perception and general object recognition capabilities. To achieve this, current works try to construct a Tri-Perspective View (TPV) or Occupancy (OCC) representation extending from the Bird-Eye-View perception. However, compressed views like TPV representation lose 3D ge… ▽ More

    Submitted 11 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: CVPR2024. Code is available at https://github.com/NotACracker/COTR