Skip to main content

Showing 1–50 of 419 results for author: Zhu, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19966  [pdf, other

    cs.CL

    Simulating Financial Market via Large Language Model based Agents

    Authors: Shen Gao, Yuntao Wen, Minghang Zhu, Jianing Wei, Yuhan Cheng, Qunzi Zhang, Shuo Shang

    Abstract: Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.19693  [pdf, other

    cs.RO cs.CV

    MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

    Authors: **ming Li, Yichen Zhu, Zhiyuan Xu, **dong Gu, Minjie Zhu, Xin Liu, Ning Liu, Yaxin Peng, Feifei Feng, Jian Tang

    Abstract: It is fundamentally challenging for robots to serve as useful assistants in human environments because this requires addressing a spectrum of sub-problems across robotics, including perception, language understanding, reasoning, and planning. The recent advancements in Multimodal Large Language Models (MLLMs) have demonstrated their exceptional abilities in solving complex mathematical problems, m… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.19672  [pdf, other

    cs.CV

    Beyond First-Order: A Multi-Scale Approach to Finger Knuckle Print Biometrics

    Authors: Chengrui Gao, Ziyuan Yang, Andrew Beng ** Teoh, Min Zhu

    Abstract: Recently, finger knuckle prints (FKPs) have gained attention due to their rich textural patterns, positioning them as a promising biometric for identity recognition. Prior FKP recognition methods predominantly leverage first-order feature descriptors, which capture intricate texture details but fail to account for structural information. Emerging research, however, indicates that second-order text… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  4. arXiv:2406.18691  [pdf, other

    cs.CV

    Geometric Features Enhanced Human-Object Interaction Detection

    Authors: Manli Zhu, Edmond S. L. Ho, Shuang Chen, Longzhi Yang, Hubert P. H. Shum

    Abstract: Cameras are essential vision instruments to capture images for pattern detection and measurement. Human-object interaction (HOI) detection is one of the most popular pattern detection approaches for captured human-centric visual scenes. Recently, Transformer-based models have become the dominant approach for HOI detection due to their advanced network architectures and thus promising results. Howe… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted to IEEE TIM

  5. arXiv:2406.18518  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

    Authors: Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong

    Abstract: The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scal… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  6. arXiv:2406.18017  [pdf, other

    cs.IT cs.ET

    Dependence Analysis and Structured Construction for Batched Sparse Code

    Authors: Jiaxin Qing, Xiaohong Cai, Yijun Fan, Mingyang Zhu, Raymond W. Yeung

    Abstract: In coding theory, codes are usually designed with a certain level of randomness to facilitate analysis and accommodate different channel conditions. However, the resulting random code constructed can be suboptimal in practical implementations. Represented by a bipartite graph, the Batched Sparse Code (BATS Code) is a randomly constructed erasure code that utilizes network coding to achieve near-op… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  7. arXiv:2406.16978  [pdf, other

    cs.LG cs.AI cs.RO

    MetaFollower: Adaptable Personalized Autonomous Car Following

    Authors: Xianda Chen, Kehua Chen, Meixin Zhu, Hao, Yang, Shaojie Shen, Xuesong Wang, Yinhai Wang

    Abstract: Car-following (CF) modeling, a fundamental component in microscopic traffic simulation, has attracted increasing interest of researchers in the past decades. In this study, we propose an adaptable personalized car-following framework -MetaFollower, by leveraging the power of meta-learning. Specifically, we first utilize Model-Agnostic Meta-Learning (MAML) to extract common driving knowledge from v… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  8. arXiv:2406.16531  [pdf, other

    cs.CV

    GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

    Authors: Yirui Chen, Xudong Huang, Quan Zhang, Wei Li, Mingjian Zhu, Qiangyu Yan, Simiao Li, Hanting Chen, Hailin Hu, Jie Yang, Wei Liu, Jie Hu

    Abstract: The extraordinary ability of generative models emerges as a new trend in image editing and generating realistic images, posing a serious threat to the trustworthiness of multimedia data and driving the research of image manipulation detection and location(IMDL). However, the lack of a large-scale data foundation makes IMDL task unattainable. In this paper, a local manipulation pipeline is designed… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Code page: https://github.com/chenyirui/GIM

  9. arXiv:2406.14862  [pdf, other

    cs.LG cs.CL cs.CV

    LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multi-modal Foundation Models

    Authors: Mengdan Zhu, Raasikh Kanjiani, Jiahui Lu, Andrew Choi, Qirui Ye, Liang Zhao

    Abstract: Deep generative models like VAEs and diffusion models have advanced various generation tasks by leveraging latent variables to learn data distributions and generate high-quality samples. Despite the field of explainable AI making strides in interpreting machine learning models, understanding latent variables in generative models remains challenging. This paper introduces LatentExplainer, a framewo… ▽ More

    Submitted 28 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

  10. arXiv:2406.12577  [pdf, other

    cs.CV

    Cephalometric Landmark Detection across Ages with Prototypical Network

    Authors: Han Wu, Chong Wang, Lanzhuju Mei, Tong Yang, Min Zhu, Dingggang Shen, Zhiming Cui

    Abstract: Automated cephalometric landmark detection is crucial in real-world orthodontic diagnosis. Current studies mainly focus on only adult subjects, neglecting the clinically crucial scenario presented by adolescents whose landmarks often exhibit significantly different appearances compared to adults. Hence, an open question arises about how to develop a unified and effective detection algorithm across… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: MICCAI 2024

  11. arXiv:2406.11689  [pdf, other

    cs.CV

    Lightweight Model Pre-training via Language Guided Knowledge Distillation

    Authors: Mingsheng Li, Lin Zhang, Mingzhen Zhu, Zilong Huang, Gang Yu, Jiayuan Fan, Tao Chen

    Abstract: This paper studies the problem of pre-training for small models, which is essential for many mobile devices. Current state-of-the-art methods on this problem transfer the representational knowledge of a large network (as a Teacher) into a smaller model (as a Student) using self-supervised distillation, improving the performance of the small model on downstream tasks. However, existing approaches a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  12. arXiv:2406.10540  [pdf, other

    cs.AI cs.NE cs.RO

    Generating and Evolving Reward Functions for Highway Driving with Large Language Models

    Authors: Xu Han, Qiannan Yang, Xianda Chen, Xiaowen Chu, Meixin Zhu

    Abstract: Reinforcement Learning (RL) plays a crucial role in advancing autonomous driving technologies by maximizing reward functions to achieve the optimal policy. However, crafting these reward functions has been a complex, manual process in many practices. To reduce this complexity, we introduce a novel framework that integrates Large Language Models (LLMs) with RL to improve reward function design in a… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures

  13. arXiv:2406.10290  [pdf, other

    cs.CL cs.AI cs.LG

    MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases

    Authors: Rithesh Murthy, Liangwei Yang, Juntao Tan, Tulika Manoj Awalgaonkar, Yilun Zhou, Shelby Heinecke, Sachin Desai, Jason Wu, Ran Xu, Sarah Tan, Jianguo Zhang, Zhiwei Liu, Shirley Kokane, Zuxin Liu, Ming Zhu, Huan Wang, Caiming Xiong, Silvio Savarese

    Abstract: The deployment of Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization. However, the hardware constraints of mobile devices necessitate the use of models with fewer parameters and model compression techniques like quantization. Currently, there is limited understand… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  14. arXiv:2406.06558  [pdf, other

    cs.CL cs.AI

    Enhancing Text Authenticity: A Novel Hybrid Approach for AI-Generated Text Detection

    Authors: Ye Zhang, Qian Leng, Mengran Zhu, Rui Ding, Yue Wu, **tong Song, Yulu Gong

    Abstract: The rapid advancement of Large Language Models (LLMs) has ushered in an era where AI-generated text is increasingly indistinguishable from human-generated content. Detecting AI-generated text has become imperative to combat misinformation, ensure content authenticity, and safeguard against malicious uses of AI. In this paper, we propose a novel hybrid approach that combines traditional TF-IDF tech… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  15. arXiv:2406.04501  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models

    Authors: Max Zhu, Adrián Bazaga, Pietro Liò

    Abstract: Learning computational fluid dynamics (CFD) traditionally relies on computationally intensive simulations of the Navier-Stokes equations. Recently, large language models (LLMs) have shown remarkable pattern recognition and reasoning abilities in natural language processing (NLP) and computer vision (CV). However, these models struggle with the complex geometries inherent in fluid dynamics. We intr… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  16. arXiv:2406.03733  [pdf, other

    cs.LG cs.AI

    Credit Card Fraud Detection Using Advanced Transformer Model

    Authors: Chang Yu, Yongshun Xu, ** Cao, Ye Zhang, Yinxin **, Mengran Zhu

    Abstract: With the proliferation of various online and mobile payment systems, credit card fraud has emerged as a significant threat to financial security. This study focuses on innovative applications of the latest Transformer models for more robust and precise fraud detection. To ensure the reliability of the data, we meticulously processed the data sources, balancing the dataset to address the issue of d… ▽ More

    Submitted 21 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: This paper have been received by https://ieee-metacom.org/

  17. arXiv:2406.02435  [pdf, other

    cs.CV

    Generative Active Learning for Long-tailed Instance Segmentation

    Authors: Muzhi Zhu, Chengxiang Fan, Hao Chen, Yang Liu, Weian Mao, Xiaogang Xu, Chunhua Shen

    Abstract: Recently, large-scale language-image generative models have gained widespread attention and many works have utilized generated data from these models to further enhance the performance of perception tasks. However, not all generated data can positively impact downstream models, and these methods do not thoroughly explore how to better select and utilize generated data. On the other hand, there is… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted by ICML 2024

  18. arXiv:2406.00416  [pdf, other

    stat.ML cs.LG eess.SP

    Representation and De-interleaving of Mixtures of Hidden Markov Processes

    Authors: Jiadi Bao, Mengtao Zhu, Yunjie Li, Shafei Wang

    Abstract: De-interleaving of the mixtures of Hidden Markov Processes (HMPs) generally depends on its representation model. Existing representation models consider Markov chain mixtures rather than hidden Markov, resulting in the lack of robustness to non-ideal situations such as observation noise or missing observations. Besides, de-interleaving methods utilize a search-based strategy, which is time-consumi… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

    Comments: 13 pages, 9 figures, submitted to IEEE transactions on Signal Processing

  19. arXiv:2406.00012  [pdf, other

    cs.IR cs.AI

    Extracting Essential and Disentangled Knowledge for Recommendation Enhancement

    Authors: Kounianhua Du, Jizheng Chen, Jianghao Lin, Menghui Zhu, Bo Chen, Shuai Li, Ruiming Tang

    Abstract: Recommender models play a vital role in various industrial scenarios, while often faced with the catastrophic forgetting problem caused by the fast shifting data distribution, e.g., the evolving user interests, click signals fluctuation during sales promotions, etc. To alleviate this problem, a common approach is to reuse knowledge from the historical data. However, preserving the vast and fast-ac… ▽ More

    Submitted 20 May, 2024; originally announced June 2024.

  20. arXiv:2405.19740  [pdf, other

    cs.CL cs.AI cs.CY

    PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations

    Authors: Jiatong Li, Renjun Hu, Kunzhe Huang, Yan Zhuang, Qi Liu, Mengxiao Zhu, Xing Shi, Wei Lin

    Abstract: Expert-designed close-ended benchmarks serve as vital tools in assessing the knowledge capacity of large language models (LLMs). Despite their widespread use, concerns have mounted regarding their reliability due to limited test scenarios and an unavoidable risk of data contamination. To rectify this, we present PertEval, a toolkit devised for in-depth probing of LLMs' knowledge capacity through k… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 23 pages, 12 figures, 10 tables

  21. arXiv:2405.18610  [pdf, other

    cs.LG cs.AI

    DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime

    Authors: Zhiyao Luo, Mingcheng Zhu, Fenglin Liu, Jiali Li, Yangchen Pan, Jiandong Zhou, Tingting Zhu

    Abstract: Reinforcement learning (RL) has garnered increasing recognition for its potential to optimise dynamic treatment regimes (DTRs) in personalised medicine, particularly for drug dosage prescriptions and medication recommendations. However, a significant challenge persists: the absence of a unified framework for simulating diverse healthcare scenarios and a comprehensive analysis to benchmark the effe… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 13 pages for main content

  22. arXiv:2405.17278  [pdf, ps, other

    cs.RO cs.CV

    EF-Calib: Spatiotemporal Calibration of Event- and Frame-Based Cameras Using Continuous-Time Trajectories

    Authors: Shaoan Wang, Zhanhua Xin, Yaoqing Hu, Dongyue Li, Mingzhu Zhu, Junzhi Yu

    Abstract: Event camera, a bio-inspired asynchronous triggered camera, offers promising prospects for fusion with frame-based cameras owing to its low latency and high dynamic range. However, calibrating stereo vision systems that incorporate both event and frame-based cameras remains a significant challenge. In this letter, we present EF-Calib, a spatiotemporal calibration framework for event- and frame-bas… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  23. arXiv:2405.16498  [pdf, other

    cs.LG

    On Sequential Loss Approximation for Continual Learning

    Authors: Menghao Waiyan William Zhu, Ercan Engin Kuruoğlu

    Abstract: We introduce for continual learning Autodiff Quadratic Consolidation (AQC), which approximates the previous loss function with a quadratic function, and Neural Consolidation (NC), which approximates the previous loss function with a neural network. Although they are not scalable to large neural networks, they can be used with a fixed pre-trained feature extractor. We empirically study these method… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  24. arXiv:2405.16134  [pdf, other

    cs.CV

    Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack

    Authors: Mingli Zhu, Siyuan Liang, Baoyuan Wu

    Abstract: Deep neural networks face persistent challenges in defending against backdoor attacks, leading to an ongoing battle between attacks and defenses. While existing backdoor defense strategies have shown promising performance on reducing attack success rates, can we confidently claim that the backdoor threat has truly been eliminated from the model? To address it, we re-investigate the characteristics… ▽ More

    Submitted 30 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  25. arXiv:2405.15842  [pdf, other

    cs.SE cs.LG

    Model Cascading for Code: Reducing Inference Costs with Model Cascading for LLM Based Code Generation

    Authors: Boyuan Chen, Mingzhi Zhu, Brendan Dolan-Gavitt, Muhammad Shafique, Siddharth Garg

    Abstract: The rapid development of large language models (LLMs) has led to significant advancements in code completion tasks. While larger models have higher accuracy, they also cost much more to run. Meanwhile, model cascading has been proven effective to conserve computational resources while enhancing accuracy in LLMs on natural language generation tasks. It generates output with the smallest model in a… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  26. arXiv:2405.13923  [pdf, other

    cs.CL

    Why Not Transform Chat Large Language Models to Non-English?

    Authors: Xiang Geng, Ming Zhu, Jiahuan Li, Zhejian Lai, Wei Zou, Shuaijie She, Jiaxin Guo, Xiaofeng Zhao, Yinglu Li, Yuang Li, Chang Su, Yanqing Zhao, Xinglin Lyu, Min Zhang, Jiajun Chen, Hao Yang, Shujian Huang

    Abstract: The scarcity of non-English data limits the development of non-English large language models (LLMs). Transforming English-centric LLMs to non-English has been identified as an effective and resource-efficient method. Previous works start from base LLMs and perform knowledge distillation (KD) with data generated by stronger LLMs, e.g. GPT-4. Compared to base LLMs, chat LLMs are further optimized fo… ▽ More

    Submitted 31 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  27. arXiv:2405.13516  [pdf, other

    cs.CL cs.LG

    LIRE: listwise reward enhancement for preference alignment

    Authors: Mingye Zhu, Yi Liu, Lei Zhang, Junbo Guo, Zhendong Mao

    Abstract: Recently, tremendous strides have been made to align the generation of Large Language Models (LLMs) with human values to mitigate toxic or unhelpful content. Leveraging Reinforcement Learning from Human Feedback (RLHF) proves effective and is widely adopted by researchers. However, implementing RLHF is complex, and its sensitivity to hyperparameters renders achieving stable performance and scalabi… ▽ More

    Submitted 4 June, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024 Findings

  28. arXiv:2405.10185  [pdf, other

    cs.CV

    DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

    Authors: Chengxiang Fan, Muzhi Zhu, Hao Chen, Yang Liu, Weijia Wu, Huaqi Zhang, Chunhua Shen

    Abstract: Instance segmentation is data-hungry, and as model capacity increases, data scale becomes crucial for improving the accuracy. Most instance segmentation datasets today require costly manual annotation, limiting their data scale. Models trained on such data are prone to overfitting on the training set, especially for those rare categories. While recent works have delved into exploiting generative m… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024, codes are available at \href{this https URL}{https://github.com/aim-uofa/DiverGen}

  29. arXiv:2405.07827  [pdf, other

    cs.MM cs.AI cs.CV

    Automatic Recognition of Food Ingestion Environment from the AIM-2 Wearable Sensor

    Authors: Yuning Huang, Mohamed Abul Hassan, Jiangpeng He, Janine Higgins, Megan McCrory, Heather Eicher-Miller, Graham Thomas, Edward O Sazonov, Fengqing Maggie Zhu

    Abstract: Detecting an ingestion environment is an important aspect of monitoring dietary intake. It provides insightful information for dietary assessment. However, it is a challenging problem where human-based reviewing can be tedious, and algorithm-based review suffers from data imbalance and perceptual aliasing problems. To address these issues, we propose a neural network-based method with a two-stage… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: Accepted at CVPRw 2024

  30. arXiv:2405.03199  [pdf, other

    cs.LG

    Boosting MLPs with a Coarsening Strategy for Long-Term Time Series Forecasting

    Authors: Nannan Bian, Minhong Zhu, Li Chen, Weiran Cai

    Abstract: Deep learning methods have been exerting their strengths in long-term time series forecasting. However, they often struggle to strike a balance between expressive power and computational efficiency. Resorting to multi-layer perceptrons (MLPs) provides a compromising solution, yet they suffer from two critical problems caused by the intrinsic point-wise map** mode, in terms of deficient contextua… ▽ More

    Submitted 20 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  31. arXiv:2405.02791  [pdf, other

    cs.CV cs.AI

    Efficient Text-driven Motion Generation via Latent Consistency Training

    Authors: Mengxian Hu, Minghao Zhu, Xun Zhou, Qingqing Yan, Shu Li, Chengju Liu, Qijun Chen

    Abstract: Motion diffusion models excel at text-driven motion generation but struggle with real-time inference since motion sequences are time-axis redundant and solving reverse diffusion trajectory involves tens or hundreds of sequential iterations. In this paper, we propose a Motion Latent Consistency Training (MLCT) framework, which allows for large-scale skip sampling of compact motion latent representa… ▽ More

    Submitted 25 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

  32. arXiv:2405.00435  [pdf, other

    cs.HC

    CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model

    Authors: Wei Zhang, Wong Kam-Kwai, Biying Xu, Yiwen Ren, Yuhuai Li, Minfeng Zhu, Yingchaojie Feng, Wei Chen

    Abstract: The integration of new technology with cultural studies enhances our understanding of cultural heritage but often struggles to connect with diverse audiences. It is challenging to align personal interpretations with the intended meanings across different cultures. Our study investigates the important factors in appreciating art from a cross-cultural perspective. We explore the application of Large… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  33. arXiv:2405.00026  [pdf

    cs.CE cs.AI

    Enhancing Credit Card Fraud Detection A Neural Network and SMOTE Integrated Approach

    Authors: Mengran Zhu, Ye Zhang, Yulu Gong, Changxin Xu, Yafei Xiang

    Abstract: Credit card fraud detection is a critical challenge in the financial sector, demanding sophisticated approaches to accurately identify fraudulent transactions. This research proposes an innovative methodology combining Neural Networks (NN) and Synthet ic Minority Over-sampling Technique (SMOTE) to enhance the detection performance. The study addresses the inherent imbalance in credit card transact… ▽ More

    Submitted 26 February, 2024; originally announced May 2024.

  34. arXiv:2404.18304  [pdf, other

    cs.IR cs.AI

    Retrieval-Oriented Knowledge for Click-Through Rate Prediction

    Authors: Huanshuo Liu, Bo Chen, Menghui Zhu, Jianghao Lin, Jiarui Qin, Yang Yang, Hao Zhang, Ruiming Tang

    Abstract: Click-through rate (CTR) prediction plays an important role in personalized recommendations. Recently, sample-level retrieval-based models (e.g., RIM) have achieved remarkable performance by retrieving and aggregating relevant samples. However, their inefficiency at the inference stage makes them impractical for industrial applications. To overcome this issue, this paper proposes a universal plug-… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  35. arXiv:2404.16152  [pdf, ps, other

    cs.IT eess.SP

    Rethinking Grant-Free Protocol in mMTC

    Authors: Minhao Zhu, Yifei Sun, Lizhao You, Zhaorui Wang, Ya-Feng Liu, Shuguang Cui

    Abstract: This paper revisits the identity detection problem under the current grant-free protocol in massive machine-type communications (mMTC) by asking the following question: for stable identity detection performance, is it enough to permit active devices to transmit preambles without any handshaking with the base station (BS)? Specifically, in the current grant-free protocol, the BS blindly allocates a… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE for possible publication

  36. arXiv:2404.16023  [pdf, other

    stat.AP cs.LG

    Learning Car-Following Behaviors Using Bayesian Matrix Normal Mixture Regression

    Authors: Chengyuan Zhang, Kehua Chen, Meixin Zhu, Hai Yang, Lijun Sun

    Abstract: Learning and understanding car-following (CF) behaviors are crucial for microscopic traffic simulation. Traditional CF models, though simple, often lack generalization capabilities, while many data-driven methods, despite their robustness, operate as "black boxes" with limited interpretability. To bridge this gap, this work introduces a Bayesian Matrix Normal Mixture Regression (MNMR) model that s… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 6 pages, Accepted by the 35th IEEE Intelligent Vehicles Symposium

  37. arXiv:2404.12242  [pdf, other

    cs.CL

    CMNEE: A Large-Scale Document-Level Event Extraction Dataset based on Open-Source Chinese Military News

    Authors: Mengna Zhu, Zijie Xu, Kaisheng Zeng, Kaiming Xiao, Mao Wang, Wenjun Ke, Hongbin Huang

    Abstract: Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain. To alleviate this problem, we propose CMNEE,… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 13 pages, 7 figures, accepted to LREC-COLING 2024

  38. arXiv:2404.11943  [pdf, other

    cs.HC

    AgentCoord: Visually Exploring Coordination Strategy for LLM-based Multi-Agent Collaboration

    Authors: Bo Pan, Jiaying Lu, Ke Wang, Li Zheng, Zhen Wen, Yingchaojie Feng, Minfeng Zhu, Wei Chen

    Abstract: The potential of automatic task-solving through Large Language Model (LLM)-based multi-agent collaboration has recently garnered widespread attention from both the research community and industry. While utilizing natural language to coordinate multiple agents presents a promising avenue for democratizing agent technology for general users, designing coordination strategies remains challenging with… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  39. arXiv:2404.10985  [pdf, ps, other

    cs.CV stat.ML

    Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images

    Authors: Junbiao Pang, Zailin Dong, Jiaxin Deng, Mengyuan Zhu, Yunwei Zhang

    Abstract: Parsing Computer-Aided Design (CAD) drawings is a fundamental step for CAD revision, semantic-based management, and the generation of 3D prototypes in both the architecture and engineering industries. Labeling symbols from a CAD drawing is a challenging yet notorious task from a practical point of view. In this work, we propose to label and spot symbols from CAD images that are converted from CAD… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 10 pages, 10 figures,6 tables

  40. arXiv:2404.09322  [pdf

    cs.DC cs.AI

    The intelligent prediction and assessment of financial information risk in the cloud computing model

    Authors: Yufu Wang, Mingwei Zhu, Jiaqiang Yuan, Guanghui Wang, Hong Zhou

    Abstract: Cloud computing (cloud computing) is a kind of distributed computing, referring to the network "cloud" will be a huge data calculation and processing program into countless small programs, and then, through the system composed of multiple servers to process and analyze these small programs to get the results and return to the user. This report explores the intersection of cloud computing and finan… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  41. arXiv:2404.08793  [pdf, other

    cs.CR cs.CL cs.HC

    JailbreakLens: Visual Analysis of Jailbreak Attacks Against Large Language Models

    Authors: Yingchaojie Feng, Zhizhang Chen, Zhining Kang, Sijia Wang, Minfeng Zhu, Wei Zhang, Wei Chen

    Abstract: The proliferation of large language models (LLMs) has underscored concerns regarding their security vulnerabilities, notably against jailbreak attacks, where adversaries design jailbreak prompts to circumvent safety mechanisms for potential misuse. Addressing these concerns necessitates a comprehensive analysis of jailbreak prompts to evaluate LLMs' defensive capabilities and identify potential we… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Submitted to VIS 2024

  42. arXiv:2404.08004  [pdf, other

    cs.LG cs.RO

    GRANP: A Graph Recurrent Attentive Neural Process Model for Vehicle Trajectory Prediction

    Authors: Yuhao Luo, Kehua Chen, Meixin Zhu

    Abstract: As a vital component in autonomous driving, accurate trajectory prediction effectively prevents traffic accidents and improves driving efficiency. To capture complex spatial-temporal dynamics and social interactions, recent studies developed models based on advanced deep-learning methods. On the other hand, recent studies have explored the use of deep generative models to further account for traje… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  43. arXiv:2404.06591  [pdf, other

    physics.soc-ph cs.IR

    Milgram's experiment in the knowledge space: Individual navigation strategies

    Authors: Manran Zhu, János Kertész

    Abstract: Data deluge characteristic for our times has led to information overload, posing a significant challenge to effectively finding our way through the digital landscape. Addressing this issue requires an in-depth understanding of how we navigate through the abundance of information. Previous research has discovered multiple patterns in how individuals navigate in the geographic, social, and informati… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 25 pages, 8 figures

  44. arXiv:2404.02937  [pdf, other

    cs.LG cs.AI

    Towards Responsible and Reliable Traffic Flow Prediction with Large Language Models

    Authors: Xusen Guo, Qiming Zhang, Junyue Jiang, Mingxing Peng, Hao, Yang, Meixin Zhu

    Abstract: Traffic forecasting is crucial for intelligent transportation systems. It has experienced significant advancements thanks to the power of deep learning in capturing latent patterns of traffic data. However, recent deep-learning architectures require intricate model designs and lack an intuitive understanding of the map** from input data to predicted results. Achieving both accuracy and responsib… ▽ More

    Submitted 21 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 27pages, 8 figures

  45. arXiv:2404.01151  [pdf, other

    cs.CV

    Detect2Interact: Localizing Object Key Field in Visual Question Answering (VQA) with LLMs

    Authors: Jialou Wang, Manli Zhu, Yulei Li, Honglei Li, Longzhi Yang, Wai Lok Woo

    Abstract: Localization plays a crucial role in enhancing the practicality and precision of VQA systems. By enabling fine-grained identification and interaction with specific parts of an object, it significantly improves the system's ability to provide contextually relevant and spatially accurate responses, crucial for applications in dynamic environments like robotics and augmented reality. However, traditi… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted to IEEE Intelligent Systems

  46. arXiv:2404.01150  [pdf, other

    cs.RO

    Visual-inertial state estimation based on Chebyshev polynomial optimization

    Authors: Hongyu Zhang, Maoran Zhu, Qi Cai, Yuanxin Wu

    Abstract: This paper proposes an innovative state estimation method for visual-inertial fusion based on Chebyshev polynomial optimization. Specifically, the pose is modeled as a Chebyshev polynomial of a certain order, and its time derivatives are used to calculate linear acceleration and angular velocity, which, along with inertial measurements, constitute dynamic constraints. This is coupled with a visual… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  47. arXiv:2404.00712  [pdf, other

    cs.LG cs.AI cs.CY cs.IR

    Survey of Computerized Adaptive Testing: A Machine Learning Perspective

    Authors: Qi Liu, Yan Zhuang, Haoyang Bi, Zhenya Huang, Weizhe Huang, Jiatong Li, Junhao Yu, Zirui Liu, Zirui Hu, Yuting Hong, Zachary A. Pardos, Hai** Ma, Mengxiao Zhu, Shi** Wang, Enhong Chen

    Abstract: Computerized Adaptive Testing (CAT) provides an efficient and tailored method for assessing the proficiency of examinees, by dynamically adjusting test questions based on their performance. Widely adopted across diverse fields like education, healthcare, sports, and sociology, CAT has revolutionized testing practices. While traditional methods rely on psychometrics and statistics, the increasing c… ▽ More

    Submitted 4 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

  48. arXiv:2403.19345  [pdf

    cs.IR cs.AI

    Intelligent Classification and Personalized Recommendation of E-commerce Products Based on Machine Learning

    Authors: Kangming Xu, Huiming Zhou, Haotian Zheng, Mingwei Zhu, Qi Xin

    Abstract: With the rapid evolution of the Internet and the exponential proliferation of information, users encounter information overload and the conundrum of choice. Personalized recommendation systems play a pivotal role in alleviating this burden by aiding users in filtering and selecting information tailored to their preferences and requirements. Such systems not only enhance user experience and satisfa… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  49. arXiv:2403.18660  [pdf, other

    cs.GR cs.CV

    InstructBrush: Learning Attention-based Instruction Optimization for Image Editing

    Authors: Ruoyu Zhao, Qingnan Fan, Fei Kou, Shuai Qin, Hong Gu, Wei Wu, Pengcheng Xu, Mingrui Zhu, Nannan Wang, Xinbo Gao

    Abstract: In recent years, instruction-based image editing methods have garnered significant attention in image editing. However, despite encompassing a wide range of editing priors, these methods are helpless when handling editing tasks that are challenging to accurately describe through language. We propose InstructBrush, an inversion method for instruction-based image editing methods to bridge this gap.… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Project Page: https://royzhao926.github.io/InstructBrush/

  50. arXiv:2403.18344  [pdf, other

    cs.AI

    LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models

    Authors: Mingxing Peng, Xusen Guo, Xianda Chen, Meixin Zhu, Kehua Chen, Hao, Yang, Xuesong Wang, Yinhai Wang

    Abstract: To ensure safe driving in dynamic environments, autonomous vehicles should possess the capability to accurately predict the lane change intentions of surrounding vehicles in advance and forecast their future trajectories. Existing motion prediction approaches have ample room for improvement, particularly in terms of long-term prediction accuracy and interpretability. In this paper, we address thes… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.