Skip to main content

Showing 1–50 of 275 results for author: Jia, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17145  [pdf, other

    cs.DC cs.AI cs.LG

    GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism

    Authors: Byungsoo Jeon, Mengdi Wu, Shiyi Cao, Sunghyun Kim, Sunghyun Park, Neeraj Aggarwal, Colin Unger, Daiyaan Arfeen, Peiyuan Liao, Xupeng Miao, Mohammad Alizadeh, Gregory R. Ganger, Tianqi Chen, Zhihao Jia

    Abstract: Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale DNN training by partitioning a DNN into multiple stages, which concurrently perform DNN training for different micro-batches in a pipeline fashion. However, existing pipeline-parallel approaches only c… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.16747  [pdf, other

    cs.CL cs.LG

    Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers

    Authors: Chao Lou, Zixia Jia, Zilong Zheng, Kewei Tu

    Abstract: Accommodating long sequences efficiently in autoregressive Transformers, especially within an extended context window, poses significant challenges due to the quadratic computational complexity and substantial KV memory requirements inherent in self-attention mechanisms. In this work, we introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome these computational and me… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: preprint

  3. arXiv:2406.16294  [pdf, other

    cs.CL cs.AI

    LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments

    Authors: Zixia Jia, Mengmeng Wang, Baichen Tong, Song-Chun Zhu, Zilong Zheng

    Abstract: Recent advances in Large Language Models (LLMs) have shown inspiring achievements in constructing autonomous agents that rely on language descriptions as inputs. However, it remains unclear how well LLMs can function as few-shot or zero-shot embodied agents in dynamic interactive environments. To address this gap, we introduce LangSuitE, a versatile and simulation-free testbed featuring 6 represen… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  4. arXiv:2406.16293  [pdf, other

    cs.CL cs.AI

    Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels

    Authors: Zixia Jia, Junpeng Li, Shichuan Zhang, Anji Liu, Zilong Zheng

    Abstract: Traditional supervised learning heavily relies on human-annotated datasets, especially in data-hungry neural approaches. However, various tasks, especially multi-label tasks like document-level relation extraction, pose challenges in fully manual annotation due to the specific domain knowledge and large class sets. Therefore, we address the multi-label positive-unlabelled learning (MLPUL) problem,… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  5. arXiv:2406.14401  [pdf, other

    cs.LG cs.AI

    Fair Streaming Feature Selection

    Authors: Zhangling Duan, Tianci Li, Xingyu Wu, Zhaolong Ling, **gye Yang, Zhaohong Jia

    Abstract: Streaming feature selection techniques have become essential in processing real-time data streams, as they facilitate the identification of the most relevant attributes from continuously updating information. Despite their performance, current algorithms to streaming feature selection frequently fall short in managing biases and avoiding discrimination that could be perpetuated by sensitive attrib… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 30 pages, 10 figures

  6. arXiv:2406.11567  [pdf, other

    cs.CV cs.AI

    Quaternion Generative Adversarial Neural Networks and Applications to Color Image Inpainting

    Authors: Duan Wang, Dandan Zhu, Meixiang Zhao, Zhigang Jia

    Abstract: Color image inpainting is a challenging task in imaging science. The existing method is based on real operation, and the red, green and blue channels of the color image are processed separately, ignoring the correlation between each channel. In order to make full use of the correlation between each channel, this paper proposes a Quaternion Generative Adversarial Neural Network (QGAN) model and rel… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 19 pages, 6 figures

  7. arXiv:2406.10958  [pdf, other

    math.OC cs.CL cs.MA

    City-LEO: Toward Transparent City Management Using LLM with End-to-End Optimization

    Authors: Zihao Jiao, Mengyi Sha, Haoyu Zhang, Xinyu Jiang, Wei Qi

    Abstract: Existing operations research (OR) models and tools play indispensable roles in smart-city operations, yet their practical implementation is limited by the complexity of modeling and deficiencies in optimization proficiency. To generate more relevant and accurate solutions to users' requirements, we propose a large language model (LLM)-based agent ("City-LEO") that enhances the efficiency and trans… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: 26 pages, 8 figures, 5 tables

  8. arXiv:2406.03777  [pdf, other

    cs.LG cs.AI

    Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices

    Authors: Ruiyang Qin, Dancheng Liu, Zheyu Yan, Zhaoxuan Tan, Zixuan Pan, Zhenge Jia, Meng Jiang, Ahmed Abbasi, **jun Xiong, Yiyu Shi

    Abstract: The scaling laws have become the de facto guidelines for designing large language models (LLMs), but they were studied under the assumption of unlimited computing resources for both training and inference. As LLMs are increasingly used as personalized intelligent assistants, their customization (i.e., learning through fine-tuning) and deployment onto resource-constrained edge devices will become m… ▽ More

    Submitted 13 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Benckmarking paper

  9. arXiv:2406.02532  [pdf, other

    cs.CL

    SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

    Authors: Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin

    Abstract: As large language models gain widespread adoption, running them efficiently becomes crucial. Recent works on LLM inference use speculative decoding to achieve extreme speedups. However, most of these works implicitly design their algorithms for high-end datacenter hardware. In this work, we ask the opposite question: how fast can we run LLMs on consumer machines? Consumer GPUs can no longer fit th… ▽ More

    Submitted 25 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: preprint

  10. arXiv:2406.01566  [pdf, other

    cs.DC cs.CL cs.LG

    Helix: Distributed Serving of Large Language Models via Max-Flow on Heterogeneous GPUs

    Authors: Yixuan Mei, Yonghao Zhuang, Xupeng Miao, Juncheng Yang, Zhihao Jia, Rashmi Vinayak

    Abstract: This paper introduces Helix, a distributed system for high-throughput, low-latency large language model (LLM) serving on heterogeneous GPU clusters. A key idea behind Helix is to formulate inference computation of LLMs over heterogeneous GPUs and network connections as a max-flow problem for a directed, weighted graph, whose nodes represent GPU instances and edges capture both GPU and network hete… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  11. arXiv:2406.01302  [pdf

    cs.CV

    Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data

    Authors: Zhusi Zhong, Helen Zhang, Fayez H. Fayad, Andrew C. Lancaster, John Sollee, Shreyas Kulkarni, Cheng Ting Lin, Jie Li, Xinbo Gao, Scott Collins, Colin Greineder, Sun H. Ahn, Harrison X. Bai, Zhicheng Jiao, Michael K. Atalay

    Abstract: Purpose: Pulmonary embolism (PE) is a significant cause of mortality in the United States. The objective of this study is to implement deep learning (DL) models using Computed Tomography Pulmonary Angiography (CTPA), clinical data, and PE Severity Index (PESI) scores to predict PE mortality. Materials and Methods: 918 patients (median age 64 years, range 13-99 years, 52% female) with 3,978 CTPAs w… ▽ More

    Submitted 5 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  12. arXiv:2405.18812  [pdf, other

    cs.CV

    MindSemantix: Deciphering Brain Visual Experiences with a Brain-Language Model

    Authors: Ziqi Ren, Jie Li, Xuetong Xue, Xin Li, Fan Yang, Zhicheng Jiao, Xinbo Gao

    Abstract: Deciphering the human visual experience through brain activities captured by fMRI represents a compelling and cutting-edge challenge in the field of neuroscience research. Compared to merely predicting the viewed image itself, decoding brain activity into meaningful captions provides a higher-level interpretation and summarization of visual information, which naturally enhances the application fle… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  13. arXiv:2405.15377   

    cs.RO

    Dynamic Planning for Sequential Whole-body Mobile Manipulation

    Authors: Zhitian Li, Yida Niu, Yao Su, Hangxin Liu, Ziyuan Jiao

    Abstract: The dynamic Sequential Mobile Manipulation Planning (SMMP) framework is essential for the safe and robust operation of mobile manipulators in dynamic environments. Previous research has primarily focused on either motion-level or task-level dynamic planning, with limitations in handling state changes that have long-term effects or in generating responsive motions for diverse tasks, respectively. T… ▽ More

    Submitted 20 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: technical issue, withdraw all the versions

  14. arXiv:2405.14905  [pdf, other

    eess.IV cs.AI cs.CL

    Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation

    Authors: Kang Liu, Zhuoqi Ma, Xiaolu Kang, Zhusi Zhong, Zhicheng Jiao, Grayson Baird, Harrison Bai, Qiguang Miao

    Abstract: The automated generation of imaging reports proves invaluable in alleviating the workload of radiologists. A clinically applicable reports generation algorithm should demonstrate its effectiveness in producing reports that accurately describe radiology findings and attend to patient-specific indications. In this paper, we introduce a novel method, \textbf{S}tructural \textbf{E}ntities extraction a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: The code is available at https://github.com/mk-runner/SEI-Temp or https://github.com/mk-runner/SEI

  15. arXiv:2405.14113  [pdf, other

    eess.IV cs.CV

    Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation

    Authors: Zhusi Zhong, Jie Li, John Sollee, Scott Collins, Harrison Bai, Paul Zhang, Terrence Healey, Michael Atalay, Xinbo Gao, Zhicheng Jiao

    Abstract: In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that foc… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  16. arXiv:2405.12114  [pdf, other

    cs.CV math.NA

    A New Cross-Space Total Variation Regularization Model for Color Image Restoration with Quaternion Blur Operator

    Authors: Zhigang Jia, Yuelian Xiang, Meixiang Zhao, Tingting Wu, Michael K. Ng

    Abstract: The cross-channel deblurring problem in color image processing is difficult to solve due to the complex coupling and structural blurring of color pixels. Until now, there are few efficient algorithms that can reduce color infection in deblurring process. To solve this challenging problem, we present a novel cross-space total variation (CSTV) regularization model for color image deblurring by intro… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 15pages,10figures

  17. arXiv:2405.11281  [pdf, other

    cs.DC cs.AI

    Cooperative Cognitive Dynamic System in UAV Swarms: Reconfigurable Mechanism and Framework

    Authors: Ziye Jia, Jiahao You, Chao Dong, Qihui Wu, Fuhui Zhou, Dusit Niyato, Zhu Han

    Abstract: As the demands for immediate and effective responses increase in both civilian and military domains, the unmanned aerial vehicle (UAV) swarms emerge as effective solutions, in which multiple cooperative UAVs can work together to achieve specific goals. However, how to manage such complex systems to ensure real-time adaptability lack sufficient researches. Hence, in this paper, we propose the coope… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  18. arXiv:2405.09586  [pdf, other

    eess.IV cs.AI cs.CV

    Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

    Authors: Kang Liu, Zhuoqi Ma, Mengmeng Liu, Zhicheng Jiao, Xiaolu Kang, Qiguang Miao, Kun Xie

    Abstract: The automation of writing imaging reports is a valuable tool for alleviating the workload of radiologists. Crucial steps in this process involve the cross-modal alignment between medical images and reports, as well as the retrieval of similar historical cases. However, the presence of presentation-style vocabulary (e.g., sentence structure and grammar) in reports poses challenges for cross-modal a… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  19. arXiv:2405.07638  [pdf, other

    cs.NI cs.AI cs.CR

    DoLLM: How Large Language Models Understanding Network Flow Data to Detect Carpet Bombing DDoS

    Authors: Qingyang Li, Yihang Zhang, Zhidong Jia, Yannan Hu, Lei Zhang, Jianrong Zhang, Yongming Xu, Yong Cui, Zongming Guo, Xinggong Zhang

    Abstract: It is an interesting question Can and How Large Language Models (LLMs) understand non-language network data, and help us detect unknown malicious flows. This paper takes Carpet Bombing as a case study and shows how to exploit LLMs' powerful capability in the networking area. Carpet Bombing is a new DDoS attack that has dramatically increased in recent years, significantly threatening network infra… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  20. arXiv:2405.06904  [pdf, other

    cs.LG

    Generation of Granular-Balls for Clustering Based on the Principle of Justifiable Granularity

    Authors: Zihang Jia, Zhen Zhang, Witold Pedrycz

    Abstract: Efficient and robust data clustering remains a challenging task in the field of data analysis. Recent efforts have explored the integration of granular-ball (GB) computing with clustering algorithms to address this challenge, yielding promising results. However, existing methods for generating GBs often rely on single indicators to measure GB quality and employ threshold-based or greedy strategies… ▽ More

    Submitted 15 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

  21. arXiv:2405.05751  [pdf, other

    cs.LG cs.AI cs.PL

    A Multi-Level Superoptimizer for Tensor Programs

    Authors: Mengdi Wu, Xinhao Cheng, Oded Padon, Zhihao Jia

    Abstract: We introduce Mirage, the first multi-level superoptimizer for tensor programs. A key idea in Mirage is $μ$Graphs, a uniform representation of tensor programs at the kernel, thread block, and thread levels of the GPU compute hierarchy. $μ$Graphs enable Mirage to discover novel optimizations that combine algebraic transformations, schedule transformations, and generation of new custom kernels. To na… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  22. arXiv:2405.04700  [pdf, other

    cs.LG cs.AI cs.DC cs.IR

    Robust Implementation of Retrieval-Augmented Generation on Edge-based Computing-in-Memory Architectures

    Authors: Ruiyang Qin, Zheyu Yan, Dewen Zeng, Zhenge Jia, Dancheng Liu, Jianbo Liu, Zhi Zheng, Ningyuan Cao, Kai Ni, **jun Xiong, Yiyu Shi

    Abstract: Large Language Models (LLMs) deployed on edge devices learn through fine-tuning and updating a certain portion of their parameters. Although such learning methods can be optimized to reduce resource utilization, the overall required resources remain a heavy burden on edge devices. Instead, Retrieval-Augmented Generation (RAG), a resource-efficient LLM learning method, can improve the quality of th… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  23. arXiv:2405.02815  [pdf, other

    cs.CV cs.AI

    Region-specific Risk Quantification for Interpretable Prognosis of COVID-19

    Authors: Zhusi Zhong, Jie Li, Zhuoqi Ma, Scott Collins, Harrison Bai, Paul Zhang, Terrance Healey, Xinbo Gao, Michael K. Atalay, Zhicheng Jiao

    Abstract: The COVID-19 pandemic has strained global public health, necessitating accurate diagnosis and intervention to control disease spread and reduce mortality rates. This paper introduces an interpretable deep survival prediction model designed specifically for improved understanding and trust in COVID-19 prognosis using chest X-ray (CXR) images. By integrating a large-scale pretrained image encoder, R… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  24. arXiv:2405.01248  [pdf, other

    cs.DC

    DiffusionPipe: Training Large Diffusion Models with Efficient Pipelines

    Authors: Ye Tian, Zhen Jia, Ziyue Luo, Yida Wang, Chuan Wu

    Abstract: Diffusion models have emerged as dominant performers for image generation. To support training large diffusion models, this paper studies pipeline parallel training of diffusion models and proposes DiffusionPipe, a synchronous pipeline training system that advocates innovative pipeline bubble filling technique, catering to structural characteristics of diffusion models. State-of-the-art diffusion… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Journal ref: MLSys 2024

  25. arXiv:2405.01186  [pdf, other

    cs.LG cs.AI

    Potential Energy based Mixture Model for Noisy Label Learning

    Authors: Zijia Wang, Wenbin Yang, Zhisong Liu, Zhen Jia

    Abstract: Training deep neural networks (DNNs) from noisy labels is an important and challenging task. However, most existing approaches focus on the corrupted labels and ignore the importance of inherent data structure. To bridge the gap between noisy labels and data, inspired by the concept of potential energy in physics, we propose a novel Potential Energy based Mixture Model (PEMM) for noise-labels lear… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  26. arXiv:2405.01175  [pdf, other

    cs.CV cs.AI

    Uncertainty-aware self-training with expectation maximization basis transformation

    Authors: Zijia Wang, Wenbin Yang, Zhisong Liu, Zhen Jia

    Abstract: Self-training is a powerful approach to deep learning. The key process is to find a pseudo-label for modeling. However, previous self-training algorithms suffer from the over-confidence issue brought by the hard labels, even some confidence-related regularizers cannot comprehensively catch the uncertainty. Therefore, we propose a new self-training framework to combine uncertainty information of bo… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Journal ref: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  27. arXiv:2405.01066  [pdf, other

    cs.CV cs.AI cs.HC

    HandS3C: 3D Hand Mesh Reconstruction with State Space Spatial Channel Attention from RGB images

    Authors: Zixun Jiao, Xihan Wang, Zhaoqiang Xia, Lianhe Shao, Quanli Gao

    Abstract: Reconstructing the hand mesh from one single RGB image is a challenging task because hands are often occluded by other objects. Most previous works attempt to explore more additional information and adopt attention mechanisms for improving 3D reconstruction performance, while it would increase computational complexity simultaneously. To achieve a performance-reserving architecture with high comput… ▽ More

    Submitted 14 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures

  28. arXiv:2404.19429  [pdf, other

    cs.DC cs.LG

    Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlap**

    Authors: Chenyu Jiang, Ye Tian, Zhen Jia, Shuai Zheng, Chuan Wu, Yida Wang

    Abstract: The Mixture-of-Expert (MoE) technique plays a crucial role in expanding the size of DNN model parameters. However, it faces the challenge of extended all-to-all communication latency during the training process. Existing methods attempt to mitigate this issue by overlap** all-to-all with expert computation. Yet, these methods frequently fall short of achieving sufficient overlap, consequently re… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 11 pages, 16 figures. Published in MLSys'24

  29. arXiv:2404.14822  [pdf, other

    cs.CV cs.AI

    CNN2GNN: How to Bridge CNN with GNN

    Authors: Ziheng Jiao, Hongyuan Zhang, Xuelong Li

    Abstract: Although the convolutional neural network (CNN) has achieved excellent performance in vision tasks by extracting the intra-sample representation, it will take a higher training expense because of stacking numerous convolutional layers. Recently, as the bilinear models, graph neural networks (GNN) have succeeded in exploring the underlying topological relationship among the graph data with a few gr… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  30. arXiv:2404.13600  [pdf, other

    cs.RO

    Are We Ready for Planetary Exploration Robots? The TAIL-Plus Dataset for SLAM in Granular Environments

    Authors: Zirui Wang, Chen Yao, Yangtao Ge, Guowei Shi, Ningbo Yang, Zheng Zhu, Kewei Dong, Hexiang Wei, Zhenzhong Jia, **g Wu

    Abstract: So far, planetary surface exploration depends on various mobile robot platforms. The autonomous navigation and decision-making of these mobile robots in complex terrains largely rely on their terrain-aware perception, localization and map** capabilities. In this paper we release the TAIL-Plus dataset, a new challenging dataset in deformable granular environments for planetary exploration robots,… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

  31. arXiv:2404.10220  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

    Authors: Peiyuan Zhi, Zhiyuan Zhang, Muzhi Han, Zeyu Zhang, Zhitian Li, Ziyuan Jiao, Baoxiong Jia, Siyuan Huang

    Abstract: Autonomous robot navigation and manipulation in open environments require reasoning and replanning with closed-loop feedback. We present COME-robot, the first closed-loop framework utilizing the GPT-4V vision-language foundation model for open-ended reasoning and adaptive planning in real-world scenarios. We meticulously construct a library of action primitives for robot exploration, navigation, a… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  32. arXiv:2403.17091  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data

    Authors: Zeyu Jia, Alexander Rakhlin, Ayush Sekhari, Chen-Yu Wei

    Abstract: We revisit the problem of offline reinforcement learning with value function realizability but without Bellman completeness. Previous work by Xie and Jiang (2021) and Foster et al. (2022) left open the question whether a bounded concentrability coefficient along with trajectory-based offline data admits a polynomial sample complexity. In this work, we provide a negative answer to this question for… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  33. arXiv:2403.16875  [pdf, other

    cs.RO

    TAIL: A Terrain-Aware Multi-Modal SLAM Dataset for Robot Locomotion in Deformable Granular Environments

    Authors: Chen Yao, Yangtao Ge, Guowei Shi, Zirui Wang, Ningbo Yang, Zheng Zhu, Hexiang Wei, Yuntian Zhao, **g Wu, Zhenzhong Jia

    Abstract: Terrain-aware perception holds the potential to improve the robustness and accuracy of autonomous robot navigation in the wilds, thereby facilitating effective off-road traversals. However, the lack of multi-modal perception across various motion patterns hinders the solutions of Simultaneous Localization And Map** (SLAM), especially when confronting non-geometric hazards in demanding landscapes… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE Robotics and Automation Letters

  34. arXiv:2403.14097  [pdf, other

    cs.DC

    Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances

    Authors: Jiangfei Duan, Ziang Song, Xupeng Miao, Xiaoli Xi, Dahua Lin, Harry Xu, Minjia Zhang, Zhihao Jia

    Abstract: Deep neural networks (DNNs) are becoming progressively large and costly to train. This paper aims to reduce DNN training costs by leveraging preemptible instances on modern clouds, which can be allocated at a much lower price when idle but may be preempted by the cloud provider at any time. Prior work that supports DNN training on preemptive instances employs a reactive approach to handling instan… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: NSDI '24

  35. arXiv:2403.11552  [pdf, other

    cs.RO cs.AI

    LLM3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

    Authors: Shu Wang, Muzhi Han, Ziyuan Jiao, Zeyu Zhang, Ying Nian Wu, Song-Chun Zhu, Hangxin Liu

    Abstract: Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Sp… ▽ More

    Submitted 20 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Submitted to IROS 2024. Codes available: https://github.com/AssassinWS/LLM-TAMP

  36. arXiv:2403.07031  [pdf, other

    cs.LG stat.CO stat.ME stat.ML

    The Cram Method for Efficient Simultaneous Learning and Evaluation

    Authors: Zeyang Jia, Kosuke Imai, Michael Lingzhi Li

    Abstract: We introduce the "cram" method, a general and efficient approach to simultaneous learning and evaluation using a generic machine learning (ML) algorithm. In a single pass of batched data, the proposed method repeatedly trains an ML algorithm and tests its empirical performance. Because it utilizes the entire sample for both learning and evaluation, cramming is significantly more data-efficient tha… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  37. arXiv:2402.18789  [pdf, other

    cs.DC cs.CL cs.LG

    FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning

    Authors: Xupeng Miao, Gabriele Oliaro, Xinhao Cheng, Mengdi Wu, Colin Unger, Zhihao Jia

    Abstract: Parameter-efficient finetuning (PEFT) is a widely used technique to adapt large language models for different tasks. Service providers typically create separate systems for users to perform PEFT model finetuning and inference tasks. This is because existing systems cannot handle workloads that include a mix of inference and PEFT finetuning requests. As a result, shared GPU resources are underutili… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  38. arXiv:2402.15592  [pdf, other

    math.OC cs.LG

    Neural optimal controller for stochastic systems via pathwise HJB operator

    Authors: Zhe Jiao, Xiaoyan Luo, Xinlei Yi

    Abstract: The aim of this work is to develop deep learning-based algorithms for high-dimensional stochastic control problems based on physics-informed learning and dynamic programming. Unlike classical deep learning-based methods relying on a probabilistic representation of the solution to the Hamilton--Jacobi--Bellman (HJB) equation, we introduce a pathwise operator associated with the HJB equation so that… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 20 pages

  39. arXiv:2402.15400  [pdf, other

    cs.IR cs.CL

    Faithful Temporal Question Answering over Heterogeneous Sources

    Authors: Zhen Jia, Philipp Christmann, Gerhard Weikum

    Abstract: Temporal question answering (QA) involves time constraints, with phrases such as "... in 2019" or "... before COVID". In the former, time is an explicit condition, in the latter it is implicit. State-of-the-art methods have limitations along three dimensions. First, with neural inference, time constraints are merely soft-matched, giving room to invalid or inexplicable answers. Second, questions wi… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted at WWW 2024

  40. arXiv:2402.12374  [pdf, other

    cs.CL

    Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

    Authors: Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen

    Abstract: As the usage of large language models (LLMs) grows, performing efficient inference with these models becomes increasingly important. While speculative decoding has recently emerged as a promising direction for speeding up inference, existing methods are limited in their ability to scale to larger speculation budgets, and adapt to different hyperparameters and hardware. This paper introduces Sequoi… ▽ More

    Submitted 29 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  41. arXiv:2402.01382  [pdf, other

    stat.ML cs.LG

    Emergence of heavy tails in homogenized stochastic gradient descent

    Authors: Zhe Jiao, Martin Keller-Ressel

    Abstract: It has repeatedly been observed that loss minimization by stochastic gradient descent (SGD) leads to heavy-tailed distributions of neural network parameters. Here, we analyze a continuous diffusion approximation of SGD, called homogenized stochastic gradient descent, show that it behaves asymptotically heavy-tailed, and give explicit upper and lower bounds on its tail-index. We validate these boun… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    MSC Class: 60H30; 68Txx ACM Class: G.3; I.2.6

  42. arXiv:2402.01138  [pdf, other

    eess.SP cs.LG

    Graph Neural Networks in EEG-based Emotion Recognition: A Survey

    Authors: Chenyu Liu, Xinliang Zhou, Yihao Wu, Ruizhi Yang, Liming Zhai, Ziyu Jia, Yang Liu

    Abstract: Compared to other modalities, EEG-based emotion recognition can intuitively respond to the emotional patterns in the human brain and, therefore, has become one of the most concerning tasks in the brain-computer interfaces field. Since dependencies within brain regions are closely related to emotion, a significant trend is to develop Graph Neural Networks (GNNs) for EEG-based emotion recognition. H… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  43. arXiv:2401.15132  [pdf, other

    cs.HC cs.AI

    On the Emergence of Symmetrical Reality

    Authors: Zhenliang Zhang, Zeyu Zhang, Ziyuan Jiao, Yao Su, Hangxin Liu, Wei Wang, Song-Chun Zhu

    Abstract: Artificial intelligence (AI) has revolutionized human cognitive abilities and facilitated the development of new AI entities capable of interacting with humans in both physical and virtual environments. Despite the existence of virtual reality, mixed reality, and augmented reality for several years, integrating these technical fields remains a formidable challenge due to their disparate applicatio… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: IEEE VR 2024

  44. arXiv:2401.14021  [pdf, other

    cs.LG cs.CL cs.IR

    Accelerating Retrieval-Augmented Language Model Serving with Speculation

    Authors: Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, Zhihao Jia

    Abstract: Retrieval-augmented language models (RaLM) have demonstrated the potential to solve knowledge-intensive natural language processing (NLP) tasks by combining a non-parametric knowledge base with a parametric language model. Instead of fine-tuning a fully parametric model, RaLM excels at its low-cost adaptation to the latest data and better source attribution mechanisms. Among various RaLM approache… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: Preprint

  45. arXiv:2401.07159  [pdf, other

    cs.LG cs.AI

    Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models

    Authors: Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Qing Li, Yong Jiang, Zhihao Jia

    Abstract: Finetuning large language models (LLMs) has been empirically effective on a variety of downstream tasks. Existing approaches to finetuning an LLM either focus on parameter-efficient finetuning, which only updates a small number of trainable parameters, or attempt to reduce the memory footprint during the training phase of the finetuning. Typically, the memory footprint during finetuning stems from… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    ACM Class: I.2.7

  46. arXiv:2401.01065  [pdf, other

    cs.CV cs.AI

    BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

    Authors: Tao Tang, Dafeng Wei, Zhengyu Jia, Tian Gao, Changwei Cai, Chengkai Hou, Peng Jia, Kun Zhan, Haiyang Sun, **gchen Fan, Yixing Zhao, Fu Liu, Xiaodan Liang, Xianpeng Lang, Yang Wang

    Abstract: The rapid development of the autonomous driving industry has led to a significant accumulation of autonomous driving data. Consequently, there comes a growing demand for retrieving data to provide specialized optimization. However, directly applying previous image retrieval methods faces several challenges, such as the lack of global feature representation and inadequate text retrieval ability for… ▽ More

    Submitted 18 June, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

  47. arXiv:2312.15633  [pdf, other

    cs.CV eess.IV

    MuLA-GAN: Multi-Level Attention GAN for Enhanced Underwater Visibility

    Authors: Ahsan Baidar Bakht, Zikai Jia, Muhayy ud Din, Waseem Akram, Lyes Saad Soud, Lakmal Seneviratne, Defu Lin, Shaoming He, Irfan Hussain

    Abstract: The underwater environment presents unique challenges, including color distortions, reduced contrast, and blurriness, hindering accurate analysis. In this work, we introduce MuLA-GAN, a novel approach that leverages the synergistic power of Generative Adversarial Networks (GANs) and Multi-Level Attention mechanisms for comprehensive underwater image enhancement. The integration of Multi-Level Atte… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  48. arXiv:2312.15234  [pdf, other

    cs.LG cs.AI cs.DC cs.PF

    Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems

    Authors: Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Xinhao Cheng, Hongyi **, Tianqi Chen, Zhihao Jia

    Abstract: In the rapidly evolving landscape of artificial intelligence (AI), generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency, particularly in scenarios demanding low latency and high throughput. This… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  49. arXiv:2312.13173  [pdf, other

    cs.LG math.OC

    Learning Fair Policies for Multi-stage Selection Problems from Observational Data

    Authors: Zhuangzhuang Jia, Grani A. Hanasusanto, Phebe Vayanos, Weijun Xie

    Abstract: We consider the problem of learning fair policies for multi-stage selection problems from observational data. This problem arises in several high-stakes domains such as company hiring, loan approval, or bail decisions where outcomes (e.g., career success, loan repayment, recidivism) are only observed for those selected. We propose a multi-stage framework that can be augmented with various fairness… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: 38th Annual AAAI Conference on Artificial Intelligence, 2024

  50. arXiv:2312.10885   

    cs.IR

    A novel diffusion recommendation algorithm based on multi-scale cnn and residual lstm

    Authors: Yong Niu, Xing Xing, Zhichun Jia, Ruidi Liu, Mindong Xin

    Abstract: Sequential recommendation aims to infer user preferences from historical interaction sequences and predict the next item that users may be interested in the future. The current mainstream design approach is to represent items as fixed vectors, capturing the underlying relationships between items and user preferences based on the order of interactions. However, relying on a single fixed-item embedd… ▽ More

    Submitted 20 December, 2023; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: This paper needs to be further modified, including the ablation experiment, model framework and other information in Chapter 5. There are some inaccuracies in the presentation of this paper. Two datasets are used instead of three, and there are many inaccuracies in the presentation, which need to be further corrected