Skip to main content

Showing 1–50 of 400 results for author: Zeng, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01104  [pdf, other

    cs.CV

    Semantic-guided Adversarial Diffusion Model for Self-supervised Shadow Removal

    Authors: Ziqi Zeng, Chen Zhao, Weiling Cai, Chenyu Dong

    Abstract: Existing unsupervised methods have addressed the challenges of inconsistent paired data and tedious acquisition of ground-truth labels in shadow removal tasks. However, GAN-based training often faces issues such as mode collapse and unstable optimization. Furthermore, due to the complex map** between shadow and shadow-free domains, merely relying on adversarial learning is not enough to capture… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2407.00474  [pdf, other

    cs.LG cs.AI

    MH-pFLGB: Model Heterogeneous personalized Federated Learning via Global Bypass for Medical Image Analysis

    Authors: Luyuan Xie, Manqing Lin, ChenMing Xu, Tianyu Luan, Zhipeng Zeng, Wenjun Qian, Cong Li, Yuejian Fang, Qingni Shen, Zhonghai Wu

    Abstract: In the evolving application of medical artificial intelligence, federated learning is notable for its ability to protect training data privacy. Federated learning facilitates collaborative model development without the need to share local data from healthcare institutions. Yet, the statistical and system heterogeneity among these institutions poses substantial challenges, which affects the effecti… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  3. arXiv:2406.13975  [pdf, other

    cs.CL cs.AI

    MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models

    Authors: Zhongshen Zeng, Yinhong Liu, Yingjia Wan, **gyao Li, Pengguang Chen, Jianbo Dai, Yuxuan Yao, Rongwu Xu, Zehan Qi, Wanru Zhao, Linling Shen, Jianqiao Lu, Haochen Tan, Yukang Chen, Hao Zhang, Zhan Shi, Bailin Wang, Zhijiang Guo, Jiaya Jia

    Abstract: Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes. However, it has been increasingly challenging to evaluate the reasoning capability of LLMs. Concretely, existing outcome-based benchmarks begin to saturate and become less sufficient to monitor the progress. To this end, we pr… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  4. arXiv:2406.13925  [pdf, other

    cs.CL cs.AI

    GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models

    Authors: Tao Zhang, Ziqian Zeng, Yuxiang Xiao, Hui** Zhuang, Cen Chen, James Foulds, Shimei Pan

    Abstract: Large Language Models (LLMs) are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicl… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.13233  [pdf, other

    cs.AI

    AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models

    Authors: Zihao Zeng, Yibo Miao, Hongcheng Gao, Hao Zhang, Zhijie Deng

    Abstract: Mixture of experts (MoE) has become the standard for constructing production-level large language models (LLMs) due to its promise to boost model capacity without causing significant overheads. Nevertheless, existing MoE methods usually enforce a constant top-k routing for all tokens, which is arguably restrictive because various tokens (e.g., "<EOS>" vs. "apple") may require various numbers of ex… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.13225  [pdf, other

    cs.LG cs.AI cs.IR

    Communication-Efficient Federated Knowledge Graph Embedding with Entity-Wise Top-K Sparsification

    Authors: Xiaoxiong Zhang, Zhiwei Zeng, Xin Zhou, Dusit Niyato, Zhiqi Shen

    Abstract: Federated Knowledge Graphs Embedding learning (FKGE) encounters challenges in communication efficiency stemming from the considerable size of parameters and extensive communication rounds. However, existing FKGE methods only focus on reducing communication rounds by conducting multiple rounds of local training in each communication round, and ignore reducing the size of parameters transmitted with… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2406.12908  [pdf, other

    cs.LG cs.AI stat.ME stat.ML

    Rating Multi-Modal Time-Series Forecasting Models (MM-TSFM) for Robustness Through a Causal Lens

    Authors: Kausik Lakkaraju, Rachneet Kaur, Zhen Zeng, Parisa Zehtabi, Sunandita Patra, Biplav Srivastava, Marco Valtorta

    Abstract: AI systems are notorious for their fragility; minor input changes can potentially cause major output swings. When such systems are deployed in critical areas like finance, the consequences of their uncertain behavior could be severe. In this paper, we focus on multi-modal time-series forecasting, where imprecision due to noisy or incorrect data can lead to erroneous predictions, impacting stakehol… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  8. arXiv:2406.11943  [pdf, other

    cs.IR cs.AI

    Personalized Federated Knowledge Graph Embedding with Client-Wise Relation Graph

    Authors: Xiaoxiong Zhang, Zhiwei Zeng, Xin Zhou, Dusit Niyato, Zhiqi Shen

    Abstract: Federated Knowledge Graph Embedding (FKGE) has recently garnered considerable interest due to its capacity to extract expressive representations from distributed knowledge graphs, while concurrently safeguarding the privacy of individual clients. Existing FKGE methods typically harness the arithmetic mean of entity embeddings from all clients as the global supplementary knowledge, and learn a repl… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2406.10196  [pdf, other

    cs.AI

    TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners

    Authors: Tomas de la Rosa, Sriram Gopalakrishnan, Alberto Pozanco, Zhen Zeng, Daniel Borrajo

    Abstract: Travel planning is a complex task that involves generating a sequence of actions related to visiting places subject to constraints and maximizing some user satisfaction criteria. Traditional approaches rely on problem formulation in a given formal language, extracting relevant travel information from web sources, and use an adequate problem solver to generate a valid solution. As an alternative, r… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 5 figures

  10. arXiv:2406.09416  [pdf, other

    cs.CV

    Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

    Authors: Qihao Liu, Zhanpeng Zeng, Ju He, Qihang Yu, Xiaohui Shen, Liang-Chieh Chen

    Abstract: This paper presents innovative enhancements to diffusion models by integrating a novel multi-resolution network and time-dependent layer normalization. Diffusion models have gained prominence for their effectiveness in high-fidelity image generation. While conventional approaches rely on convolutional U-Net architectures, recent Transformer-based designs have demonstrated superior performance and… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Introducing DiMR, a new diffusion backbone that surpasses all existing image generation models of various sizes on ImageNet 256 with only 505M parameters. Project page: https://qihao067.github.io/projects/DiMR

  11. arXiv:2406.08819  [pdf, other

    cs.LG cs.AI stat.ML

    AIM: Attributing, Interpreting, Mitigating Data Unfairness

    Authors: Zhining Liu, Ruizhong Qiu, Zhichen Zeng, Yada Zhu, Hendrik Hamann, Hanghang Tong

    Abstract: Data collected in the real world often encapsulates historical discrimination against disadvantaged groups and individuals. Existing fair machine learning (FairML) research has predominantly focused on mitigating discriminative bias in the model prediction, with far less effort dedicated towards exploring how to trace biases present in the data, despite its importance for the transparency and inte… ▽ More

    Submitted 18 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures, accepted by ACM SIGKDD 2024. Webpage: https://github.com/ZhiningLiu1998/AIM

  12. arXiv:2406.06633  [pdf, other

    cs.LG

    PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning

    Authors: Xiaoqi Qiu, Yongjie Wang, Xu Guo, Zhiwei Zeng, Yue Yu, Yuhong Feng, Chunyan Miao

    Abstract: Counterfactually Augmented Data (CAD) involves creating new data samples by applying minimal yet sufficient modifications to flip the label of existing data samples to other classes. Training with CAD enhances model robustness against spurious features that happen to correlate with labels by spreading the casual relationships across different classes. Yet, recent research reveals that training wit… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024 main conference

    MSC Class: 68T50 ACM Class: I.2; I.2.7

  13. arXiv:2406.03582  [pdf, other

    cs.CV cs.AI

    Understanding the Limitations of Diffusion Concept Algebra Through Food

    Authors: E. Zhixuan Zeng, Yuhao Chen, Alexander Wong

    Abstract: Image generation techniques, particularly latent diffusion models, have exploded in popularity in recent years. Many techniques have been developed to manipulate and clarify the semantic concepts these large-scale models learn, offering crucial insights into biases and concept relationships. However, these techniques are often only validated in conventional realms of human or animal faces and arti… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  14. arXiv:2406.02428  [pdf, other

    cs.LG

    Harnessing Neural Unit Dynamics for Effective and Scalable Class-Incremental Learning

    Authors: Depeng Li, Tianqi Wang, Junwei Chen, Wei Dai, Zhigang Zeng

    Abstract: Class-incremental learning (CIL) aims to train a model to learn new classes from non-stationary data streams without forgetting old ones. In this paper, we propose a new kind of connectionist model by tailoring neural unit dynamics that adapt the behavior of neural networks for CIL. In each training session, it introduces a supervisory mechanism to guide network expansion whose growth size is comp… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  15. arXiv:2406.01394  [pdf, other

    cs.CR cs.AI

    PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration

    Authors: Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Hui** Zhuang, Cen Chen

    Abstract: The widespread usage of online Large Language Models (LLMs) inference services has raised significant privacy concerns about the potential exposure of private information in user inputs to eavesdroppers or untrustworthy service providers. Existing privacy protection methods for LLMs suffer from insufficient privacy protection, performance degradation, or severe inference time overhead. In this pap… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  16. arXiv:2406.00954  [pdf, other

    cs.CL cs.AI

    Annotation Guidelines-Based Knowledge Augmentation: Towards Enhancing Large Language Models for Educational Text Classification

    Authors: Shiqi Liu, Sannyuya Liu, Lele Sha, Zijie Zeng, Dragan Gasevic, Zhi Liu

    Abstract: Various machine learning approaches have gained significant popularity for the automated classification of educational text to identify indicators of learning engagement -- i.e. learning engagement classification (LEC). LEC can offer comprehensive insights into human learning processes, attracting significant interest from diverse research communities, including Natural Language Processing (NLP),… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: The manuscript has been submitted for peer review to the IEEE Transactions on Learning Technologies

  17. arXiv:2405.18860  [pdf, other

    cs.RO

    Empowering Embodied Manipulation: A Bimanual-Mobile Robot Manipulation Dataset for Household Tasks

    Authors: Tianle Zhang, Dongjiang Li, Yihang Li, Zecui Zeng, Lin Zhao, Lei Sun, Yue Chen, Xuelong Wei, Yibing Zhan, Lusong Li, Xiaodong He

    Abstract: The advancements in embodied AI are increasingly enabling robots to tackle complex real-world tasks, such as household manipulation. However, the deployment of robots in these environments remains constrained by the lack of comprehensive bimanual-mobile robot manipulation data that can be learned. Existing datasets predominantly focus on single-arm manipulation tasks, while the few dual-arm datase… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  18. arXiv:2405.18840  [pdf, other

    cs.CV

    Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation

    Authors: Zelin Peng, Zhengqin Xu, Zhilin Zeng, Yaoming Wang, Lingxi Xie, Qi Tian, Wei Shen

    Abstract: Open-vocabulary semantic segmentation seeks to label each pixel in an image with arbitrary text descriptions. Vision-language foundation models, especially CLIP, have recently emerged as powerful tools for acquiring open-vocabulary capabilities. However, fine-tuning CLIP to equip it with pixel-level prediction ability often suffers three issues: 1) high computational cost, 2) misalignment between… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. Information Dynamics in Evolving Networks Based on the Birth-Death Process: Random Drift and Natural Selection Perspective

    Authors: Minyu Feng, Ziyan Zeng, Qin Li, Matjaž Perc, Jürgen Kurths

    Abstract: Dynamic processes in complex networks are crucial for better understanding collective behavior in human societies, biological systems, and the internet. In this paper, we first focus on the continuous Markov-based modeling of evolving networks with the birth-death of individuals. A new individual arrives at the group by the Poisson process, while new links are established in the network through ei… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 14 pages, 9 figures

  20. arXiv:2405.18729  [pdf, other

    cs.LG cs.AI

    Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning

    Authors: Tianle Zhang, Jiayi Guan, Lin Zhao, Yihang Li, Dongjiang Li, Zecui Zeng, Lei Sun, Yue Chen, Xuelong Wei, Lusong Li, Xiaodong He

    Abstract: Offline reinforcement learning (RL) aims to learn optimal policies from previously collected datasets. Recently, due to their powerful representational capabilities, diffusion models have shown significant potential as policy models for offline RL issues. However, previous offline RL algorithms based on diffusion policies generally adopt weighted regression to improve the policy. This approach opt… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  21. arXiv:2405.17779  [pdf, other

    cs.LG cs.RO

    Online Analytic Exemplar-Free Continual Learning with Large Models for Imbalanced Autonomous Driving Task

    Authors: Hui** Zhuang, Di Fang, Kai Tong, Yuchen Liu, Ziqian Zeng, Xu Zhou, Cen Chen

    Abstract: In the field of autonomous driving, even a meticulously trained model can encounter failures when faced with unfamiliar sceanrios. One of these scenarios can be formulated as an online continual learning (OCL) problem. That is, data come in an online fashion, and models are updated according to these streaming data. Two major OCL challenges are catastrophic forgetting and data imbalance. To addres… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  22. arXiv:2405.16884  [pdf, other

    cs.CL cs.DB

    Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching

    Authors: Tianshu Wang, Xiaoyang Chen, Hongyu Lin, Xuanang Chen, Xianpei Han, Hao Wang, Zhenyu Zeng, Le Sun

    Abstract: Entity matching (EM) is a critical step in entity resolution (ER). Recently, entity matching based on large language models (LLMs) has shown great promise. However, current LLM-based entity matching approaches typically follow a binary matching paradigm that ignores the global consistency between record relationships. In this paper, we investigate various methodologies for LLM-based entity matchin… ▽ More

    Submitted 23 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Code is available at https://github.com/tshu-w/LLM4EM

  23. arXiv:2405.16785  [pdf, other

    cs.CV

    PromptFix: You Prompt and We Fix the Photo

    Authors: Yongsheng Yu, Ziyun Zeng, Hang Hua, Jianlong Fu, Jiebo Luo

    Abstract: Diffusion models equipped with language models demonstrate excellent controllability in image generation tasks, allowing image processing to adhere to human instructions. However, the lack of diverse instruction-following data hampers the development of models that effectively recognize and execute user-customized instructions, particularly in low-level tasks. Moreover, the stochastic nature of th… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  24. arXiv:2405.16240  [pdf, other

    cs.LG

    Analytic Federated Learning

    Authors: Hui** Zhuang, Run He, Kai Tong, Di Fang, Han Sun, Haoran Li, Tianyi Chen, Ziqian Zeng

    Abstract: In this paper, we introduce analytic federated learning (AFL), a new training paradigm that brings analytical (i.e., closed-form) solutions to the federated learning (FL) community. Our AFL draws inspiration from analytic learning -- a gradient-free technique that trains neural networks with analytical solutions in one epoch. In the local client training stage, the AFL facilitates a one-epoch trai… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  25. arXiv:2405.14212  [pdf, other

    cs.CR cs.CL

    Federated Domain-Specific Knowledge Transfer on Large Language Models Using Synthetic Data

    Authors: Haoran Li, Xinyuan Zhao, Dadi Guo, Hanlin Gu, Ziqian Zeng, Yuxing Han, Yangqiu Song, Lixin Fan, Qiang Yang

    Abstract: As large language models (LLMs) demonstrate unparalleled performance and generalization ability, LLMs are widely used and integrated into various applications. When it comes to sensitive domains, as commonly described in federated learning scenarios, directly using external LLMs on private data is strictly prohibited by stringent data security and privacy regulations. For local clients, the utiliz… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  26. arXiv:2405.12939  [pdf, other

    cs.CL

    Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

    Authors: Zhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Tianxiang Sun, Cheng Chang, Qinyuan Cheng, Ding Wang, Xiaofeng Mou, Xipeng Qiu, Xuan**g Huang

    Abstract: Recent advancements in Chain-of-Thought prompting have facilitated significant breakthroughs for Large Language Models (LLMs) in complex reasoning tasks. Current research enhances the reasoning performance of LLMs by sampling multiple reasoning chains and ensembling based on the answer frequency. However, this approach fails in scenarios where the correct answers are in the minority. We identify t… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 17 pages, 14 figures, accepted by LREC-COLING 2024

  27. arXiv:2405.11416  [pdf, other

    cs.LG

    Discrete-state Continuous-time Diffusion for Graph Generation

    Authors: Zhe Xu, Ruizhong Qiu, Yuzhong Chen, Huiyuan Chen, Xiran Fan, Menghai Pan, Zhichen Zeng, Mahashweta Das, Hanghang Tong

    Abstract: Graph is a prevalent discrete data structure, whose generation has wide applications such as drug discovery and circuit design. Diffusion generative models, as an emerging research focus, have been applied to graph generation tasks. Overall, according to the space of states and time steps, diffusion generative models can be categorized into discrete-/continuous-state discrete-/continuous-time fash… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  28. arXiv:2405.10497  [pdf, other

    cs.MM cs.AI cs.CV cs.SI

    SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge

    Authors: Bo Wu, Peiye Liu, Wen-Huang Cheng, Bei Liu, Zhaoyang Zeng, Jia Wang, Qiushi Huang, Jiebo Luo

    Abstract: Social Media Popularity Prediction (SMPP) is a crucial task that involves automatically predicting future popularity values of online posts, leveraging vast amounts of multimodal data available on social media platforms. Studying and investigating social media popularity becomes central to various online applications and requires novel methods of comprehensive analysis, multimodal comprehension, a… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: ACM Multimedia. arXiv admin note: text overlap with arXiv:1910.01795

  29. arXiv:2405.10300  [pdf, other

    cs.CV

    Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

    Authors: Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang

    Abstract: This paper introduces Grounding DINO 1.5, a suite of advanced open-set object detection models developed by IDEA Research, which aims to advance the "Edge" of open-set object detection. The suite encompasses two models: Grounding DINO 1.5 Pro, a high-performance model designed for stronger generalization capability across a wide range of scenarios, and Grounding DINO 1.5 Edge, an efficient model o… ▽ More

    Submitted 31 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: homepage: https://deepdataspace.com/home

  30. arXiv:2405.09828  [pdf, other

    cs.CV

    PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features

    Authors: Xusheng Li, Chengliang Wang, Shumao Wang, Zhuo Zeng, Ji Liu

    Abstract: The multi-line LiDAR is widely used in autonomous vehicles, so point cloud-based 3D detectors are essential for autonomous driving. Extracting rich multi-scale features is crucial for point cloud-based 3D detectors in autonomous driving due to significant differences in the size of different types of objects. However, because of the real-time requirements, large-size convolution kernels are rarely… ▽ More

    Submitted 19 May, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  31. arXiv:2405.08668  [pdf, other

    cs.CV cs.AI cs.LG stat.AP

    Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research

    Authors: Qinglong Cao, Yuntian Chen, Lu Lu, Hao Sun, Zhenzhong Zeng, Xiaokang Yang, Dongxiao Zhang

    Abstract: Large-scale Vision-Language Models (VLMs) have demonstrated exceptional performance in natural vision tasks, motivating researchers across domains to explore domain-specific VLMs. However, the construction of powerful domain-specific VLMs demands vast amounts of annotated data, substantial electrical energy, and computing resources, primarily accessible to industry, yet hindering VLM research in a… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  32. arXiv:2405.07801  [pdf, other

    cs.CV

    Deep Learning-Based Object Pose Estimation: A Comprehensive Survey

    Authors: Jian Liu, Wei Sun, Hui Yang, Zhiwen Zeng, Chongpei Liu, ** Zheng, Xingyu Liu, Hossein Rahmani, Nicu Sebe, Ajmal Mian

    Abstract: Object pose estimation is a fundamental computer vision problem with broad applications in augmented reality and robotics. Over the past decade, deep learning models, due to their superior accuracy and robustness, have increasingly supplanted conventional algorithms reliant on engineered point pair features. Nevertheless, several challenges persist in contemporary methods, including their dependen… ▽ More

    Submitted 31 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

    Comments: 27 pages, 7 figures

  33. arXiv:2405.06925  [pdf, other

    cs.LG cs.AI

    Semi-supervised Anomaly Detection via Adaptive Reinforcement Learning-Enabled Method with Causal Inference for Sensor Signals

    Authors: Xiangwei Chen, Ruliang Xiaoa, Zhixia Zeng, Zhipeng Qiu, Shi Zhang, Xin Du

    Abstract: Semi-supervised anomaly detection for sensor signals is critical in ensuring system reliability in smart manufacturing. However, existing methods rely heavily on data correlation, neglecting causality and leading to potential misinterpretations due to confounding factors. Moreover, while current reinforcement learning-based methods can effectively identify known and unknown anomalies with limited… ▽ More

    Submitted 16 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

  34. RGB$\leftrightarrow$X: Image decomposition and synthesis using material- and lighting-aware diffusion models

    Authors: Zheng Zeng, Valentin Deschaintre, Iliyan Georgiev, Yannick Hold-Geoffroy, Yiwei Hu, Fujun Luan, Ling-Qi Yan, Miloš Hašan

    Abstract: The three areas of realistic forward rendering, per-pixel inverse rendering, and generative image synthesis may seem like separate and unrelated sub-fields of graphics and vision. However, recent work has demonstrated improved estimation of per-pixel intrinsic channels (albedo, roughness, metallicity) based on a diffusion architecture; we call this the RGB$\rightarrow$X problem. We further show th… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Journal ref: SIGGRAPH Conference Papers '24, July 27-August 1, 2024, Denver, CO, USA

  35. arXiv:2404.17961  [pdf, other

    cs.CV

    Random Walk on Pixel Manifolds for Anomaly Segmentation of Complex Driving Scenes

    Authors: Zelong Zeng, Kaname Tomite

    Abstract: In anomaly segmentation for complex driving scenes, state-of-the-art approaches utilize anomaly scoring functions to calculate anomaly scores. For these functions, accurately predicting the logits of inlier classes for each pixel is crucial for precisely inferring the anomaly score. However, in real-world driving scenarios, the diversity of scenes often results in distorted manifolds of pixel embe… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

    Comments: 23 pages

  36. arXiv:2404.17199  [pdf, other

    cs.CV

    Few-shot Calligraphy Style Learning

    Authors: Fangda Chen, Jiacheng Nie, Lichuan Jiang, Zhuoer Zeng

    Abstract: We introduced "Presidifussion," a novel approach to learning and replicating the unique style of calligraphy of President Xu, using a pretrained diffusion model adapted through a two-stage training process. Initially, our model is pretrained on a diverse dataset containing works from various calligraphers. This is followed by fine-tuning on a smaller, specialized dataset of President Xu's calligra… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  37. arXiv:2404.16563  [pdf, other

    cs.CL

    Evaluating Large Language Models on Time Series Feature Understanding: A Comprehensive Taxonomy and Benchmark

    Authors: Elizabeth Fons, Rachneet Kaur, Soham Palande, Zhen Zeng, Svitlana Vyetrenko, Tucker Balch

    Abstract: Large Language Models (LLMs) offer the potential for automatic time series analysis and reporting, which is a critical task across many domains, spanning healthcare, finance, climate, energy, and many more. In this paper, we propose a framework for rigorously evaluating the capabilities of LLMs on time series understanding, encompassing both univariate and multivariate forms. We introduce a compre… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  38. arXiv:2404.16248  [pdf, other

    cs.CL cs.AI

    URL: Universal Referential Knowledge Linking via Task-instructed Representation Compression

    Authors: Zhuoqun Li, Hongyu Lin, Tianshu Wang, Boxi Cao, Yaojie Lu, Weixiang Zhou, Hao Wang, Zhenyu Zeng, Le Sun, Xianpei Han

    Abstract: Linking a claim to grounded references is a critical ability to fulfill human demands for authentic and reliable information. Current studies are limited to specific tasks like information retrieval or semantic matching, where the claim-reference relationships are unique and fixed, while the referential knowledge linking (RKL) in real-world can be much more diverse and complex. In this paper, we p… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  39. arXiv:2404.13050  [pdf, other

    cs.CL cs.AI

    FlowMind: Automatic Workflow Generation with LLMs

    Authors: Zhen Zeng, William Watson, Nicole Cho, Saba Rahimi, Shayleen Reynolds, Tucker Balch, Manuela Veloso

    Abstract: The rapidly evolving field of Robotic Process Automation (RPA) has made significant strides in automating repetitive processes, yet its effectiveness diminishes in scenarios requiring spontaneous or unpredictable tasks demanded by users. This paper introduces a novel approach, FlowMind, leveraging the capabilities of Large Language Models (LLMs) such as Generative Pretrained Transformer (GPT), to… ▽ More

    Submitted 16 March, 2024; originally announced April 2024.

    Comments: Published in ACM ICAIF 2023

  40. arXiv:2404.11887  [pdf, other

    cs.AR

    EN-TensorCore: Advancing TensorCores Performance through Encoder-Based Methodology

    Authors: Qizhe Wu, Yuchen Gui, Zhichen Zeng, Xiaotian Wang, Huawen Liang, Xi **

    Abstract: Tensor computations, with matrix multiplication being the primary operation, serve as the fundamental basis for data analysis, physics, machine learning, and deep learning. As the scale and complexity of data continue to grow rapidly, the demand for tensor computations has also increased significantly. To meet this demand, several research institutions have started develo** dedicated hardware fo… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 7 pages, 6 figures

  41. arXiv:2404.11000  [pdf, other

    cs.RO

    OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding

    Authors: Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins

    Abstract: In order for robots to interact with objects effectively, they must understand the form and function of each object they encounter. Essentially, robots need to understand which actions each object affords, and where those affordances can be acted on. Robots are ultimately expected to operate in unstructured human environments, where the set of objects and affordances is not known to the robot befo… ▽ More

    Submitted 25 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted to Vision-Language Models for Navigation and Manipulation (VLMNM) Workshop (ICRA 2024)

  42. arXiv:2404.10450  [pdf, ps, other

    cs.LG

    Graph Neural Networks for Protein-Protein Interactions -- A Short Survey

    Authors: Mingda Xu, Peisheng Qian, Ziyuan Zhao, Zeng Zeng, Jianguo Chen, Weide Liu, Xulei Yang

    Abstract: Protein-protein interactions (PPIs) play key roles in a broad range of biological processes. Numerous strategies have been proposed for predicting PPIs, and among them, graph-based methods have demonstrated promising outcomes owing to the inherent graph structure of PPI networks. This paper reviews various graph-based methodologies, and discusses their applications in PPI prediction. We classify t… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  43. arXiv:2404.09931  [pdf, other

    cs.CV cs.AI

    Zero-shot detection of buildings in mobile LiDAR using Language Vision Model

    Authors: June Moh Goo, Zichao Zeng, Jan Boehm

    Abstract: Recent advances have demonstrated that Language Vision Models (LVMs) surpass the existing State-of-the-Art (SOTA) in two-dimensional (2D) computer vision tasks, motivating attempts to apply LVMs to three-dimensional (3D) data. While LVMs are efficient and effective in addressing various downstream 2D vision tasks without training, they face significant challenges when it comes to point clouds, a r… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 7 pages, 6 figures, conference

  44. arXiv:2404.09921  [pdf, other

    cs.CV cs.AI

    Zero-shot Building Age Classification from Facade Image Using GPT-4

    Authors: Zichao Zeng, June Moh Goo, Xinglei Wang, Bin Chi, Meihui Wang, Jan Boehm

    Abstract: A building's age of construction is crucial for supporting many geospatial applications. Much current research focuses on estimating building age from facade images using deep learning. However, building an accurate deep learning model requires a considerable amount of labelled training data, and the trained models often have geographical constraints. Recently, large pre-trained vision language mo… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  45. arXiv:2404.06003  [pdf, other

    cs.CL cs.AI

    FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

    Authors: Zhuohao Yu, Chang Gao, Wen** Yao, Yidong Wang, Zhengran Zeng, Wei Ye, **dong Wang, Yue Zhang, Shikun Zhang

    Abstract: The rapid development of large language model (LLM) evaluation methodologies and datasets has led to a profound challenge: integrating state-of-the-art evaluation techniques cost-effectively while ensuring reliability, reproducibility, and efficiency. Currently, there is a notable absence of a unified and adaptable framework that seamlessly integrates various evaluation approaches. Moreover, the r… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: We open-source all our code at: https://github.com/WisdomShell/FreeEval

  46. arXiv:2404.05880  [pdf, other

    cs.CL

    Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge

    Authors: Weikai Lu, Ziqian Zeng, Jianwei Wang, Zhengdong Lu, Zelin Chen, Hui** Zhuang, Cen Chen

    Abstract: Jailbreaking attacks can enable Large Language Models (LLMs) to bypass the safeguard and generate harmful content. Existing jailbreaking defense methods have failed to address the fundamental issue that harmful knowledge resides within the model, leading to potential jailbreak risks for LLMs. In this paper, we propose a novel defense method called Eraser, which mainly includes three goals: unlearn… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  47. arXiv:2404.04935  [pdf, other

    cs.CV

    Anomaly Detection in Electrocardiograms: Advancing Clinical Diagnosis Through Self-Supervised Learning

    Authors: Aofan Jiang, Chaoqin Huang, Qing Cao, Yuchen Xu, Zi Zeng, Kang Chen, Ya Zhang, Yanfeng Wang

    Abstract: The electrocardiogram (ECG) is an essential tool for diagnosing heart disease, with computer-aided systems improving diagnostic accuracy and reducing healthcare costs. Despite advancements, existing systems often miss rare cardiac anomalies that could be precursors to serious, life-threatening issues or alterations in the cardiac macro/microstructure. We address this gap by focusing on self-superv… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  48. arXiv:2404.04815  [pdf, other

    cs.PL cs.AR cs.LG

    Allo: A Programming Model for Composable Accelerator Design

    Authors: Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang

    Abstract: Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools o… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted to PLDI'24

  49. arXiv:2404.03929  [pdf, other

    cs.DB

    SLSM : An Efficient Strategy for Lazy Schema Migration on Shared-Nothing Databases

    Authors: Zhilin Zeng, Hui Li, Xiyue Gao, Hui Zhang, Huiquan Zhang, Jiangtao Cui

    Abstract: By introducing intermediate states for metadata changes and ensuring that at most two versions of metadata exist in the cluster at the same time, shared-nothing databases are capable of making online, asynchronous schema changes. However, this method leads to delays in the deployment of new schemas since it requires waiting for massive data backfill. To shorten the service vacuum period before the… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  50. arXiv:2404.03635  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.MM

    WorDepth: Variational Language Prior for Monocular Depth Estimation

    Authors: Ziyao Zeng, Daniel Wang, Fengyu Yang, Hyoungseob Park, Yangchao Wu, Stefano Soatto, Byung-Woo Hong, Dong Lao, Alex Wong

    Abstract: Three-dimensional (3D) reconstruction from a single image is an ill-posed problem with inherent ambiguities, i.e. scale. Predicting a 3D scene from text description(s) is similarly ill-posed, i.e. spatial arrangements of objects described. We investigate the question of whether two inherently ambiguous modalities can be used in conjunction to produce metric-scaled reconstructions. To test this, we… ▽ More

    Submitted 2 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.