Skip to main content

Showing 1–50 of 346 results for author: Deng, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00129  [pdf

    eess.IV cs.AI cs.HC

    Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction

    Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Rishi Agrawal, Carol C. Wu, Hien Van Nguyen

    Abstract: Predicting human gaze behavior within computer vision is integral for develo** interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Submitted to the Journal

  2. arXiv:2406.19686  [pdf

    eess.IV cs.AI cs.CV cs.HC

    Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction

    Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Carol C. Wu, Hien Van Nguyen

    Abstract: Human-AI collaboration to identify and correct perceptual errors in chest radiographs has not been previously explored. This study aimed to develop a collaborative AI system, CoRaX, which integrates eye gaze data and radiology reports to enhance diagnostic accuracy in chest radiology by pinpointing perceptual errors and refining the decision-making process. Using public datasets REFLACX and EGD-CX… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Under Review in Journal

  3. arXiv:2406.17720  [pdf, other

    cs.CV

    Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

    Authors: Chih-Hsuan Yang, Benjamin Feuer, Zaki Jubery, Zi K. Deng, Andre Nakkab, Md Zahid Hasan, Shivani Chiranjeevi, Kelly Marshall, Nirmal Baishnab, Asheesh K Singh, Arti Singh, Soumik Sarkar, Nirav Merchant, Chinmay Hegde, Baskar Ganapathysubramanian

    Abstract: We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Preprint under review

  4. arXiv:2406.17100  [pdf, other

    cs.CV

    Fine-tuning Diffusion Models for Enhancing Face Quality in Text-to-image Generation

    Authors: Zhenyi Liao, Qingsong Xie, Chen Chen, Hannan Lu, Zhijie Deng

    Abstract: Diffusion models (DMs) have achieved significant success in generating imaginative images given textual descriptions. However, they are likely to fall short when it comes to real-life scenarios with intricate details.The low-quality, unrealistic human faces in text-to-image generation are one of the most prominent issues, hindering the wide application of DMs in practice. Targeting addressing such… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Under review

  5. arXiv:2406.16253  [pdf, other

    cs.CL

    LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

    Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

    Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More

    Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

  6. arXiv:2406.14066  [pdf, other

    cs.AI cs.PF

    Optimizing Speculative Decoding for Serving Large Language Models Using Goodput

    Authors: Xiaoxuan Liu, Cade Daniel, Langxiang Hu, Woosuk Kwon, Zhuohan Li, Xiangxi Mo, Alvin Cheung, Zhijie Deng, Ion Stoica, Hao Zhang

    Abstract: Reducing the inference latency of large language models (LLMs) is crucial, and speculative decoding (SD) stands out as one of the most effective techniques. Rather than letting the LLM generate all tokens directly, speculative decoding employs effective proxies to predict potential outputs, which are then verified by the LLM without compromising the generation quality. Yet, deploying SD in real on… ▽ More

    Submitted 25 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.13233  [pdf, other

    cs.AI

    AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models

    Authors: Zihao Zeng, Yibo Miao, Hongcheng Gao, Hao Zhang, Zhijie Deng

    Abstract: Mixture of experts (MoE) has become the standard for constructing production-level large language models (LLMs) due to its promise to boost model capacity without causing significant overheads. Nevertheless, existing MoE methods usually enforce a constant top-k routing for all tokens, which is arguably restrictive because various tokens (e.g., "<EOS>" vs. "apple") may require various numbers of ex… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  8. arXiv:2406.11310  [pdf

    cs.CV cs.LG

    Federated Active Learning Framework for Efficient Annotation Strategy in Skin-lesion Classification

    Authors: Zhipeng Deng, Yuqiao Yang, Kenji Suzuki

    Abstract: Federated Learning (FL) enables multiple institutes to train models collaboratively without sharing private data. Current FL research focuses on communication efficiency, privacy protection, and personalization and assumes that the data of FL have already been ideally collected. In medical scenarios, however, data annotation demands both expertise and intensive labor, which is a critical problem i… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 14 pages, 3 figures

  9. arXiv:2406.11149  [pdf, other

    cs.CL cs.CR

    GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory

    Authors: Wei Fan, Haoran Li, Zheye Deng, Weiqi Wang, Yangqiu Song

    Abstract: Privacy issues arise prominently during the inappropriate transmission of information between entities. Existing research primarily studies privacy by exploring various privacy attacks, defenses, and evaluations within narrowly predefined patterns, while neglecting that privacy is not an isolated, context-free concept limited to traditionally sensitive data (e.g., social security numbers), but int… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  10. arXiv:2406.10485  [pdf, other

    cs.LG cs.CV

    A Label is Worth a Thousand Images in Dataset Distillation

    Authors: Tian Qin, Zhiwei Deng, David Alvarez-Melis

    Abstract: Data $\textit{quality}$ is a crucial factor in the performance of machine learning models, a principle that dataset distillation methods exploit by compressing training datasets into much smaller counterparts that maintain similar downstream performance. Understanding how and why data distillation methods work is vital not only for improving these methods but also for revealing fundamental charact… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  11. arXiv:2406.10237  [pdf

    cs.IR cs.CE cs.CL cs.HC cs.LG

    Towards commands recommender system in BIM authoring tool using transformers

    Authors: Changyu Du, Zihan Deng, Stavros Nousias, André Borrmann

    Abstract: The complexity of BIM software presents significant barriers to the widespread adoption of BIM and model-based design within the Architecture, Engineering, and Construction (AEC) sector. End-users frequently express concerns regarding the additional effort required to create a sufficiently detailed BIM model when compared with conventional 2D drafting. This study explores the potential of sequenti… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  12. arXiv:2406.07327  [pdf, other

    cs.AI cs.CL cs.LG

    3D-Properties: Identifying Challenges in DPO and Charting a Path Forward

    Authors: Yuzi Yan, Yibo Miao, Jialian Li, Yipin Zhang, Jian Xie, Zhijie Deng, Dong Yan

    Abstract: Aligning large language models (LLMs) with human preference has recently gained tremendous attention, with the canonical yet costly RLHF-PPO and the simple and straightforward Direct Preference Optimization (DPO) as two examples. Despite the efficiency, DPO has rarely be used in the state-of-the-art production-level LLMs, implying its potential pathologies. In this work, we revisit DPO with a comp… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  13. arXiv:2406.05768  [pdf, other

    cs.CV cs.AI

    MLCM: Multistep Consistency Distillation of Latent Diffusion Model

    Authors: Qingsong Xie, Zhenyi Liao, Chen chen, Zhijie Deng, Shixiang Tang, Haonan Lu

    Abstract: Distilling large latent diffusion models (LDMs) into ones that are fast to sample from is attracting growing research interest. However, the majority of existing methods face a dilemma where they either (i) depend on multiple individual distilled models for different sampling budgets, or (ii) sacrifice generation quality with limited (e.g., 2-4) and/or moderate (e.g., 5-8) sampling steps. To addre… ▽ More

    Submitted 11 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  14. arXiv:2406.04284  [pdf, other

    cs.LG

    What is Dataset Distillation Learning?

    Authors: William Yang, Ye Zhu, Zhiwei Deng, Olga Russakovsky

    Abstract: Dataset distillation has emerged as a strategy to overcome the hurdles associated with large datasets by learning a compact set of synthetic data that retains essential information from the original dataset. While distilled data can be used to train high performing models, little is understood about how the information is stored. In this study, we posit and answer three questions about the behavio… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  15. arXiv:2406.03470  [pdf, other

    cs.NE cs.AI

    SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN

    Authors: Kang You, Zekai Xu, Chen Nie, Zhijie Deng, Qinghai Guo, Xiang Wang, Zhezhi He

    Abstract: Spiking neural network (SNN) has attracted great attention due to its characteristic of high efficiency and accuracy. Currently, the ANN-to-SNN conversion methods can obtain ANN on-par accuracy SNN with ultra-low latency (8 time-steps) in CNN structure on computer vision (CV) tasks. However, as Transformer-based networks have achieved prevailing precision on both CV and natural language processing… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: * These authors contributed equally to this work

  16. arXiv:2406.03239  [pdf, other

    cs.CL

    Document-level Claim Extraction and Decontextualisation for Fact-Checking

    Authors: Zhenyun Deng, Michael Schlichtkrull, Andreas Vlachos

    Abstract: Selecting which claims to check is a time-consuming task for human fact-checkers, especially from documents consisting of multiple sentences and containing multiple claims. However, existing claim extraction approaches focus more on identifying and extracting claims from individual sentences, e.g., identifying whether a sentence contains a claim or the exact boundaries of the claim within a senten… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024

  17. arXiv:2406.02903  [pdf, other

    cs.CL

    Open Grounded Planning: Challenges and Benchmark Construction

    Authors: Shiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun

    Abstract: The emergence of large language models (LLMs) has increasingly drawn attention to the use of LLMs for human-like planning. Existing work on LLM-based planning either focuses on leveraging the inherent language generation capabilities of LLMs to produce free-style plans, or employs reinforcement learning approaches to learn decision-making for a limited set of actions within restricted environments… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accept to ACL 2024 main conference

  18. arXiv:2406.02630  [pdf, other

    cs.CR cs.AI

    AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

    Authors: Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, Yang Xiang

    Abstract: An Artificial Intelligence (AI) agent is a software entity that autonomously performs tasks or makes decisions based on pre-defined objectives and data inputs. AI agents, capable of perceiving user inputs, reasoning and planning tasks, and executing actions, have seen remarkable advancements in algorithm development and task performance. However, the security challenges they pose remain under-expl… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: ACM Computing Survey

  19. arXiv:2405.17814  [pdf, other

    cs.CV cs.AI

    FAIntbench: A Holistic and Precise Benchmark for Bias Evaluation in Text-to-Image Models

    Authors: Hanjun Luo, Ziye Deng, Ruizhe Chen, Zuozhu Liu

    Abstract: The rapid development and reduced barriers to entry for Text-to-Image (T2I) models have raised concerns about the biases in their outputs, but existing research lacks a holistic definition and evaluation framework of biases, limiting the enhancement of debiasing techniques. To address this issue, we introduce FAIntbench, a holistic and precise benchmark for biases in T2I models. In contrast to exi… ▽ More

    Submitted 8 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  20. arXiv:2405.16334  [pdf, other

    cs.AI

    Devil's Advocate: Anticipatory Reflection for LLM Agents

    Authors: Haoyu Wang, Tao Li, Zhiwei Deng, Dan Roth, Yang Li

    Abstract: In this work, we introduce a novel approach that equips LLM agents with introspection, enhancing consistency and adaptability in solving complex tasks. Our approach prompts LLM agents to decompose a given task into manageable subtasks (i.e., to make a plan), and to continuously introspect upon the suitability and results of their actions. %; and when necessary, to explore ``the road not taken.'' W… ▽ More

    Submitted 20 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 13 pages, 6 figures

  21. arXiv:2405.15258  [pdf, other

    cs.CR

    Leakage-Resilient and Carbon-Neutral Aggregation Featuring the Federated AI-enabled Critical Infrastructure

    Authors: Zehang Deng, Ruoxi Sun, Minhui Xue, Sheng Wen, Seyit Camtepe, Surya Nepal, Yang Xiang

    Abstract: AI-enabled critical infrastructures (ACIs) integrate artificial intelligence (AI) technologies into various essential systems and services that are vital to the functioning of society, offering significant implications for efficiency, security and resilience. While adopting decentralized AI approaches (such as federated learning technology) in ACIs is plausible, private and sensitive data are stil… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  22. arXiv:2405.13199  [pdf, ps, other

    eess.IV cs.CV

    TauAD: MRI-free Tau Anomaly Detection in PET Imaging via Conditioned Diffusion Models

    Authors: Lujia Zhong, Shuo Huang, Jiaxin Yue, Jianwei Zhang, Zhiwei Deng, Wenhao Chi, Yonggang Shi

    Abstract: The emergence of tau PET imaging over the last decade has enabled Alzheimer's disease (AD) researchers to examine tau pathology in vivo and more effectively characterize the disease trajectories of AD. Current tau PET analysis methods, however, typically perform inferences on large cortical ROIs and are limited in the detection of localized tau pathology that varies across subjects. Furthermore, a… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  23. arXiv:2405.12843  [pdf, other

    cs.CY cs.LG

    OpenCarbonEval: A Unified Carbon Emission Estimation Framework in Large-Scale AI Models

    Authors: Zhaojian Yu, Yinghao Wu, Zhuotao Deng, Yansong Tang, Xiao-** Zhang

    Abstract: In recent years, large-scale auto-regressive models have made significant progress in various tasks, such as text or video generation. However, the environmental impact of these models has been largely overlooked, with a lack of assessment and analysis of their carbon footprint. To address this gap, we introduce OpenCarbonEval, a unified framework for integrating large-scale models across diverse… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  24. arXiv:2405.11442  [pdf, other

    cs.CV

    Unifying 3D Vision-Language Understanding via Promptable Queries

    Authors: Ziyu Zhu, Zhuofan Zhang, Xiaojian Ma, Xuesong Niu, Yixin Chen, Baoxiong Jia, Zhidong Deng, Siyuan Huang, Qing Li

    Abstract: A unified model for 3D vision-language (3D-VL) understanding is expected to take various scene representations and perform a wide range of tasks in a 3D scene. However, a considerable gap exists between existing methods and such a unified model, due to the independent application of representation and insufficient exploration of 3D multi-task training. In this paper, we introduce PQ3D, a unified m… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: Project page: https://pq3d.github.io

  25. arXiv:2405.10934  [pdf, other

    cs.CV

    Reconstruction of Manipulated Garment with Guided Deformation Prior

    Authors: Ren Li, Corentin Dumery, Zhantao Deng, Pascal Fua

    Abstract: Modeling the shape of garments has received much attention, but most existing approaches assume the garments to be worn by someone, which constrains the range of shapes they can assume. In this work, we address shape recovery when garments are being manipulated instead of worn, which gives rise to an even larger range of possible shapes. To this end, we leverage the implicit sewing patterns (ISP)… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  26. arXiv:2405.08051  [pdf, ps, other

    cs.CC

    P=NP

    Authors: Zikang Deng

    Abstract: This paper investigates an extremely classic NP-complete problem: How to determine if a graph G, where each vertex has a degree of at most 4, can be 3-colorable(The research in this paper focuses on graphs G that satisfy the condition where the degree of each vertex does not exceed 4. To conserve space, it is assumed throughout the paper that graph G meets this condition by default.). The author h… ▽ More

    Submitted 18 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

  27. arXiv:2405.05615  [pdf, other

    cs.CV cs.CL cs.LG

    Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

    Authors: Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, Yunhe Wang

    Abstract: Current solutions for efficiently constructing large vision-language (VL) models follow a two-step paradigm: projecting the output of pre-trained vision encoders to the input space of pre-trained language models as visual prompts; and then transferring the models to downstream VL tasks via end-to-end parameter-efficient fine-tuning (PEFT). However, this paradigm still exhibits inefficiency since i… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML2024

  28. arXiv:2405.05567  [pdf, other

    cs.IT

    Perfect Subset Privacy in Polynomial Computation

    Authors: Zirui Deng, Vinayak Ramkumar, Netanel Raviv

    Abstract: Delegating large-scale computations to service providers is a common practice which raises privacy concerns. This paper studies information-theoretic privacy-preserving delegation of data to a service provider, who may further delegate the computation to auxiliary worker nodes, in order to compute a polynomial over that data at a later point in time. We study techniques which are compatible with r… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to ISIT 2024

  29. arXiv:2405.03234  [pdf, other

    cs.HC cs.LG

    A Reliable Framework for Human-in-the-Loop Anomaly Detection in Time Series

    Authors: Ziquan Deng, Xiwei Xuan, Kwan-Liu Ma, Zhaodan Kong

    Abstract: Time series anomaly detection is a critical machine learning task for numerous applications, such as finance, healthcare, and industrial systems. However, even high-performed models may exhibit potential issues such as biases, leading to unreliable outcomes and misplaced confidence. While model explanation techniques, particularly visual explanations, offer valuable insights to detect such issues… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: The manuscript is currently under review

  30. arXiv:2404.14215  [pdf, other

    cs.CL

    Text-Tuple-Table: Towards Information Integration in Text-to-Table Generation via Global Tuple Extraction

    Authors: Zheye Deng, Chunkit Chan, Weiqi Wang, Yuxi Sun, Wei Fan, Tianshi Zheng, Yauwai Yim, Yangqiu Song

    Abstract: The task of condensing large chunks of textual information into concise and structured tables has gained attention recently due to the emergence of Large Language Models (LLMs) and their potential benefit for downstream tasks, such as text summarization and text mining. Previous approaches often generate tables that directly replicate information from the text, limiting their applicability in broa… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  31. arXiv:2404.13964  [pdf, other

    cs.LG econ.GN stat.ME

    An Economic Solution to Copyright Challenges of Generative AI

    Authors: Jiachen T. Wang, Zhun Deng, Hiroaki Chiba-Okabe, Boaz Barak, Weijie J. Su

    Abstract: Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their cont… ▽ More

    Submitted 24 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  32. arXiv:2404.13627  [pdf, other

    cs.CL cs.AI

    NegotiationToM: A Benchmark for Stress-testing Machine Theory of Mind on Negotiation Surrounding

    Authors: Chunkit Chan, Cheng Jiayang, Yauwai Yim, Zheye Deng, Wei Fan, Haoran Li, Xin Liu, Hongming Zhang, Weiqi Wang, Yangqiu Song

    Abstract: Large Language Models (LLMs) have sparked substantial interest and debate concerning their potential emergence of Theory of Mind (ToM) ability. Theory of mind evaluations currently focuses on testing models using machine-generated data or game settings prone to shortcuts and spurious correlations, which lacks evaluation of machine ToM ability in real-world human interaction scenarios. This poses a… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  33. arXiv:2404.11605  [pdf, other

    cs.CV cs.AI cs.RO

    VG4D: Vision-Language Model Goes 4D Video Recognition

    Authors: Zhichao Deng, Xiangtai Li, Xia Li, Yunhai Tong, Shen Zhao, Mengyuan Liu

    Abstract: Understanding the real world through point cloud video is a crucial aspect of robotics and autonomous driving systems. However, prevailing methods for 4D point cloud recognition have limitations due to sensor resolution, which leads to a lack of detailed information. Recent advances have shown that Vision-Language Models (VLM) pre-trained on web-scale text-image datasets can learn fine-grained vis… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: ICRA 2024

  34. arXiv:2404.10942  [pdf, other

    cs.LG cs.AI cs.CY stat.ME

    What Hides behind Unfairness? Exploring Dynamics Fairness in Reinforcement Learning

    Authors: Zhihong Deng, **g Jiang, Guodong Long, Chengqi Zhang

    Abstract: In sequential decision-making problems involving sensitive attributes like race and gender, reinforcement learning (RL) agents must carefully consider long-term fairness while maximizing returns. Recent works have proposed many different types of fairness notions, but how unfairness arises in RL problems remains unclear. In this paper, we address this gap in the literature by investigating the sou… ▽ More

    Submitted 28 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 13 pages, 9 figures, accepted by IJCAI 2024

  35. arXiv:2404.06939  [pdf, other

    cs.ET cs.AI

    Fast System Technology Co-Optimization Framework for Emerging Technology Based on Graph Neural Networks

    Authors: Tianliang Ma, Guangxi Fan, Xuguang Sun, Zhihui Deng, Kainlu Low, Leilai Shao

    Abstract: This paper proposes a fast system technology co-optimization (STCO) framework that optimizes power, performance, and area (PPA) for next-generation IC design, addressing the challenges and opportunities presented by novel materials and device architectures. We focus on accelerating the technology level of STCO using AI techniques, by employing graph neural network (GNN)-based approaches for both T… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Accepted by the 61th Design Automation Conference (DAC)

  36. arXiv:2404.04943  [pdf

    cs.LG cs.AI cs.AR

    Chiplet Placement Order Exploration Based on Learning to Rank with Graph Representation

    Authors: Zhihui Deng, Yuanyuan Duan, Leilai Shao, Xiaolei Zhu

    Abstract: Chiplet-based systems, integrating various silicon dies manufactured at different integrated circuit technology nodes on a carrier interposer, have garnered significant attention in recent years due to their cost-effectiveness and competitive performance. The widespread adoption of reinforcement learning as a sequential placement method has introduced a new challenge in determining the optimal pla… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 6 pages, 8 figures and 6 tables, accepted by the Conference ISEDA

  37. arXiv:2404.04876  [pdf, other

    cs.CV

    HiLo: Detailed and Robust 3D Clothed Human Reconstruction with High-and Low-Frequency Information of Parametric Models

    Authors: Yifan Yang, Dong Liu, Shuhai Zhang, Zeshuai Deng, Zixiong Huang, Mingkui Tan

    Abstract: Reconstructing 3D clothed human involves creating a detailed geometry of individuals in clothing, with applications ranging from virtual try-on, movies, to games. To enable practical and widespread applications, recent advances propose to generate a clothed human from an RGB image. However, they struggle to reconstruct detailed and robust avatars simultaneously. We empirically find that the high-f… ▽ More

    Submitted 19 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Accepted Paper

  38. arXiv:2404.04140  [pdf, other

    cs.CV cs.LG

    Improving Detection in Aerial Images by Capturing Inter-Object Relationships

    Authors: Botao Ren, Botian Xu, Yifan Pu, **gyi Wang, Zhidong Deng

    Abstract: In many image domains, the spatial distribution of objects in a scene exhibits meaningful patterns governed by their semantic relationships. In most modern detection pipelines, however, the detection proposals are processed independently, overlooking the underlying relationships between objects. In this work, we introduce a transformer-based approach to capture these inter-object relationships to… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  39. arXiv:2404.02885  [pdf, other

    cs.CV

    PoCo: Point Context Cluster for RGBD Indoor Place Recognition

    Authors: **g Liang, Zhuo Deng, Zheming Zhou, Omid Ghasemalizadeh, Dinesh Manocha, Min Sun, Cheng-Hao Kuo, Arnie Sen

    Abstract: We present a novel end-to-end algorithm (PoCo) for the indoor RGB-D place recognition task, aimed at identifying the most likely match for a given query frame within a reference database. The task presents inherent challenges attributed to the constrained field of view and limited range of perception sensors. We propose a new network architecture, which generalizes the recent Context of Clusters (… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  40. arXiv:2404.01618  [pdf, other

    cs.RO

    Multi-Robot Collaborative Navigation with Formation Adaptation

    Authors: Zihao Deng, Peng Gao, Williard Joshua Jose, Hao Zhang

    Abstract: Multi-robot collaborative navigation is an essential ability where teamwork and synchronization are keys. In complex and uncertain environments, adaptive formation is vital, as rigid formations prove to be inadequate. The ability of robots to dynamically adjust their formation enables navigation through unpredictable spaces, maintaining cohesion, and effectively responding to environmental challen… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  41. arXiv:2404.00312  [pdf, other

    cs.CV cs.AI

    Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

    Authors: Yibo Miao, Yu Lei, Feng Zhou, Zhijie Deng

    Abstract: Low-shot image classification is a fundamental task in computer vision, and the emergence of large-scale vision-language models such as CLIP has greatly advanced the forefront of research in this field. However, most existing CLIP-based methods lack the flexibility to effectively incorporate other pre-trained models that encompass knowledge distinct from CLIP. To bridge the gap, this work proposes… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  42. arXiv:2403.16361  [pdf, other

    eess.IV cs.CV

    RSTAR: Rotational Streak Artifact Reduction in 4D CBCT using Separable and Circular Convolutions

    Authors: Ziheng Deng, Hua Chen, Haibo Hu, Zhiyong Xu, Tianling Lyu, Yan Xi, Yang Chen, Jun Zhao

    Abstract: Four-dimensional cone-beam computed tomography (4D CBCT) provides respiration-resolved images and can be used for image-guided radiation therapy. However, the ability to reveal respiratory motion comes at the cost of image artifacts. As raw projection data are sorted into multiple respiratory phases, there is a limited number of cone-beam projections available for image reconstruction. Consequentl… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  43. arXiv:2403.15088  [pdf, other

    cs.CL

    CHisIEC: An Information Extraction Corpus for Ancient Chinese History

    Authors: Xuemei Tang, Zekun Deng, Qi Su, Hao Yang, Jun Wang

    Abstract: Natural Language Processing (NLP) plays a pivotal role in the realm of Digital Humanities (DH) and serves as the cornerstone for advancing the structural analysis of historical and cultural heritage texts. This is particularly true for the domains of named entity recognition (NER) and relation extraction (RE). In our commitment to expediting ancient history and culture, we present the ``Chinese Hi… ▽ More

    Submitted 20 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 11 pages, 6 tables, 3 figures

  44. arXiv:2403.12719  [pdf, other

    cs.LG

    Bilevel Hypergraph Networks for Multi-Modal Alzheimer's Diagnosis

    Authors: Angelica I. Aviles-Rivero, Chun-Wun Cheng, Zhongying Deng, Zoe Kourtzi, Carola-Bibiane Schönlieb

    Abstract: Early detection of Alzheimer's disease's precursor stages is imperative for significantly enhancing patient outcomes and quality of life. This challenge is tackled through a semi-supervised multi-modal diagnosis framework. In particular, we introduce a new hypergraph framework that enables higher-order relations between multi-modal data, while utilising minimal labels. We first introduce a bilevel… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  45. arXiv:2403.11586  [pdf, other

    cs.CV

    DynoSurf: Neural Deformation-based Temporally Consistent Dynamic Surface Reconstruction

    Authors: Yuxin Yao, Siyu Ren, Junhui Hou, Zhi Deng, Juyong Zhang, Wen** Wang

    Abstract: This paper explores the problem of reconstructing temporally consistent surfaces from a 3D point cloud sequence without correspondence. To address this challenging task, we propose DynoSurf, an unsupervised learning framework integrating a template surface representation with a learnable deformation field. Specifically, we design a coarse-to-fine strategy for learning the template surface based on… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  46. arXiv:2403.11101  [pdf, other

    cs.CV

    Hierarchical Generative Network for Face Morphing Attacks

    Authors: Zuyuan He, Zongyong Deng, Qiaoyun He, Qijun Zhao

    Abstract: Face morphing attacks circumvent face recognition systems (FRSs) by creating a morphed image that contains multiple identities. However, existing face morphing attack methods either sacrifice image quality or compromise the identity preservation capability. Consequently, these attacks fail to bypass FRSs verification well while still managing to deceive human observers. These methods typically rel… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by FG2024

  47. arXiv:2403.05006  [pdf, ps, other

    cs.LG cs.AI stat.ME stat.ML

    Provable Multi-Party Reinforcement Learning with Diverse Human Feedback

    Authors: Huiying Zhong, Zhun Deng, Weijie J. Su, Zhiwei Steven Wu, Linjun Zhang

    Abstract: Reinforcement learning with human feedback (RLHF) is an emerging paradigm to align models with human preferences. Typically, RLHF aggregates preferences from multiple individuals who have diverse viewpoints that may conflict with each other. Our work \textit{initiates} the theoretical study of multi-party RLHF that explicitly models the diverse preferences of multiple individuals. We show how trad… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  48. arXiv:2403.04770  [pdf, other

    cs.CL cs.LG

    Social Orientation: A New Feature for Dialogue Analysis

    Authors: Todd Morrill, Zhaoyuan Deng, Yanda Chen, Amith Ananthram, Colin Wayne Leach, Kathleen McKeown

    Abstract: There are many settings where it is useful to predict and explain the success or failure of a dialogue. Circumplex theory from psychology models the social orientations (e.g., Warm-Agreeable, Arrogant-Calculating) of conversation participants and can be used to predict and explain the outcome of social interactions. Our work is novel in its systematic application of social orientation tags to mode… ▽ More

    Submitted 25 February, 2024; originally announced March 2024.

    Comments: Accepted to LREC-COLING 2024

  49. arXiv:2403.01505  [pdf, other

    cs.CV

    SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation

    Authors: Hongjian Liu, Qingsong Xie, Zhijie Deng, Chen Chen, Shixiang Tang, Fueyang Fu, Zheng-jun Zha, Haonan Lu

    Abstract: The iterative sampling procedure employed by diffusion models (DMs) often leads to significant inference latency. To address this, we propose Stochastic Consistency Distillation (SCott) to enable accelerated text-to-image generation, where high-quality generations can be achieved with just 1-2 sampling steps, and further improvements can be obtained by adding additional steps. In contrast to vanil… ▽ More

    Submitted 15 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: 22 pages, 16 figures

  50. arXiv:2403.00999  [pdf, other

    cs.LG

    Distributional Dataset Distillation with Subtask Decomposition

    Authors: Tian Qin, Zhiwei Deng, David Alvarez-Melis

    Abstract: What does a neural network learn when training from a task-specific dataset? Synthesizing this knowledge is the central idea behind Dataset Distillation, which recent work has shown can be used to compress large datasets into a small set of input-label pairs ($\textit{prototypes}$) that capture essential aspects of the original dataset. In this paper, we make the key observation that existing meth… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.