Skip to main content

Showing 1–50 of 351 results for author: Chang, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17442  [pdf, other

    cs.CV

    Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model

    Authors: Zhuoyuan Li, Yubo Ai, Jiahao Lu, ChuXin Wang, Jiacheng Deng, Hanzhi Chang, Yanzhe Liang, Wenfei Yang, Shifeng Zhang, Tianzhu Zhang

    Abstract: Transformers have demonstrated impressive results for 3D point cloud semantic segmentation. However, the quadratic complexity of transformer makes computation cost high, limiting the number of points that can be processed simultaneously and impeding the modeling of long-range dependencies. Drawing inspiration from the great potential of recent state space models (SSM) for long sequence modeling, w… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.17289  [pdf, other

    cs.IR cs.AI

    Hyperbolic Knowledge Transfer in Cross-Domain Recommendation System

    Authors: Xin Yang, Heng Chang, Zhijian La, **ze Yang, Xingrun Li, Yu Lu, Shuaiqiang Wang, Dawei Yin, Erxue Min

    Abstract: Cross-Domain Recommendation (CDR) seeks to utilize knowledge from different domains to alleviate the problem of data sparsity in the target recommendation domain, and it has been gaining more attention in recent years. Although there have been notable advancements in this area, most current methods represent users and items in Euclidean space, which is not ideal for handling long-tail distributed… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.16357  [pdf, other

    cs.LG cs.AI cs.SI

    Towards Lightweight Graph Neural Network Search with Curriculum Graph Sparsification

    Authors: Beini Xie, Heng Chang, Ziwei Zhang, Zeyang Zhang, Simin Wu, Xin Wang, Yuan Meng, Wenwu Zhu

    Abstract: Graph Neural Architecture Search (GNAS) has achieved superior performance on various graph-structured tasks. However, existing GNAS studies overlook the applications of GNAS in resource-constraint scenarios. This paper proposes to design a joint graph data and architecture mechanism, which identifies important sub-architectures via the valuable graph data. To search for optimal lightweight Graph N… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. The two first authors made equal contributions

  4. arXiv:2406.11813  [pdf, other

    cs.CL

    How Do Large Language Models Acquire Factual Knowledge During Pretraining?

    Authors: Hoyeon Chang, **ho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo

    Abstract: Despite the recent observation that large language models (LLMs) can store substantial factual knowledge, there is a limited understanding of the mechanisms of how they acquire factual knowledge through pretraining. This work addresses this gap by studying how LLMs acquire factual knowledge during pretraining. The findings reveal several important insights into the dynamics of factual knowledge ac… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  5. arXiv:2406.07837  [pdf, other

    cs.RO cs.AI

    Scaling Manipulation Learning with Visual Kinematic Chain Prediction

    Authors: Xinyu Zhang, Yuhan Liu, Haonan Chang, Abdeslam Boularias

    Abstract: Learning general-purpose models from diverse datasets has achieved great success in machine learning. In robotics, however, existing methods in multi-task learning are typically constrained to a single robot and workspace, while recent work such as RT-X requires a non-trivial action normalization procedure to manually bridge the gap between different action spaces in diverse environments. In this… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Submitted to CoRL 2024

  6. arXiv:2406.07735  [pdf, other

    cs.CL cs.LG

    REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy

    Authors: Haw-Shiuan Chang, Nanyun Peng, Mohit Bansal, Anil Ramakrishna, Tagyoung Chung

    Abstract: Decoding methods for large language models (LLMs) usually struggle with the tradeoff between ensuring factuality and maintaining diversity. For example, a higher p threshold in the nucleus (top-p) sampling increases the diversity but decreases the factuality, and vice versa. In this paper, we propose REAL (Residual Entropy from Asymptotic Line) sampling, a decoding method that achieves improved fa… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  7. arXiv:2406.07549  [pdf, other

    cs.RO

    A3VLM: Actionable Articulation-Aware Vision Language Model

    Authors: Siyuan Huang, Haonan Chang, Yuhan Liu, Yimeng Zhu, Hao Dong, Peng Gao, Abdeslam Boularias, Hongsheng Li

    Abstract: Vision Language Models (VLMs) have received significant attention in recent years in the robotics community. VLMs are shown to be able to perform complex visual reasoning and scene understanding tasks, which makes them regarded as a potential universal solution for general robotics problems such as manipulation and navigation. However, previous VLMs for robotics such as RT-1, RT-2, and ManipLLM ha… ▽ More

    Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  8. arXiv:2406.06650  [pdf, other

    eess.IV cs.CV

    Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

    Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

    Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  9. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin **, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao **, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  10. arXiv:2406.00276  [pdf

    cs.LG cs.AI cs.CE physics.data-an

    Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

    Authors: Shengyu Tao, Mengtian Zhang, Zixi Zhao, Haoyang Li, Ruifei Ma, Yunhong Che, Xin Sun, Lin Su, Xiangyu Chen, Zihao Zhou, Heng Chang, Tingwei Cao, Xiao Xiao, Yaojun Liu, Wenjun Yu, Zhongling Xu, Yang Li, Han Hao, Xuan Zhang, Xiaosong Hu, Guangmin ZHou

    Abstract: Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed mac… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    ACM Class: J.2; G.3

  11. arXiv:2405.20596  [pdf, other

    cs.CV cs.LG

    Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

    Authors: Jiachen Liang, Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Traditional semi-supervised learning (SSL) assumes that the feature distributions of labeled and unlabeled data are consistent which rarely holds in realistic scenarios. In this paper, we propose a novel SSL setting, where unlabeled samples are drawn from a mixed distribution that deviates from the feature distribution of labeled samples. Under this setting, previous SSL methods tend to predict wr… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages; Accepted by NeurIPS 2023

  12. arXiv:2405.20202  [pdf, other

    cs.AI

    One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

    Authors: Ke Yi, Yuhui Xu, Heng Chang, Chen Tang, Yuan Meng, Tong Zhang, Jia Li

    Abstract: Large Language Models (LLMs) have advanced rapidly but face significant memory demands. While quantization has shown promise for LLMs, current methods typically require lengthy training to alleviate the performance degradation from quantization loss. However, deploying LLMs across diverse scenarios with different resource constraints, e.g., servers and personal computers, requires repeated trainin… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  13. arXiv:2405.17913  [pdf, other

    cs.CV cs.AI

    OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

    Authors: Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang

    Abstract: Open-Vocabulary Detection (OVD) aims to detect objects from novel categories beyond the base categories on which the detector is trained. However, existing open-vocabulary detectors trained on known category data tend to assign higher confidence to trained categories and confuse novel categories with background. To resolve this, we propose OV-DQUO, an \textbf{O}pen-\textbf{V}ocabulary DETR with \t… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  14. arXiv:2405.16273  [pdf, other

    cs.CV

    M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation

    Authors: Mingshuang Luo, Ruibing Hou, Hong Chang, Zimo Liu, Yaowei Wang, Shiguang Shan

    Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three fundamental principles. The first focuses on creating a unified representation space for various motion-relevant modalities. We employ discrete vector quantization for multimodal control and generation signals, such as text,… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 18 pages, 6 figures

  15. arXiv:2405.15304  [pdf, other

    cs.LG cs.CV

    Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

    Authors: Yongliang Wu, Shiji Zhou, Mingzhuo Yang, Lianzhe Wang, Wenbo Zhu, Heng Chang, Xiao Zhou, Xu Yang

    Abstract: Current text-to-image diffusion models have achieved groundbreaking results in image generation tasks. However, the unavoidable inclusion of sensitive information during pre-training introduces significant risks such as copyright infringement and privacy violations in the generated images. Machine Unlearning (MU) provides a effective way to the sensitive concepts captured by the model, has been sh… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  16. arXiv:2405.12656  [pdf, other

    cs.CL cs.AI

    Retrieval-Augmented Language Model for Extreme Multi-Label Knowledge Graph Link Prediction

    Authors: Yu-Hsiang Lin, Huang-Ting Shieh, Chih-Yu Liu, Kuang-Ting Lee, Hsiao-Cheng Chang, **g-Lun Yang, Yu-Sheng Lin

    Abstract: Extrapolation in Large language models (LLMs) for open-ended inquiry encounters two pivotal issues: (1) hallucination and (2) expensive training costs. These issues present challenges for LLMs in specialized domains and personalized data, requiring truthful responses and low fine-tuning costs. Existing works attempt to tackle the problem by augmenting the input of a smaller language model with inf… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  17. arXiv:2405.05248  [pdf, other

    cs.CL cs.AI cs.MA

    LLMs with Personalities in Multi-issue Negotiation Games

    Authors: Sean Noh, Ho-Chun Herbert Chang

    Abstract: Powered by large language models (LLMs), AI agents have become capable of many human tasks. Using the most canonical definitions of the Big Five personality, we measure the ability of LLMs to negotiate within a game-theoretical framework, as well as methodological challenges to measuring notions of fairness and risk. Simulations (n=1,500) for both single-issue and multi-issue negotiation reveal in… ▽ More

    Submitted 8 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  18. arXiv:2405.01610  [pdf, other

    cs.CL cs.IR

    Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

    Authors: Noah Giebink, Amrita Gupta, Diogo Verìssimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

    Abstract: Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: v0.1, 21 pages with 10 figures

  19. arXiv:2404.17486  [pdf, other

    cs.CV

    TextGaze: Gaze-Controllable Face Generation with Natural Language

    Authors: Hengfei Wang, Zhongqun Zhang, Yihua Cheng, Hyung ** Chang

    Abstract: Generating face image with specific gaze information has attracted considerable attention. Existing approaches typically input gaze values directly for face generation, which is unnatural and requires annotated gaze datasets for training, thereby limiting its application. In this paper, we present a novel gaze-controllable face generation task. Our approach inputs textual descriptions that describ… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Under review

  20. arXiv:2404.09696  [pdf, other

    cs.CL cs.AI cs.ET

    Are Large Language Models Reliable Argument Quality Annotators?

    Authors: Nailia Mirzakhmedova, Marcel Gohsen, Chia Hao Chang, Benno Stein

    Abstract: Evaluating the quality of arguments is a crucial aspect of any system leveraging argument mining. However, it is a challenge to obtain reliable and consistent annotations regarding argument quality, as this usually requires domain-specific expertise of the annotators. Even among experts, the assessment of argument quality is often inconsistent due to the inherent subjectivity of this task. In this… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures, 5 tables

  21. arXiv:2404.09507  [pdf, other

    cs.CV

    Clothes-Changing Person Re-Identification with Feasibility-Aware Intermediary Matching

    Authors: Jiahe Zhao, Ruibing Hou, Hong Chang, Xinqian Gu, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Current clothes-changing person re-identification (re-id) approaches usually perform retrieval based on clothes-irrelevant features, while neglecting the potential of clothes-relevant features. However, we observe that relying solely on clothes-irrelevant features for clothes-changing re-id is limited, since they often lack adequate identity information and suffer from large intra-class variations… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  22. arXiv:2404.09385  [pdf, other

    eess.AS cs.CL eess.SP

    A Large-Scale Evaluation of Speech Foundation Models

    Authors: Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee

    Abstract: The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work,… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: The extended journal version for SUPERB and SUPERB-SG. Published in IEEE/ACM TASLP. The Arxiv version is preferred

  23. arXiv:2404.06903  [pdf, other

    cs.CV cs.AI

    DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

    Authors: Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi

    Abstract: The increasing demand for virtual reality applications has highlighted the significance of crafting immersive 3D assets. We present a text-to-3D 360$^{\circ}$ scene generation pipeline that facilitates the creation of comprehensive 360$^{\circ}$ scenes for in-the-wild environments in a matter of minutes. Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  24. arXiv:2404.04979  [pdf, other

    econ.EM cs.LG

    CAVIAR: Categorical-Variable Embeddings for Accurate and Robust Inference

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: Social science research often hinges on the relationship between categorical variables and outcomes. We introduce CAVIAR, a novel method for embedding categorical variables that assume values in a high-dimensional ambient space but are sampled from an underlying manifold. Our theoretical and numerical analyses outline challenges posed by such categorical variables in causal inference. Specifically… ▽ More

    Submitted 11 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  25. arXiv:2404.04436  [pdf, other

    cs.AI

    AI Knowledge and Reasoning: Emulating Expert Creativity in Scientific Research

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: We investigate whether modern AI can emulate expert creativity in complex scientific endeavors. We introduce novel methodology that utilizes original research articles published after the AI's training cutoff, ensuring no prior exposure, mitigating concerns of rote memorization and prior training. The AI are tasked with redacting findings, predicting outcomes from redacted research, and assessing… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  26. arXiv:2403.17754  [pdf, other

    cs.CG

    Optimal Euclidean Tree Covers

    Authors: Hsien-Chih Chang, Jonathan Conroy, Hung Le, Lazar Milenkovic, Shay Solomon, Cuong Than

    Abstract: A $(1+\varepsilon)\textit{-stretch tree cover}$ of a metric space is a collection of trees, where every pair of points has a $(1+\varepsilon)$-stretch path in one of the trees. The celebrated $\textit{Dumbbell Theorem}$ [Arya et~al. STOC'95] states that any set of $n$ points in $d$-dimensional Euclidean space admits a $(1+\varepsilon)$-stretch tree cover with… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  27. arXiv:2403.16428  [pdf, other

    cs.CV

    Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

    Authors: Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung ** Chang, Angela Yao

    Abstract: We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  28. arXiv:2403.15664  [pdf, other

    cs.CV

    What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation

    Authors: Yihua Cheng, Yaning Zhu, Zongji Wang, Hongquan Hao, Yongwei Liu, Shiqing Cheng, Xi Wang, Hyung ** Chang

    Abstract: Driver's eye gaze holds a wealth of cognitive and intentional cues crucial for intelligent vehicles. Despite its significance, research on in-vehicle gaze estimation remains limited due to the scarcity of comprehensive and well-annotated datasets in real driving scenarios. In this paper, we present three novel elements to advance in-vehicle gaze research. Firstly, we introduce IVGaze, a pioneering… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: CVPR24

  29. arXiv:2403.13551  [pdf, other

    cs.CV cs.LG

    Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing

    Authors: Hangeol Chang, **ho Chang, Jong Chul Ye

    Abstract: Despite recent advancements in text-to-image diffusion models facilitating various image editing techniques, complex text prompts often lead to an oversight of some requests due to a bottleneck in processing text information. To tackle this challenge, we present Ground-A-Score, a simple yet powerful model-agnostic image editing method by incorporating grounding during score distillation. This appr… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  30. arXiv:2403.11163  [pdf, ps, other

    stat.ME cs.LG math.ST stat.CO

    A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

    Authors: Xuetong Li, Yuan Gao, Hong Chang, Danyang Huang, Yingying Ma, Rui Pan, Haobo Qi, Feifei Wang, Shuyuan Wu, Ke Xu, **g Zhou, Xuening Zhu, Yingqiu Zhu, Hansheng Wang

    Abstract: This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first clas… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  31. arXiv:2403.10036  [pdf, other

    cs.CV

    SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception

    Authors: Yiheng Li, Hongyang Li, Zehao Huang, Hong Chang, Naiyan Wang

    Abstract: Multi-modal 3D object detection has exhibited significant progress in recent years. However, most existing methods can hardly scale to long-range scenarios due to their reliance on dense 3D features, which substantially escalate computational demands and memory usage. In this paper, we introduce SparseFusion, a novel multi-modal fusion framework fully built upon sparse 3D features to facilitate ef… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  32. arXiv:2403.09404  [pdf, other

    cs.AI

    Heuristic Reasoning in AI: Instrumental Use and Mimetic Absorption

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: Deviating from conventional perspectives that frame artificial intelligence (AI) systems solely as logic emulators, we propose a novel program of heuristic reasoning. We distinguish between the 'instrumental' use of heuristics to match resources with objectives, and 'mimetic absorption,' whereby heuristics manifest randomly and universally. Through a series of innovative experiments, including var… ▽ More

    Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  33. arXiv:2403.09289  [pdf, other

    cs.AI

    Silico-centric Theory of Mind

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one's own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by hu… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  34. arXiv:2403.06225  [pdf, other

    cs.CV cs.AI

    MoST: Motion Style Transformer between Diverse Action Contents

    Authors: Boeun Kim, Jungho Kim, Hyung ** Chang, ** Young Choi

    Abstract: While existing motion style transfer methods are effective between two motions with identical content, their performance significantly diminishes when transferring style between motions with different contents. This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge, we propose a novel motion style transformer that effectively disentangle… ▽ More

    Submitted 20 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  35. arXiv:2403.03535  [pdf, other

    cs.CV cs.LG

    Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications

    Authors: Minyang Hu, Hong Chang, Zong Guo, Bingpeng Ma, Shiguan Shan, Xilin Chen

    Abstract: Few-shot learning (FSL) aims to learn novel tasks with very few labeled samples by leveraging experience from \emph{related} training tasks. In this paper, we try to understand FSL by delving into two key questions: (1) How to quantify the relationship between \emph{training} and \emph{novel} tasks? (2) How does the relationship affect the \emph{adaptation difficulty} on novel tasks for different… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  36. arXiv:2403.01053  [pdf, other

    cs.LG cs.AI cs.CV

    Seeing Unseen: Discover Novel Biomedical Concepts via Geometry-Constrained Probabilistic Modeling

    Authors: Jianan Fan, Dongnan Liu, Hang Chang, Heng Huang, Mei Chen, Weidong Cai

    Abstract: Machine learning holds tremendous promise for transforming the fundamental practice of scientific discovery by virtue of its data-driven nature. With the ever-increasing stream of research data collection, it would be appealing to autonomously explore patterns and insights from observational data for discovering novel classes of phenotypes and concepts. However, in the biomedical domain, there are… ▽ More

    Submitted 5 March, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  37. arXiv:2402.14320  [pdf, other

    cs.CL cs.AI

    Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering

    Authors: Chang Zong, Yuchen Yan, Weiming Lu, Jian Shao, Eliot Huang, Heng Chang, Yueting Zhuang

    Abstract: Recent progress with LLM-based agents has shown promising results across various tasks. However, their use in answering questions from knowledge bases remains largely unexplored. Implementing a KBQA system using traditional methods is challenging due to the shortage of task-specific training data and the complexity of creating task-focused model structures. In this paper, we present Triad, a unifi… ▽ More

    Submitted 15 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: 8 pages

    MSC Class: 68T50 ACM Class: I.2.7

  38. arXiv:2402.06959  [pdf, other

    cs.CL cs.SD eess.AS

    SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

    Authors: Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-yi Lee, Hsin-Min Wang, David Harwath

    Abstract: The recently proposed visually grounded speech model SpeechCLIP is an innovative framework that bridges speech and text through images via CLIP without relying on text transcription. On this basis, this paper introduces two extensions to SpeechCLIP. First, we apply the Continuous Integrate-and-Fire (CIF) module to replace a fixed number of CLS tokens in the cascaded architecture. Second, we propos… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: Accepted to ICASSP 2024, Self-supervision in Audio, Speech, and Beyond (SASB) workshop

  39. arXiv:2402.05532  [pdf, other

    cs.CV

    NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of Hand-Object Interaction

    Authors: Zhongqun Zhang, Jifei Song, Eduardo Pérez-Pellitero, Yiren Zhou, Hyung ** Chang, Aleš Leonardis

    Abstract: Modeling hand-object interactions is a fundamentally challenging task in 3D computer vision. Despite remarkable progress that has been achieved in this field, existing methods still fail to synthesize the hand-object interaction photo-realistically, suffering from degraded rendering quality caused by the heavy mutual occlusions between the hand and the object, and inaccurate hand-object pose estim… ▽ More

    Submitted 9 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted by 3DV 2024

  40. arXiv:2402.02453  [pdf, other

    cs.CV

    AI Art Neural Constellation: Revealing the Collective and Contrastive State of AI-Generated and Human Art

    Authors: Faizan Farooq Khan, Diana Kim, Divyansh Jha, Youssef Mohamed, Hanna H Chang, Ahmed Elgammal, Luba Elliott, Mohamed Elhoseiny

    Abstract: Discovering the creative potentials of a random signal to various artistic expressions in aesthetic and conceptual richness is a ground for the recent success of generative machine learning as a way of art creation. To understand the new artistic medium better, we conduct a comprehensive analysis to position AI-generated art within the context of human art heritage. Our comparative analysis is bas… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  41. Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition

    Authors: Haochen Chang, **g Chen, Yilin Li, Jixiang Chen, Xiaofeng Zhang

    Abstract: Skeleton-based action recognition has attracted much attention, benefiting from its succinctness and robustness. However, the minimal inter-class variation in similar action sequences often leads to confusion. The inherent spatiotemporal coupling characteristics make it challenging to mine the subtle differences in joint motion trajectories, which is critical for distinguishing confusing fine-grai… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted by ICASSP 2024

    Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2024, Seoul (Korea), South Korea

  42. arXiv:2401.12881  [pdf, other

    cs.DS cs.CG

    Computing Diameter+2 in Truly Subquadratic Time for Unit-Disk Graphs

    Authors: Hsien-Chih Chang, Jie Gao, Hung Le

    Abstract: Finding the diameter of a graph in general cannot be done in truly subquadratic assuming the Strong Exponential Time Hypothesis (SETH), even when the underlying graph is unweighted and sparse. When restricting to concrete classes of graphs and assuming SETH, planar graphs and minor-free graphs admit truly subquadratic algorithms, while geometric intersection graphs of unit balls, congruent equilat… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 28 pages, 7 figures

  43. arXiv:2401.12578  [pdf, other

    cs.CR

    ToDA: Target-oriented Diffusion Attacker against Recommendation System

    Authors: Xiaohao Liu, Zhulin Tao, Ting Jiang, He Chang, Yunshan Ma, Xianglin Huang, Xiang Wang

    Abstract: Recommendation systems (RS) have become indispensable tools for web services to address information overload, thus enhancing user experiences and bolstering platforms' revenues. However, with their increasing ubiquity, security concerns have also emerged. As the public accessibility of RS, they are susceptible to specific malicious attacks where adversaries can manipulate user profiles, leading to… ▽ More

    Submitted 16 April, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

  44. arXiv:2401.10956  [pdf, other

    cs.HC cs.AI cs.IR

    AI Revolution on Chat Bot: Evidence from a Randomized Controlled Experiment

    Authors: Sida Peng, Wojciech Swiatek, Allen Gao, Paul Cullivan, Haoge Chang

    Abstract: In recent years, generative AI has undergone major advancements, demonstrating significant promise in augmenting human productivity. Notably, large language models (LLM), with ChatGPT-4 as an example, have drawn considerable attention. Numerous articles have examined the impact of LLM-based tools on human productivity in lab settings and designed tasks or in observational studies. Despite recent a… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  45. arXiv:2401.09496  [pdf, other

    cs.CV

    Learning to Generalize over Subpartitions for Heterogeneity-aware Domain Adaptive Nuclei Segmentation

    Authors: Jianan Fan, Dongnan Liu, Hang Chang, Weidong Cai

    Abstract: Annotation scarcity and cross-modality/stain data distribution shifts are two major obstacles hindering the application of deep learning models for nuclei analysis, which holds a broad spectrum of potential applications in digital pathology. Recently, unsupervised domain adaptation (UDA) methods have been proposed to mitigate the distributional gap between different imaging modalities for unsuperv… ▽ More

    Submitted 21 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  46. arXiv:2401.04390  [pdf, other

    cs.CV

    Learning with Noisy Labels: Interconnection of Two Expectation-Maximizations

    Authors: Heewon Kim, Hyun Sung Chang, Kiho Cho, Jaeyun Lee, Bohyung Han

    Abstract: Labor-intensive labeling becomes a bottleneck in develo** computer vision algorithms based on deep learning. For this reason, dealing with imperfect labels has increasingly gained attention and has become an active field of study. We address learning with noisy labels (LNL) problem, which is formalized as a task of finding a structured manifold in the midst of noisy data. In this framework, we p… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  47. arXiv:2401.02290  [pdf, other

    cs.LG cs.AI cs.SI

    Path-based Explanation for Knowledge Graph Completion

    Authors: Heng Chang, Jiangnan Ye, Alejo Lopez Avila, **hua Du, Jia Li

    Abstract: Graph Neural Networks (GNNs) have achieved great success in Knowledge Graph Completion (KGC) by modelling how entities and relations interact in recent years. However, the explanation of the predicted facts has not caught the necessary attention. Proper explanations for the results of GNN-based KGC models increase model transparency and help researchers develop more reliable models. Existing pract… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  48. arXiv:2312.04035  [pdf, other

    cs.CR

    Defense against ML-based Power Side-channel Attacks on DNN Accelerators with Adversarial Attacks

    Authors: Xiaobei Yan, Chip Hong Chang, Tianwei Zhang

    Abstract: Artificial Intelligence (AI) hardware accelerators have been widely adopted to enhance the efficiency of deep learning applications. However, they also raise security concerns regarding their vulnerability to power side-channel attacks (SCA). In these attacks, the adversary exploits unintended communication channels to infer sensitive information processed by the accelerator, posing significant pr… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  49. arXiv:2312.03203  [pdf, other

    cs.CV

    Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields

    Authors: Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, Achuta Kadambi

    Abstract: 3D scene representations have gained immense popularity in recent years. Methods that use Neural Radiance fields are versatile for traditional tasks such as novel view synthesis. In recent times, some work has emerged that aims to extend the functionality of NeRF beyond view synthesis, for semantically aware tasks such as editing and segmentation using 3D feature field distillation from 2D foundat… ▽ More

    Submitted 8 April, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  50. arXiv:2312.01042  [pdf, ps, other

    cs.IT eess.SP

    Covert Communications in STAR-RIS-Aided Rate-Splitting Multiple Access Systems

    Authors: Heng Chang, Hai Yang, Shuobo Xu, Xiyu Pang, Hongwu Liu

    Abstract: In this paper, we investigate covert communications in a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided rate-splitting multiple access (RSMA) system. Under the RSMA principles, the messages for the covert user (Bob) and public user (Grace) are converted to the common and private streams at the legitimate transmitter (Alice) to realize downlink transm… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 17 pages, submitted to journal