Skip to main content

Showing 1–50 of 384 results for author: Jiang, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00100  [pdf, other

    cs.LG cs.AI cs.CL

    Enhancing In-Context Learning via Implicit Demonstration Augmentation

    Authors: Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang

    Abstract: The emergence of in-context learning (ICL) enables large pre-trained language models (PLMs) to make predictions for unseen inputs without updating parameters. Despite its potential, ICL's effectiveness heavily relies on the quality, quantity, and permutation of demonstrations, commonly leading to suboptimal and unstable performance. In this paper, we tackle this challenge for the first time from t… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Accepted by ACL 2024 Main 19 pages,10 figures

    ACM Class: I.2.7

  2. arXiv:2406.19434  [pdf, other

    cs.GR cs.AI

    Lightweight Predictive 3D Gaussian Splats

    Authors: Junli Cao, Vidit Goel, Chaoyang Wang, Anil Kag, Ju Hu, Sergei Korolev, Chenfanfu Jiang, Sergey Tulyakov, Jian Ren

    Abstract: Recent approaches representing 3D objects and scenes using Gaussian splats show increased rendering speed across a variety of platforms and devices. While rendering such representations is indeed extremely efficient, storing and transmitting them is often prohibitively expensive. To represent large-scale scenes, one often needs to store millions of 3D Gaussians, occupying gigabytes of disk space.… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Project Page: https://plumpuddings.github.io/LPGS//

  3. arXiv:2406.18914  [pdf, other

    eess.SY cs.RO

    Verification and Synthesis of Compatible Control Lyapunov and Control Barrier Functions

    Authors: Hongkai Dai, Chuanrui Jiang, Hongchao Zhang, Andrew Clark

    Abstract: Safety and stability are essential properties of control systems. Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs) have been proposed to ensure safety and stability respectively. However, previous approaches typically verify and synthesize the CBFs and CLFs separately, satisfying their respective constraints, without proving that the CBFs and CLFs are compatible with each oth… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  4. arXiv:2406.18099  [pdf, other

    cs.DB

    CompassDB: Pioneering High-Performance Key-Value Store with Perfect Hash

    Authors: ** Jiang, Dongsheng He, Yu Hu, Dong Liu, Chenfan Xiao, Hongxiao Bi, Yusong Zhang, Chaoqu Jiang, Zhijun Fu

    Abstract: Modern mainstream persistent key-value storage engines utilize Log-Structured Merge tree (LSM-tree) based designs, optimizing read/write performance by leveraging sequential disk I/O. However, the advent of SSDs, with their significant improvements in bandwidth and IOPS, shifts the bottleneck from I/O to CPU. The high compaction cost and large read/write amplification associated with LSM trees hav… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  5. arXiv:2406.14962  [pdf, other

    cs.CV

    Contextual Interaction via Primitive-based Adversarial Training For Compositional Zero-shot Learning

    Authors: Suyi Li, Chenyi Jiang, Shidong Wang, Yang Long, Zheng Zhang, Haofeng Zhang

    Abstract: Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addr… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  6. arXiv:2406.13923  [pdf, other

    cs.AI cs.CL cs.CV cs.MM

    PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents

    Authors: Junjie Wang, Yin Zhang, Yatai Ji, Yuxiang Zhang, Chunyang Jiang, Yubo Wang, Kang Zhu, Zekun Wang, Tiezhen Wang, Wenhao Huang, Jie Fu, Bei Chen, Qunshu Lin, Minghao Liu, Ge Zhang, Wenhu Chen

    Abstract: Recent advancements in Large Multimodal Models (LMMs) have leveraged extensive multimodal datasets to enhance capabilities in complex knowledge-driven tasks. However, persistent challenges in perceptual and reasoning errors limit their efficacy, particularly in interpreting intricate visual data and deducing multimodal relationships. Addressing these issues, we introduce a novel dataset format, PI… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2406.12227  [pdf, other

    cs.AI

    Interpretable Catastrophic Forgetting of Large Language Model Fine-tuning via Instruction Vector

    Authors: Gangwei Jiang, Caigao Jiang, Zhaoyi Li, Siqiao Xue, Jun Zhou, Linqi Song, Defu Lian, Ying Wei

    Abstract: Fine-tuning large language models (LLMs) can cause them to lose their general capabilities. However, the intrinsic mechanisms behind such forgetting remain unexplored. In this paper, we begin by examining this phenomenon by focusing on knowledge understanding and instruction following, with the latter identified as the main contributor to forgetting during fine-tuning. Consequently, we propose the… ▽ More

    Submitted 24 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  8. arXiv:2406.12019  [pdf

    eess.SY cs.CR cs.ET eess.SP

    Hacking Encrypted Wireless Power: Cyber-Security of Dynamic Charging

    Authors: Hui Wang, Nima Tashakor, Wei Jiang, Wei Liu, C. Q. Jiang, Stefan M. Goetz

    Abstract: Recently, energy encryption for wireless power transfer has been developed for energy safety, which is important in public places to suppress unauthorized energy extraction. Most techniques vary the frequency so that unauthorized receivers cannot extract energy because of non-resonance. However, this strategy is unreliable. To stimulate the progress of energy encryption technology and point out se… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages, 17 figures

  9. arXiv:2406.11434  [pdf, other

    cs.DB

    DB-GPT-Hub: Towards Open Benchmarking Text-to-SQL Empowered by Large Language Models

    Authors: Fan Zhou, Siqiao Xue, Danrui Qi, Wenhui Shi, Wang Zhao, Ganglin Wei, Hongyang Zhang, Caigai Jiang, Gangwei Jiang, Zhixuan Chu, Faqiang Chen

    Abstract: Large language models (LLMs) becomes the dominant paradigm for the challenging task of text-to-SQL. LLM-empowered text-to-SQL methods are typically categorized into prompting-based and tuning approaches. Compared to prompting-based methods, benchmarking fine-tuned LLMs for text-to-SQL is important yet under-explored, partially attributed to the prohibitively high computational cost. In this paper,… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  10. arXiv:2406.07362  [pdf, other

    cs.HC

    AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database

    Authors: Wanling Gao, Yuan Liu, Zhuoming Yu, Dandan Cui, Wen**g Liu, Xiaoshuang Liang, Jiahui Zhao, Jiyue Xie, Hao Li, Li Ma, Ning Ye, Yumiao Kang, Dingfeng Luo, Peng Pan, Wei Huang, Zhongmou Liu, Jizhong Hu, Fan Huang, Gangyuan Zhao, Chongrong Jiang, Tianyi Wei, Zhifei Zhang, Yunyou Huang, Jianfeng Zhan

    Abstract: Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI f… ▽ More

    Submitted 15 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: 12 pages

  11. arXiv:2406.06858  [pdf, other

    cs.LG cs.DC

    FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion

    Authors: Li-Wen Chang, Wenlei Bao, Qi Hou, Chengquan Jiang, Ningxin Zheng, Yinmin Zhong, Xuanrun Zhang, Zuquan Song, Ziheng Jiang, Haibin Lin, Xin **, Xin Liu

    Abstract: Large deep learning models have demonstrated strong ability to solve many tasks across a wide range of applications. Those large models typically require training and inference to be distributed. Tensor parallelism is a common technique partitioning computation of an operation or layer across devices to overcome the memory capacity limitation of a single processor, and/or to accelerate computation… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  12. arXiv:2406.03520  [pdf, other

    cs.CV cs.AI cs.LG

    VideoPhy: Evaluating Physical Commonsense for Video Generation

    Authors: Hritik Bansal, Zongyu Lin, Tianyi Xie, Zeshun Zong, Michal Yarom, Yonatan Bitton, Chenfanfu Jiang, Yizhou Sun, Kai-Wei Chang, Aditya Grover

    Abstract: Recent advances in internet-scale video data pretraining have led to the development of text-to-video generative models that can create high-quality videos across a broad range of visual concepts and styles. Due to their ability to synthesize realistic motions and render complex objects, these generative models have the potential to become general-purpose simulators of the physical world. However,… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 36 pages, 26 figures, 8 tables

  13. arXiv:2405.18515  [pdf, other

    cs.LG

    Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

    Authors: Yunuo Chen, Tianyi Xie, Zeshun Zong, Xuan Li, Feng Gao, Yin Yang, Ying Nian Wu, Chenfanfu Jiang

    Abstract: Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embod… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  14. arXiv:2405.17764  [pdf, other

    cs.CL cs.AI math.ST

    On the Sequence Evaluation based on Stochastic Processes

    Authors: Tianhao Zhang, Zhexiao Lin, Zhecheng Sheng, Chen Jiang, Dongyeop Kang

    Abstract: Modeling and analyzing long sequences of text is an essential task for Natural Language Processing. Success in capturing long text dynamics using neural language models will facilitate many downstream tasks such as coherence evaluation, text generation, machine translation and so on. This paper presents a novel approach to model sequences through a stochastic process. We introduce a likelihood-bas… ▽ More

    Submitted 15 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  15. arXiv:2405.16868  [pdf, other

    cs.CV

    RCDN: Towards Robust Camera-Insensitivity Collaborative Perception via Dynamic Feature-based 3D Neural Modeling

    Authors: Tianhang Wang, Fan Lu, Zehan Zheng, Guang Chen, Changjun Jiang

    Abstract: Collaborative perception is dedicated to tackling the constraints of single-agent perception, such as occlusions, based on the multiple agents' multi-view sensor inputs. However, most existing works assume an ideal condition that all agents' multi-view cameras are continuously available. In reality, cameras may be highly noisy, obscured or even failed during the collaboration. In this work, we int… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  16. arXiv:2405.15056  [pdf, other

    cs.LG cs.CV cs.GR

    ElastoGen: 4D Generative Elastodynamics

    Authors: Yutao Feng, Yintong Shang, Xiang Feng, Lei Lan, Shandian Zhe, Tianjia Shao, Hongzhi Wu, Kun Zhou, Hao Su, Chenfanfu Jiang, Yin Yang

    Abstract: We present ElastoGen, a knowledge-driven model that generates physically accurate and coherent 4D elastodynamics. Instead of relying on petabyte-scale data-driven learning, ElastoGen leverages the principles of physics-in-the-loop and learns from established physical knowledge, such as partial differential equations and their numerical solutions. The core idea of ElastoGen is converting the global… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  17. arXiv:2405.14595  [pdf, other

    cs.GR

    Elastic Locomotion with Mixed Second-order Differentiation

    Authors: Siyuan Shen, Tianjia Shao, Kun Zhou, Chenfanfu Jiang, Sheldon Andrews, Victor Zordan, Yin Yang

    Abstract: We present a framework of elastic locomotion, which allows users to enliven an elastic body to produce interesting locomotion by prescribing its high-level kinematics. We formulate this problem as an inverse simulation problem and seek the optimal muscle activations to drive the body to complete the desired actions. We employ the interior-point method to model wide-area contacts between the body a… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 13 pages, 14 figures

  18. arXiv:2405.14338  [pdf, other

    cs.CV

    MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models

    Authors: Jiuming Liu, **ru Han, Lihao Liu, Angelica I. Aviles-Rivero, Chaokang Jiang, Zhe Liu, Hesheng Wang

    Abstract: Point cloud videos effectively capture real-world spatial geometries and temporal dynamics, which are essential for enabling intelligent agents to understand the dynamically changing 3D world we live in. Although static 3D point cloud processing has witnessed significant advancements, designing an effective 4D point cloud video backbone remains challenging, mainly due to the irregular and unordere… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  19. arXiv:2405.14241  [pdf, other

    cs.CV

    NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation

    Authors: Chaokang Jiang, Dalong Du, Jiuming Liu, Siting Zhu, Zhenqiang Liu, Zhuang Ma, Zhu** Liang, Jie Zhou

    Abstract: Point Cloud Interpolation confronts challenges from point sparsity, complex spatiotemporal dynamics, and the difficulty of deriving complete 3D point clouds from sparse temporal information. This paper presents NeuroGauss4D-PCI, which excels at modeling complex non-rigid deformations across varied dynamic scenes. The method begins with an iterative Gaussian cloud soft clustering module, offering s… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Under review

  20. arXiv:2405.12484  [pdf, other

    cs.GR

    Meta-Homogenization for Knitwear Simulation

    Authors: Chun Yuan, Kui Wu, Haoyang Shi, Lei Lan, Yuxing Qiu, Cem Yuksel, Huamin Wang, Chenfanfu Jiang, Yin Yang

    Abstract: This paper presents meta-homogenization, a spatially varying homogenization scheme for knitwear simulation. We are motivated by the observation that macro-scale fabric dynamics is strongly correlated with its underlying knitting patterns. Therefore, homogenization towards a single material is less effective when the knitting is complex and non-repetitive. Our method tackles this challenge by homog… ▽ More

    Submitted 23 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  21. arXiv:2405.12420  [pdf, other

    cs.CV

    GarmentDreamer: 3DGS Guided Garment Synthesis with Diverse Geometry and Texture Details

    Authors: Boqian Li, Xuan Li, Ying Jiang, Tianyi Xie, Feng Gao, Huamin Wang, Yin Yang, Chenfanfu Jiang

    Abstract: Traditional 3D garment creation is labor-intensive, involving sketching, modeling, UV map**, and texturing, which are time-consuming and costly. Recent advances in diffusion-based generative models have enabled new possibilities for 3D garment generation from text prompts, images, and videos. However, existing methods either suffer from inconsistencies among multi-view images or require addition… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  22. arXiv:2405.11694  [pdf, other

    cs.GR

    PBI: Position-Based Dynamics Handles Updated Lagrangian Inelasticity

    Authors: Chang Yu, Xuan Li, Lei Lan, Yin Yang, Chenfanfu Jiang

    Abstract: Position-based Dynamics (PBD) and its extension, eXtended Position-based Dynamics (XPBD), have been predominantly applied to compliant constrained dynamics, with their potential in finite strain inelasticity remaining underexplored. XPBD stands in contrast to other meshless methods, such as the Material Point Method (MPM). MPM is based on discretizing the weak form of governing partial differentia… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  23. arXiv:2405.09431  [pdf, other

    cs.CV cs.GR

    A Survey On Text-to-3D Contents Generation In The Wild

    Authors: Chenhan Jiang

    Abstract: 3D content creation plays a vital role in various applications, such as gaming, robotics simulation, and virtual reality. However, the process is labor-intensive and time-consuming, requiring skilled designers to invest considerable effort in creating a single 3D asset. To address this challenge, text-to-3D generation technologies have emerged as a promising solution for automating 3D creation. Le… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 11 pages, 10 figures, 4 tables. arXiv admin note: text overlap with arXiv:2401.17807 by other authors

  24. arXiv:2405.08036  [pdf, other

    cs.LG cs.AI

    POWQMIX: Weighted Value Factorization with Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

    Authors: Chang Huang, Junqiao Zhao, Shatong Zhu, Hongtu Zhou, Chen Ye, Tiantian Feng, Changjun Jiang

    Abstract: Value function factorization methods are commonly used in cooperative multi-agent reinforcement learning, with QMIX receiving significant attention. Many QMIX-based methods introduce monotonicity constraints between the joint action value and individual action values to achieve decentralized execution. However, such constraints limit the representation capacity of value factorization, restricting… ▽ More

    Submitted 15 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

    Comments: change reference format

  25. arXiv:2405.06944  [pdf, other

    cs.CV

    Learning Monocular Depth from Focus with Event Focal Stack

    Authors: Chenxu Jiang, Mingyuan Lin, Chi Zhang, Zhenghai Wang, Lei Yu

    Abstract: Depth from Focus estimates depth by determining the moment of maximum focus from multiple shots at different focal distances, i.e. the Focal Stack. However, the limited sampling rate of conventional optical cameras makes it difficult to obtain sufficient focus cues during the focal sweep. Inspired by biological vision, the event camera records intensity changes over time in extremely low latency,… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  26. arXiv:2405.06918  [pdf, other

    cs.CV

    Super-Resolving Blurry Images with Events

    Authors: Chi Zhang, Mingyuan Lin, Xiang Zhang, Chenxu Jiang, Lei Yu

    Abstract: Super-resolution from motion-blurred images poses a significant challenge due to the combined effects of motion blur and low spatial resolution. To address this challenge, this paper introduces an Event-based Blurry Super Resolution Network (EBSR-Net), which leverages the high temporal resolution of events to mitigate motion blur and improve high-resolution image prediction. Specifically, we propo… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  27. X-SLAM: Scalable Dense SLAM for Task-aware Optimization using CSFD

    Authors: Zhexi Peng, Yin Yang, Tianjia Shao, Chenfanfu Jiang, Kun Zhou

    Abstract: We present X-SLAM, a real-time dense differentiable SLAM system that leverages the complex-step finite difference (CSFD) method for efficient calculation of numerical derivatives, bypassing the need for a large-scale computational graph. The key to our approach is treating the SLAM process as a differentiable function, enabling the calculation of the derivatives of important SLAM parameters throug… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: To be published in ACM SIGGRAPH 2024

  28. arXiv:2405.02144  [pdf, other

    cs.CL

    MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain

    Authors: Chao Jiang, Wei Xu

    Abstract: Medical texts are notoriously challenging to read. Properly measuring their readability is the first step towards making them more accessible. In this paper, we present a systematic study on fine-grained readability measurements in the medical domain at both sentence-level and span-level. We introduce a new dataset MedReadMe, which consists of manually annotated readability ratings and fine-graine… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  29. arXiv:2405.01306  [pdf, other

    cs.LG

    Graph is all you need? Lightweight data-agnostic neural architecture search without training

    Authors: Zhenhan Huang, Tejaswini Pedapati, Pin-Yu Chen, Chunhen Jiang, Jianxi Gao

    Abstract: Neural architecture search (NAS) enables the automatic design of neural network models. However, training the candidates generated by the search algorithm for performance evaluation incurs considerable computational overhead. Our method, dubbed nasgraph, remarkably reduces the computational costs by converting neural architectures to graphs and using the average degree, a graph measure, as the pro… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  30. arXiv:2404.19429  [pdf, other

    cs.DC cs.LG

    Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlap**

    Authors: Chenyu Jiang, Ye Tian, Zhen Jia, Shuai Zheng, Chuan Wu, Yida Wang

    Abstract: The Mixture-of-Expert (MoE) technique plays a crucial role in expanding the size of DNN model parameters. However, it faces the challenge of extended all-to-all communication latency during the training process. Existing methods attempt to mitigate this issue by overlap** all-to-all with expert computation. Yet, these methods frequently fall short of achieving sufficient overlap, consequently re… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 11 pages, 16 figures. Published in MLSys'24

  31. arXiv:2404.18219  [pdf, other

    physics.ins-det cs.LG hep-ex hep-ph physics.data-an

    BUFF: Boosted Decision Tree based Ultra-Fast Flow matching

    Authors: Cheng Jiang, Sitian Qian, Huilin Qu

    Abstract: Tabular data stands out as one of the most frequently encountered types in high energy physics. Unlike commonly homogeneous data such as pixelated images, simulating high-dimensional tabular data and accurately capturing their correlations are often quite challenging, even with the most advanced architectures. Based on the findings that tree-based models surpass the performance of deep learning mo… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 9 pages, 10 figures, 1 additional figure in appendix

  32. arXiv:2404.14066  [pdf, other

    cs.CV cs.IR

    SHE-Net: Syntax-Hierarchy-Enhanced Text-Video Retrieval

    Authors: Xuzheng Yu, Chen Jiang, Xingning Dong, Tian Gan, Ming Yang, Qingpei Guo

    Abstract: The user base of short video apps has experienced unprecedented growth in recent years, resulting in a significant demand for video content analysis. In particular, text-video retrieval, which aims to find the top matching videos given text descriptions from a vast video corpus, is an essential function, the primary challenge of which is to bridge the modality gap. Nevertheless, most existing appr… ▽ More

    Submitted 6 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  33. arXiv:2404.10358  [pdf, other

    cs.CV

    Improving Bracket Image Restoration and Enhancement with Flow-guided Alignment and Enhanced Feature Aggregation

    Authors: Wenjie Lin, Zhen Liu, Chengzhi Jiang, Mingyan Han, Ting Jiang, Shuaicheng Liu

    Abstract: In this paper, we address the Bracket Image Restoration and Enhancement (BracketIRE) task using a novel framework, which requires restoring a high-quality high dynamic range (HDR) image from a sequence of noisy, blurred, and low dynamic range (LDR) multi-exposure RAW inputs. To overcome this challenge, we present the IREANet, which improves the multiple exposure alignment and aggregation with a Fl… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  34. arXiv:2404.10209  [pdf, other

    cs.AI cs.LG

    Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models

    Authors: Siqiao Xue, Danrui Qi, Caigao Jiang, Wenhui Shi, Fangyin Cheng, Keting Chen, Hongjun Yang, Zhi** Zhang, Jianshan He, Hongyang Zhang, Ganglin Wei, Wang Zhao, Fan Zhou, Hong Yi, Shaodong Liu, Hongjun Yang, Faqiang Chen

    Abstract: The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. The technologies of interacting with data particularly have an important entanglement with LLMs as efficient and intuitive data interactions are paramount. In this paper, we present DB-GPT, a revolutionary and product-ready Python library that integrates LLMs into traditional data interact… ▽ More

    Submitted 24 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  35. arXiv:2404.09425  [pdf, other

    eess.IV cs.CV

    Super-resolution of biomedical volumes with 2D supervision

    Authors: Cheng Jiang, Alexander Gedeon, Yiwei Lyu, Eric Landgraf, Yufeng Zhang, Xinhai Hou, Akhil Kondepudi, Asadur Chowdury, Honglak Lee, Todd Hollon

    Abstract: Volumetric biomedical microscopy has the potential to increase the diagnostic information extracted from clinical tissue specimens and improve the diagnostic accuracy of both human pathologists and computational pathology models. Unfortunately, barriers to integrating 3-dimensional (3D) volumetric microscopy into clinical medicine include long imaging times, poor depth / z-axis resolution, and an… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: CVPR Workshop on Computer Vision for Microscopy Image Analysis 2024

  36. arXiv:2404.06780  [pdf, other

    cs.CV

    Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior

    Authors: Fan Lu, Kwan-Yee Lin, Yan Xu, Hongsheng Li, Guang Chen, Changjun Jiang

    Abstract: Text-to-3D generation has achieved remarkable success via large-scale text-to-image diffusion models. Nevertheless, there is no paradigm for scaling up the methodology to urban scale. Urban scenes, characterized by numerous elements, intricate arrangement relationships, and vast scale, present a formidable barrier to the interpretability of ambiguous textual descriptions for effective model optimi… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: Project page: https://urbanarchitect.github.io/

  37. arXiv:2404.05449  [pdf, other

    cs.CL

    RoT: Enhancing Large Language Models with Reflection on Search Trees

    Authors: Wenyang Hui, Chengyue Jiang, Yan Wang, Kewei Tu

    Abstract: Large language models (LLMs) have demonstrated impressive capability in reasoning and planning when integrated with tree-search-based prompting methods. However, since these methods ignore the previous search experiences, they often make the same mistakes in the search process. To address this issue, we introduce Reflection on search Trees (RoT), an LLM reflection framework designed to improve the… ▽ More

    Submitted 11 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 9 pages main

  38. arXiv:2404.04900  [pdf, other

    cs.CL

    Radial Networks: Dynamic Layer Routing for High-Performance Large Language Models

    Authors: Jordan Dotzel, Yash Akhauri, Ahmed S. AbouElhamayed, Carly Jiang, Mohamed Abdelfattah, Zhiru Zhang

    Abstract: Large language models (LLMs) often struggle with strict memory, latency, and power demands. To meet these demands, various forms of dynamic sparsity have been proposed that reduce compute on an input-by-input basis. These methods improve over static methods by exploiting the variance across individual inputs, which has steadily grown with the exponential increase in training data. Yet, the increas… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: First two authors have equal contribution

  39. arXiv:2404.02742  [pdf, other

    cs.CV

    LiDAR4D: Dynamic Neural Fields for Novel Space-time View LiDAR Synthesis

    Authors: Zehan Zheng, Fan Lu, Weiyi Xue, Guang Chen, Changjun Jiang

    Abstract: Although neural radiance fields (NeRFs) have achieved triumphs in image novel view synthesis (NVS), LiDAR NVS remains largely unexplored. Previous LiDAR NVS methods employ a simple shift from image NVS methods while ignoring the dynamic nature and the large-scale reconstruction problem of LiDAR point clouds. In light of this, we propose LiDAR4D, a differentiable LiDAR-only framework for novel spac… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. Project Page: https://dyfcalid.github.io/LiDAR4D

  40. arXiv:2404.02068  [pdf, other

    cs.CL

    Using Interpretation Methods for Model Enhancement

    Authors: Zhuo Chen, Chengyue Jiang, Kewei Tu

    Abstract: In the age of neural natural language processing, there are plenty of works trying to derive interpretations of neural models. Intuitively, when gold rationales exist during training, one can additionally train the model to match its interpretation with the rationales. However, this intuitive idea has not been fully explored. In this paper, we propose a framework of utilizing interpretation method… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: EMNLP 2023

    Journal ref: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 424-438

  41. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  42. arXiv:2404.01875  [pdf, other

    eess.SP cs.DC cs.IT cs.LG

    Satellite Federated Edge Learning: Architecture Design and Convergence Analysis

    Authors: Yuanming Shi, Li Zeng, **gyang Zhu, Yong Zhou, Chunxiao Jiang, Khaled B. Letaief

    Abstract: The proliferation of low-earth-orbit (LEO) satellite networks leads to the generation of vast volumes of remote sensing data which is traditionally transferred to the ground server for centralized processing, raising privacy and bandwidth concerns. Federated edge learning (FEEL), as a distributed machine learning approach, has the potential to address these challenges by sharing only model paramet… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 16 pages, 15 figures

  43. arXiv:2404.01643  [pdf, other

    eess.IV cs.CV cs.LG

    A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection

    Authors: Chih-Chung Hsu, Chia-Ming Lee, Yang Fan Chiang, Yi-Shiuan Chou, Chih-Yu Jiang, Shen-Chieh Tai, Chi-Han Tsai

    Abstract: Conventional Computed Tomography (CT) imaging recognition faces two significant challenges: (1) There is often considerable variability in the resolution and size of each CT scan, necessitating strict requirements for the input size and adaptability of models. (2) CT-scan contains large number of out-of-distribution (OOD) slices. The crucial features may only be present in specific spatial regions… ▽ More

    Submitted 20 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: Camera-ready version, accepted by DEF-AI-MIA workshop, in conjunted with CVPR2024

  44. arXiv:2403.20009  [pdf, other

    cs.CL cs.LG

    On Large Language Models' Hallucination with Regard to Known Facts

    Authors: Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong Meng, Mo Yu, Bowen Zhou, Jie Zhou

    Abstract: Large language models are successful in answering factoid questions but are also prone to hallucination.We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations.We are able to conduct this analysis via two key ideas.First, we identify the factual question… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted by NAACL 2024 MainConference

  45. arXiv:2403.19272  [pdf, other

    cs.GR

    Mil2: Efficient Cloth Simulation Using Non-distance Barriers and Subspace Reuse

    Authors: Lei Lan, Zixuan Lu, **gyi Long, Chun Yuan, Xuan Li, Xiaowei He, Huamin Wang, Chenfanfu Jiang, Yin Yang

    Abstract: Mil2 pushes the performance of high-resolution cloth simulation, making the simulation interactive (in milliseconds) for models with one million degrees of freedom (DOFs) while kee** every triangle untangled. The guarantee of being penetration-free is inspired by the interior-point method, which converts the inequality constraints to barrier potentials. Nevertheless, we propose a major overhaul… ▽ More

    Submitted 23 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  46. arXiv:2403.18317  [pdf, other

    cs.IR

    A Situation-aware Enhancer for Personalized Recommendation

    Authors: Jiayu Li, Peijie Sun, Chumeng Jiang, Weizhi Ma, Qingyao Ai, Min Zhang

    Abstract: When users interact with Recommender Systems (RecSys), current situations, such as time, location, and environment, significantly influence their preferences. Situations serve as the background for interactions, where relationships between users and items evolve with situation changes. However, existing RecSys treat situations, users, and items on the same level. They can only model the relations… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted at the International Conference on Database Systems for Advanced Applications (DASFAA 2024)

  47. arXiv:2403.17011  [pdf, other

    cs.LG cs.AI cs.CY

    SUDO: a framework for evaluating clinical artificial intelligence systems without ground-truth annotations

    Authors: Dani Kiyasseh, Aaron Cohen, Chengsheng Jiang, Nicholas Altieri

    Abstract: A clinical artificial intelligence (AI) system is often validated on a held-out set of data which it has not been exposed to before (e.g., data from a different hospital with a distinct electronic health record system). This evaluation process is meant to mimic the deployment of an AI system on data in the wild; those which are currently unseen by the system yet are expected to be encountered in a… ▽ More

    Submitted 2 January, 2024; originally announced March 2024.

  48. arXiv:2403.14410  [pdf, other

    cs.CV cs.AI cs.LG

    GLC++: Source-Free Universal Domain Adaptation through Global-Local Clustering and Contrastive Affinity Learning

    Authors: Sanqing Qu, Tianpei Zou, Florian Röhrbein, Cewu Lu, Guang Chen, Dacheng Tao, Changjun Jiang

    Abstract: Deep neural networks often exhibit sub-optimal performance under covariate and category shifts. Source-Free Domain Adaptation (SFDA) presents a promising solution to this dilemma, yet most SFDA approaches are restricted to closed-set scenarios. In this paper, we explore Source-Free Universal Domain Adaptation (SF-UniDA) aiming to accurately classify "known" data belonging to common categories and… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: This is a substantial extension of the CVPR 2023 paper "Upcycling Models under Domain and Category Shift"

  49. arXiv:2403.14401  [pdf, other

    cs.CV

    Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination

    Authors: Dingchen Yang, Bowen Cao, Guang Chen, Changjun Jiang

    Abstract: Multi-modal Large Language Models (MLLMs) demonstrate remarkable success across various vision-language tasks. However, they suffer from visual hallucination, where the generated responses diverge from the provided image. Are MLLMs completely oblivious to accurate visual cues when they hallucinate? Our investigation reveals that the visual branch may simultaneously advocate both accurate and non-e… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  50. arXiv:2403.13783  [pdf, other

    cs.RO

    A Convex Formulation of Frictional Contact for the Material Point Method and Rigid Bodies

    Authors: Zeshun Zong, Chenfanfu Jiang, Xuchen Han

    Abstract: In this paper, we introduce a novel convex formulation that seamlessly integrates the Material Point Method (MPM) with articulated rigid body dynamics in frictional contact scenarios. We extend the linear corotational hyperelastic model into the realm of elastoplasticity and include an efficient return map** algorithm. This approach is particularly effective for MPM simulations involving signifi… ▽ More

    Submitted 22 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: The supplemental video is available at https://youtu.be/5jrQtF5D0DA