Skip to main content

Showing 101–150 of 1,041 results for author: Zha, D

.
  1. arXiv:2401.12917  [pdf, other

    cs.AI

    Active Inference as a Model of Agency

    Authors: Lancelot Da Costa, Samuel Tenka, Dominic Zhao, Noor Sajid

    Abstract: Is there a canonical way to think of agency beyond reward maximisation? In this paper, we show that any type of behaviour complying with physically sound assumptions about how macroscopic biological agents interact with the world canonically integrates exploration and exploitation in the sense of minimising risk and ambiguity about states of the world. This description, known as active inference,… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted in RLDM2022 for the workshop 'RL as a model of agency'

  2. arXiv:2401.11997  [pdf, other

    astro-ph.GA

    PAC.V. The Roles of Mass and Environment in the Quenching of Galaxies

    Authors: Yun Zheng, Kun Xu, Y. P. **g, Donghai Zhao, Hongyu Gao, Xiaolin Luo, Jianxin Han, Yu Yu, Ming Li

    Abstract: The roles that mass and environment play in the galaxy quenching are still under debate. Leveraging the Photometric objects Around Cosmic webs (PAC) method, we analyze the excess surface distribution $\bar{n}_2w_{\rm{p}}(r_{\rm{p}})$ of photometric galaxies in different color (rest-frame $u-r$) within the stellar mass range of $10^{9.0}M_{\odot}\sim10^{11.0}M_{\odot}$ around spectroscopic massive… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 23 pages, 14 figures. Submitted to ApJ. Comments welcome :-)

  3. arXiv:2401.11687  [pdf, other

    cs.NE cs.CV cs.LG

    TIM: An Efficient Temporal Interaction Module for Spiking Transformer

    Authors: Sicheng Shen, Dongcheng Zhao, Guobin Shen, Yi Zeng

    Abstract: Spiking Neural Networks (SNNs), as the third generation of neural networks, have gained prominence for their biological plausibility and computational efficiency, especially in processing diverse datasets. The integration of attention mechanisms, inspired by advancements in neural network architectures, has led to the development of Spiking Transformers. These have shown promise in enhancing SNNs'… ▽ More

    Submitted 9 May, 2024; v1 submitted 21 January, 2024; originally announced January 2024.

    Comments: Accepted by the 33rd International Joint Conference on Artificial Intelligence(IJCAI 2024)

  4. arXiv:2401.10450  [pdf, other

    physics.optics cond-mat.mes-hall

    Observation of tunable topological polaritons in a cavity waveguide

    Authors: Dong Zhao, Ziyao Wang, Linyun Yang, Yuxin Zhong, Xiang Xi, Zhenxiao Zhu, Maohua Gong, Qingan Tu, Yan Meng, Bei Yan, Ce Shang, Zhen Gao

    Abstract: Topological polaritons characterized by light-matter interactions have become a pivotal platform in exploring new topological phases of matter. Recent theoretical advances unveiled a novel mechanism for tuning topological phases of polaritons by modifying the surrounding photonic environment (light-matter interactions) without altering the lattice structure. Here, by embedding a dimerized chain of… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: 6 pages, 4 figures

  5. arXiv:2401.08819  [pdf, other

    cs.LG cs.AI

    Learning from Sparse Offline Datasets via Conservative Density Estimation

    Authors: Zhepeng Cen, Zuxin Liu, Zitong Wang, Yihang Yao, Henry Lam, Ding Zhao

    Abstract: Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment. However, existing methods struggle to handle out-of-distribution (OOD) extrapolation errors, especially in sparse reward or scarce data settings. In this paper, we propose a novel training algorithm called Conservative Densi… ▽ More

    Submitted 11 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: ICLR 2024

  6. arXiv:2401.07159  [pdf, other

    cs.LG cs.AI

    Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models

    Authors: Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Qing Li, Yong Jiang, Zhihao Jia

    Abstract: Finetuning large language models (LLMs) has been empirically effective on a variety of downstream tasks. Existing approaches to finetuning an LLM either focus on parameter-efficient finetuning, which only updates a small number of trainable parameters, or attempt to reduce the memory footprint during the training phase of the finetuning. Typically, the memory footprint during finetuning stems from… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    ACM Class: I.2.7

  7. arXiv:2401.05709  [pdf, other

    cs.NI eess.SP

    Probability-based Distance Estimation Model for 3D DV-Hop Localization in WSNs

    Authors: Penghong Wang, Hao Wang, Wenrui Li, Xiaopeng Fan, Debin Zhao

    Abstract: Localization is one of the pivotal issues in wireless sensor network applications. In 3D localization studies, most algorithms focus on enhancing the location prediction process, lacking theoretical derivation of the detection distance of an anchor node at the varying hops, engenders a localization performance bottleneck. To address this issue, we propose a probability-based average distance estim… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

  8. arXiv:2401.03901  [pdf, other

    cs.CV cs.CL

    STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering

    Authors: Yueqian Wang, Yuxuan Wang, Kai Chen, Dongyan Zhao

    Abstract: Recently we have witnessed the rapid development of video question answering models. However, most models can only handle simple videos in terms of temporal reasoning, and their performance tends to drop when answering temporal-reasoning questions on long and informative videos. To tackle this problem we propose STAIR, a Spatial-Temporal Reasoning model with Auditable Intermediate Results for vide… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: To appear in AAAI 2024

  9. arXiv:2401.03141  [pdf, other

    cs.RO

    Estimating the Lateral Motion States of an Underwater Robot by Propeller Wake Sensing Using an Artificial Lateral Line

    Authors: Jun Wang, Dexin Zhao, Youxi Zhao, Feitian Zhang, Tongsheng Shen

    Abstract: An artificial lateral line (ALL) is a bioinspired flow sensing system of an underwater robot that consists of distributed flow sensors. The ALL has achieved great success in sensing the motion states of bioinspired underwater robots, e.g., robotic fish, that are driven by body undulation and/or tail flap**. However, the ALL has not been systematically tested and studied in the sensing of underwa… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

    Comments: 10 pages, 8 figures

  10. arXiv:2401.02673  [pdf, other

    eess.AS cs.AI cs.SD

    A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

    Authors: Dongdi Zhao, Jianbo Ma, Lu Lu, **ke Li, Xuan Ji, Lei Zhu, Fuming Fang, Ming Liu, Feijun Jiang

    Abstract: Far-field speech recognition is a challenging task that conventionally uses signal processing beamforming to attack noise and interference problem. But the performance has been found usually limited due to heavy reliance on environmental assumption. In this paper, we propose a unified multichannel far-field speech recognition system that combines the neural beamforming and transformer-based Listen… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  11. Photometric Objects Around Cosmic Webs (PAC). VI. High Satellite Fraction of Quasars

    Authors: Shanquan Gui, Kun Xu, Y. P. **g, Donghai Zhao, Hongyu Gao

    Abstract: The Photometric objects Around Cosmic webs (PAC) approach developed in Xu et al. (2022b) has the advantage of making full use of spectroscopic and deeper photometric surveys. With the merits of PAC, the excess surface density $\bar{n}_2w_{\rm{p}}$ of neighboring galaxies can be measured down to stellar mass $10^{10.80}\,M_{\odot}$ around quasars at redshift $0.8<z_{\rm{s}}<1.0$, with the data from… ▽ More

    Submitted 15 May, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

    Comments: 15 pages, 11 figures, 2 tables, accepted for publication in the Astrophysical Journal

    Journal ref: The Astrophysical Journal, 967:17 (13pp), 2024 May 20

  12. arXiv:2401.00124  [pdf, other

    eess.SP

    Generative AI-driven Semantic Communication Networks: Architecture, Technologies and Applications

    Authors: Chengsi Liang, Hongyang Du, Yao Sun, Dusit Niyato, Jiawen Kang, Dezong Zhao, Muhammad Ali Imran

    Abstract: Generative artificial intelligence (GAI) has emerged as a rapidly burgeoning field demonstrating significant potential in creating diverse contents intelligently and automatically. To support such artificial intelligence-generated content (AIGC) services, future communication systems should fulfill much more stringent requirements (including data rate, throughput, latency, etc.) with limited yet p… ▽ More

    Submitted 7 January, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

  13. arXiv:2312.17493  [pdf, other

    cs.LG cs.CR

    Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning

    Authors: Xiao-Yang Liu, Rongyi Zhu, Daochen Zha, Jiechao Gao, Shan Zhong, Matt White, Meikang Qiu

    Abstract: The surge in interest and application of large language models (LLMs) has sparked a drive to fine-tune these models to suit specific applications, such as finance and medical science. However, concerns regarding data privacy have emerged, especially when multiple stakeholders aim to collaboratively enhance LLMs using sensitive data. In this scenario, federated learning becomes a natural choice, al… ▽ More

    Submitted 2 June, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

    Comments: 21 pages, 1 figure, 19 tables

  14. arXiv:2312.16352  [pdf, ps, other

    cs.CR cs.LG cs.PF

    Smuche: Scalar-Multiplicative Caching in Homomorphic Encryption

    Authors: Dongfang Zhao

    Abstract: Addressing the challenge of balancing security and efficiency when deploying machine learning systems in untrusted environments, such as federated learning, remains a critical concern. A promising strategy to tackle this issue involves optimizing the performance of fully homomorphic encryption (HE). Recent research highlights the efficacy of advanced caching techniques, such as Rache, in significa… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  15. arXiv:2312.15127  [pdf, other

    cs.LG

    Gradient Sha** for Multi-Constraint Safe Reinforcement Learning

    Authors: Yihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu, Ding Zhao

    Abstract: Online safe reinforcement learning (RL) involves training a policy that maximizes task efficiency while satisfying constraints via interacting with the environments. In this paper, our focus lies in addressing the complex challenges associated with solving multi-constraint (MC) safe RL problems. We approach the safe RL problem from the perspective of Multi-Objective Optimization (MOO) and propose… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  16. arXiv:2312.13303  [pdf, other

    cs.LG cs.AI

    RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

    Authors: Wenhao Ding, Yulong Cao, Ding Zhao, Chaowei Xiao, Marco Pavone

    Abstract: Simulation plays a crucial role in the development of autonomous vehicles (AVs) due to the potential risks associated with real-world testing. Although significant progress has been made in the visual aspects of simulators, generating complex behavior among agents remains a formidable challenge. It is not only imperative to ensure realism in the scenarios generated but also essential to incorporat… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  17. arXiv:2312.11945  [pdf, other

    cs.CL

    Multi-Granularity Information Interaction Framework for Incomplete Utterance Rewriting

    Authors: Haowei Du, Dinghao Zhang, Chen Li, Yang Li, Dongyan Zhao

    Abstract: Recent approaches in Incomplete Utterance Rewriting (IUR) fail to capture the source of important words, which is crucial to edit the incomplete utterance, and introduce words from irrelevant utterances. We propose a novel and effective multi-task information interaction framework including context selection, edit matrix construction, and relevance merging to capture the multi-granularity of seman… ▽ More

    Submitted 8 January, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Findings of EMNLP2023 (short)

  18. arXiv:2312.11922  [pdf, other

    cs.CL cs.AI

    Relation-Aware Question Answering for Heterogeneous Knowledge Graphs

    Authors: Haowei Du, Quzhe Huang, Chen Li, Chen Zhang, Yang Li, Dongyan Zhao

    Abstract: Multi-hop Knowledge Base Question Answering(KBQA) aims to find the answer entity in a knowledge graph (KG), which requires multiple steps of reasoning. Existing retrieval-based approaches solve this task by concentrating on the specific relation at different hops and predicting the intermediate entity within the reasoning path. During the reasoning process of these methods, the representation of r… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Findings of EMNLP2023 (Long)

  19. arXiv:2312.10825  [pdf, other

    cs.CV cs.LG

    Latent Space Editing in Transformer-Based Flow Matching

    Authors: Vincent Tao Hu, David W Zhang, Pascal Mettes, Meng Tang, Deli Zhao, Cees G. M. Snoek

    Abstract: This paper strives for image editing via generative models. Flow Matching is an emerging generative modeling technique that offers the advantage of simple and efficient training. Simultaneously, a new transformer-based U-ViT has recently been proposed to replace the commonly used UNet for better scalability and performance in generative modeling. Hence, Flow Matching with a transformer backbone of… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: AAAI 2024 with Appendix

  20. PETDet: Proposal Enhancement for Two-Stage Fine-Grained Object Detection

    Authors: Wentao Li, Danpei Zhao, Bo Yuan, Yue Gao, Zhenwei Shi

    Abstract: Fine-grained object detection (FGOD) extends object detection with the capability of fine-grained recognition. In recent two-stage FGOD methods, the region proposal serves as a crucial link between detection and fine-grained recognition. However, current methods overlook that some proposal-related procedures inherited from general detection are not equally suitable for FGOD, limiting the multi-tas… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: IEEE TGRS 2023

  21. arXiv:2312.09785  [pdf, other

    cs.CL

    RJUA-QA: A Comprehensive QA Dataset for Urology

    Authors: Shiwei Lyu, Chenfei Chi, Hongbo Cai, Lei Shi, Xiaoyan Yang, Lei Liu, Xiang Chen, Deng Zhao, Zhiqiang Zhang, Xianguo Lyu, Ming Zhang, Fangzhou Li, Xiaowei Ma, Yue Shen, **jie Gu, Wei Xue, Yiran Huang

    Abstract: We introduce RJUA-QA, a novel medical dataset for question answering (QA) and reasoning with clinical evidence, contributing to bridge the gap between general large language models (LLMs) and medical-specific LLM applications. RJUA-QA is derived from realistic clinical scenarios and aims to facilitate LLMs in generating reliable diagnostic and advice. The dataset contains 2,132 curated Question-Co… ▽ More

    Submitted 7 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: An initial version

  22. arXiv:2312.07625  [pdf, other

    cs.NE cs.AI

    Astrocyte-Enabled Advancements in Spiking Neural Networks for Large Language Modeling

    Authors: Guobin Shen, Dongcheng Zhao, Yiting Dong, Yang Li, **dong Li, Kang Sun, Yi Zeng

    Abstract: Within the complex neuroarchitecture of the brain, astrocytes play crucial roles in development, structure, and metabolism. These cells regulate neural activity through tripartite synapses, directly impacting cognitive processes such as learning and memory. Despite the growing recognition of astrocytes' significance, traditional Spiking Neural Network (SNN) models remain predominantly neuron-centr… ▽ More

    Submitted 25 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  23. arXiv:2312.06964  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Ground Calibration Result of the Lobster Eye Imager for Astronomy

    Authors: Huaqing Cheng, Zhixing Ling, Chen Zhang, Xiao** Sun, Shengli Sun, Yuan Liu, Yanfeng Dai, Zhenqing Jia, Haiwu Pan, Wenxin Wang, Donghua Zhao, Yifan Chen, Zhiwei Cheng, Wei Fu, Yixiao Han, Junfei Li, Zhengda Li, Xiaohao Ma, Yulong Xue, Ailiang Yan, Qiang Zhang, Yusa Wang, Xiongtao Yang, Zijian Zhao, Weimin Yuan

    Abstract: We report on results of the on-ground X-ray calibration of the Lobster Eye Imager for Astronomy (LEIA), an experimental space wide-field (18.6*18.6 square degrees) X-ray telescope built from novel lobster eye mirco-pore optics. LEIA was successfully launched on July 27, 2022 onboard the SATech-01 satellite. To achieve full characterisation of its performance before launch, a series of tests and ca… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: 24 pages, 13 figures. Submitted to Experimental Astronomy

  24. arXiv:2312.06331  [pdf, other

    cs.CV

    Semantic Connectivity-Driven Pseudo-labeling for Cross-domain Segmentation

    Authors: Dong Zhao, Ruizhi Yang, Shuang Wang, Qi Zang, Yang Hu, Licheng Jiao, Nicu Sebe, Zhun Zhong

    Abstract: Presently, self-training stands as a prevailing approach in cross-domain semantic segmentation, enhancing model efficacy by training with pixels assigned with reliable pseudo-labels. However, we find two critical limitations in this paradigm. (1) The majority of reliable pixels exhibit a speckle-shaped pattern and are primarily located in the central semantic region. This presents challenges for t… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  25. arXiv:2312.06185  [pdf, other

    cs.CL cs.AI

    KnowGPT: Knowledge Graph based Prompting for Large Language Models

    Authors: Qinggang Zhang, Junnan Dong, Hao Chen, Daochen Zha, Zailiang Yu, Xiao Huang

    Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in many real-world applications. Nonetheless, LLMs are often criticized for their tendency to produce hallucinations, wherein the models fabricate incorrect statements on tasks beyond their knowledge and perception. To alleviate this issue, researchers have explored leveraging the factual knowledge in knowledge graphs (KGs) to… ▽ More

    Submitted 4 June, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

  26. arXiv:2312.03868  [pdf, other

    eess.SY econ.GN math.OC

    Uncertainty-Informed Renewable Energy Scheduling: A Scalable Bilevel Framework

    Authors: Dongwei Zhao, Vladimir Dvorkin, Stefanos Delikaraoglou, Alberto J. Lamadrid L., Audun Botterud

    Abstract: This work proposes an uncertainty-informed bid adjustment framework for integrating variable renewable energy sources (VRES) into electricity markets. This framework adopts a bilevel model to compute the optimal VRES day-ahead bids. It aims to minimize the expected system cost across day-ahead and real-time stages and approximate the cost efficiency of the stochastic market design. However, solvin… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: IEEE Transactions on Energy Markets, Policy, and Regulation

  27. arXiv:2312.00956  [pdf, other

    eess.SY math.OC

    A Cyclic Small Phase Theorem

    Authors: Chao Chen, Wei Chen, Di Zhao, Jianqi Chen, Li Qiu

    Abstract: This paper introduces a brand-new phase definition called the segmental phase for multi-input multi-output linear time-invariant systems. The underpinning of the definition lies in the matrix segmental phase which, as its name implies, is graphically based on the smallest circular segment covering the matrix normalized numerical range in the unit disk. The matrix segmental phase has the crucial pr… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  28. arXiv:2311.18829  [pdf, other

    cs.CV

    MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

    Authors: Yanhui Wang, Jianmin Bao, Wenming Weng, Ruoyu Feng, Dacheng Yin, Tao Yang, **gxu Zhang, Qi Dai Zhiyuan Zhao, Chunyu Wang, Kai Qiu, Yuhui Yuan, Chuanxin Tang, Xiaoyan Sun, Chong Luo, Baining Guo

    Abstract: We present MicroCinema, a straightforward yet effective framework for high-quality and coherent text-to-video generation. Unlike existing approaches that align text prompts with video directly, MicroCinema introduces a Divide-and-Conquer strategy which divides the text-to-video into a two-stage process: text-to-image generation and image\&text-to-video generation. This strategy offers two signific… ▽ More

    Submitted 29 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Project page: https://wangyanhui666.github.io/MicroCinema.github.io/

  29. arXiv:2311.18166  [pdf, other

    cs.CV

    A-Scan2BIM: Assistive Scan to Building Information Modeling

    Authors: Weilian Song, Jieliang Luo, Dale Zhao, Yan Fu, Chin-Yi Cheng, Yasutaka Furukawa

    Abstract: This paper proposes an assistive system for architects that converts a large-scale point cloud into a standardized digital representation of a building for Building Information Modeling (BIM) applications. The process is known as Scan-to-BIM, which requires many hours of manual work even for a single building floor by a professional architect. Given its challenging nature, the paper focuses on hel… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: BMVC 2023, order evaluation updated after fixing evaluation bug

  30. arXiv:2311.15649  [pdf, other

    cs.RO cs.AI cs.LG

    RoboGPT: an intelligent agent of making embodied long-term decisions for daily instruction tasks

    Authors: Yaran Chen, Wenbo Cui, Yuanwen Chen, Mining Tan, Xinyao Zhang, Dongbin Zhao, He Wang

    Abstract: Robotic agents must master common sense and long-term sequential decisions to solve daily tasks through natural language instruction. The developments in Large Language Models (LLMs) in natural language processing have inspired efforts to use LLMs in complex robot planning. Despite LLMs' great generalization and comprehension of instruction tasks, LLMs-generated task plans sometimes lack feasibili… ▽ More

    Submitted 30 June, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  31. arXiv:2311.15542  [pdf

    physics.optics

    Arbitrary Engineering of Spatial Caustics with 3D-printed Metasurfaces

    Authors: Xiaoyan Zhou, Hongtao Wang, Shuxi Liu, Hao Wang, John You En Chan, Cheng-Feng Pan, Daomu Zhao, Joel K. W. Yang, Cheng-Wei Qiu

    Abstract: Caustics occur in diverse physical systems, spanning the nano-scale in electron microscopy to astronomical-scale in gravitational lensing. As envelopes of rays, optical caustics result in sharp edges or extended networks. Caustics in structured light, characterized by complex-amplitude distributions, have innovated numerous applications including particle manipulation, high-resolution imaging tech… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  32. arXiv:2311.14992  [pdf, ps, other

    math.OC

    Model-free Reinforcement Learning for ${H_{2}/H_{\infty}}$ Control of Stochastic Discrete-time Systems

    Authors: Xiushan Jiang, Li Wang, Dongya Zhao, Ling Shi

    Abstract: This paper proposes a reinforcement learning (RL) algorithm for infinite horizon $\rm {H_{2}/H_{\infty}}$ problem in a class of stochastic discrete-time systems, rather than using a set of coupled generalized algebraic Riccati equations (GAREs). The algorithm is able to learn the optimal control policy for the system even when its parameters are unknown. Additionally, the paper explores the effect… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  33. arXiv:2311.12626  [pdf

    physics.app-ph

    Acoustic Vortex in Waveguide with Chiral Gradient Sawtooth Metasurface

    Authors: Zeliang Song, Shuhuan Xie, Yong Li, Hua Ding, Feiyan Cai, Yugui Peng, Xuefeng Zhu, Degang Zhao

    Abstract: The acoustic vortex states with spiral phase dislocation that can carry orbital angular moment (OAM) have aroused many research interests in recent years. The mainstream methods of generating acoustic vortex are based on Huygens-Fresnel principle to modulate the wavefront to create spatial spiral phase dislocation. In this work, we propose an entirely new scenario to generate acoustic vortex in a… ▽ More

    Submitted 14 January, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  34. arXiv:2311.12292  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG eess.IV

    Map** "Brain Coral" Regions on Mars using Deep Learning

    Authors: Kyle A. Pearson, Eldar Noe, Daniel Zhao, Alphan Altinok, Alex Morgan

    Abstract: One of the main objectives of the Mars Exploration Program is to search for evidence of past or current life on the planet. To achieve this, Mars exploration has been focusing on regions that may have liquid or frozen water. A set of critical areas may have seen cycles of ice thawing in the relatively recent past in response to periodic changes in the obliquity of Mars. In this work, we use convol… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Submitted for publication, seeking comments from the community. Code available: https://github.com/pearsonkyle/Mars-Brain-Coral-Network

  35. arXiv:2311.10802  [pdf, other

    cs.NE

    Is Conventional SNN Really Efficient? A Perspective from Network Quantization

    Authors: Guobin Shen, Dongcheng Zhao, Tenglong Li, **dong Li, Yi Zeng

    Abstract: Spiking Neural Networks (SNNs) have been widely praised for their high energy efficiency and immense potential. However, comprehensive research that critically contrasts and correlates SNNs with quantized Artificial Neural Networks (ANNs) remains scant, often leading to skewed comparisons lacking fairness towards ANNs. This paper introduces a unified perspective, illustrating that the time steps i… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

  36. arXiv:2311.10747  [pdf, other

    cs.RO cs.AI cs.LG

    Safety-aware Causal Representation for Trustworthy Offline Reinforcement Learning in Autonomous Driving

    Authors: Haohong Lin, Wenhao Ding, Zuxin Liu, Yaru Niu, Jiacheng Zhu, Yuming Niu, Ding Zhao

    Abstract: In the domain of autonomous driving, the offline Reinforcement Learning~(RL) approaches exhibit notable efficacy in addressing sequential decision-making problems from offline datasets. However, maintaining safety in diverse safety-critical scenarios remains a significant challenge due to long-tailed and unforeseen scenarios absent from offline datasets. In this paper, we introduce the saFety-awar… ▽ More

    Submitted 12 March, 2024; v1 submitted 31 October, 2023; originally announced November 2023.

  37. arXiv:2311.08911  [pdf, other

    cs.GT

    Connection Incentives in Cost Sharing Mechanisms with Budgets

    Authors: Tianyi Zhang, Dengji Zhao, Junyu Zhang, Sizhe Gu

    Abstract: In a cost sharing problem on a weighted undirected graph, all other nodes want to connect to the source node for some service. Each edge has a cost denoted by a weight and all the connected nodes should share the total cost for the connectivity. The goal of the existing solutions (e.g. folk solution and cycle-complete solution) is to design cost sharing rules with nice properties, e.g. budget bala… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2201.05976

  38. arXiv:2311.08903  [pdf, other

    cs.GT

    Cost Sharing under Private Costs and Connection Control on Directed Acyclic Graphs

    Authors: Tianyi Zhang, Dengji Zhao, Junyu Zhang, Sizhe Gu

    Abstract: We consider a cost sharing problem on a weighted directed acyclic graph (DAG) with a source node to which all the other nodes want to connect. The cost (weight) of each edge is private information reported by multiple contractors, and among them, only one contractor is selected as the builder. All the nodes except for the source need to share the total cost of the used edges. However, they may blo… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  39. arXiv:2311.07491  [pdf, other

    cs.CL

    A Step Closer to Comprehensive Answers: Constrained Multi-Stage Question Decomposition with Large Language Models

    Authors: He**g Cao, Zhenwei An, Jiazhan Feng, Kun Xu, Liwei Chen, Dongyan Zhao

    Abstract: While large language models exhibit remarkable performance in the Question Answering task, they are susceptible to hallucinations. Challenges arise when these models grapple with understanding multi-hop relations in complex questions or lack the necessary knowledge for a comprehensive response. To address this issue, we introduce the "Decompose-and-Query" framework (D&Q). This framework guides the… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  40. arXiv:2311.06158  [pdf, other

    cs.CL cs.AI

    Language Models can be Logical Solvers

    Authors: Jiazhan Feng, Ruochen Xu, Junheng Hao, Hiteshi Sharma, Yelong Shen, Dongyan Zhao, Weizhu Chen

    Abstract: Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questi… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: Preprint

  41. arXiv:2311.04145  [pdf, other

    cs.CV

    I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

    Authors: Shiwei Zhang, Jiayu Wang, Yingya Zhang, Kang Zhao, Hangjie Yuan, Zhiwu Qin, Xiang Wang, Deli Zhao, **gren Zhou

    Abstract: Video synthesis has recently made remarkable strides benefiting from the rapid development of diffusion models. However, it still encounters challenges in terms of semantic accuracy, clarity and spatio-temporal continuity. They primarily arise from the scarcity of well-aligned text-video data and the complex inherent structure of videos, making it difficult for the model to simultaneously ensure s… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Project page: https://i2vgen-xl.github.io

  42. arXiv:2311.01767  [pdf, other

    cs.CL

    PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

    Authors: Yiduo Guo, Zekai Zhang, Yaobo Liang, Dongyan Zhao, Nan Duan

    Abstract: Recent evaluations of Large Language Models (LLMs) have centered around testing their zero-shot/few-shot capabilities for basic natural language tasks and their ability to translate instructions into tool APIs. However, the evaluation of LLMs utilizing complex tools to finish multi-turn, multi-modal instructions in a complex multi-modal environment has not been investigated. To address this gap, w… ▽ More

    Submitted 7 November, 2023; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: LLM evaluation, PPT task completion

  43. arXiv:2311.00426  [pdf, other

    cs.LG cs.AI

    Enhanced Generalization through Prioritization and Diversity in Self-Imitation Reinforcement Learning over Procedural Environments with Sparse Rewards

    Authors: Alain Andres, Daochen Zha, Javier Del Ser

    Abstract: Exploration poses a fundamental challenge in Reinforcement Learning (RL) with sparse rewards, limiting an agent's ability to learn optimal decision-making due to a lack of informative feedback signals. Self-Imitation Learning (self-IL) has emerged as a promising approach for exploration, leveraging a replay buffer to store and reproduce successful behaviors. However, traditional self-IL methods, w… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: 7 pages, 5 figures

  44. arXiv:2311.00134  [pdf, other

    cs.CV

    Joint Depth Prediction and Semantic Segmentation with Multi-View SAM

    Authors: Mykhailo Shvets, Dongxu Zhao, Marc Niethammer, Roni Sengupta, Alexander C. Berg

    Abstract: Multi-task approaches to joint depth and segmentation prediction are well-studied for monocular images. Yet, predictions from a single-view are inherently limited, while multiple views are available in many robotics applications. On the other end of the spectrum, video-based and full 3D methods require numerous frames to perform reconstruction and segmentation. With this work we propose a Multi-Vi… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

    Comments: To appear in the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision

  45. arXiv:2310.20669  [pdf, other

    cs.RO

    Modeling multi-legged robot locomotion with slip** and its experimental validation

    Authors: Ziyou Wu, Dan Zhao, Shai Revzen

    Abstract: Multi-legged robots with six or more legs are not in common use, despite designs with superior stability, maneuverability, and a low number of actuators being available for over 20 years. This may be in part due to the difficulty in modeling multi-legged motion with slip** and producing reliable predictions of body velocity. Here we present a detailed measurement of the foot contact forces in a… ▽ More

    Submitted 3 January, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

  46. arXiv:2310.20198  [pdf, ps, other

    eess.SP

    Structured Two-Stage True-Time-Delay Array Codebook Design for Multi-User Data Communication

    Authors: Aditya Wadaskar, Ding Zhao, Ibrahim Pehlivan, Danijela Cabric

    Abstract: Wideband millimeter-wave and terahertz (THz) systems can facilitate simultaneous data communication with multiple spatially separated users. It is desirable to orthogonalize users across sub-bands by deploying frequency-dependent beams with a sub-band-specific spatial response. True-Time-Delay (TTD) antenna arrays are a promising wideband architecture to implement sub-band-specific dispersion of b… ▽ More

    Submitted 15 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

  47. arXiv:2310.19859  [pdf, other

    cs.CV cs.AI

    Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone

    Authors: Zeyinzi Jiang, Chaojie Mao, Ziyuan Huang, Ao Ma, Yiliang Lv, Yujun Shen, Deli Zhao, **gren Zhou

    Abstract: Parameter-efficient tuning has become a trend in transferring large-scale foundation models to downstream applications. Existing methods typically embed some light-weight tuners into the backbone, where both the design and the learning of the tuners are highly dependent on the base model. This work offers a new tuning paradigm, dubbed Res-Tuning, which intentionally unbinds tuners from the backbon… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  48. arXiv:2310.19572  [pdf, other

    cs.CL

    Improving Input-label Map** with Demonstration Replay for In-context Learning

    Authors: Zhuocheng Gong, Jiahao Liu, Qifan Wang, **gang Wang, Xunliang Cai, Dongyan Zhao, Rui Yan

    Abstract: In-context learning (ICL) is an emerging capability of large autoregressive language models where a few input-label demonstrations are appended to the input to enhance the model's understanding of downstream NLP tasks, without directly adjusting the model parameters. The effectiveness of ICL can be attributed to the strong language modeling capabilities of large language models (LLMs), which enabl… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  49. arXiv:2310.19070  [pdf, other

    cs.CV

    Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection

    Authors: Yuanze Li, Haolin Wang, Shihao Yuan, Ming Liu, Debin Zhao, Yiwen Guo, Chen Xu, Guangming Shi, Wangmeng Zuo

    Abstract: Existing industrial anomaly detection (IAD) methods predict anomaly scores for both anomaly detection and localization. However, they struggle to perform a multi-turn dialog and detailed descriptions for anomaly regions, e.g., color, shape, and categories of industrial anomalies. Recently, large multimodal (i.e., vision and language) models (LMMs) have shown eminent perception abilities on multipl… ▽ More

    Submitted 31 October, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: 8 pages, 7 figures

  50. arXiv:2310.18257  [pdf, other

    cs.LG

    MIM-GAN-based Anomaly Detection for Multivariate Time Series Data

    Authors: Shan Lu, Zhicheng Dong, Donghong Cai, Fang Fang, Dongcai Zhao

    Abstract: The loss function of Generative adversarial network(GAN) is an important factor that affects the quality and diversity of the generated samples for anomaly detection. In this paper, we propose an unsupervised multiple time series anomaly detection algorithm based on the GAN with message importance measure(MIM-GAN). In particular, the time series data is divided into subsequences using a sliding wi… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 7 pages,6 figures