Skip to main content

Showing 101–150 of 1,070 results for author: Zhao, T

.
  1. arXiv:2401.02117  [pdf, other

    cs.RO cs.AI cs.CV cs.LG eess.SY

    Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

    Authors: Zipeng Fu, Tony Z. Zhao, Chelsea Finn

    Abstract: Imitation learning from human demonstrations has shown impressive performance in robotics. However, most results focus on table-top manipulation, lacking the mobility and dexterity necessary for generally useful tasks. In this work, we develop a system for imitating mobile manipulation tasks that are bimanual and require whole-body control. We first present Mobile ALOHA, a low-cost and whole-body… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: Project website: https://mobile-aloha.github.io (Zipeng Fu and Tony Z. Zhao are project co-leads, Chelsea Finn is the advisor)

  2. arXiv:2401.01097  [pdf, other

    cs.CV

    Robust single-particle cryo-EM image denoising and restoration

    Authors: **g Zhang, Tengfei Zhao, ShiYu Hu, Xin Zhao

    Abstract: Cryo-electron microscopy (cryo-EM) has achieved near-atomic level resolution of biomolecules by reconstructing 2D micrographs. However, the resolution and accuracy of the reconstructed particles are significantly reduced due to the extremely low signal-to-noise ratio (SNR) and complex noise structure of cryo-EM images. In this paper, we introduce a diffusion model with post-processing framework to… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: This paper is accepted to ICASSP 2024

  3. arXiv:2312.16571  [pdf, other

    cs.CV

    GRSDet: Learning to Generate Local Reverse Samples for Few-shot Object Detection

    Authors: Hefei Mei, Tai** Zhao, Shiyuan Tang, Heqian Qiu, Lanxiao Wang, Minjian Zhang, Fanman Meng, Hongliang Li

    Abstract: Few-shot object detection (FSOD) aims to achieve object detection only using a few novel class training data. Most of the existing methods usually adopt a transfer-learning strategy to construct the novel class distribution by transferring the base class knowledge. However, this direct way easily results in confusion between the novel class and other similar categories in the decision space. To ad… ▽ More

    Submitted 29 December, 2023; v1 submitted 27 December, 2023; originally announced December 2023.

  4. arXiv:2312.16426  [pdf, ps, other

    math.NA

    Spectral approximation of $ψ$-fractional differential equation based on mapped Jacobi functions

    Authors: Tinggang Zhao, Zhenyu Zhao, Changpin Li, Dongxia Li

    Abstract: Fractional calculus with respect to function $ψ$, also named as $ψ$-fractional calculus, generalizes the Hadamard and the Riemann-Liouville fractional calculi, which causes challenge in numerical treatment. In this paper we study spectral-type methods using mapped Jacobi functions (MJFs) as basis functions and obtain efficient algorithms to solve $ψ$-fractional differential equations. In particula… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: This is a full length version of a submission to TWMS

    MSC Class: 65F60; 65D32; 65M12; 35K55 ACM Class: G.1.2; G.1.9

  5. arXiv:2312.16246  [pdf, other

    cs.CV

    Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning

    Authors: Andong Lu, Tianrui Zha, Chenglong Li, ** Tang, Xiaofeng Wang, Bin Luo

    Abstract: Prevalent nighttime ReID methods typically combine relighting networks and ReID networks in a sequential manner, which not only restricts the ReID performance by the quality of relighting images, but also neglects the effective collaborative modeling between image relighting and person ReID tasks. To handle these problems, we propose a novel Collaborative Enhancement Network called CENet, which pe… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  6. arXiv:2312.15219  [pdf, other

    cs.CV

    Scale Optimization Using Evolutionary Reinforcement Learning for Object Detection on Drone Imagery

    Authors: Jialu Zhang, Xiaoying Yang, Wentao He, Jianfeng Ren, Qian Zhang, Titian Zhao, Ruibin Bai, Xiangjian He, Jiang Liu

    Abstract: Object detection in aerial imagery presents a significant challenge due to large scale variations among objects. This paper proposes an evolutionary reinforcement learning agent, integrated within a coarse-to-fine object detection framework, to optimize the scale for more effective detection of objects in such images. Specifically, a set of patches potentially containing objects are first generate… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  7. arXiv:2312.15043  [pdf, other

    cs.CV

    GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection

    Authors: Haozhan Shen, Tiancheng Zhao, Mingwei Zhu, Jianwei Yin

    Abstract: Visual grounding, a crucial vision-language task involving the understanding of the visual context based on the query expression, necessitates the model to capture the interactions between objects, as well as various spatial and attribute information. However, the annotation data of visual grounding task is limited due to its time-consuming and labor-intensive annotation process, resulting in the… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  8. arXiv:2312.11109  [pdf, other

    cs.LG

    Graph Transformers for Large Graphs

    Authors: Vijay Prakash Dwivedi, Yozen Liu, Anh Tuan Luu, Xavier Bresson, Neil Shah, Tong Zhao

    Abstract: Transformers have recently emerged as powerful neural networks for graph learning, showcasing state-of-the-art performance on several graph property prediction tasks. However, these results have been limited to small-scale graphs, where the computational feasibility of the global attention mechanism is possible. The next goal is to scale up these architectures to handle very large graphs on the sc… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  9. arXiv:2312.03668  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition

    Authors: Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada

    Abstract: Advances in machine learning have made it possible to perform various text and speech processing tasks, such as automatic speech recognition (ASR), in an end-to-end (E2E) manner. E2E approaches utilizing pre-trained models are gaining attention for conserving training data and resources. However, most of their applications in ASR involve only one of either a pre-trained speech or a language model.… ▽ More

    Submitted 6 June, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: 17 pages, 4 figures, 9 tables, accepted for Findings of ACL 2024. The model is available at https://huggingface.co/rinna/nue-asr

  10. arXiv:2312.03256  [pdf, other

    cs.LG

    CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation Models

    Authors: Hailin Zhang, Zirui Liu, Boxuan Chen, Yikai Zhao, Tong Zhao, Tong Yang, Bin Cui

    Abstract: Recently, the growing memory demands of embedding tables in Deep Learning Recommendation Models (DLRMs) pose great challenges for model training and deployment. Existing embedding compression solutions cannot simultaneously meet three key design requirements: memory efficiency, low latency, and adaptability to dynamic data distribution. This paper presents CAFE, a Compact, Adaptive, and Fast Embed… ▽ More

    Submitted 26 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  11. arXiv:2312.01616  [pdf, other

    cs.CV cs.RO

    SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

    Authors: Yunfei Fan, Tianyu Zhao, Guidong Wang

    Abstract: Accuracy and computational efficiency are the most important metrics to Visual Inertial Navigation System (VINS). The existing VINS algorithms with either high accuracy or low computational complexity, are difficult to provide the high precision localization in resource-constrained devices. To this end, we propose a novel filter-based VINS framework named SchurVINS, which could guarantee both high… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 December, 2023; originally announced December 2023.

    Comments: Accepted by CVPR2024

  12. arXiv:2312.01263  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Gate-Tunable Berry Curvature Dipole Polarizability in Dirac Semimetal Cd3As2

    Authors: Tong-Yang Zhao, An-Qi Wang, Xing-Guo Ye, Xing-Yu Liu, Xin Liao, Zhi-Min Liao

    Abstract: We reveal the gate-tunable Berry curvature dipole polarizability in Dirac semimetal Cd3As2 nanoplates through measurements of the third-order nonlinear Hall effect. Under an applied electric field, the Berry curvature exhibits an asymmetric distribution, forming a field-induced Berry curvature dipole, resulting in a measurable third-order Hall voltage with a cubic relationship to the longitudinal… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Journal ref: Phys. Rev. Lett. 131, 186302 (2023)

  13. arXiv:2312.01175  [pdf

    physics.acc-ph

    High Q and high gradient performance of the first medium-temperature baking 1.3 GHz cryomodule

    Authors: Jiyuan Zhai, Weimin Pan, Feisi He, Rui Ge, Zhenghui Mi, Peng Sha, Song **, Ruixiong Han, Qunyao Wang, Haiying Lin, Guangwei Wang, Mei Li, Min**g Sang, Liangrui Sun, Rui Ye, Tongxian Zhao, Shaopeng Li, Keyu Zhu, Baiqi Liu, Xiaolong Wang, Xiangchen Yang, Xiaojuan Bian, Xiangzhen Zhang, Huizhou Ma, Xuwen Dai , et al. (14 additional authors not shown)

    Abstract: World's first 1.3 GHz cryomodule containing eight 9-cell superconducting radio-frequency (RF) cavities treated by medium-temperature furnace baking (mid-T bake) was developed, assembled and tested at IHEP for the Dalian Advanced Light Source (DALS) and CEPC R&D. The 9-cell cavities in the cryomodule achieved an unprecedented highest average Q0 of 3.8E10 at 16 MV/m and 3.6E10 at 21 MV/m in the hori… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: 5 pages, 6 figures

  14. arXiv:2312.01073  [pdf

    physics.optics physics.bio-ph

    High-speed image reconstruction for nonlinear structured illumination microscopy

    Authors: **gxiang Zhang, Tianyu Zhao, Xiangda Fu, Manming Shu, Jia**g Yan, **xiao Chen, Yansheng Liang, Shaowei Wang, Ming Lei

    Abstract: By exploiting the nonlinear responses of the fluorescent probes, the spatial resolution of structured illumination microscopy(SIM) can be further increased. However, due to the complex reconstruction process, the traditional reconstruction method of nonlinear structured illumination microscopy (NL-SIM) is relatively slow, which brings a great challenge to realizing real-time display of super-resol… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  15. arXiv:2311.15588  [pdf

    physics.optics

    Versatile manipulation of low-refractive-index particles using customized optical building blocks

    Authors: Minru He, Yansheng Liang, Xue Yun, Linquan Guo, Tianyu Zhao, Ming Lei

    Abstract: Low-refractive-index (LRI) particles play significant roles in physics, drug delivery, biomedical science, and other fields. However, they have not attained sufficient utilization in active manipulation due to the repulsive effect of light. Here, we demonstrate the establishment of optical building blocks (OBBs) to fulfill the demands of versatile manipulation of LRI particles. The OBBs are genera… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 13 pages, 5 figures, corresponding authors:Yansheng Liang and Ming Lei

  16. arXiv:2311.15585  [pdf, other

    gr-qc astro-ph.HE astro-ph.IM

    Dawning of a New Era in Gravitational Wave Data Analysis: Unveiling Cosmic Mysteries via Artificial Intelligence -- A Systematic Review

    Authors: Tianyu Zhao, Ruijun Shi, Yue Zhou, Zhoujian Cao, Zhixiang Ren

    Abstract: Background: Artificial intelligence (AI), with its vast capabilities, has become an integral part of our daily interactions, particularly with the rise of sophisticated models like Large Language Models. These advancements have not only transformed human-machine interactions but have also paved the way for significant breakthroughs in various scientific domains. Aim of review: This review is cente… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  17. arXiv:2311.14736  [pdf, other

    cs.CL cs.LG

    Data Diversity Matters for Robust Instruction Tuning

    Authors: Alexander Bukharin, Tuo Zhao

    Abstract: Recent works have shown that by curating high quality and diverse instruction tuning datasets, we can significantly improve instruction-following capabilities. However, creating such datasets is difficult and most works rely on manual curation or proprietary language models. Automatic data curation is difficult as it is still not clear how we can define diversity for instruction tuning, how divers… ▽ More

    Submitted 5 February, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: 22 pages, 18 figures

  18. arXiv:2311.14612  [pdf, other

    quant-ph physics.optics

    Phase estimation via multi-photon subtraction inside the SU(1,1) interferometer

    Authors: Q. Q. Kang, Z. K. Zhao, Y. K. Xu, T. Zhao, C. J. Liu, L. Y. Hu

    Abstract: To improve the phase sensitivity, multi-photon subtraction schemes within the SU(1,1) interferometer are proposed. The input states are the coherent state and the vacuum state, and the detection method is homodyne detection. The effects of multi-photon subtraction on phase sensitivity, quantum Fisher information, and quantum Cramer-Rao bound are analyzed under both ideal and photon losses situatio… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 13 pages

  19. arXiv:2311.13864  [pdf, other

    cs.LG

    Which Matters Most in Making Fund Investment Decisions? A Multi-granularity Graph Disentangled Learning Framework

    Authors: Chun**g Gan, Binbin Hu, Bo Huang, Tianyu Zhao, Yingru Lin, Wenliang Zhong, Zhiqiang Zhang, Jun Zhou, Chuan Shi

    Abstract: In this paper, we highlight that both conformity and risk preference matter in making fund investment decisions beyond personal interest and seek to jointly characterize these aspects in a disentangled manner. Consequently, we develop a novel M ulti-granularity Graph Disentangled Learning framework named MGDL to effectively perform intelligent matching of fund investment products. Benefiting from… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: Accepted by SIGIR 2023

  20. arXiv:2311.12608  [pdf, other

    cs.CV

    Density-Guided Dense Pseudo Label Selection For Semi-supervised Oriented Object Detection

    Authors: Tong Zhao, Qiang Fang, Shuohao Shi, Xin Xu

    Abstract: Recently, dense pseudo-label, which directly selects pseudo labels from the original output of the teacher model without any complicated post-processing steps, has received considerable attention in semi-supervised object detection (SSOD). However, for the multi-oriented and dense objects that are common in aerial scenes, existing dense pseudo-label selection methods are inefficient because they i… ▽ More

    Submitted 14 May, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: 9 pages, 6 figures

  21. arXiv:2311.08747  [pdf, other

    cs.CV

    Improved Dense Nested Attention Network Based on Transformer for Infrared Small Target Detection

    Authors: Chun Bao, Jie Cao, Yaqian Ning, Tianhua Zhao, Zhijun Li, Zechen Wang, Li Zhang, Qun Hao

    Abstract: Infrared small target detection based on deep learning offers unique advantages in separating small targets from complex and dynamic backgrounds. However, the features of infrared small targets gradually weaken as the depth of convolutional neural network (CNN) increases. To address this issue, we propose a novel method for detecting infrared small targets called improved dense nested attention ne… ▽ More

    Submitted 17 January, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  22. arXiv:2311.07998  [pdf, other

    math.CA math.AP

    Fractional Leibniz rule on the torus

    Authors: Árpád Bényi, Tadahiro Oh, Tengfei Zhao

    Abstract: We discuss the fractional Leibniz rule for periodic functions on the $d$-dimensional torus, including the endpoint cases.

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 10 pages

    MSC Class: 42B15; 42B25; 46E35

  23. arXiv:2311.06126  [pdf, other

    astro-ph.HE astro-ph.GA

    A centi-pc-scale compact radio core in the nearby galaxy M60

    Authors: Xiaofeng Li, Jun Yang, Xiaopeng Cheng, Mai Liao, Xiaoyu Hong, Liming Dou, Tianle Zhao, Zhongying Fan, Fupeng Zhang, Weirong Huang

    Abstract: M60, an elliptical galaxy located 16.5~Mpc away, has an active nucleus with a very low luminosity and an extremely low accretion rate. Its central supermassive black hole has a mass of $M_{\rm BH}\sim4.5\times10^{9}\, M_{\odot}$ and a Schwarzschild radii corresponding to $R_{\rm S}\sim5.4\,μ\mathrm{as}$. To investigate the nature of its innermost radio nucleus, data from the Very Long Baseline Arr… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Comments: 15 pages, 5 figures, 3 tables, accepted for publication in Astrophysical Journal

  24. arXiv:2311.03751  [pdf, other

    physics.geo-ph

    Seismic traveltime simulation for variable velocity models using physics-informed Fourier neural operator

    Authors: Chao Song, Tianshuo Zhao, Umair bin Waheed, Cai Liu, Tian You

    Abstract: Seismic traveltime is critical information conveyed by seismic waves, widely utilized in various geophysical applications. Conventionally, the simulation of seismic traveltime involves solving the eikonal equation. However, the efficiency of traditional numerical solvers is hindered, as they are typically capable of simulating seismic traveltime for only a single source at a time. Recently, deep l… ▽ More

    Submitted 8 April, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: 13 pages, 12 figures, submitted to IEEE TGRS

  25. arXiv:2311.03468  [pdf, other

    cs.AI cs.CY cs.HC cs.LG math.NA

    FinA: Fairness of Adverse Effects in Decision-Making of Human-Cyber-Physical-System

    Authors: Tianyu Zhao, Salma Elmalaki

    Abstract: Ensuring fairness in decision-making systems within Human-Cyber-Physical-Systems (HCPS) is a pressing concern, particularly when diverse individuals, each with varying behaviors and expectations, coexist within the same application space, influenced by a shared set of control actions in the system. The long-term adverse effects of these actions further pose the challenge, as historical experiences… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  26. arXiv:2311.02262  [pdf, other

    cs.CL cs.LG

    Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

    Authors: Qingru Zhang, Chandan Singh, Liyuan Liu, Xiaodong Liu, Bin Yu, Jianfeng Gao, Tuo Zhao

    Abstract: In human-written articles, we often leverage the subtleties of text style, such as bold and italics, to guide the attention of readers. These textual emphases are vital for the readers to grasp the conveyed information. When interacting with large language models (LLMs), we have a similar need - steering the model to pay closer attention to user-specified information, e.g., an instruction. Existin… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: 16 pages

  27. arXiv:2311.02228  [pdf, other

    cs.CY cs.HC

    Towards Fairness-aware Crowd Management System and Surge Prevention in Smart Cities

    Authors: Yixin Zhang, Tianyu Zhao, Salma Elmalaki

    Abstract: Instances of casualties resulting from large crowds persist, highlighting the existing limitations of current crowd management practices in Smart Cities. One notable drawback is the insufficient provision for disadvantaged individuals who may require additional time to evacuate due to their slower running speed. Moreover, the existing escape strategies may fall short of ensuring the safety of all… ▽ More

    Submitted 22 April, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

  28. arXiv:2311.01403  [pdf, other

    cs.RO

    REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots

    Authors: Andrea Tagliabue, Kota Kondo, Tong Zhao, Mason Peterson, Claudius T. Tewari, Jonathan P. How

    Abstract: Large Language Models (LLMs) pre-trained on internet-scale datasets have shown impressive capabilities in code understanding, synthesis, and general purpose question-and-answering. Key to their performance is the substantial prior knowledge acquired during training and their ability to reason over extended sequences of symbols, often presented in natural language. In this work, we aim to harness t… ▽ More

    Submitted 2 November, 2023; originally announced November 2023.

    Comments: 13 pages, 5 figures, conference workshop

  29. arXiv:2311.00652  [pdf, other

    q-bio.TO physics.bio-ph

    The physical origin of aneurysm growth, dissection, and rupture

    Authors: Tom Y. Zhao, **-Tae Kim, Min Cho, Akhil Narang, John A. Rogers, Neelesh A. Patankar

    Abstract: Rupture of aortic aneurysms is by far the most fatal heart disease, with a mortality rate exceeding 80%. There are no reliable clinical protocols to predict growth, dissection, and rupture because the fundamental physics driving aneurysm progression is unknown. Here, via in-vitro experiments, we show that a blood-wall, fluttering instability manifests in synthetic arteries under pulsatile forcing.… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  30. arXiv:2310.20172  [pdf, other

    gr-qc astro-ph.IM cs.LG

    Compact Binary Systems Waveform Generation with Generative Pre-trained Transformer

    Authors: Ruijun Shi, Yue Zhou, Tianyu Zhao, Zhoujian Cao, Zhixiang Ren

    Abstract: Space-based gravitational wave (GW) detection is one of the most anticipated GW detection projects in the next decade, which promises to detect abundant compact binary systems. At present, deep learning methods have not been widely explored for GW waveform generation and extrapolation. To solve the data processing difficulty and the increasing waveform complexity caused by the detector's response… ▽ More

    Submitted 5 March, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

  31. arXiv:2310.19927  [pdf, other

    cs.LG cs.AI

    Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms

    Authors: Shenao Zhang, Boyi Liu, Zhaoran Wang, Tuo Zhao

    Abstract: ReParameterization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics. However, recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes with exploding gradient variance, which leads to slow convergence. This… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Published at NeurIPS 2023

  32. arXiv:2310.17087  [pdf, other

    cs.LG math.DS math.OC stat.ML

    Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

    Authors: Yuqing Wang, Zhenghao Xu, Tuo Zhao, Molei Tao

    Abstract: Large learning rates, when applied to gradient descent for nonconvex optimization, yield various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang et al., 2022), and catapult (Lewkowycz et al., 2020). These phenomena cannot be well explained by classical optimization theory. Though significant theoretical progress has been made in understanding these implicit bi… ▽ More

    Submitted 11 December, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  33. arXiv:2310.16336  [pdf, other

    cs.LG stat.ML

    SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

    Authors: Zichong Li, Yanbo Xu, Simiao Zuo, Haoming Jiang, Chao Zhang, Tuo Zhao, Hongyuan Zha

    Abstract: Transformer Hawkes process models have shown to be successful in modeling event sequence data. However, most of the existing training methods rely on maximizing the likelihood of event sequences, which involves calculating some intractable integral. Moreover, the existing methods fail to provide uncertainty quantification for model predictions, e.g., confidence intervals for the predicted event's… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  34. arXiv:2310.16310  [pdf, other

    cs.LG

    Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process with Uncertainty Quantification

    Authors: Zichong Li, Qunzhi Xu, Zhenghao Xu, Yajun Mei, Tuo Zhao, Hongyuan Zha

    Abstract: Spatio-temporal point processes (STPPs) are potent mathematical tools for modeling and predicting events with both temporal and spatial features. Despite their versatility, most existing methods for learning STPPs either assume a restricted form of the spatio-temporal distribution, or suffer from inaccurate approximations of the intractable integral in the likelihood training objective. These issu… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  35. arXiv:2310.14517  [pdf, ps, other

    math.AP

    Global well-posedness of the energy-critical stochastic Hartree nonlinear wave equation

    Authors: Guopeng Li, Liying Tao, Tengfei Zhao

    Abstract: We consider the Cauchy problem for the stochastic Hartree nonlinear wave equations (SHNLW) with a cubic convolution nonlinearity and an additive stochastic forcing on the Euclidean space. Our goal in this paper is two-fold. (i) We study the defocusing energy-critical SHNLW on $\mathbb{R}^d$, for $d \geq 5$, and prove that they are globally well-posed with deterministic initial data in the energy s… ▽ More

    Submitted 28 February, 2024; v1 submitted 22 October, 2023; originally announced October 2023.

    Comments: We consider the stochastic case. 33 pages

    MSC Class: 35L71; 35R60; 60H15

  36. arXiv:2310.13473  [pdf, other

    cs.CV

    Benchmarking Sequential Visual Input Reasoning and Prediction in Multimodal Large Language Models

    Authors: Mingwei Zhu, Leigang Sha, Yu Shu, Kangjia Zhao, Tiancheng Zhao, Jianwei Yin

    Abstract: Multimodal large language models (MLLMs) have shown great potential in perception and interpretation tasks, but their capabilities in predictive reasoning remain under-explored. To address this gap, we introduce a novel benchmark that assesses the predictive reasoning capabilities of MLLMs across diverse scenarios. Our benchmark targets three important domains: abstract pattern reasoning, human ac… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  37. arXiv:2310.12442  [pdf, other

    cs.CL cs.LG

    Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

    Authors: Qingru Zhang, Dhananjay Ram, Cole Hawkins, Sheng Zha, Tuo Zhao

    Abstract: Pretrained transformer models have demonstrated remarkable performance across various natural language processing tasks. These models leverage the attention mechanism to capture long- and short-range dependencies in the sequence. However, the (full) attention mechanism incurs high computational cost - quadratic in the sequence length, which is not affordable in tasks with long sequences, e.g., inp… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings)

  38. arXiv:2310.10810  [pdf, other

    cs.LG

    Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms

    Authors: Alexander Bukharin, Yan Li, Yue Yu, Qingru Zhang, Zhehui Chen, Simiao Zuo, Chao Zhang, Songan Zhang, Tuo Zhao

    Abstract: Multi-Agent Reinforcement Learning (MARL) has shown promising results across several domains. Despite this promise, MARL policies often lack robustness and are therefore sensitive to small changes in their environment. This presents a serious concern for the real world deployment of MARL algorithms, where the testing environment may slightly differ from the training environment. In this work we sh… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 33 pages, 10 figures

  39. arXiv:2310.09521  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Effective electrical manipulation of topological antiferromagnet by orbital Hall effect

    Authors: Zhenyi Zheng, Tao Zeng, Tieyang Zhao, Shu Shi, Lizhu Ren, Tongtong Zhang, Lanxin Jia, Youdi Gu, Rui Xiao, Hengan Zhou, Qihan Zhang, Jiaqi Lu, Guilei Wang, Chao Zhao, Huihui Li, Beng Kang Tay, **gsheng Chen

    Abstract: Electrical control of the non-trivial topology in Weyl antiferromagnet is of great interests to develop next-generation spintronic devices. Recent works suggest that spin Hall effect can switch the topological antiferromagnetic order. However, the switching efficiency remains relatively low. Here, we demonstrate effective manipulation of antiferromagnetic order in Weyl semimetal Mn3Sn by orbital H… ▽ More

    Submitted 14 October, 2023; originally announced October 2023.

    Comments: 13 pages, 4 figures

  40. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, A**kya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (267 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 1 June, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  41. arXiv:2310.08659  [pdf, other

    cs.CL cs.AI cs.LG

    LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

    Authors: Yixiao Li, Yifan Yu, Chen Liang, Pengcheng He, Nikos Karampatziakis, Weizhu Chen, Tuo Zhao

    Abstract: Quantization is an indispensable technique for serving Large Language Models (LLMs) and has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where quantization and LoRA fine-tuning are applied together on a pre-trained model. In such cases it is common to observe a consistent gap in the performance on downstream tasks between full fine-tuning and quantization plu… ▽ More

    Submitted 28 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  42. arXiv:2310.08016  [pdf, other

    cond-mat.quant-gas cond-mat.mes-hall

    What can we learn from the experiment of electrostatic conveyor belt for excitons?

    Authors: T. T. Zhao, Rui Li, C. S. Liu

    Abstract: Motivated by the experiment of electrostatic conveyor belt for indirect excitons [A. G. Winbow, \textit{et al.}, Phys. Rev. Lett. \textbf{106}, 196806 (2011)], we study the exciton patterns for understanding the exciton dynamics. By analyzing the exciton diffusion, we find that the patterns mainly come from the photoluminescence of two kinds of excitons. The patterns near the laser spot come from… ▽ More

    Submitted 17 June, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: 12 pages, 9 figures

  43. arXiv:2310.07326  [pdf

    econ.GN

    Empirical Analysis of the Impact of Legal Tender Digital Currency on Monetary Policy -Based on China's Data

    Authors: Ruimin Song, TIntian Zhao, Chunhui Zhou

    Abstract: This paper takes the development of China's Central bank digital currencies as a perspective, theoretically analyses the impact mechanism of the issuance and circulation of Central bank digital currencies on China's monetary policy and various variables of the money multiplier; at the same time, it selects the quarterly data from 2010 to 2022, and examines the impact of the Central bank digital cu… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  44. arXiv:2310.04612  [pdf, other

    cs.LG cs.SI

    A Topological Perspective on Demystifying GNN-Based Link Prediction Performance

    Authors: Yu Wang, Tong Zhao, Yuying Zhao, Yunchao Liu, Xueqi Cheng, Neil Shah, Tyler Derr

    Abstract: Graph Neural Networks (GNNs) have shown great promise in learning node embeddings for link prediction (LP). While numerous studies aim to improve the overall LP performance of GNNs, none have explored its varying performance across different nodes and its underlying reasons. To this end, we aim to demystify which nodes will perform better from the perspective of their local topology. Despite the w… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  45. arXiv:2310.04550  [pdf, other

    cs.CV cs.CL cs.LG

    Module-wise Adaptive Distillation for Multimodality Foundation Models

    Authors: Chen Liang, Jiahui Yu, Ming-Hsuan Yang, Matthew Brown, Yin Cui, Tuo Zhao, Boqing Gong, Tianyi Zhou

    Abstract: Pre-trained multimodal foundation models have demonstrated remarkable generalizability but pose challenges for deployment due to their large sizes. One effective approach to reducing their sizes is layerwise distillation, wherein small student models are trained to match the hidden representations of large teacher models at each layer. Motivated by our observation that certain architecture compone… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

  46. arXiv:2310.02262  [pdf, other

    cs.CV cs.GR cs.RO

    RSRD: A Road Surface Reconstruction Dataset and Benchmark for Safe and Comfortable Autonomous Driving

    Authors: Tong Zhao, Chenfeng Xu, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Yintao Wei

    Abstract: This paper addresses the growing demands for safety and comfort in intelligent robot systems, particularly autonomous vehicles, where road conditions play a pivotal role in overall driving performance. For example, reconstructing road surfaces helps to enhance the analysis and prediction of vehicle responses for motion planning and control systems. We introduce the Road Surface Reconstruction Data… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  47. arXiv:2310.00800  [pdf, other

    cs.LG cs.AI

    GraphPatcher: Mitigating Degree Bias for Graph Neural Networks via Test-time Augmentation

    Authors: Mingxuan Ju, Tong Zhao, Wenhao Yu, Neil Shah, Yanfang Ye

    Abstract: Recent studies have shown that graph neural networks (GNNs) exhibit strong biases towards the node degree: they usually perform satisfactorily on high-degree nodes with rich neighbor information but struggle with low-degree nodes. Existing works tackle this problem by deriving either designated GNN architectures or training strategies specifically for low-degree nodes. Though effective, these appr… ▽ More

    Submitted 1 October, 2023; originally announced October 2023.

    Comments: NeurIPS'23

  48. arXiv:2310.00793  [pdf, other

    cs.SI cs.AI

    Revisiting Link Prediction: A Data Perspective

    Authors: Haitao Mao, Juanhui Li, Harry Shomer, Bingheng Li, Wenqi Fan, Yao Ma, Tong Zhao, Neil Shah, Jiliang Tang

    Abstract: Link prediction, a fundamental task on graphs, has proven indispensable in various applications, e.g., friend recommendation, protein analysis, and drug interaction prediction. However, since datasets span a multitude of domains, they could have distinct underlying mechanisms of link formation. Evidence in existing literature underscores the absence of a universally best algorithm suitable for all… ▽ More

    Submitted 6 February, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: 36 pages, 12 figures

  49. Interpretable Imitation Learning with Dynamic Causal Relations

    Authors: Tianxiang Zhao, Wenchao Yu, Suhang Wang, Lu Wang, Xiang Zhang, Yuncong Chen, Yanchi Liu, Wei Cheng, Haifeng Chen

    Abstract: Imitation learning, which learns agent policy by mimicking expert demonstration, has shown promising results in many applications such as medical treatment regimes and self-driving vehicles. However, it remains a difficult task to interpret control policies learned by the agent. Difficulties mainly come from two aspects: 1) agents in imitation learning are usually implemented as deep neural networ… ▽ More

    Submitted 30 January, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: Accepted by WSDM 2024 as an oral paper

  50. Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds

    Authors: Zhenghao Xu, Xiang Ji, Minshuo Chen, Mengdi Wang, Tuo Zhao

    Abstract: Policy gradient methods equipped with deep neural networks have achieved great success in solving high-dimensional reinforcement learning (RL) problems. However, current analyses cannot explain why they are resistant to the curse of dimensionality. In this work, we study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN). Motiva… ▽ More

    Submitted 14 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.