Skip to main content

Showing 1–50 of 56 results for author: Tian, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15286  [pdf, other

    cs.CV

    3D Unsupervised Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving

    Authors: Boyi Sun, Yuhang Liu, Xingxia Wang, Bin Tian, Long Chen, Fei-Yue Wang

    Abstract: Point cloud data labeling is considered a time-consuming and expensive task in autonomous driving, whereas unsupervised learning can avoid it by learning point cloud representations from unannotated data. In this paper, we propose UOV, a novel 3D Unsupervised framework assisted by 2D Open-Vocabulary segmentation models. It consists of two stages: In the first stage, we innovatively integrate high-… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: 25 pages, 6 figures, codes are available at https://github.com/sbysbysbys/UOV

  2. arXiv:2404.13820  [pdf, other

    cs.CC cs.NE

    Prove Symbolic Regression is NP-hard by Symbol Graph

    Authors: **glu Song, Qiang Lu, Bozhou Tian, **gwen Zhang, Jake Luo, Zhiguang Wang

    Abstract: Symbolic regression (SR) is the task of discovering a symbolic expression that fits a given data set from the space of mathematical expressions. Despite the abundance of research surrounding the SR problem, there's a scarcity of works that confirm its NP-hard nature. Therefore, this paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expressi… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  3. arXiv:2404.00622  [pdf, other

    cs.MA eess.SY

    OpenMines: A Light and Comprehensive Mining Simulation Environment for Truck Dispatching

    Authors: Shi Meng, Bin Tian, Xiaotong Zhang, Shuangying Qi, Caiji Zhang, Qiang Zhang

    Abstract: Mine fleet management algorithms can significantly reduce operational costs and enhance productivity in mining systems. Most current fleet management algorithms are evaluated based on self-implemented or proprietary simulation environments, posing challenges for replication and comparison. This paper models the simulation environment for mine fleet management from a complex systems perspective. Bu… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: accepted in: 2024 35th IEEE Intelligent Vehicles Symposium (IV) 4 figures, 1 table

  4. arXiv:2402.17933  [pdf, other

    cs.RO

    ICAT: An Indoor Connected and Autonomous Testbed for Vehicle Computing

    Authors: Zhaofeng Tian, William He, Boyang Tian, Ren Zhong, Erfan Foorginejad, Weisong Shi

    Abstract: Indoor autonomous driving testbeds have emerged to complement expensive outdoor testbeds and virtual simulations, offering scalable and cost-effective solutions for research in navigation, traffic optimization, and swarm intelligence. However, they often lack the robust sensing and computing infrastructure for advanced research. Addressing these limitations, we introduce the Indoor Connected Auton… ▽ More

    Submitted 5 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  5. arXiv:2402.16343  [pdf, other

    cs.AR

    Trimma: Trimming Metadata Storage and Latency for Hybrid Memory Systems

    Authors: Yiwei Li, Boyu Tian, Mingyu Gao

    Abstract: Hybrid main memory systems combine both performance and capacity advantages from heterogeneous memory technologies. With larger capacities, higher associativities, and finer granularities, hybrid memory systems currently exhibit significant metadata storage and lookup overheads for flexibly remap** data blocks between the two memory tiers. To alleviate the inefficiencies of existing designs, we… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  6. arXiv:2402.16123  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    InstructEdit: Instruction-based Knowledge Editing for Large Language Models

    Authors: Ningyu Zhang, Bozhong Tian, Siyuan Cheng, Xiaozhuan Liang, Yi Hu, Kouying Xue, Yanjie Gou, Xi Chen, Huajun Chen

    Abstract: Knowledge editing for large language models can offer an efficient solution to alter a model's behavior without negatively impacting the overall performance. However, the current approaches encounter issues with limited generalizability across tasks, necessitating one distinct editor for each task, significantly hindering the broader applications. To address this, we take the first step to analyze… ▽ More

    Submitted 28 April, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

    Comments: IJCAI 2024; the project website is at https://www.zjukg.org/project/InstructEdit/

  7. arXiv:2402.14835  [pdf, other

    cs.CL cs.AI cs.LG

    MIKE: A New Benchmark for Fine-grained Multimodal Entity Knowledge Editing

    Authors: Jiaqi Li, Miaozeng Du, Chuanyi Zhang, Yongrui Chen, Nan Hu, Guilin Qi, Haiyun Jiang, Siyuan Cheng, Bozhong Tian

    Abstract: Multimodal knowledge editing represents a critical advancement in enhancing the capabilities of Multimodal Large Language Models (MLLMs). Despite its potential, current benchmarks predominantly focus on coarse-grained knowledge, leaving the intricacies of fine-grained (FG) multimodal entity knowledge largely unexplored. This gap presents a notable challenge, as FG entity recognition is pivotal for… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 8 pages

  8. arXiv:2402.11458  [pdf, other

    cs.CV

    Key Patch Proposer: Key Patches Contain Rich Information

    Authors: **g Xu, Beiwen Tian, Hao Zhao

    Abstract: In this paper, we introduce a novel algorithm named Key Patch Proposer (KPP) designed to select key patches in an image without additional training. Our experiments showcase KPP's robust capacity to capture semantic information by both reconstruction and classification tasks. The efficacy of KPP suggests its potential application in active learning for semantic segmentation. Our source code is pub… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    Comments: Accepted by ICLR 2024 Tiny Papers (notable)

  9. arXiv:2402.05869  [pdf, other

    cs.CV

    Adaptive Surface Normal Constraint for Geometric Estimation from Monocular Images

    Authors: Xiaoxiao Long, Yuhang Zheng, Yupeng Zheng, Beiwen Tian, Cheng Lin, Lingjie Liu, Hao Zhao, Guyue Zhou, Wen** Wang

    Abstract: We introduce a novel approach to learn geometries such as depth and surface normal from images while incorporating geometric context. The difficulty of reliably capturing geometric context in existing methods impedes their ability to accurately enforce the consistency between the different geometric properties, thereby leading to a bottleneck of geometric estimation quality. We therefore propose t… ▽ More

    Submitted 31 March, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted by TPAMI. arXiv admin note: substantial text overlap with arXiv:2103.15483

  10. arXiv:2401.04942  [pdf, other

    cs.CV

    Latency-aware Road Anomaly Segmentation in Videos: A Photorealistic Dataset and New Metrics

    Authors: Beiwen Tian, Huan-ang Gao, Leiyao Cui, Yupeng Zheng, Lan Luo, Baofeng Wang, Rong Zhi, Guyue Zhou, Hao Zhao

    Abstract: In the past several years, road anomaly segmentation is actively explored in the academia and drawing growing attention in the industry. The rationale behind is straightforward: if the autonomous car can brake before hitting an anomalous object, safety is promoted. However, this rationale naturally calls for a temporally informed setting while existing methods and benchmarks are designed in an unr… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  11. arXiv:2401.01286  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    A Comprehensive Study of Knowledge Editing for Large Language Models

    Authors: Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, **tian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen

    Abstract: Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training, arising from their extensive parameterization. This challenge is further intensified by the dynamic nature of the world, necessitating frequent updates to LLMs t… ▽ More

    Submitted 28 March, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

    Comments: Ongoing work; 52 pages, 282 citations; benchmark is available at https://huggingface.co/datasets/zjunlp/KnowEdit code is available at https://github.com/zjunlp/EasyEdit paper list is available at https://github.com/zjunlp/KnowledgeEditingPapers

  12. arXiv:2311.01010  [pdf, other

    cs.LG cs.CV

    Fast Shapley Value Estimation: A Unified Approach

    Authors: Borui Zhang, Baotong Tian, Wenzhao Zheng, Jie Zhou, Jiwen Lu

    Abstract: Shapley values have emerged as a widely accepted and trustworthy tool, grounded in theoretical axioms, for addressing challenges posed by black-box models like deep neural networks. However, computing Shapley values encounters exponential complexity as the number of features increases. Various approaches, including ApproSemivalue, KernelSHAP, and FastSHAP, have been explored to expedite the comput… ▽ More

    Submitted 23 May, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  13. arXiv:2310.12035  [pdf

    cs.HC q-bio.NC

    Tracking dynamic flow: Decoding flow fluctuations through performance in a fine motor control task

    Authors: Bohao Tian, Shijun Zhang, Sirui Chen, Yuru Zhang, Kai** Peng, Hongxing Zhang, Dangxiao Wang

    Abstract: Flow, an optimal mental state merging action and awareness, significantly impacts our emotion, performance, and well-being. However, capturing its swift fluctuations on a fine timescale is challenging due to the sparsity of the existing flow detecting tools. Here we present a fine fingertip force control (F3C) task to induce flow, wherein the task challenge is set at a compatible level with person… ▽ More

    Submitted 28 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  14. arXiv:2310.08475  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MM

    Can We Edit Multimodal Large Language Models?

    Authors: Siyuan Cheng, Bozhong Tian, Qingbin Liu, Xi Chen, Yongheng Wang, Huajun Chen, Ningyu Zhang

    Abstract: In this paper, we focus on editing Multimodal Large Language Models (MLLMs). Compared to editing single-modal LLMs, multimodal model editing is more challenging, which demands a higher level of scrutiny and careful consideration in the editing process. To facilitate research in this area, we construct a new benchmark, dubbed MMEdit, for editing multimodal LLMs and establishing a suite of innovativ… ▽ More

    Submitted 18 April, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023. Add the Exact Match/Accuracy results of Reliability and T-Generality

  15. arXiv:2308.07269  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models

    Authors: Peng Wang, Ningyu Zhang, Bozhong Tian, Zekun Xi, Yunzhi Yao, Ziwen Xu, Mengru Wang, Shengyu Mao, Xiaohan Wang, Siyuan Cheng, Kangwei Liu, Yuansheng Ni, Guozhou Zheng, Huajun Chen

    Abstract: Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy issues, which means they are unaware of unseen events or generate text with incorrect facts owing to outdated/noisy data. To this end, many knowledge editing approaches for LLMs have emerged -- aiming to subtly inject/edit updated knowledge or adjust undesired behavior while minimizing the impact on unrelated inputs. Neve… ▽ More

    Submitted 23 June, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: ACL 2024 System Demonstrations; Code: https://github.com/zjunlp/EasyEdit HF Demo: https://huggingface.co/spaces/zjunlp/EasyEdit Video: https://youtu.be/Gm6T0QaaskU Docs: https://zjunlp.gitbook.io/easyedit

  16. arXiv:2308.05756  [pdf, other

    eess.SP cs.LG

    WeldMon: A Cost-effective Ultrasonic Welding Machine Condition Monitoring System

    Authors: Beitong Tian, Kuan-Chieh Lu, Ahmadreza Eslaminia, Yaohui Wang, Chenhui Shao, Klara Nahrstedt

    Abstract: Ultrasonic welding machines play a critical role in the lithium battery industry, facilitating the bonding of batteries with conductors. Ensuring high-quality welding is vital, making tool condition monitoring systems essential for early-stage quality control. However, existing monitoring methods face challenges in cost, downtime, and adaptability. In this paper, we present WeldMon, an affordable… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

    Comments: 9 pages, 5 figures

  17. arXiv:2306.15129  [pdf, other

    cs.NI

    DeepStream: Bandwidth Efficient Multi-Camera Video Streaming for Deep Learning Analytics

    Authors: Hongpeng Guo, Beitong Tian, Zhe Yang, Bo Chen, Qian Zhou, Shengzhong Liu, Klara Nahrstedt, Claudiu Danilov

    Abstract: Deep learning video analytic systems process live video feeds from multiple cameras with computer vision models deployed on edge or cloud. To optimize utility for these systems, which usually corresponds to query accuracy, efficient bandwidth management for the cameras competing for the fluctuating network resources is crucial. We propose DeepStream, a bandwidth efficient multi-camera video stream… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  18. arXiv:2305.13172  [pdf, other

    cs.CL cs.AI cs.CV cs.IR cs.LG

    Editing Large Language Models: Problems, Methods, and Opportunities

    Authors: Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang

    Abstract: Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to efficiently alter the behavior of LLMs within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep… ▽ More

    Submitted 30 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023. Updated with new experiments

  19. arXiv:2304.13031  [pdf, other

    cs.CV

    DQS3D: Densely-matched Quantization-aware Semi-supervised 3D Detection

    Authors: Huan-ang Gao, Beiwen Tian, Pengfei Li, Hao Zhao, Guyue Zhou

    Abstract: In this paper, we study the problem of semi-supervised 3D object detection, which is of great importance considering the high annotation cost for cluttered 3D indoor scenes. We resort to the robust and principled framework of selfteaching, which has triggered notable progress for semisupervised learning recently. While this paradigm is natural for image-level or pixel-level prediction, adapting it… ▽ More

    Submitted 11 August, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted to ICCV 2023. Code: https://github.com/AIR-DISCOVER/DQS3D

  20. arXiv:2304.09058  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    Revisiting k-NN for Fine-tuning Pre-trained Language Models

    Authors: Lei Li, **g Chen, Bozhong Tian, Ningyu Zhang

    Abstract: Pre-trained Language Models (PLMs), as parametric-based eager learners, have become the de-facto choice for current paradigms of Natural Language Processing (NLP). In contrast, k-Nearest-Neighbor (kNN) classifiers, as the lazy learning paradigm, tend to mitigate over-fitting and isolated noise. In this paper, we revisit kNN classifiers for augmenting the PLMs-based classifiers. From the methodolog… ▽ More

    Submitted 17 June, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: CCL 2023

  21. arXiv:2304.08491  [pdf, other

    cs.CV

    Delving into Shape-aware Zero-shot Semantic Segmentation

    Authors: Xinyu Liu, Beiwen Tian, Zhen Wang, Rui Wang, Kehua Sheng, Bo Zhang, Hao Zhao, Guyue Zhou

    Abstract: Thanks to the impressive progress of large-scale vision-language pretraining, recent recognition models can classify arbitrary objects in a zero-shot and open-set manner, with a surprisingly high accuracy. However, translating this success to semantic segmentation is not trivial, because this dense prediction task requires not only accurate semantic understanding but also fine shape delineation an… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023, code: https://github.com/Liuxinyv/SAZS

  22. Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space

    Authors: Yahui Liu, Bin Tian, Yisheng Lv, Lingxi Li, Feiyue Wang

    Abstract: Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention, but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called Poi… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: This paper is accepted to IEEE/CAA Journal of Automatica Sinica (JAS)

  23. arXiv:2302.13699  [pdf, other

    cs.CV

    MPS-AMS: Masked Patches Selection and Adaptive Masking Strategy Based Self-Supervised Medical Image Segmentation

    Authors: Xiangtao Wang, Ruizhi Wang, Biao Tian, Jiaojiao Zhang, Shuo Zhang, Junyang Chen, Thomas Lukasiewicz, Zhenghua Xu

    Abstract: Existing self-supervised learning methods based on contrastive learning and masked image modeling have demonstrated impressive performances. However, current masked image modeling methods are mainly utilized in natural images, and their applications in medical images are relatively lacking. Besides, their fixed high masking strategy limits the upper bound of conditional mutual information, and the… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 6 pages, 3 figures,Received by the ICASSP2023

  24. arXiv:2302.04469  [pdf, other

    cs.SD eess.AS

    Joint Acoustic Echo Cancellation and Speech Dereverberation Using Kalman filters

    Authors: Ziteng Wang, Yueyue Na, Biao Tian, Qiang Fu

    Abstract: This paper proposes a joint acoustic echo cancellation (AEC) and speech dereverberation (DR) algorithm in the short-time Fourier transform domain. The reverberant microphone signals are described using an auto-regressive (AR) model. The AR coefficients and the loudspeaker-to-microphone acoustic transfer functions (ATFs) are considered time-varying and are modeled simultaneously using a first-order… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

  25. arXiv:2302.01722  [pdf, other

    cs.LG

    Leveraging Contaminated Datasets to Learn Clean-Data Distribution with Purified Generative Adversarial Networks

    Authors: Bowen Tian, Qinliang Su, Jianxing Yu

    Abstract: Generative adversarial networks (GANs) are known for their strong abilities on capturing the underlying distribution of training instances. Since the seminal work of GAN, many variants of GAN have been proposed. However, existing GANs are almost established on the assumption that the training dataset is clean. But in many real-world applications, this may not hold, that is, the training dataset ma… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  26. arXiv:2301.13865  [pdf, other

    cs.CV

    From Semi-supervised to Omni-supervised Room Layout Estimation Using Point Clouds

    Authors: Huan-ang Gao, Beiwen Tian, Pengfei Li, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Yurong Chen, Hongbin Zha

    Abstract: Room layout estimation is a long-existing robotic vision task that benefits both environment sensing and motion planning. However, layout estimation using point clouds (PCs) still suffers from data scarcity due to annotation difficulty. As such, we address the semi-supervised setting of this task based upon the idea of model exponential moving averaging. But adapting this scheme to the state-of-th… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

    Comments: Accepted to ICRA2023. Code: https://github.com/AIR-DISCOVER/Omni-PQ

  27. arXiv:2301.10405  [pdf, other

    cs.CL cs.AI cs.DB cs.IR cs.LG

    Editing Language Model-based Knowledge Graph Embeddings

    Authors: Siyuan Cheng, Ningyu Zhang, Bozhong Tian, Xi Chen, Qingbing Liu, Huajun Chen

    Abstract: Recently decades have witnessed the empirical success of framing Knowledge Graph (KG) embeddings via language models. However, language model-based KG embeddings are usually deployed as static artifacts, making them difficult to modify post-deployment without re-training after deployment. To address this issue, we propose a new task of editing language model-based KG embeddings in this paper. This… ▽ More

    Submitted 19 December, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

    Comments: AAAI 2024. The project website is https://zjunlp.github.io/project/KGE_Editing/

  28. arXiv:2210.11472  [pdf, other

    cs.CV cs.AI

    VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling

    Authors: Beiwen Tian, Liyi Luo, Hao Zhao, Guyue Zhou

    Abstract: Recently, 3D scenes parsing with deep learning approaches has been a heating topic. However, current methods with fully-supervised models require manually annotated point-wise supervision which is extremely user-unfriendly and time-consuming to obtain. As such, training 3D scene parsing models with sparse supervision is an intriguing alternative. We term this task as data-efficient 3D scene parsin… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted to ISPRS Journal of Photogrammetry and Remote Sensing, Code: https://github.com/AIR-DISCOVER/VIBUS

  29. arXiv:2210.10775  [pdf, other

    cs.CV cs.AI

    TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation

    Authors: Pengfei Li, Beiwen Tian, Yongliang Shi, Xiaoxue Chen, Hao Zhao, Guyue Zhou, Ya-Qin Zhang

    Abstract: Current referring expression comprehension algorithms can effectively detect or segment objects indicated by nouns, but how to understand verb reference is still under-explored. As such, we study the challenging problem of task oriented detection, which aims to find objects that best afford an action indicated by verbs like sit comfortably on. Towards a finer localization that better serves downst… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted by NeurIPS 2022. Codes are available at https://github.com/AIR-DISCOVER/TOIST

  30. arXiv:2209.05708  [pdf, other

    cs.RO

    InTEn-LOAM: Intensity and Temporal Enhanced LiDAR Odometry and Map**

    Authors: Shuaixin Li, Bin Tian, Zhu Xiaozhou, Gui Jianjun, Yao Wen, Guangyun Li

    Abstract: Traditional LiDAR odometry (LO) systems mainly leverage geometric information obtained from the traversed surroundings to register laser scans and estimate LiDAR ego-motion, while it may be unreliable in dynamic or unstructured environments. This paper proposes InTEn-LOAM, a low-drift and robust LiDAR odometry and map** method that fully exploits implicit information of laser sweeps (i.e., geome… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

  31. arXiv:2208.07870  [pdf, other

    cs.CV cs.AI

    Language-guided Semantic Style Transfer of 3D Indoor Scenes

    Authors: Bu **, Beiwen Tian, Hao Zhao, Guyue Zhou

    Abstract: We address the new problem of language-guided semantic style transfer of 3D indoor scenes. The input is a 3D indoor scene mesh and several phrases that describe the target scene. Firstly, 3D vertex coordinates are mapped to RGB residues by a multi-layer perceptron. Secondly, colored 3D meshes are differentiablly rendered into 2D images, via a viewpoint sampling strategy tailored for indoor scenes.… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: Accepted to ACM Multimedia PIES-ME 2022. Code: https://github.com/AIR-DISCOVER/LASST

  32. Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions

    Authors: Ryan Abbott, Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Sébastien Racanière, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Betsy Tian, Julian M. Urban

    Abstract: This work presents gauge-equivariant architectures for flow-based sampling in fermionic lattice field theories using pseudofermions as stochastic estimators for the fermionic determinant. This is the default approach in state-of-the-art lattice field theory calculations, making this development critical to the practical application of flow models to theories such as QCD. Methods by which flow-base… ▽ More

    Submitted 16 October, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: 15 pages, 7 figures. v3: accepted version for publication. New appendix C

    Report number: MIT-CTP/5446, INT-PUB-22-017

    Journal ref: Phys.Rev.D 106 (2022) 7, 074506

  33. arXiv:2204.13335  [pdf, other

    cs.LG

    Anomaly Detection by Leveraging Incomplete Anomalous Knowledge with Anomaly-Aware Bidirectional GANs

    Authors: Bowen Tian, Qinliang Su, Jian Yin

    Abstract: The goal of anomaly detection is to identify anomalous samples from normal ones. In this paper, a small number of anomalies are assumed to be available at the training stage, but they are assumed to be collected only from several anomaly types, leaving the majority of anomaly types not represented in the collected anomaly dataset at all. To effectively leverage this kind of incomplete anomalous kn… ▽ More

    Submitted 1 May, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

  34. arXiv:2204.07937  [pdf, other

    cs.CL cs.AI cs.LG

    Unsupervised Cross-Task Generalization via Retrieval Augmentation

    Authors: Bill Yuchen Lin, Kangmin Tan, Chris Miller, Beiwen Tian, Xiang Ren

    Abstract: Humans can perform unseen tasks by recalling relevant skills acquired previously and then generalizing them to the target tasks, even if there is no supervision at all. In this paper, we aim to improve this kind of cross-task generalization ability of massive multi-task language models, such as T0 and FLAN, in an unsupervised setting. We propose a retrieval-augmentation method named ReCross that t… ▽ More

    Submitted 17 October, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

    Comments: Accepted to NeurIPS 2022. Website: https://inklab.usc.edu/ReCross/

  35. arXiv:2204.05445  [pdf, other

    cs.SD eess.AS

    Small Footprint Multi-channel ConvMixer for Keyword Spotting with Centroid Based Awareness

    Authors: Dianwen Ng, ** Hui Pang, Yang Xiao, Biao Tian, Qiang Fu, Eng Siong Chng

    Abstract: It is critical for a keyword spotting model to have a small footprint as it typically runs on-device with low computational resources. However, maintaining the previous SOTA performance with reduced model size is challenging. In addition, a far-field and noisy environment with multiple signals interference aggravates the problem causing the accuracy to degrade significantly. In this paper, we pres… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: submitted to INTERSPEECH 2022

  36. Multi-Task Deep Residual Echo Suppression with Echo-aware Loss

    Authors: Shimin Zhang, Ziteng Wang, Jiayao Sun, Yihui Fu, Biao Tian, Qiang Fu, Lei Xie

    Abstract: This paper introduces the NWPU Team's entry to the ICASSP 2022 AEC Challenge. We take a hybrid approach that cascades a linear AEC with a neural post-filter. The former is used to deal with the linear echo components while the latter suppresses the residual non-linear echo components. We use gated convolutional F-T-LSTM neural network (GFTNN) as the backbone and shape the post-filter by a multi-ta… ▽ More

    Submitted 20 February, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: ICASSP 2022

  37. ConvMixer: Feature Interactive Convolution with Curriculum Learning for Small Footprint and Noisy Far-field Keyword Spotting

    Authors: Dianwen Ng, Yunqi Chen, Biao Tian, Qiang Fu, Eng Siong Chng

    Abstract: Building efficient architecture in neural speech processing is paramount to success in keyword spotting deployment. However, it is very challenging for lightweight models to achieve noise robustness with concise neural operations. In a real-world application, the user environment is typically noisy and may also contain reverberations. We proposed a novel feature interactive convolutional model wit… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

    Comments: submitted to ICASSP 2022

  38. arXiv:2110.08439  [pdf, other

    cs.SD eess.AS

    Controllable Multichannel Speech Dereverberation based on Deep Neural Networks

    Authors: Ziteng Wang, Yueyue Na, Biao Tian, Qiang Fu

    Abstract: Neural network based speech dereverberation has achieved promising results in recent studies. Nevertheless, many are focused on recovery of only the direct path sound and early reflections, which could be beneficial to speech perception, are discarded. The performance of a model trained to recover clean speech degrades when evaluated on early reverberation targets, and vice versa. This paper propo… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: submitted to ICASSP2022

  39. arXiv:2110.08437  [pdf, other

    cs.SD eess.AS

    NN3A: Neural Network supported Acoustic Echo Cancellation, Noise Suppression and Automatic Gain Control for Real-Time Communications

    Authors: Ziteng Wang, Yueyue Na, Biao Tian, Qiang Fu

    Abstract: Acoustic echo cancellation (AEC), noise suppression (NS) and automatic gain control (AGC) are three often required modules for real-time communications (RTC). This paper proposes a neural network supported algorithm for RTC, namely NN3A, which incorporates an adaptive filter and a multi-task model for residual echo suppression, noise reduction and near-end speech activity detection. The proposed a… ▽ More

    Submitted 15 October, 2021; originally announced October 2021.

    Comments: submitted to ICASSP2022

  40. arXiv:2110.01663  [pdf, ps, other

    cs.LG math.OC

    Global Convergence and Stability of Stochastic Gradient Descent

    Authors: Vivak Patel, Shushu Zhang, Bowen Tian

    Abstract: In machine learning, stochastic gradient descent (SGD) is widely deployed to train models using highly non-convex objectives with equally complex noise models. Unfortunately, SGD theory often makes restrictive assumptions that fail to capture the non-convexity of real problems, and almost entirely ignore the complex noise models that exist in practice. In this work, we make substantial progress on… ▽ More

    Submitted 10 October, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

    MSC Class: 65K05; 68Q25; 90C06; 90C30; 68T05

  41. arXiv:2109.13765  [pdf

    cs.SI cs.CY

    Exploring the spatiotemporal heterogeneity in the relationship between human mobility and COVID-19 prevalence using dynamic time war**

    Authors: Hoeyun Kwon, Kaitlyn Hom, Mark Rifkin, Beichen Tian, Caglar Koylu

    Abstract: Understanding where and when human mobility is associated with disease infection is crucial for implementing location-based health care policy and interventions. Previous studies on COVID-19 have revealed the correlation between human mobility and COVID-19 cases. However, the spatiotemporal heterogeneity of such correlation is not yet fully understood. In this study, we aim to identify the spatiot… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: GIScience 2021 Workshop on Advancing Movement Data Science (AMD21)

  42. arXiv:2109.08553  [pdf, other

    cs.CV

    Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck

    Authors: Liyi Luo, Beiwen Tian, Hao Zhao, Guyue Zhou

    Abstract: Semantic understanding of 3D point clouds is important for various robotics applications. Given that point-wise semantic annotation is expensive, in this paper, we address the challenge of learning models with extremely sparse labels. The core problem is how to leverage numerous unlabeled points. To this end, we propose a self-supervised 3D representation learning framework named viewpoint bottlen… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: Code: https://github.com/OPEN-AIR-SUN/Viewpoint-Bottleneck

  43. arXiv:2105.05558  [pdf, other

    eess.IV cs.CV

    AVA: Adversarial Vignetting Attack against Visual Recognition

    Authors: Binyu Tian, Felix Juefei-Xu, Qing Guo, Xiaofei Xie, Xiaohong Li, Yang Liu

    Abstract: Vignetting is an inherited imaging phenomenon within almost all optical systems, showing as a radial intensity darkening toward the corners of an image. Since it is a common effect for photography and usually appears as a slight intensity variation, people usually regard it as a part of a photo and would not even want to post-process it. Due to this natural advantage, in this work, we study vignet… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: This work has been accepted to IJCAI2021

  44. arXiv:2104.04325  [pdf, other

    cs.SD eess.AS

    Joint Online Multichannel Acoustic Echo Cancellation, Speech Dereverberation and Source Separation

    Authors: Yueyue Na, Ziteng Wang, Zhang Liu, Biao Tian, Qiang Fu

    Abstract: This paper presents a joint source separation algorithm that simultaneously reduces acoustic echo, reverberation and interfering sources. Target speeches are separated from the mixture by maximizing independence with respect to the other sources. It is shown that the separation process can be decomposed into cascading sub-processes that separately relate to acoustic echo cancellation, speech derev… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: submitted to INTERSPEECH 2021

  45. arXiv:2102.08551  [pdf, other

    cs.SD eess.AS

    Weighted Recursive Least Square Filter and Neural Network based Residual Echo Suppression for the AEC-Challenge

    Authors: Ziteng Wang, Yueyue Na, Zhang Liu, Biao Tian, Qiang Fu

    Abstract: This paper presents a real-time Acoustic Echo Cancellation (AEC) algorithm submitted to the AEC-Challenge. The algorithm consists of three modules: Generalized Cross-Correlation with PHAse Transform (GCC-PHAT) based time delay compensation, weighted Recursive Least Square (wRLS) based linear adaptive filtering and neural network based residual echo suppression. The wRLS filter is derived from a no… ▽ More

    Submitted 18 February, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: 5 pages, 2 figures, accepted by ICASSP 2021

  46. arXiv:2012.01353  [pdf, other

    cs.AR

    Eudoxus: Characterizing and Accelerating Localization in Autonomous Machines

    Authors: Yiming Gan, Bo Yu, Boyuan Tian, Leimeng Xu, Wei Hu, Shaoshan Liu, Qiang Liu, Yanjun Zhang, Jie Tang, Yuhao Zhu

    Abstract: We develop and commercialize autonomous machines, such as logistic robots and self-driving cars, around the globe. A critical challenge to our -- and any -- autonomous machine is accurate and efficient localization under resource constraints, which has fueled specialized localization accelerators recently. Prior acceleration efforts are point solutions in that they each specialize for a specific l… ▽ More

    Submitted 29 April, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

  47. arXiv:2011.00652  [pdf, other

    cs.CV

    Multi-View Adaptive Fusion Network for 3D Object Detection

    Authors: Guojun Wang, Bin Tian, Yachen Zhang, Long Chen, Dongpu Cao, Jian Wu

    Abstract: 3D object detection based on LiDAR-camera fusion is becoming an emerging research theme for autonomous driving. However, it has been surprisingly difficult to effectively fuse both modalities without information loss and interference. To solve this issue, we propose a single-stage multi-view fusion framework that takes LiDAR bird's-eye view, LiDAR range view and camera view images as inputs for 3D… ▽ More

    Submitted 7 December, 2020; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: 11 pages,9 figures. We change the CV backbone and the details of network to improve performance. Submitted to IEEE transactions on intelligent transportation systems

  48. arXiv:2009.09247  [pdf, other

    eess.IV cs.CV cs.LG

    Bias Field Poses a Threat to DNN-based X-Ray Recognition

    Authors: Binyu Tian, Qing Guo, Felix Juefei-Xu, Wen Le Chan, Yupeng Cheng, Xiaohong Li, Xiaofei Xie, Shengchao Qin

    Abstract: The chest X-ray plays a key role in screening and diagnosis of many lung diseases including the COVID-19. More recently, many works construct deep neural networks (DNNs) for chest X-ray images to realize automated and efficient diagnosis of lung diseases. However, bias field caused by the improper medical image acquisition process widely exists in the chest X-ray images while the robustness of DNN… ▽ More

    Submitted 3 May, 2021; v1 submitted 19 September, 2020; originally announced September 2020.

    Comments: 6 pages, 5 figures; This work has been accepted to ICME 2021 as the oral presentation

  49. arXiv:2008.06967  [pdf, other

    cs.CV cs.AR

    Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation

    Authors: Yu Feng, Boyuan Tian, Tiancheng Xu, Paul Whatmough, Yuhao Zhu

    Abstract: Point cloud analytics is poised to become a key workload on battery-powered embedded and mobile platforms in a wide range of emerging application domains, such as autonomous driving, robotics, and augmented reality, where efficiency is paramount. This paper proposes Mesorasi, an algorithm-architecture co-designed system that simultaneously improves the performance and energy efficiency of point cl… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Journal ref: Proceedings of the 53nd (2020) Annual IEEE/ACM International Symposium on Microarchitecture

  50. arXiv:2007.10786  [pdf

    cs.LG cs.AI eess.SP

    Comparison of Different Methods for Time Sequence Prediction in Autonomous Vehicles

    Authors: Teng Liu, Bin Tian, Yunfeng Ai, Long Chen, Fei Liu, Dongpu Cao

    Abstract: As a combination of various kinds of technologies, autonomous vehicles could complete a series of driving tasks by itself, such as perception, decision-making, planning, and control. Since there is no human driver to handle the emergency situation, future transportation information is significant for automated vehicles. This paper proposes different methods to forecast the time series for autonomo… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: 6 pages, 11 figures