Skip to main content

Showing 1–50 of 144 results for author: Tang, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13123  [pdf, other

    cs.AI cs.CV

    ViLCo-Bench: VIdeo Language COntinual learning Benchmark

    Authors: Tianqi Tang, Shohreh Deldari, Hao Xue, Celso De Melo, Flora D. Salim

    Abstract: Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures, 8 tables, under review

  2. arXiv:2406.04702  [pdf, other

    cs.LG

    Marking the Pace: A Blockchain-Enhanced Privacy-Traceable Strategy for Federated Recommender Systems

    Authors: Zhen Cai, Tao Tang, Shuo Yu, Yunpeng Xiao, Feng Xia

    Abstract: Federated recommender systems have been crucially enhanced through data sharing and continuous model updates, attributed to the pervasive connectivity and distributed computing capabilities of Internet of Things (IoT) devices. Given the sensitivity of IoT data, transparent data processing in data sharing and model updates is paramount. However, existing methods fall short in tracing the flow of sh… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2406.02147  [pdf, other

    cs.CV

    UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking

    Authors: Lijun Zhou, Tao Tang, Pengkun Hao, Zihang He, Kalok Ho, Shuo Gu, Wenbo Hou, Zhihui Hao, Haiyang Sun, Kun Zhan, Peng Jia, Xianpeng Lang, Xiaodan Liang

    Abstract: 3D multiple object tracking (MOT) plays a crucial role in autonomous driving perception. Recent end-to-end query-based trackers simultaneously detect and track objects, which have shown promising potential for the 3D MOT task. However, existing methods overlook the uncertainty issue, which refers to the lack of precise confidence about the state and location of tracked objects. Uncertainty arises… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  4. arXiv:2406.01349  [pdf, other

    cs.CV

    Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

    Authors: Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

    Abstract: Using generative models to synthesize new data has become a de-facto standard in autonomous driving to address the data scarcity issue. Though existing approaches are able to boost perception models, we discover that these approaches fail to improve the performance of planning of end-to-end autonomous driving models as the generated videos are usually less than 8 frames and the spatial and tempora… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: Project Page: https://westlake-autolab.github.io/delphi.github.io/, 8 figures

  5. arXiv:2405.09543  [pdf, other

    cs.CY cs.AI cs.IR cs.LG

    Algorithmic Fairness: A Tolerance Perspective

    Authors: Renqiang Luo, Tao Tang, Feng Xia, Jiaying Liu, Chengpei Xu, Leo Yu Zhang, Wei Xiang, Chengqi Zhang

    Abstract: Recent advancements in machine learning and deep learning have brought algorithmic fairness into sharp focus, illuminating concerns over discriminatory decision making that negatively impacts certain individuals or groups. These concerns have manifested in legal, ethical, and societal challenges, including the erosion of trust in intelligent systems. In response, this survey delves into the existi… ▽ More

    Submitted 26 April, 2024; originally announced May 2024.

    Comments: 33 pages, 4 figures

    MSC Class: 68T01; 68W40 ACM Class: I.2.6; K.4.2; H.1.2

  6. arXiv:2404.11502  [pdf, other

    cs.CL cs.AI

    Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models

    Authors: Yushuo Chen, Tianyi Tang, Erge Xiang, Linjiang Li, Wayne Xin Zhao, **g Wang, Yunpeng Chai, Ji-Rong Wen

    Abstract: In real world, large language models (LLMs) can serve as the assistant to help users accomplish their jobs, and also support the development of advanced applications. For the wide application of LLMs, the inference efficiency is an essential concern, which has been widely studied in existing work, and numerous optimization algorithms and code libraries have been proposed to improve it. Nonetheless… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  7. arXiv:2404.10227  [pdf, other

    cs.CV cs.RO

    MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

    Authors: Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu

    Abstract: This work proposes a novel learning framework for visual hand dynamics analysis that takes into account the physiological aspects of hand motion. The existing models, which are simplified joint-actuated systems, often produce unnatural motions. To address this, we integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create a new model, MS-MANO. This model emulates th… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 11 pages, 5 figures; CVPR 2024

  8. arXiv:2404.08559  [pdf, other

    cs.CL

    MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking

    Authors: Tianwen Tang, Tong Zhu, Haodong Liu, Yin Bai, Jia Cheng, Wenliang Chen

    Abstract: Zero-shot dialogue state tracking (DST) transfers knowledge to unseen domains, reducing the cost of annotating new datasets. Previous zero-shot DST models mainly suffer from domain transferring and partial prediction problems. To address these challenges, we propose Mixture of Prefix Experts (MoPE) to establish connections between similar slots in different domains, which strengthens the model tra… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Accepted to LREC-COLING 2024

  9. arXiv:2404.05657  [pdf, other

    cs.CV

    MLP Can Be A Good Transformer Learner

    Authors: Sihao Lin, Pumeng Lyu, Dongrui Liu, Tao Tang, Xiaodan Liang, Andy Song, Xiaojun Chang

    Abstract: Self-attention mechanism is the key of the Transformer but often criticized for its computation demands. Previous token pruning works motivate their methods from the view of computation redundancy but still need to load the full network and require same memory costs. This paper introduces a novel strategy that simplifies vision transformers and reduces computational load through the selective remo… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: efficient transformer

  10. arXiv:2403.19446  [pdf, other

    cs.LO

    EDA-Driven Preprocessing for SAT Solving

    Authors: Zhengyuan Shi, Tiebing Tang, Sadaf Khan, Hui-Ling Zhen, Mingxuan Yuan, Zhufei Chu, Qiang Xu

    Abstract: Effective formulation of problems into Conjunctive Normal Form (CNF) is critical in modern Boolean Satisfiability (SAT) solving for optimizing solver performance. Addressing the limitations of existing methods, our Electronic Design Automation (EDA)-driven preprocessing framework introduces a novel methodology for preparing SAT instances, leveraging both circuit and CNF formats for enhanced flexib… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  11. arXiv:2403.13311  [pdf, other

    cs.AI cs.MA cs.RO

    Multi-Robot Connected Fermat Spiral Coverage

    Authors: **gtao Tang, Hang Ma

    Abstract: We introduce the Multi-Robot Connected Fermat Spiral (MCFS), a novel algorithmic framework for Multi-Robot Coverage Path Planning (MCPP) that adapts Connected Fermat Spiral (CFS) from the computer graphics community to multi-robot coordination for the first time. MCFS uniquely enables the orchestration of multiple robots to generate coverage paths that contour around arbitrarily shaped obstacles,… ▽ More

    Submitted 16 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: accepted to ICAPS24

  12. arXiv:2403.10995  [pdf, other

    cs.LG cs.AI cs.CR cs.SI

    Edge Private Graph Neural Networks with Singular Value Perturbation

    Authors: Tingting Tang, Yue Niu, Salman Avestimehr, Murali Annavaram

    Abstract: Graph neural networks (GNNs) play a key role in learning representations from graph-structured data and are demonstrated to be useful in many applications. However, the GNN training pipeline has been shown to be vulnerable to node feature leakage and edge extraction attacks. This paper investigates a scenario where an attacker aims to recover private edge information from a trained GNN model. Prev… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Accepted at Privacy Enhancing Technologies Symposium (PETS) 2024

  13. arXiv:2403.08994  [pdf, other

    cs.CL

    Ethos: Rectifying Language Models in Orthogonal Parameter Space

    Authors: Lei Gao, Yue Niu, Tingting Tang, Salman Avestimehr, Murali Annavaram

    Abstract: Language models (LMs) have greatly propelled the research on natural language processing. However, LMs also raise concerns regarding the generation of biased or toxic content and the potential disclosure of private information from the training dataset. In this work, we present a new efficient approach, Ethos, that rectifies LMs to mitigate toxicity and bias in outputs and avoid privacy leakage. E… ▽ More

    Submitted 1 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  14. arXiv:2403.05851  [pdf, other

    cs.MM cs.ET

    Interest-Aware Joint Caching, Computing, and Communication Optimization for Mobile VR Delivery in MEC Networks

    Authors: Baojie Fu, Tong Tang, Dapeng Wu, Ruyan Wang

    Abstract: In the upcoming B5G/6G era, virtual reality (VR) over wireless has become a typical application, which is an inevitable trend in the development of video. However, in immersive and interactive VR experiences, VR services typically exhibit high delay, while simultaneously posing challenges for the energy consumption of local devices. To address these issues, this paper aims to improve the performan… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  15. arXiv:2403.03544  [pdf, other

    cs.AI cs.CL

    Prompt Mining for Language-based Human Mobility Forecasting

    Authors: Hao Xue, Tianye Tang, Ali Payani, Flora D. Salim

    Abstract: With the advancement of large language models, language-based forecasting has recently emerged as an innovative approach for predicting human mobility patterns. The core idea is to use prompts to transform the raw mobility data given as numerical values into natural language sentences so that the language models can be leveraged to generate the description for future observations. However, previou… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  16. arXiv:2402.17483  [pdf, other

    cs.CV

    AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

    Authors: Tao Tang, Guangrun Wang, Yixing Lao, Peng Chen, Jie Liu, Liang Lin, Kaicheng Yu, Xiaodan Liang

    Abstract: Neural implicit fields have been a de facto standard in novel view synthesis. Recently, there exist some methods exploring fusing multiple modalities within a single field, aiming to share implicit features from different modalities to enhance reconstruction performance. However, these modalities often exhibit misaligned behaviors: optimizing for one modality, such as LiDAR, can adversely affect a… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: CVPR2024

  17. arXiv:2402.16438  [pdf, other

    cs.CL

    Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

    Authors: Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen

    Abstract: Large language models (LLMs) demonstrate remarkable multilingual capabilities without being pre-trained on specially curated multilingual parallel corpora. It remains a challenging problem to explain the underlying mechanisms by which LLMs process multilingual texts. In this paper, we delve into the composition of Transformer architectures in LLMs to pinpoint language-specific regions. Specially,… ▽ More

    Submitted 6 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024

  18. Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

    Authors: Xin Chen, Mingliang Hou, Tao Tang, Achhardeep Kaur, Feng Xia

    Abstract: With the arrival of the big data era, mobility profiling has become a viable method of utilizing enormous amounts of mobility data to create an intelligent transportation system. Mobility profiling can extract potential patterns in urban traffic from mobility data and is critical for a variety of traffic-related applications. However, due to the high level of complexity and the huge amount of data… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 10 pages, 7 figures

    MSC Class: 68T09; 68T30; 68U35 ACM Class: I.2.6; I.2.4; H.1.2

    Journal ref: The 7th IEEE International Conference on Data Science and Systems (DSS), Dec 20 - 22, 2021, Haikou, China

  19. arXiv:2401.07312  [pdf, other

    cs.HC

    Understanding Nonlinear Collaboration between Human and AI Agents: A Co-design Framework for Creative Design

    Authors: Jiayi Zhou, Renzhong Li, Junxiu Tang, Tan Tang, Haotian Li, Weiwei Cui, Yingcai Wu

    Abstract: Creative design is a nonlinear process where designers generate diverse ideas in the pursuit of an open-ended goal and converge towards consensus through iterative remixing. In contrast, AI-powered design tools often employ a linear sequence of incremental and precise instructions to approximate design objectives. Such operations violate customary creative design practices and thus hinder AI agent… ▽ More

    Submitted 20 March, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

    Comments: to be published in CHI 2024

  20. arXiv:2401.01065  [pdf, other

    cs.CV cs.AI

    BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

    Authors: Tao Tang, Dafeng Wei, Zhengyu Jia, Tian Gao, Changwei Cai, Chengkai Hou, Peng Jia, Kun Zhan, Haiyang Sun, **gchen Fan, Yixing Zhao, Fu Liu, Xiaodan Liang, Xianpeng Lang, Yang Wang

    Abstract: The rapid development of the autonomous driving industry has led to a significant accumulation of autonomous driving data. Consequently, there comes a growing demand for retrieving data to provide specialized optimization. However, directly applying previous image retrieval methods faces several challenges, such as the lack of global feature representation and inadequate text retrieval ability for… ▽ More

    Submitted 18 June, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

  21. arXiv:2312.17016  [pdf, other

    cs.CV cs.AI

    On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications

    Authors: Chenjiao Tan, Qian Cao, Yiwei Li, Jielu Zhang, Xiao Yang, Huaqin Zhao, Zihao Wu, Zhengliang Liu, Hao Yang, Nemin Wu, Tao Tang, Xinyue Ye, Lilong Chai, Ninghao Liu, Changying Li, Lan Mu, Tianming Liu, Gengchen Mai

    Abstract: The advent of large language models (LLMs) has heightened interest in their potential for multimodal applications that integrate language and vision. This paper explores the capabilities of GPT-4V in the realms of geography, environmental science, agriculture, and urban planning by evaluating its performance across a variety of tasks. Data sources comprise satellite imagery, aerial photos, ground-… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

    Comments: 110 Pages; 61 Figures

    ACM Class: I.2.7; I.2.10; I.4.6; I.4.8; J.2

  22. arXiv:2312.10429  [pdf, other

    physics.geo-ph cs.AI

    ResoNet: Robust and Explainable ENSO Forecasts with Hybrid Convolution and Transformer Networks

    Authors: Pumeng Lyu, Tao Tang, Fenghua Ling, **g-Jia Luo, Niklas Boers, Wanli Ouyang, Lei Bai

    Abstract: Recent studies have shown that deep learning (DL) models can skillfully predict the El Niño-Southern Oscillation (ENSO) forecasts over 1.5 years ahead. However, concerns regarding the reliability of predictions made by DL methods persist, including potential overfitting issues and lack of interpretability. Here, we propose ResoNet, a DL model that combines convolutional neural network (CNN) and Tr… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 32 pages, 5 main figures and 12 supplementary figures

  23. arXiv:2312.09911  [pdf, other

    cs.SD eess.AS

    Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

    Authors: Xueyao Zhang, Liumeng Xue, Yicheng Gu, Yuancheng Wang, Haorui He, Chaoren Wang, Xi Chen, Zihao Fang, Haopeng Chen, Junan Zhang, Tze Ying Tang, Lexiao Zou, Mingxuan Wang, Jun Han, Kai Chen, Haizhou Li, Zhizheng Wu

    Abstract: Amphion is an open-source toolkit for Audio, Music, and Speech Generation, targeting to ease the way for junior researchers and engineers into these fields. It presents a unified framework that is inclusive of diverse generation tasks and models, with the added bonus of being easily extendable for new incorporation. The toolkit is designed with beginner-friendly workflows and pre-trained models, a… ▽ More

    Submitted 22 February, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Amphion Website: https://github.com/open-mmlab/Amphion

  24. arXiv:2312.08876  [pdf, other

    cs.CV

    OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

    Authors: Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu

    Abstract: Traditional LiDAR-based object detection research primarily focuses on closed-set scenarios, which falls short in complex real-world applications. Directly transferring existing 2D open-vocabulary models with some known LiDAR classes for open-vocabulary ability, however, tends to suffer from over-fitting problems: The obtained model will detect the known objects, even presented with a novel catego… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  25. arXiv:2311.13036  [pdf, other

    cs.LG stat.ML

    Favour: FAst Variance Operator for Uncertainty Rating

    Authors: Thomas D. Ahle, Sahar Karimi, Peter Tak Peter Tang

    Abstract: Bayesian Neural Networks (BNN) have emerged as a crucial approach for interpreting ML predictions. By sampling from the posterior distribution, data scientists may estimate the uncertainty of an inference. Unfortunately many inference samples are often needed, the overhead of which greatly hinder BNN's wide adoption. To mitigate this, previous work proposed propagating the first and second moments… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  26. RFTrans: Leveraging Refractive Flow of Transparent Objects for Surface Normal Estimation and Manipulation

    Authors: Tutian Tang, Jiyu Liu, Jieyi Zhang, Haoyuan Fu, Wenqiang Xu, Cewu Lu

    Abstract: Transparent objects are widely used in our daily lives, making it important to teach robots to interact with them. However, it's not easy because the reflective and refractive effects can make depth cameras fail to give accurate geometry measurements. To solve this problem, this paper introduces RFTrans, an RGB-D-based method for surface normal estimation and manipulation of transparent objects. B… ▽ More

    Submitted 7 February, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  27. arXiv:2311.08687  [pdf, other

    cs.CL cs.AI cs.LG

    An Eye on Clinical BERT: Investigating Language Model Generalization for Diabetic Eye Disease Phenoty**

    Authors: Keith Harrigian, Tina Tang, Anthony Gonzales, Cindy X. Cai, Mark Dredze

    Abstract: Diabetic eye disease is a major cause of blindness worldwide. The ability to monitor relevant clinical trajectories and detect lapses in care is critical to managing the disease and preventing blindness. Alas, much of the information necessary to support these goals is found only in the free text of the electronic medical record. To fill this information gap, we introduce a system for extracting e… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 24 pages

  28. arXiv:2311.04072  [pdf, other

    cs.CL

    Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment

    Authors: Geyang Guo, Ranchi Zhao, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Alignment with human preference is a desired property of large language models (LLMs). Currently, the main alignment approach is based on reinforcement learning from human feedback (RLHF). Despite the effectiveness of RLHF, it is intricate to implement and train, thus recent studies explore how to develop alternative alignment approaches based on supervised fine-tuning (SFT). A major limitation of… ▽ More

    Submitted 15 April, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

  29. arXiv:2311.02396  [pdf, other

    cs.RO

    Precise Robotic Needle-Threading with Tactile Perception and Reinforcement Learning

    Authors: Zhenjun Yu, Wenqiang Xu, Siqiong Yao, Jieji Ren, Tutian Tang, Yutong Li, Guoying Gu, Cewu Lu

    Abstract: This work presents a novel tactile perception-based method, named T-NT, for performing the needle-threading task, an application of deformable linear object (DLO) manipulation. This task is divided into two main stages: Tail-end Finding and Tail-end Insertion. In the first stage, the agent traces the contour of the thread twice using vision-based tactile sensors mounted on the gripper fingers. The… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  30. arXiv:2309.16179  [pdf, other

    cs.CV

    BEVHeight++: Toward Robust Visual Centric 3D Object Detection

    Authors: Lei Yang, Tao Tang, Jun Li, Peng Chen, Kun Yuan, Li Wang, Yi Huang, Xinyu Zhang, Kaicheng Yu

    Abstract: While most recent autonomous driving system focuses on develo** perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric bird's eye view detection methods have inferior performances on roadside cameras. This is b… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.08498

  31. arXiv:2309.13345  [pdf, other

    cs.CL

    BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models

    Authors: Zican Dong, Tianyi Tang, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: Large language models (LLMs) have achieved dramatic proficiency over NLP tasks with normal length. Recently, multiple studies have committed to extending the context length and enhancing the long text modeling capabilities of LLMs. To comprehensively evaluate the long context ability of LLMs, we propose BAMBOO, a multi-task long context benchmark. BAMBOO has been designed with four principles: com… ▽ More

    Submitted 19 March, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted for the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING) 2024

  32. arXiv:2308.00240  [pdf, other

    cs.CL

    Towards Effective Ancient Chinese Translation: Dataset, Model, and Evaluation

    Authors: Geyang Guo, Jiarong Yang, Fengyuan Lu, Jiaxin Qin, Tianyi Tang, Wayne Xin Zhao

    Abstract: Interpreting ancient Chinese has been the key to comprehending vast Chinese literature, tradition, and civilization. In this paper, we propose Erya for ancient Chinese translation. From a dataset perspective, we collect, clean, and classify ancient Chinese materials from various sources, forming the most extensive ancient Chinese resource to date. From a model perspective, we devise Erya training… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: Accepted by NLPCC 2023

  33. arXiv:2307.01932  [pdf, other

    stat.ME cs.AI cs.LG stat.ML

    MDI+: A Flexible Random Forest-Based Feature Importance Framework

    Authors: Abhineet Agarwal, Ana M. Kenney, Yan Shuo Tan, Tiffany M. Tang, Bin Yu

    Abstract: Mean decrease in impurity (MDI) is a popular feature importance measure for random forests (RFs). We show that the MDI for a feature $X_k$ in each tree in an RF is equivalent to the unnormalized $R^2$ value in a linear regression of the response on the collection of decision stumps that split on $X_k$. We use this interpretation to propose a flexible feature importance framework called MDI+. Speci… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

  34. arXiv:2306.17609  [pdf, other

    cs.RO cs.AI

    Mixed Integer Programming for Time-Optimal Multi-Robot Coverage Path Planning with Efficient Heuristics

    Authors: **gtao Tang, Hang Ma

    Abstract: We investigate time-optimal Multi-Robot Coverage Path Planning (MCPP) for both unweighted and weighted terrains, which aims to minimize the coverage time, defined as the maximum travel time of all robots. Specifically, we focus on a reduction from MCPP to Min-Max Rooted Tree Cover (MMRTC). For the first time, we propose a Mixed Integer Programming (MIP) model to optimally solve MMRTC, resulting in… ▽ More

    Submitted 11 August, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted to RA-L

  35. arXiv:2306.00043  [pdf, other

    cs.AI cs.NE

    Space Net Optimization

    Authors: Chun-Wei Tsai, Yi-Cheng Yang, Tzu-Chieh Tang, Che-Wei Hsu

    Abstract: Most metaheuristic algorithms rely on a few searched solutions to guide later searches during the convergence process for a simple reason: the limited computing resource of a computer makes it impossible to retain all the searched solutions. This also reveals that each search of most metaheuristic algorithms is just like a ballpark guess. To help address this issue, we present a novel metaheuristi… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: 12 pages, 6 figures

  36. arXiv:2305.17006  [pdf, other

    cs.CV cs.CL

    Zero-shot Visual Question Answering with Language Model Feedback

    Authors: Yifan Du, Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: In this paper, we propose a novel language model guided captioning approach, LAMOC, for knowledge-based visual question answering (VQA). Our approach employs the generated captions by a captioning model as the context of an answer prediction model, which is a Pre-trained Language model (PLM). As the major contribution, we leverage the guidance and feedback of the prediction model to improve the ca… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL2023 findings

  37. arXiv:2305.16944  [pdf, other

    cs.CL

    Learning to Imagine: Visually-Augmented Natural Language Generation

    Authors: Tianyi Tang, Yushuo Chen, Yifan Du, Junyi Li, Wayne Xin Zhao, Ji-Rong Wen

    Abstract: People often imagine relevant scenes to aid in the writing process. In this work, we aim to utilize visual information for composition in the same manner as humans. We propose a method, LIVE, that makes pre-trained language models (PLMs) Learn to Imagine for Visuallyaugmented natural language gEneration. First, we imagine the scene based on the text: we use a diffusion model to synthesize high-qua… ▽ More

    Submitted 15 June, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023

  38. arXiv:2305.15067  [pdf, other

    cs.CL

    Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References

    Authors: Tianyi Tang, Hongyuan Lu, Yuchen Eleanor Jiang, Haoyang Huang, Dongdong Zhang, Wayne Xin Zhao, Tom Kocmi, Furu Wei

    Abstract: Most research about natural language generation (NLG) relies on evaluation benchmarks with limited references for a sample, which may result in poor correlations with human judgements. The underlying reason is that one semantic meaning can actually be expressed in different forms, and the evaluation with a single or few references may not accurately reflect the quality of the model's hypotheses. T… ▽ More

    Submitted 24 May, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted by NAACL 2024

  39. arXiv:2305.10998  [pdf, other

    cs.CL

    The Web Can Be Your Oyster for Improving Large Language Models

    Authors: Junyi Li, Tianyi Tang, Wayne Xin Zhao, **gyuan Wang, Jian-Yun Nie, Ji-Rong Wen

    Abstract: Large language models (LLMs) encode a large amount of world knowledge. However, as such knowledge is frozen at the time of model training, the models become static and limited by the training data at that time. In order to further improve the capacity of LLMs for knowledge-intensive tasks, we consider augmenting LLMs with the large-scale web using search engine. Unlike previous augmentation source… ▽ More

    Submitted 24 May, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: This work has been accepted by ACL 2023 (findings), while we slightly revise the title and the content based on the conference version

  40. arXiv:2305.10949  [pdf, other

    cs.OH cs.AR

    Toward Platform-based Building Design

    Authors: Yu-Wen Lin, Tsz Ling Elaine Tang, Stefano Schiavon, Costas J. Spanos

    Abstract: The electronic design industry has undergone a significant transformation, transitioning from traditional hand-drawn designs to modern automated design processes. While Computer-Aided Design (CAD) tools emerged alongside the electronic industry, the current building design process has little to no automation. There is a need for a unified platform to address the complexity of building design and p… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: 17 pages, 3 figures

  41. arXiv:2305.07004  [pdf, other

    cs.CL

    Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting

    Authors: Haoyang Huang, Tianyi Tang, Dongdong Zhang, Wayne Xin Zhao, Ting Song, Yan Xia, Furu Wei

    Abstract: Large language models (LLMs) demonstrate impressive multilingual capability, but their performance varies substantially across different languages. In this work, we introduce a simple yet effective method, called cross-lingual-thought prompting (XLT), to systematically improve the multilingual capability of LLMs. Specifically, XLT is a generic template prompt that stimulates cross-lingual and logi… ▽ More

    Submitted 22 October, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted by EMNLP 2023 Findings

  42. arXiv:2305.06380  [pdf, other

    cs.OH

    From Electronic Design Automation to Building Design Automation: Challenges and Opportunities

    Authors: Yu-Wen Lin, Tsz Ling Elaine Tang, Alberto L. Sangiovanni-Vincentelli, Stefano Schiavon, Costas J. Spanos

    Abstract: Design automation, which involves the use of software tools and technologies to streamline the design process, has been widely adopted in the electronics industry, resulting in significant advancements in product development and manufacturing. However, building design, which involves the creation of complex structures and systems, has traditionally lagged behind in leveraging design automation tec… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  43. arXiv:2304.10406  [pdf, other

    cs.CV

    LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields

    Authors: Tang Tao, Longfei Gao, Guangrun Wang, Yixing Lao, Peng Chen, Hengshuang Zhao, Dayang Hao, Xiaodan Liang, Mathieu Salzmann, Kaicheng Yu

    Abstract: We introduce a new task, novel view synthesis for LiDAR sensors. While traditional model-based LiDAR simulators with style-transfer neural networks can be applied to render novel views, they fall short of producing accurate and realistic LiDAR patterns because the renderers rely on explicit 3D reconstruction and exploit game engines, that ignore important attributes of LiDAR points. We address thi… ▽ More

    Submitted 14 July, 2023; v1 submitted 20 April, 2023; originally announced April 2023.

    Comments: This paper introduces a new task of novel LiDAR view synthesis, and proposes a differentiable framework called LiDAR-NeRF with a structural regularization, as well as an object-centric multi-view LiDAR dataset called NeRF-MVL

  44. arXiv:2303.18223  [pdf, other

    cs.CL cs.AI

    A Survey of Large Language Models

    Authors: Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, **hao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, Ji-Rong Wen

    Abstract: Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI algorithms for comprehending and gras** a language. As a major approach, language modeling has been widely studied for language understanding and generation in the past two decades, evolving from statistical language models to neural langu… ▽ More

    Submitted 24 November, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: ongoing work; 124 pages, 946 citations

  45. arXiv:2303.13913  [pdf, other

    cs.CV cs.RO

    GarmentTracking: Category-Level Garment Pose Tracking

    Authors: Han Xue, Wenqiang Xu, Jieyi Zhang, Tutian Tang, Yutong Li, Wenxin Du, Ruolin Ye, Cewu Lu

    Abstract: Garments are important to humans. A visual system that can estimate and track the complete garment pose can be useful for many downstream tasks and real-world applications. In this work, we present a complete package to address the category-level garment pose tracking task: (1) A recording system VR-Garment, with which users can manipulate virtual garment models in simulation through a VR interfac… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  46. arXiv:2303.09055  [pdf, other

    cs.CV

    TemporalMaxer: Maximize Temporal Context with only Max Pooling for Temporal Action Localization

    Authors: Tuan N. Tang, Kwonyoung Kim, Kwanghoon Sohn

    Abstract: Temporal Action Localization (TAL) is a challenging task in video understanding that aims to identify and localize actions within a video sequence. Recent studies have emphasized the importance of applying long-term temporal context modeling (TCM) blocks to the extracted video clip features such as employing complex self-attention mechanisms. In this paper, we present the simplest method ever to a… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  47. arXiv:2303.08498  [pdf, other

    cs.CV

    BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection

    Authors: Lei Yang, Kaicheng Yu, Tao Tang, Jun Li, Kun Yuan, Li Wang, Xinyu Zhang, Peng Chen

    Abstract: While most recent autonomous driving system focuses on develo** perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric bird's eye view detection methods have inferior performances on roadside cameras. This is b… ▽ More

    Submitted 11 April, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted by CVPR 2023

  48. arXiv:2302.14502  [pdf, other

    cs.CL

    A Survey on Long Text Modeling with Transformers

    Authors: Zican Dong, Tianyi Tang, Lunyi Li, Wayne Xin Zhao

    Abstract: Modeling long texts has been an essential technique in the field of natural language processing (NLP). With the ever-growing number of long documents, it is important to develop effective modeling methods that can process and analyze such texts. However, long texts pose important research challenges for existing text models, with more complex semantics and special characteristics. In this paper, w… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  49. arXiv:2302.00755  [pdf, other

    stat.ML cs.LG stat.ME

    Hierarchical shrinkage Gaussian processes: applications to computer code emulation and dynamical system recovery

    Authors: Tao Tang, Simon Mak, David Dunson

    Abstract: In many areas of science and engineering, computer simulations are widely used as proxies for physical experiments, which can be infeasible or unethical. Such simulations can often be computationally expensive, and an emulator can be trained to efficiently predict the desired response surface. A widely-used emulator is the Gaussian process (GP), which provides a flexible framework for efficient pr… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  50. arXiv:2212.13005  [pdf, other

    cs.CL

    TextBox 2.0: A Text Generation Library with Pre-trained Language Models

    Authors: Tianyi Tang, Junyi Li, Zhipeng Chen, Yiwen Hu, Zhuohao Yu, Wenxun Dai, Zican Dong, Xiaoxue Cheng, Yuhao Wang, Wayne Xin Zhao, Jian-Yun Nie, Ji-Rong Wen

    Abstract: To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets and further incorporates $45$ PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompt… ▽ More

    Submitted 25 December, 2022; originally announced December 2022.

    Comments: Accepted by EMNLP 2022