Skip to main content

Showing 1–50 of 600 results for author: Yan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17470  [pdf, other

    cs.LG cs.AI cs.DC cs.IT

    Dynamic Scheduling for Vehicle-to-Vehicle Communications Enhanced Federated Learning

    Authors: **tao Yan, Tan Chen, Yuxuan Sun, Zhaojun Nan, Sheng Zhou, Zhisheng Niu

    Abstract: Leveraging the computing and sensing capabilities of vehicles, vehicular federated learning (VFL) has been applied to edge training for connected vehicles. The dynamic and interconnected nature of vehicular networks presents unique opportunities to harness direct vehicle-to-vehicle (V2V) communications, enhancing VFL training efficiency. In this paper, we formulate a stochastic optimization proble… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Submitted to IEEE for possible publication

  2. arXiv:2406.16949  [pdf, other

    cs.LG

    Fair Differentiable Neural Network Architecture Search for Long-Tailed Data with Self-Supervised Learning

    Authors: Jiaming Yan

    Abstract: Recent advancements in artificial intelligence (AI) have positioned deep learning (DL) as a pivotal technology in fields like computer vision, data mining, and natural language processing. A critical factor in DL performance is the selection of neural network architecture. Traditional predefined architectures often fail to adapt to different data distributions, making it challenging to achieve opt… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  3. arXiv:2406.16367  [pdf, other

    cs.IR

    On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models

    Authors: Dongyang Li, Junbing Yan, Taolin Zhang, Chengyu Wang, Xiaofeng He, Longtao Huang, Hui Xue, Jun Huang

    Abstract: Retrieval augmented generation (RAG) exhibits outstanding performance in promoting the knowledge capabilities of large language models (LLMs) with retrieved documents related to user queries. However, RAG only focuses on improving the response quality of LLMs via enhancing queries indiscriminately with retrieved information, paying little attention to what type of knowledge LLMs really need to ans… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.15836  [pdf, other

    cs.LG cs.AI cs.MA

    Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models

    Authors: Yang Zhang, Chenjia Bai, Bin Zhao, Junchi Yan, Xiu Li, Xuelong Li

    Abstract: Learning a world model for model-free Reinforcement Learning (RL) agents can significantly improve the sample efficiency by learning policies in imagination. However, building a world model for Multi-Agent RL (MARL) can be particularly challenging due to the scalability issue in a centralized architecture arising from a large number of agents, and also the non-stationarity issue in a decentralized… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  5. arXiv:2406.13945  [pdf, other

    cs.AI cs.CL cs.LG

    CityBench: Evaluating the Capabilities of Large Language Model as World Model

    Authors: Jie Feng, Jun Zhang, Junbo Yan, Xin Zhang, Tianjian Ouyang, Tianhui Liu, Yuwei Du, Siqi Guo, Yong Li

    Abstract: Large language models (LLMs) with powerful generalization ability has been widely used in many domains. A systematic and reliable evaluation of LLMs is a crucial step in their development and applications, especially for specific professional fields. In the urban domain, there have been some early explorations about the usability of LLMs, but a systematic and scalable evaluation benchmark is still… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.13358  [pdf, other

    cs.CV eess.IV

    Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

    Authors: Zaiyan Zhang, **ing Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

    Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  7. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  8. arXiv:2406.10661  [pdf, other

    cs.AI cs.LG

    A GPU-accelerated Large-scale Simulator for Transportation System Optimization Benchmarking

    Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Depeng **, Yong Li

    Abstract: With the development of artificial intelligence techniques, transportation system optimization is evolving from traditional methods relying on expert experience to simulation and learning-based decision optimization methods. Learning-based optimization methods require extensive interaction with highly realistic microscopic traffic simulators for optimization. However, existing microscopic traffic… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS 2024 Datasets and Benchmarks Track

  9. arXiv:2406.09410  [pdf, other

    cs.CV cs.AI

    Scene Graph Generation in Large-Size VHR Satellite Imagery: A Large-Scale Dataset and A Context-Aware Approach

    Authors: Yansheng Li, Linlin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

    Abstract: Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting intelligent understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it necessary to holistically conduct SGG in large-size very-high-reso… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: This paper releases a SAI-oriented SGG toolkit with about 30 OBD methods and 10 SGG methods, and develops a benchmark based on RSG where our HOD-Net and RPCM significantly outperform the state-of-the-art methods in both OBD and SGG tasks. The RSG dataset and SAI-oriented toolkit will be made publicly available at https://linlin-dev.github.io/project/RSG

  10. arXiv:2406.09385  [pdf, other

    cs.CV

    Towards Vision-Language Geo-Foundation Model: A Survey

    Authors: Yue Zhou, Litong Feng, Yi** Ke, Xue Jiang, Junchi Yan, Xue Yang, Wayne Zhang

    Abstract: Vision-Language Foundation Models (VLFMs) have made remarkable progress on various multimodal tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding. However, most methods rely on training with general image datasets, and the lack of geospatial data leads to poor performance on earth observation. Numerous geospatial image-text pair datasets and VLFMs… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 18 pages, 4 figures

  11. arXiv:2406.04963  [pdf, other

    cs.LG cs.AI

    Learning Divergence Fields for Shift-Robust Graph Representations

    Authors: Qitian Wu, Fan Nie, Chenxiao Yang, Junchi Yan

    Abstract: Real-world data generation often involves certain geometries (e.g., graphs) that induce instance-level interdependence. This characteristic makes the generalization of learning models more difficult due to the intricate interdependent patterns that impact data-generative distributions and can vary from training to testing. In this work, we propose a geometric diffusion model with learnable diverge… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024. Source codes at https://github.com/fannie1208/GLIND

  12. arXiv:2406.03877  [pdf, other

    cs.RO cs.CV

    Bench2Drive: Towards Multi-Ability Benchmarking of Closed-Loop End-To-End Autonomous Driving

    Authors: Xiaosong Jia, Zhenjie Yang, Qifeng Li, Zhiyuan Zhang, Junchi Yan

    Abstract: In an era marked by the rapid scaling of foundation models, autonomous driving technologies are approaching a transformative threshold where end-to-end autonomous driving (E2E-AD) emerges due to its potential of scaling up in the data-driven manner. However, existing E2E-AD methods are mostly evaluated under the open-loop log-replay manner with L2 errors and collision rate as metrics (e.g., in nuS… ▽ More

    Submitted 11 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: Fix typos in text and Table 4. More reference

  13. arXiv:2405.20583  [pdf, other

    cs.CG math.AT

    The Gestalt Computational Model

    Authors: Yu Chen, Hongwei Lin, Jiacong Yan

    Abstract: Widely employed in cognitive psychology, Gestalt theory elucidates basic principles in visual perception, but meanwhile presents significant challenges for computation. The advancement of artificial intelligence requires the emulation of human cognitive behavior, for which Gestalt theory serves as a fundamental framework describing human visual cognitive behavior. In this paper, we utilize persist… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  14. arXiv:2405.18132  [pdf, other

    cs.CV

    EG4D: Explicit Generation of 4D Object without Score Distillation

    Authors: Qi Sun, Zhiyang Guo, Ziyu Wan, **g Nathan Yan, Shengming Yin, Wengang Zhou, **g Liao, Houqiang Li

    Abstract: In recent years, the increasing demand for dynamic 3D assets in design and gaming applications has given rise to powerful generative pipelines capable of synthesizing high-quality 4D objects. Previous methods generally rely on score distillation sampling (SDS) algorithm to infer the unseen views and motion of 4D objects, thus leading to unsatisfactory results with defects like over-saturation and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  15. arXiv:2405.16759  [pdf, other

    cs.CV cs.LG

    Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

    Authors: Cristina N. Vasconcelos, Abdullah Rashwan, Austin Waters, Trevor Walker, Keyang Xu, Jimmy Yan, Rui Qian, Shixin Luo, Zarana Parekh, Andrew Bunner, Hongliang Fei, Roopal Garg, Mandy Guo, Ivana Kajic, Yeqing Li, Henna Nandwani, Jordi Pont-Tuset, Yasumasa Onoe, Sarah Rosston, Su Wang, Wenlei Zhou, Kevin Swersky, David J. Fleet, Jason M. Baldridge, Oliver Wang

    Abstract: We address the long-standing problem of how to learn effective pixel-based image diffusion models at scale, introducing a remarkably simple greedy growing method for stable training of large-scale, high-resolution models. without the needs for cascaded super-resolution components. The key insight stems from careful pre-training of core components, namely, those responsible for text-to-image alignm… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  16. arXiv:2405.15908  [pdf, other

    cs.AI cs.CR cs.LG

    Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine

    Authors: Yuanliang Li, Hanzheng Dai, Jun Yan

    Abstract: Automated penetration testing (AutoPT) based on reinforcement learning (RL) has proven its ability to improve the efficiency of vulnerability identification in information systems. However, RL-based PT encounters several challenges, including poor sampling efficiency, intricate reward specification, and limited interpretability. To address these issues, we propose a knowledge-informed AutoPT frame… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  17. arXiv:2405.14854  [pdf, other

    cs.CV cs.LG

    TerDiT: Ternary Diffusion Models with Transformers

    Authors: Xudong Lu, Aojun Zhou, Ziyi Lin, Qi Liu, Yuhui Xu, Renrui Zhang, Yafei Wen, Shuai Ren, Peng Gao, Junchi Yan, Hongsheng Li

    Abstract: Recent developments in large-scale pre-trained text-to-image diffusion models have significantly improved the generation of high-fidelity images, particularly with the emergence of diffusion models based on transformer architecture (DiTs). Among these diffusion models, diffusion transformers have demonstrated superior image generation capabilities, boosting lower FID scores and higher scalability.… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 18 pages, 13 figures

  18. arXiv:2405.12788  [pdf, other

    cs.CL

    What Have We Achieved on Non-autoregressive Translation?

    Authors: Yafu Li, Huajian Zhang, Jianhao Yan, Yong**g Yin, Yue Zhang

    Abstract: Recent advances have made non-autoregressive (NAT) translation comparable to autoregressive methods (AT). However, their evaluation using BLEU has been shown to weakly correlate with human annotations. Limited research compares non-autoregressive translation and autoregressive translation comprehensively, leaving uncertainty about the true proximity of NAT to AT. To address this gap, we systematic… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: ACL 2024 Findings

  19. arXiv:2405.12520  [pdf, other

    cs.DC

    MOSS: A Large-scale Open Microscopic Traffic Simulation System

    Authors: Jun Zhang, Wenxuan Ao, Junbo Yan, Can Rong, Depeng **, Wei Wu, Yong Li

    Abstract: In the research of Intelligent Transportation Systems (ITS), traffic simulation is a key procedure for the evaluation of new methods and optimization of strategies. However, existing traffic simulation systems face two challenges. First, how to balance simulation scale with realism is a dilemma. Second, it is hard to simulate realistic results, which requires realistic travel demand data and simul… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE ITSC 2024

  20. arXiv:2404.16952  [pdf, other

    cs.RO

    Simultaneous Estimation of Shape and Force along Highly Deformable Surgical Manipulators Using Sparse FBG Measurement

    Authors: Yiang Lu, Bin Li, Wei Chen, Junyan Yan, Shing Shin Cheng, Jiangliu Wang, Jianshu Zhou, Qi Dou, Yun-hui Liu

    Abstract: Recently, fiber optic sensors such as fiber Bragg gratings (FBGs) have been widely investigated for shape reconstruction and force estimation of flexible surgical robots. However, most existing approaches need precise model parameters of FBGs inside the fiber and their alignments with the flexible robots for accurate sensing results. Another challenge lies in online acquiring external forces at ar… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted to ICRA 2024

  21. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  22. arXiv:2404.13299  [pdf, other

    cs.CV

    PCQA: A Strong Baseline for AIGC Quality Assessment Based on Prompt Condition

    Authors: Xi Fang, Weigang Wang, Xiaoxin Lv, Jun Yan

    Abstract: The development of Large Language Models (LLM) and Diffusion Models brings the boom of Artificial Intelligence Generated Content (AIGC). It is essential to build an effective quality assessment framework to provide a quantifiable evaluation of different images or videos based on the AIGC technologies. The content generated by AIGC methods is driven by the crafted prompts. Therefore, it is intuitiv… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: Published in CVPR-2024's NTIRE: New Trends in Image Restoration and Enhancement workshop and challenges

  23. arXiv:2404.12612  [pdf, other

    cs.LG cs.CV

    SA-Attack: Speed-adaptive stealthy adversarial attack on trajectory prediction

    Authors: Huilin Yin, Jiaxiang Li, Pengju Zhen, Jun Yan

    Abstract: Trajectory prediction is critical for the safe planning and navigation of automated vehicles. The trajectory prediction models based on the neural networks are vulnerable to adversarial attacks. Previous attack methods have achieved high attack success rates but overlook the adaptability to realistic scenarios and the concealment of the deceits. To address this problem, we propose a speed-adaptive… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: This work is published in IEEE IV Symposium

  24. arXiv:2404.12097  [pdf, other

    eess.SY cs.LG

    MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models

    Authors: Jiaqi Yan, Ankush Chakrabarty, Alisa Rupenyan, John Lygeros

    Abstract: In this paper, we consider the problem of reference tracking in uncertain nonlinear systems. A neural State-Space Model (NSSM) is used to approximate the nonlinear system, where a deep encoder network learns the nonlinearity from data, and a state-space component captures the temporal relationship. This transforms the nonlinear system into a linear system in a latent space, enabling the applicatio… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  25. arXiv:2404.11171  [pdf, other

    cs.LG cs.AI eess.SP

    Personalized Heart Disease Detection via ECG Digital Twin Generation

    Authors: Yaojun Hu, **tai Chen, Lianting Hu, Dantong Li, Jiahuan Yan, Haochao Ying, Huiying Liang, Jian Wu

    Abstract: Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ dig… ▽ More

    Submitted 11 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  26. A Phone-based Distributed Ambient Temperature Measurement System with An Efficient Label-free Automated Training Strategy

    Authors: Dayin Chen, Xiaodan Shi, Haoran Zhang, Xuan Song, Dongxiao Zhang, Yuntian Chen, **yue Yan

    Abstract: Enhancing the energy efficiency of buildings significantly relies on monitoring indoor ambient temperature. The potential limitations of conventional temperature measurement techniques, together with the omnipresence of smartphones, have redirected researchers'attention towards the exploration of phone-based ambient temperature estimation methods. However, existing phone-based methods face challen… ▽ More

    Submitted 17 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Journal ref: IEEE Transactions on Mobile Computing,13 May 2024, 1 - 13

  27. arXiv:2404.07882  [pdf, other

    cs.AR quant-ph

    On Reducing the Execution Latency of Superconducting Quantum Processors via Quantum Program Scheduling

    Authors: Wenjie Wu, Yiquan Wang, Ge Yan, Yuming Zhao, Junchi Yan

    Abstract: Quantum computing has gained considerable attention, especially after the arrival of the Noisy Intermediate-Scale Quantum (NISQ) era. Quantum processors and cloud services have been made world-wide increasingly available. Unfortunately, programs on existing quantum processors are often executed in series, and the workload could be heavy to the processor. Typically, one has to wait for hours or eve… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  28. arXiv:2404.06119  [pdf, other

    cs.CV

    DreamView: Injecting View-specific Text Guidance into Text-to-3D Generation

    Authors: Junkai Yan, Yipeng Gao, Qize Yang, Xihan Wei, Xuansong Xie, Ancong Wu, Wei-Shi Zheng

    Abstract: Text-to-3D generation, which synthesizes 3D assets according to an overall text description, has significantly progressed. However, a challenge arises when the specific appearances need customizing at designated viewpoints but referring solely to the overall description for generating 3D objects. For instance, ambiguity easily occurs when producing a T-shirt with distinct patterns on its front and… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  29. arXiv:2403.20002  [pdf, other

    cs.CV

    Grounding and Enhancing Grid-based Models for Neural Fields

    Authors: Zelin Zhao, Fenglei Fan, Wenlong Liao, Junchi Yan

    Abstract: Many contemporary studies utilize grid-based models for neural field representation, but a systematic analysis of grid-based models is still missing, hindering the improvement of those models. Therefore, this paper introduces a theoretical framework for grid-based models. This framework points out that these models' approximation and generalization behaviors are determined by grid tangent kernels… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: CVPR24 Oral & Best Paper Award Candidate. Pre-rebuttal scores: 555. Post-rebuttal scores: 555

  30. arXiv:2403.14660  [pdf

    cs.CY cs.AI

    Machina Economicus: A New Paradigm for Prosumers in the Energy Internet of Smart Cities

    Authors: Luyang Hou, Jun Yan, Yuankai Wu, Chun Wang, Tie Qiu

    Abstract: Energy Internet (EI) is emerging as new share economy platform for flexible local energy supplies in smart cities. Empowered by the Internet-of-Things (IoT) and Artificial Intelligence (AI), EI aims to unlock peer-to-peer energy trading and sharing among prosumers, who can adeptly switch roles between providers and consumers in localized energy markets with rooftop photovoltaic panels, vehicle-to-… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

    Comments: 25 pages, 1 figure

  31. arXiv:2403.13846  [pdf, other

    cs.LG cs.AI

    A Clustering Method with Graph Maximum Decoding Information

    Authors: Xinrun Xu, Manying Lv, Zhanbiao Lian, Yurong Wu, ** Yan, Shan Jiang, Zhiming Ding

    Abstract: The clustering method based on graph models has garnered increased attention for its widespread applicability across various knowledge domains. Its adaptability to integrate seamlessly with other relevant applications endows the graph model-based clustering analysis with the ability to robustly extract "natural associations" or "graph structures" within datasets, facilitating the modelling of rela… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: 9 pages, 9 figures, IJCNN 2024

  32. arXiv:2403.13647  [pdf, other

    cs.CV

    Meta-Point Learning and Refining for Category-Agnostic Pose Estimation

    Authors: Junjie Chen, Jiebin Yan, Yuming Fang, Li Niu

    Abstract: Category-agnostic pose estimation (CAPE) aims to predict keypoints for arbitrary classes given a few support images annotated with keypoints. Existing methods only rely on the features extracted at support keypoints to predict or refine the keypoints on query image, but a few support feature vectors are local and inadequate for CAPE. Considering that human can quickly perceive potential keypoints… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Published in CVPR 2024

  33. arXiv:2403.13331  [pdf, other

    cs.CV cs.RO

    AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving

    Authors: Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yan

    Abstract: As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation. One natural solution is to estimate the position of other agents in a step-by-step manner where each predicted time-step is conditioned on both observed time-steps and previously predicted time-steps, i.e., autoregressive prediction. Pioneering works like SocialL… ▽ More

    Submitted 21 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  34. arXiv:2403.11380  [pdf, other

    cs.CV

    Boosting Order-Preserving and Transferability for Neural Architecture Search: a Joint Architecture Refined Search and Fine-tuning Approach

    Authors: Beichen Zhang, Xiaoxing Wang, Xiaohan Qin, Junchi Yan

    Abstract: Supernet is a core component in many recent Neural Architecture Search (NAS) methods. It not only helps embody the search space but also provides a (relative) estimation of the final performance of candidate architectures. Thus, it is critical that the top architectures ranked by a supernet should be consistent with those ranked by true performance, which is known as the order-preserving ability.… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  35. arXiv:2403.11203  [pdf, other

    cs.CL

    TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

    Authors: Junbing Yan, Chengyu Wang, Taolin Zhang, Xiaofeng He, Jun Huang, Longtao Huang, Hui Xue, Wei Zhang

    Abstract: KEPLMs are pre-trained models that utilize external knowledge to enhance language understanding. Previous language models facilitated knowledge acquisition by incorporating knowledge-related pre-training tasks learned from relation triples in knowledge graphs. However, these models do not prioritize learning embeddings for entity-related tokens. Moreover, updating the entire set of parameters in K… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  36. arXiv:2403.11121  [pdf, other

    cs.CV

    A Versatile Framework for Multi-scene Person Re-identification

    Authors: Wei-Shi Zheng, Junkai Yan, Yi-Xing Peng

    Abstract: Person Re-identification (ReID) has been extensively developed for a decade in order to learn the association of images of the same person across non-overlap** camera views. To overcome significant variations between images across camera views, mountains of variants of ReID models were developed for solving a number of challenges, such as resolution change, clothing change, occlusion, modality c… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: To appear in TPAMI

  37. arXiv:2403.11090  [pdf, other

    cs.NI cs.LG

    Brain-on-Switch: Towards Advanced Intelligent Network Data Plane via NN-Driven Traffic Analysis at Line-Speed

    Authors: **zhu Yan, Haotian Xu, Zhuotao Liu, Qi Li, Ke Xu, Mingwei Xu, Jian** Wu

    Abstract: The emerging programmable networks sparked significant research on Intelligent Network Data Plane (INDP), which achieves learning-based traffic analysis at line-speed. Prior art in INDP focus on deploying tree/forest models on the data plane. We observe a fundamental limitation in tree-based INDP approaches: although it is possible to represent even larger tree/forest tables on the data plane, the… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 12 pages body, 22 pages total, 14 figures, accepted by the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI'24)

  38. arXiv:2403.10299  [pdf, other

    cs.AI

    A Multi-constraint and Multi-objective Allocation Model for Emergency Rescue in IoT Environment

    Authors: Xinrun Xu, Zhanbiao Lian, Yurong Wu, Manying Lv, Zhiming Ding, Jian Yan, Shang Jiang

    Abstract: Emergency relief operations are essential in disaster aftermaths, necessitating effective resource allocation to minimize negative impacts and maximize benefits. In prolonged crises or extensive disasters, a systematic, multi-cycle approach is key for timely and informed decision-making. Leveraging advancements in IoT and spatio-temporal data analytics, we've developed the Multi-Objective Shuffled… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 5 pages, 5 figures, ISCAS 2024

  39. arXiv:2403.10069  [pdf, other

    cs.CV cs.AI

    Boundary Matters: A Bi-Level Active Finetuning Framework

    Authors: Han Lu, Yichen Xie, Xiaokang Yang, Junchi Yan

    Abstract: The pretraining-finetuning paradigm has gained widespread adoption in vision tasks and other fields, yet it faces the significant challenge of high sample annotation costs. To mitigate this, the concept of active finetuning has emerged, aiming to select the most appropriate samples for model finetuning within a limited budget. Traditional active learning methods often struggle in this setting due… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  40. arXiv:2403.07865  [pdf, other

    cs.CL cs.AI cs.CR cs.LG cs.SE

    CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion

    Authors: Qibing Ren, Chang Gao, **g Shao, Junchi Yan, Xin Tan, Wai Lam, Lizhuang Ma

    Abstract: The rapid advancement of Large Language Models (LLMs) has brought about remarkable generative capabilities but also raised concerns about their potential misuse. While strategies like supervised fine-tuning and reinforcement learning from human feedback have enhanced their safety, these methods primarily focus on natural languages, which may not generalize to other domains. This paper introduces C… ▽ More

    Submitted 9 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: ACL Findings 2024, Code is available at https://github.com/renqibing/CodeAttack

  41. arXiv:2403.07257  [pdf, other

    cs.AR cs.ET

    The Dawn of AI-Native EDA: Opportunities and Challenges of Large Circuit Models

    Authors: Lei Chen, Yiqi Chen, Zhufei Chu, Wenji Fang, Tsung-Yi Ho, Ru Huang, Yu Huang, Sadaf Khan, Min Li, Xingquan Li, Yu Li, Yun Liang, **wei Liu, Yi Liu, Yibo Lin, Guojie Luo, Zhengyuan Shi, Guangyu Sun, Dimitrios Tsaras, Runsheng Wang, Ziyi Wang, Xinming Wei, Zhiyao Xie, Qiang Xu, Chenhao Xue , et al. (14 additional authors not shown)

    Abstract: Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Suc… ▽ More

    Submitted 1 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

    Comments: The authors are ordered alphabetically. Contact: qxu@cse[dot]cuhk[dot]edu[dot]hk, gluo@pku[dot]edu[dot]cn, yuan.mingxuan@huawei[dot]com

  42. arXiv:2403.06087  [pdf, other

    cs.LG eess.IV

    Learning the irreversible progression trajectory of Alzheimer's disease

    Authors: Yipei Wang, Bing He, Shannon Risacher, Andrew Saykin, **gwen Yan, Xiaoqian Wang

    Abstract: Alzheimer's disease (AD) is a progressive and irreversible brain disorder that unfolds over the course of 30 years. Therefore, it is critical to capture the disease progression in an early stage such that intervention can be applied before the onset of symptoms. Machine learning (ML) models have been shown effective in predicting the onset of AD. Yet for subjects with follow-up visits, existing te… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: accepted by ISBI 2024

  43. Effect of turbulent diffusion in modeling anaerobic digestion

    Authors: Jeremy Z. Yan, Prashant Kumar, Wolfgang Rauch

    Abstract: In this study, the impact of turbulent diffusion on mixing of biochemical reaction models is explored by implementing and validating different models. An original codebase called CHAD (Coupled Hydrodynamics and Anaerobic Digestion) is extended to incorporate turbulent diffusion and validate it against results from OpenFOAM with 2D Rayleigh-Taylor Instability and lid-driven cavity simulations. The… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  44. arXiv:2403.02877  [pdf, other

    cs.CV cs.AI cs.RO

    ActiveAD: Planning-Oriented Active Learning for End-to-End Autonomous Driving

    Authors: Han Lu, Xiaosong Jia, Yichen Xie, Wenlong Liao, Xiaokang Yang, Junchi Yan

    Abstract: End-to-end differentiable learning for autonomous driving (AD) has recently become a prominent paradigm. One main bottleneck lies in its voracious appetite for high-quality labeled data e.g. 3D bounding boxes and semantic segmentation, which are notoriously expensive to manually annotate. The difficulty is further pronounced due to the prominent fact that the behaviors within samples in AD often s… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  45. arXiv:2403.01841  [pdf, other

    cs.CL cs.LG

    Making Pre-trained Language Models Great on Tabular Prediction

    Authors: Jiahuan Yan, Bo Zheng, Hongxia Xu, Yiheng Zhu, Danny Z. Chen, Jimeng Sun, Jian Wu, **tai Chen

    Abstract: The transferability of deep neural networks (DNNs) has made significant progress in image and language processing. However, due to the heterogeneity among tables, such DNN bonus is still far from being well exploited on tabular data prediction (e.g., regression or classification tasks). Condensing knowledge from diverse domains, language models (LMs) possess the capability to comprehend feature na… ▽ More

    Submitted 12 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to ICLR 2024 as spotlight presentation (Notable Top 5%). OpenReview link is https://openreview.net/forum?id=anzIzGZuLi, codes will be available at https://github.com/jyansir/tp-berta

  46. arXiv:2403.01570  [pdf, other

    cs.CL cs.LG

    SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction

    Authors: Jiahuan Yan, **tai Chen, Chaowen Hu, Bo Zheng, Yaojun Hu, Jimeng Sun, Jian Wu

    Abstract: Recent development of large language models (LLMs) has exhibited impressive zero-shot proficiency on generic and common sense questions. However, LLMs' application on domain-specific vertical questions still lags behind, primarily due to the humiliation problems and deficiencies in vertical knowledge. Furthermore, the vertical data annotation process often requires labor-intensive expert involveme… ▽ More

    Submitted 16 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  47. arXiv:2403.00250  [pdf, other

    cs.CV cs.AI

    Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach

    Authors: Han Lu, Siyu Sun, Yichen Xie, Liqing Zhang, Xiaokang Yang, Junchi Yan

    Abstract: In the long-tailed recognition field, the Decoupled Training paradigm has demonstrated remarkable capabilities among various methods. This paradigm decouples the training process into separate representation learning and classifier re-training. Previous works have attempted to improve both stages simultaneously, making it difficult to isolate the effect of classifier re-training. Furthermore, rece… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  48. arXiv:2403.00012  [pdf, other

    cs.LG cs.AR

    PreRoutGNN for Timing Prediction with Order Preserving Partition: Global Circuit Pre-training, Local Delay Learning and Attentional Cell Modeling

    Authors: Ruizhe Zhong, Junjie Ye, Zhentao Tang, Shixiong Kai, Mingxuan Yuan, Jianye Hao, Junchi Yan

    Abstract: Pre-routing timing prediction has been recently studied for evaluating the quality of a candidate cell placement in chip design. It involves directly estimating the timing metrics for both pin-level (slack, slew) and edge-level (net delay, cell delay), without time-consuming routing. However, it often suffers from signal decay and error accumulation due to the long timing paths in large-scale indu… ▽ More

    Submitted 12 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

    Comments: 13 pages, 5 figures, The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  49. arXiv:2402.18975  [pdf, other

    cs.CV cs.AI

    Theoretically Achieving Continuous Representation of Oriented Bounding Boxes

    Authors: Zi-Kai Xiao, Guo-Ye Yang, Xue Yang, Tai-Jiang Mu, Junchi Yan, Shi-min Hu

    Abstract: Considerable efforts have been devoted to Oriented Object Detection (OOD). However, one lasting issue regarding the discontinuity in Oriented Bounding Box (OBB) representation remains unresolved, which is an inherent bottleneck for extant OOD methods. This paper endeavors to completely solve this issue in a theoretically guaranteed manner and puts an end to the ad-hoc efforts in this direction. Pr… ▽ More

    Submitted 16 April, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: 17 pages, 12 tables, 8 figures. Accepted by CVPR'24. Code: https://github.com/514flowey/JDet-COBB

  50. arXiv:2402.18008  [pdf, other

    cs.CV

    Fast and Interpretable 2D Homography Decomposition: Similarity-Kernel-Similarity and Affine-Core-Affine Transformations

    Authors: Shen Cai, Zhanhao Wu, Lingxi Guo, Jiachun Wang, Siyu Zhang, Junchi Yan, Shuhan Shen

    Abstract: In this paper, we present two fast and interpretable decomposition methods for 2D homography, which are named Similarity-Kernel-Similarity (SKS) and Affine-Core-Affine (ACA) transformations respectively. Under the minimal $4$-point configuration, the first and the last similarity transformations in SKS are computed by two anchor points on target and source planes, respectively. Then, the other two… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.