Skip to main content

Showing 1–25 of 25 results for author: Leng, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12718  [pdf, other

    cs.CV cs.AI cs.CL

    AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

    Authors: Wenbin An, Feng Tian, Sicong Leng, Jiahao Nie, Haonan Lin, QianYing Wang, Guang Dai, ** Chen, Shijian Lu

    Abstract: Despite their great success across various multimodal tasks, Large Vision-Language Models (LVLMs) are facing a prevalent problem with object hallucinations, where the generated textual responses are inconsistent with ground-truth objects in the given image. This paper investigates various LVLMs and pinpoints attention deficiency toward discriminative local image features as one root cause of objec… ▽ More

    Submitted 21 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.07476  [pdf, other

    cs.CV cs.CL

    VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

    Authors: Zesen Cheng, Sicong Leng, Hang Zhang, Yifei Xin, Xin Li, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, Lidong Bing

    Abstract: In this paper, we present the VideoLLaMA 2, a set of Video Large Language Models (Video-LLMs) designed to enhance spatial-temporal modeling and audio understanding in video and audio-oriented tasks. Building upon its predecessor, VideoLLaMA 2 incorporates a tailor-made Spatial-Temporal Convolution (STC) connector, which effectively captures the intricate spatial and temporal dynamics of video data… ▽ More

    Submitted 17 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: ZC, SL, HZ, YX, and XL contributed equally to this project

  3. arXiv:2405.00181  [pdf, other

    cs.CV cs.AI

    Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

    Authors: Hang Du, Sicheng Zhang, Binzhu Xie, Guoshun Nan, Jiayang Zhang, Junrui Xu, Hangyu Liu, Sicong Leng, Jiangming Liu, Hehe Fan, Dajiu Huang, **g Feng, Linli Chen, Can Zhang, Xuhuan Li, Hao Zhang, Jianhang Chen, Qimei Cui, Xiaofeng Tao

    Abstract: Video anomaly understanding (VAU) aims to automatically comprehend unusual occurrences in videos, thereby enabling various applications such as traffic surveillance and industrial manufacturing. While existing VAU benchmarks primarily concentrate on anomaly detection and localization, our focus is on more practicality, prompting us to raise the following crucial questions: "what anomaly occurred?"… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 April, 2024; originally announced May 2024.

    Comments: Accepted in CVPR2024, Codebase: https://github.com/fesvhtr/CUVA

  4. arXiv:2404.00385  [pdf, other

    cs.CV cs.AI cs.LG

    Constrained Layout Generation with Factor Graphs

    Authors: Mohammed Haroon Dupty, Yanfei Dong, Sicong Leng, Guoji Fu, Yong Liang Goh, Wei Lu, Wee Sun Lee

    Abstract: This paper addresses the challenge of object-centric layout generation under spatial constraints, seen in multiple domains including floorplan design process. The design process typically involves specifying a set of spatial constraints that include object attributes like size and inter-object relations such as relative positioning. Existing works, which typically represent objects as single nodes… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: To be published at IEEE/CVF CVPR 2024

  5. arXiv:2403.05133  [pdf, other

    cs.IT cs.LG cs.NI

    RIS-empowered Topology Control for Distributed Learning in Urban Air Mobility

    Authors: Kai Xiong, Rui Wang, Supeng Leng, Wenyang Che, Chongwen Huang, Chau Yuen

    Abstract: Urban Air Mobility (UAM) expands vehicles from the ground to the near-ground space, envisioned as a revolution for transportation systems. Comprehensive scene perception is the foundation for autonomous aerial driving. However, UAM encounters the intelligent perception challenge: high perception learning requirements conflict with the limited sensors and computing chips of flying cars. To overcome… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  6. arXiv:2402.05940  [pdf

    cs.LG cs.AI stat.ME

    Causal Relationship Network of Risk Factors Impacting Workday Loss in Underground Coal Mines

    Authors: Shangsi Ren, Cameron A. Beeche, Zhiyi Shi, Maria Acevedo Garcia, Katherine Zychowski, Shuguang Leng, Pedram Roghanchi, Jiantao Pu

    Abstract: This study aims to establish the causal relationship network between various factors leading to workday loss in underground coal mines using a novel causal artificial intelligence (AI) method. The analysis utilizes data obtained from the National Institute for Occupational Safety and Health (NIOSH). A total of 101,010 injury records from 3,982 unique underground coal mines spanning the years from… ▽ More

    Submitted 24 January, 2024; originally announced February 2024.

    Comments: 5 figures 5 tables

  7. arXiv:2311.16922  [pdf, other

    cs.CV cs.AI cs.CL

    Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

    Authors: Sicong Leng, Hang Zhang, Guanzheng Chen, Xin Li, Shijian Lu, Chunyan Miao, Lidong Bing

    Abstract: Large Vision-Language Models (LVLMs) have advanced considerably, intertwining visual recognition and language understanding to generate content that is not only coherent but also contextually attuned. Despite their success, LVLMs still suffer from the issue of object hallucinations, where models generate plausible yet incorrect outputs that include objects that do not exist in the images. To mitig… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  8. arXiv:2311.15941  [pdf, other

    cs.CL cs.CV

    Tell2Design: A Dataset for Language-Guided Floor Plan Generation

    Authors: Sicong Leng, Yang Zhou, Mohammed Haroon Dupty, Wee Sun Lee, Sam Conrad Joyce, Wei Lu

    Abstract: We consider the task of generating designs directly from natural language descriptions, and consider floor plan generation as the initial research area. Language conditional generative models have recently been very successful in generating high-quality artistic images. However, designs must satisfy different constraints that are not present in generating artistic images, particularly spatial and… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Paper published in ACL2023; Area Chair Award; Best Paper Nomination

  9. UniG-Encoder: A Universal Feature Encoder for Graph and Hypergraph Node Classification

    Authors: Minhao Zou, Zhongxue Gan, Yutong Wang, Junheng Zhang, Dongyan Sui, Chun Guan, Siyang Leng

    Abstract: Graph and hypergraph representation learning has attracted increasing attention from various research fields. Despite the decent performance and fruitful applications of Graph Neural Networks (GNNs), Hypergraph Neural Networks (HGNNs), and their well-designed variants, on some commonly used benchmark graphs and hypergraphs, they are outperformed by even a simple Multi-Layer Perceptron. This observ… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  10. arXiv:2305.02214  [pdf, other

    cs.IT cs.RO eess.SY

    A Digital Twin Empowered Lightweight Model Sharing Scheme for Multi-Robot Systems

    Authors: Kai Xiong, Zhihong Wang, Supeng Leng, Jianhua He

    Abstract: Multi-robot system for manufacturing is an Industry Internet of Things (IIoT) paradigm with significant operational cost savings and productivity improvement, where Unmanned Aerial Vehicles (UAVs) are employed to control and implement collaborative productions without human intervention. This mission-critical system relies on 3-Dimension (3-D) scene recognition to improve operation accuracy in the… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 16 pages, 12 figures, journal

  11. arXiv:2303.09042  [pdf, other

    cs.LG math.DS

    Embedding Theory of Reservoir Computing and Reducing Reservoir Network Using Time Delays

    Authors: Xing-Yue Duan, Xiong Ying, Si-Yang Leng, Jürgen Kurths, Wei Lin, Huan-Fei Ma

    Abstract: Reservoir computing (RC), a particular form of recurrent neural network, is under explosive development due to its exceptional efficacy and high performance in reconstruction or/and prediction of complex physical systems. However, the mechanism triggering such effective applications of RC is still unclear, awaiting deep and systematic exploration. Here, combining the delayed embedding theory with… ▽ More

    Submitted 8 May, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

  12. arXiv:2109.05182   

    cs.CL

    Speaker-Oriented Latent Structures for Dialogue-Based Relation Extraction

    Authors: Guoshun Nan, Guoqing Luo, Sicong Leng, Yao Xiao, Wei Lu

    Abstract: Dialogue-based relation extraction (DiaRE) aims to detect the structural information from unstructured utterances in dialogues. Existing relation extraction models may be unsatisfactory under such a conversational setting, due to the entangled logic and information sparsity issues in utterances involving multiple speakers. To this end, we introduce SOLS, a novel model which can explicitly induce s… ▽ More

    Submitted 30 October, 2021; v1 submitted 11 September, 2021; originally announced September 2021.

    Comments: The experiment part is insufficient, while we are not planning to improve it for now. To avoid potential confusion and to ensure the quality of arxiv papers, we would like to withdraw this submission

  13. arXiv:2108.01885  [pdf, other

    cs.NI

    Intelligent Sensing Scheduling for Mobile Target Tracking Wireless Sensor Networks

    Authors: Longyu Zhou, Supeng Leng, Qiang Liu, Haoye Chai, Jihua Zhou

    Abstract: Edge computing has emerged as a prospective paradigm to meet ever-increasing computation demands in Mobile Target Tracking Wireless Sensor Networks (MTT-WSN). This paradigm can offload time-sensitive tasks to sink nodes to improve computing efficiency. Nevertheless, it is difficult to execute dynamic and critical tasks in the MTT-WSN network. Besides, the network cannot ensure consecutive tracking… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Comments: 11 pages, 12 figures

  14. arXiv:2108.01598  [pdf, other

    cs.NI

    Secure and Efficient Blockchain based Knowledge Sharing for Intelligent Connected Vehicles

    Authors: Haoye Chai, Supeng Leng, Fan Wu, Jianhua He

    Abstract: The emergence of Intelligent Connected Vehicles (ICVs) shows great potential for future intelligent traffic systems, enhancing both traffic safety and road efficiency. However, the ICVs relying on data driven perception and driving models face many challenges, including the lack of comprehensive knowledge to deal with complicated driving context. In this paper, we are motivated to investigate coop… ▽ More

    Submitted 2 November, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: 12 pages, 13 figures

  15. arXiv:2106.11013  [pdf, other

    cs.CV cs.CL

    Interventional Video Grounding with Dual Contrastive Learning

    Authors: Guoshun Nan, Rui Qiao, Yao Xiao, Jun Liu, Sicong Leng, Hao Zhang, Wei Lu

    Abstract: Video grounding aims to localize a moment from an untrimmed video for a given textual query. Existing approaches focus more on the alignment of visual and language stimuli with various likelihood-based matching or regression strategies, i.e., P(Y|X). Consequently, these models may suffer from spurious correlations between the language and video features due to the selection bias of the dataset. 1)… ▽ More

    Submitted 7 July, 2021; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: Accepted in CVPR 2021

  16. arXiv:2104.14088  [pdf, other

    cs.DC cs.AI cs.LG

    Connecting AI Learning and Blockchain Mining in 6G Systems

    Authors: Yunkai Wei, Zixian An, Supeng Leng, Kun Yang

    Abstract: The sixth generation (6G) systems are generally recognized to be established on ubiquitous Artificial Intelligence (AI) and distributed ledger such as blockchain. However, the AI training demands tremendous computing resource, which is limited in most 6G devices. Meanwhile, miners in Proof-of-Work (PoW) based blockchains devote massive computing power to block mining, and are widely criticized for… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: 7 pages, 6 figures, submitted to IEEE Communications Magazine

  17. arXiv:2012.13817  [pdf, other

    cs.AI eess.SP

    Deep Learning Based Intelligent Inter-Vehicle Distance Control for 6G Enabled Cooperative Autonomous Driving

    Authors: Xiaosha Chen, Supeng Leng, Jianhua He, Longyu Zhou

    Abstract: Research on the sixth generation cellular networks (6G) is gaining huge momentum to achieve ubiquitous wireless connectivity. Connected autonomous driving (CAV) is a critical vertical envisioned for 6G, holding great potentials of improving road safety, road and energy efficiency. However the stringent service requirements of CAV applications on reliability, latency and high speed communications w… ▽ More

    Submitted 26 December, 2020; originally announced December 2020.

  18. arXiv:2012.06992  [pdf, other

    cs.NI

    Edge Intelligence for Autonomous Driving in 6G Wireless System: Design Challenges and Solutions

    Authors: Bo Yang, Xuelin Cao, Kai Xiong, Chau Yuen, Yong Liang Guan, Supeng Leng, Lijun Qian, Zhu Han

    Abstract: In a level-5 autonomous driving system, the autonomous driving vehicles (AVs) are expected to sense the surroundings via analyzing a large amount of data captured by a variety of onboard sensors in near-real-time. As a result, enormous computing costs will be introduced to the AVs for processing the tasks with the deployed machine learning (ML) model, while the inference accuracy may not be guaran… ▽ More

    Submitted 13 December, 2020; originally announced December 2020.

  19. arXiv:2006.15875  [pdf, other

    cs.IT cs.NI

    Communication and Computing Resource Optimization for Connected Autonomous Driving

    Authors: Kai Xiong, Supeng Leng, Xiaosha Chen, Chongwen Huang, Chau Yuen, Yong Liang Guan

    Abstract: Transportation system is facing a sharp disruption since the Connected Autonomous Vehicles (CAVs) can free people from driving and provide good driving experience with the aid of Vehicle-to-Vehicle (V2V) communications. Although CAVs bring benefits in terms of driving safety, vehicle string stability, and road traffic throughput, most existing work aims at improving only one of these performance m… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: 12 pages, 7 figures

  20. arXiv:2006.15855  [pdf, other

    cs.NI cs.IT

    Intelligent Task Offloading for Heterogeneous V2X Communications

    Authors: Kai Xiong, Supeng Leng, Chongwen Huang, Chau Yuen, Liang Guan

    Abstract: With the rapid development of autonomous driving technologies, it becomes difficult to reconcile the conflict between ever-increasing demands for high process rate in the intelligent automotive tasks and resource-constrained on-board processors. Fortunately, vehicular edge computing (VEC) has been proposed to meet the pressing resource demands. Due to the delay-sensitive traits of automotive tasks… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: 12 pages, 7 figures

  21. arXiv:1605.03678  [pdf, other

    cs.NI

    Energy-aware Traffic Engineering in Hybrid SDN/IP Backbone Networks

    Authors: Yunkai Wei, Xiaoning Zhang, Lei Xie, Supeng Leng

    Abstract: Software Defined Networking (SDN) can effectively improve the performance of traffic engineering and has promising application foreground in backbone networks. Therefore, new energy saving schemes must take SDN into account, which is extremely important considering the rapidly increasing energy consumption from Telecom and ISP networks. At the same time, the introduction of SDN in a current networ… ▽ More

    Submitted 12 May, 2016; originally announced May 2016.

    Comments: 8 pages, 7 figures. Accepted by Journal of Communications and Networks

  22. Multi-Objective Resource Allocation in Full-Duplex SWIPT Systems

    Authors: Shiyang Leng, Derrick Wing Kwan Ng, Nikola Zlatanov, Robert Schober

    Abstract: In this paper, we investigate the resource allocation algorithm design for full-duplex simultaneous wireless information and power transfer (FD-SWIPT) systems. The considered system comprises a FD radio base station, multiple single-antenna half-duplex (HD) users, and multiple energy harvesters equipped with multiple antennas. We propose a multi-objective optimization framework to study the trade-… ▽ More

    Submitted 31 January, 2016; v1 submitted 6 October, 2015; originally announced October 2015.

    Comments: accepted for presentation at the IEEE ICC 2016

  23. arXiv:1509.05959  [pdf, other

    cs.IT

    Multi-Objective Beamforming for Energy-Efficient SWIPT Systems

    Authors: Shiyang Leng, Derrick Wing Kwan Ng, Nikola Zlatanov, Robert Schober

    Abstract: In this paper, we study the resource allocation algorithm design for energy-efficient simultaneous wireless information and power transfer (SWIPT) systems. The considered system comprises a transmitter, an information receiver, and multiple energy harvesting receivers equipped with multiple antennas. We propose a multi-objective optimization framework to study the trade-off between the maximizatio… ▽ More

    Submitted 19 September, 2015; originally announced September 2015.

    Comments: accepted, 2016 International Conference on Computing, Networking and Communications, Wireless Communications Symposium

  24. arXiv:1504.02360  [pdf, other

    cs.IT

    Multi-Objective Power Allocation for Energy Efficient Wireless Information and Power Transfer Systems

    Authors: Shiyang Leng

    Abstract: Simultaneous wireless information and power transfer (SWIPT) provides a promising solution for enabling perpetual wireless networks. As energy efficiency (EE) is an im- portant evaluation of system performance, this thesis studies energy-efficient resource allocation algorithm designs in SWIPT systems. We first investigate the trade-off between the EE for information transmission, the EE for power… ▽ More

    Submitted 9 April, 2015; originally announced April 2015.

    Comments: Master thesis,Institute for Digital Communications, Germany, http://www.idc.lnt.de/en/forschung/energy-harvesting/

  25. arXiv:1402.5730  [pdf, ps, other

    cs.IT

    Power Efficient and Secure Multiuser Communication Systems with Wireless Information and Power Transfer

    Authors: Shiyang Leng, Derrick Wing Kwan Ng, Robert Schober

    Abstract: In this paper, we study resource allocation algorithm design for power efficient secure communication with simultaneous wireless information and power transfer (WIPT) in multiuser communication systems. In particular, we focus on power splitting receivers which are able to harvest energy and decode information from the received signals. The considered problem is modeled as an optimization problem… ▽ More

    Submitted 24 February, 2014; originally announced February 2014.

    Comments: Accepted for presentation at the IEEE International Conference on Communications (ICC), Sydney, Australia, 2014