Skip to main content

Showing 1–50 of 617 results for author: Xue, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01608  [pdf, other

    cs.LG cs.AI cs.DB cs.HC cs.SE

    Deriva-ML: A Continuous FAIRness Approach to Reproducible Machine Learning Models

    Authors: Zhiwei Li, Carl Kesselman, Mike D'Arch, Michael Pazzani, Benjamin Yizing Xu

    Abstract: Increasingly, artificial intelligence (AI) and machine learning (ML) are used in eScience applications [9]. While these approaches have great potential, the literature has shown that ML-based approaches frequently suffer from results that are either incorrect or unreproducible due to mismanagement or misuse of data used for training and validating the models [12, 15]. Recognition of the necessity… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  2. arXiv:2406.18770  [pdf, other

    cs.LG

    ADO-LLM: Analog Design Bayesian Optimization with In-Context Learning of Large Language Models

    Authors: Yuxuan Yin, Yu Wang, Boxun Xu, Peng Li

    Abstract: Analog circuit design requires substantial human expertise and involvement, which is a significant roadblock to design productivity. Bayesian Optimization (BO), a popular machine learning based optimization strategy, has been leveraged to automate analog design given its applicability across various circuit topologies and technologies. Traditional BO methods employ black box Gaussian Process surro… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 8 pages, 3 figures

  3. arXiv:2406.17586  [pdf, other

    cs.RO

    Benchmarking SLAM Algorithms in the Cloud: The SLAM Hive System

    Authors: Xinzhe Liu, Yuanyuan Yang, Bowen Xu, Sören Schwertfeger

    Abstract: Evaluating the performance of Simultaneous Localization and Map** (SLAM) algorithms is essential for scientists and users of robotic systems alike. But there are a multitude different permutations of possible options of hardware setups and algorithm configurations, as well as different datasets and algorithms, such that it is infeasible to thoroughly compare SLAM systems against the full state o… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2303.11854

  4. arXiv:2406.17325  [pdf, other

    cs.SE

    AI Tool Use and Adoption in Software Development by Individuals and Organizations: A Grounded Theory Study

    Authors: Ze Shi Li, Nowshin Nawar Arony, Ahmed Musa Awon, Daniela Damian, Bowen Xu

    Abstract: AI assistance tools such as ChatGPT, Copilot, and Gemini have dramatically impacted the nature of software development in recent years. Numerous studies have studied the positive benefits that practitioners have achieved from using these tools in their work. While there is a growing body of knowledge regarding the usability aspects of leveraging AI tools, we still lack concrete details on the issu… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  5. arXiv:2406.16713  [pdf, other

    cs.RO

    ShanghaiTech Map** Robot is All You Need: Robot System for Collecting Universal Ground Vehicle Datasets

    Authors: Bowen Xu, Xiting Zhao, Delin Feng, Yuanyuan Yang, Sören Schwertfeger

    Abstract: This paper presents the ShanghaiTech Map** Robot, a state-of-the-art unmanned ground vehicle (UGV) designed for collecting comprehensive multi-sensor datasets to support research in robotics, computer vision, and autonomous driving. The robot is equipped with a wide array of sensors including RGB cameras, RGB-D cameras, event-based cameras, IR cameras, LiDARs, mmWave radars, IMUs, ultrasonic ran… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Incomplete draft

  6. Residual path integrals for re-rendering

    Authors: Bing Xu, Tzu-Mao Li, Iliyan Georgiev, Trevor Hedstrom, Ravi Ramamoorthi

    Abstract: Conventional rendering techniques are primarily designed and optimized for single-frame rendering. In practical applications, such as scene editing and animation rendering, users frequently encounter scenes where only a small portion is modified between consecutive frames. In this paper, we develop a novel approach to incremental re-rendering of scenes with dynamic objects, where only a small part… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 14 pages, 13 figures

    ACM Class: I.3.0

  7. arXiv:2406.15501  [pdf

    cs.CR

    Secure Combination of Untrusted Time information Based on Optimized Dempster-Shafer Theory

    Authors: Yang Li, Yujie Luo, Yichen Zhang, Ao Sun, Wei Huang, Shuai Zhang, Tao Zhang, Chuang Zhou, Li Ma, Jie Yang, Mei Wu, Heng Wang, Yan Pan, Yun Shao, Xing Chen, Ziyang Chen, Song Yu, Hong Guo, Bingjie Xu

    Abstract: Secure precision time synchronization is important for applications of Cyber-Physical Systems. However, several attacks, especially the Time Delay Attack (TDA), deteriorates the performance of time synchronization system seriously. Multiple paths scheme is thought as an effective security countermeasure to decrease the influence of TDA. However, the effective secure combination algorithm is still… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  8. arXiv:2406.14565  [pdf, other

    cs.GR cs.CV

    ReflectanceFusion: Diffusion-based text to SVBRDF Generation

    Authors: Bowen Xue, Giuseppe Claudio Guarnera, Shuang Zhao, Zahra Montazeri

    Abstract: We introduce Reflectance Diffusion, a new neural text-to-texture model capable of generating high-fidelity SVBRDF maps from textual descriptions. Our method leverages a tandem neural approach, consisting of two modules, to accurately model the distribution of spatially varying reflectance as described by text prompts. Initially, we employ a pre-trained stable diffusion 2 model to generate a latent… ▽ More

    Submitted 25 April, 2024; originally announced June 2024.

  9. arXiv:2406.13162  [pdf, other

    cs.LG cs.AI q-bio.QM

    AntibodyFlow: Normalizing Flow Model for Designing Antibody Complementarity-Determining Regions

    Authors: Bohao Xu, Yanbo Wang, Wenyu Chen, Shimin Shan

    Abstract: Therapeutic antibodies have been extensively studied in drug discovery and development in the past decades. Antibodies are specialized protective proteins that bind to antigens in a lock-to-key manner. The binding strength/affinity between an antibody and a specific antigen is heavily determined by the complementarity-determining regions (CDRs) on the antibodies. Existing machine learning methods… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  10. arXiv:2406.12793  [pdf, other

    cs.CL

    ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

    Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, **g Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang , et al. (32 additional authors not shown)

    Abstract: We introduce ChatGLM, an evolving family of large language models that we have been develo** over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  11. arXiv:2406.11619  [pdf, other

    eess.AS cs.LG

    AV-CrossNet: an Audiovisual Complex Spectral Map** Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling

    Authors: Vahid Ahmadi Kalkhorani, Cheng Yu, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang

    Abstract: Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet is extended from the CrossNet architecture, which is a recently proposed network that performs complex spectral map** for speech separation by lever… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 Figures, and 4 Tables

  12. arXiv:2406.10885  [pdf, other

    cs.CL

    On the Role of Entity and Event Level Conceptualization in Generalizable Reasoning: A Survey of Tasks, Methods, Applications, and Future Directions

    Authors: Weiqi Wang, Tianqing Fang, Haochen Shi, Baixuan Xu, Wenxuan Ding, Liyu Zhang, Wei Fan, Jiaxin Bai, Haoran Li, Xin Liu, Yangqiu Song

    Abstract: Entity- and event-level conceptualization, as fundamental elements of human cognition, plays a pivotal role in generalizable reasoning. This process involves abstracting specific instances into higher-level concepts and forming abstract knowledge that can be applied in unfamiliar or novel situations, which can enhance models' inferential capabilities and support the effective transfer of knowledge… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  13. arXiv:2406.10744   

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Sheng** Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou , et al. (77 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: The author list and contents need to be verified by all authors

  14. arXiv:2406.10701  [pdf, other

    cs.CL

    MIND: Multimodal Shop** Intention Distillation from Large Vision-language Models for E-commerce Purchase Understanding

    Authors: Baixuan Xu, Weiqi Wang, Haochen Shi, Wenxuan Ding, Huihao **g, Tianqing Fang, Jiaxin Bai, Long Chen, Yangqiu Song

    Abstract: Improving user experience and providing personalized search results in E-commerce platforms heavily rely on understanding purchase intention. However, existing methods for acquiring large-scale intentions bank on distilling large language models with human annotation for verification. Such an approach tends to generate product-centric intentions, overlook valuable visual information from product i… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 8 pages, 5 figures

  15. arXiv:2406.08035  [pdf, other

    cs.CV cs.AI

    LVBench: An Extreme Long Video Understanding Benchmark

    Authors: Weihan Wang, Zehai He, Wenyi Hong, Yean Cheng, Xiaohan Zhang, Ji Qi, Shiyu Huang, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

    Abstract: Recent progress in multimodal large language models has markedly enhanced the understanding of short videos (typically under one minute), and several evaluation datasets have emerged accordingly. However, these advancements fall short of meeting the demands of real-world applications such as embodied intelligence for long-term decision-making, in-depth movie reviews and discussions, and live sport… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  16. arXiv:2406.07983  [pdf, other

    cs.LG

    Meta-Learning Neural Procedural Biases

    Authors: Christian Raymond, Qi Chen, Bing Xue, Mengjie Zhang

    Abstract: The goal of few-shot learning is to generalize and achieve high performance on new unseen learning tasks, where each task has only a limited number of examples available. Gradient-based meta-learning attempts to address this challenging task by learning how to learn new tasks by embedding inductive biases informed by prior learning experiences into the components of the learning algorithm. In this… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  17. arXiv:2406.04508  [pdf, other

    cs.CV cs.AI cs.LG

    OCCAM: Towards Cost-Efficient and Accuracy-Aware Image Classification Inference

    Authors: Dujian Ding, Bicheng Xu, Laks V. S. Lakshmanan

    Abstract: Image classification is a fundamental building block for a majority of computer vision applications. With the growing popularity and capacity of machine learning models, people can easily access trained image classifiers as a service online or offline. However, model use comes with a cost and classifiers of higher capacity usually incur higher inference costs. To harness the respective strengths o… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Under Review

  18. arXiv:2406.02902  [pdf, other

    cs.CL

    S$^2$GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis

    Authors: Bingfeng Chen, Qihan Ouyang, Yongqi Luo, Boyan Xu, Ruichu Cai, Zhifeng Hao

    Abstract: Previous graph-based approaches in Aspect based Sentiment Analysis(ABSA) have demonstrated impressive performance by utilizing graph neural networks and attention mechanisms to learn structures of static dependency trees and dynamic latent trees. However, incorporating both semantic and syntactic information simultaneously within complex global structures can introduce irrelevant contexts and synt… ▽ More

    Submitted 7 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: ACL2024(main)

  19. arXiv:2406.02728  [pdf

    cs.HC

    Impacts of Illuminance and Correlated Color Temperature on Cognitive Performance: A VR-Lighting Study

    Authors: Armin Mostafavi, Milica Vujovic, Tong Bill Xu, Michael Hensel

    Abstract: This study contributes to the ongoing exploration of methods to enhance the environmental design, cognitive function, and overall wellbeing, primarily focusing on understanding the modulation of human cognitive performance by artificial lighting conditions. In this investigation, participants (N=35) engaged with two distinct architectural contexts, each featuring five different lighting conditions… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  20. arXiv:2406.00983  [pdf, other

    cs.CL cs.AI

    Take its Essence, Discard its Dross! Debiasing for Toxic Language Detection via Counterfactual Causal Effect

    Authors: Junyu Lu, Bo Xu, Xiaokun Zhang, Kaiyuan Liu, Dongyu Zhang, Liang Yang, Hongfei Lin

    Abstract: Current methods of toxic language detection (TLD) typically rely on specific tokens to conduct decisions, which makes them suffer from lexical bias, leading to inferior performance and generalization. Lexical bias has both "useful" and "misleading" impacts on understanding toxicity. Unfortunately, instead of distinguishing between these impacts, current debiasing methods typically eliminate them i… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  21. arXiv:2406.00714  [pdf, other

    cs.CV

    A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving

    Authors: Di Wu, Feng Yang, Benlian Xu, Pan Liao, Bo Liu

    Abstract: With the rapid advancement of autonomous driving technology, there is a growing need for enhanced safety and efficiency in the automatic environmental perception of vehicles during their operation. In modern vehicle setups, cameras and mmWave radar (radar), being the most extensively employed sensors, demonstrate complementary characteristics, inherently rendering them conducive to fusion and faci… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  22. arXiv:2406.00282  [pdf, other

    cs.CV cs.CR

    Adversarial 3D Virtual Patches using Integrated Gradients

    Authors: Chengzeng You, Zhongyuan Hau, Binbin Xu, Soteris Demetriou

    Abstract: LiDAR sensors are widely used in autonomous vehicles to better perceive the environment. However, prior works have shown that LiDAR signals can be spoofed to hide real objects from 3D object detectors. This study explores the feasibility of reducing the required spoofing area through a novel object-hiding strategy based on virtual patches (VPs). We first manually design VPs (MVPs) and show that VP… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: IEEE/ACM Workshop on the Internet of Safe Things, May 23rd, 2024

  23. arXiv:2405.18688  [pdf, other

    cs.LG cs.AI cs.CL

    Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation

    Authors: Fengshuo Bai, Rui Zhao, Hongming Zhang, Sijia Cui, Ying Wen, Yaodong Yang, Bo Xu, Lei Han

    Abstract: Preference-based reinforcement learning (PbRL) has shown impressive capabilities in training agents without reward engineering. However, a notable limitation of PbRL is its dependency on substantial human feedback. This dependency stems from the learning loop, which entails accurate reward learning compounded with value/policy learning, necessitating a considerable number of samples. To boost the… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  24. arXiv:2405.18521  [pdf, other

    econ.TH cs.GT

    Falsifiable Test Design in Coordination Games

    Authors: Yingkai Li, Boli Xu

    Abstract: A principal can propose a project to an agent, who then decides whether to accept. Their payoffs from launching the project depend on an unknown binary state. The principal can obtain more precise information about the state through a test at no cost, but crucially, it is common knowledge that she can falsify the test result. In the most interesting case where players have conflicted interests, th… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  25. arXiv:2405.17719  [pdf, other

    cs.CV

    EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions?

    Authors: Boshen Xu, Ziheng Wang, Yang Du, Zhinan Song, Sipeng Zheng, Qin **

    Abstract: Egocentric video-language pretraining is a crucial paradigm to advance the learning of egocentric hand-object interactions (EgoHOI). Despite the great success on existing testbeds, these benchmarks focus more on closed-set visual concepts or limited scenarios. Due to the occurrence of diverse EgoHOIs in the real world, we propose an open-vocabulary benchmark named EgoHOIBench to reveal the diminis… ▽ More

    Submitted 3 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

    Comments: Code: https://github.com/xuboshen/EgoNCEpp

  26. arXiv:2405.16466  [pdf, other

    cs.NE

    High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost

    Authors: JiaKui Hu, Man Yao, Xuerui Qiu, Yuhong Chou, Yuxuan Cai, Ning Qiao, Yonghong Tian, Bo XU, Guoqi Li

    Abstract: Multi-timestep simulation of brain-inspired Spiking Neural Networks (SNNs) boost memory requirements during training and increase inference energy cost. Current training methods cannot simultaneously solve both training and inference dilemmas. This work proposes a novel Temporal Reversible architecture for SNNs (T-RevSNN) to jointly address the training and inference challenges by altering the for… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML2024

  27. Negative as Positive: Enhancing Out-of-distribution Generalization for Graph Contrastive Learning

    Authors: Zixu Wang, Bingbing Xu, Yige Yuan, Huawei Shen, Xueqi Cheng

    Abstract: Graph contrastive learning (GCL), standing as the dominant paradigm in the realm of graph pre-training, has yielded considerable progress. Nonetheless, its capacity for out-of-distribution (OOD) generalization has been relatively underexplored. In this work, we point out that the traditional optimization of InfoNCE in GCL restricts the cross-domain pairs only to be negative samples, which inevitab… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 5 pages, 5 figures, In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '24), July 14-18, 2024, Washington, DC, USA

    ACM Class: I.2

  28. arXiv:2405.15690  [pdf, other

    cs.SE

    A Case Study of LLM for Automated Vulnerability Repair: Assessing Impact of Reasoning and Patch Validation Feedback

    Authors: Ummay Kulsum, Haotian Zhu, Bowen Xu, Marcelo d'Amorim

    Abstract: Recent work in automated program repair (APR) proposes the use of reasoning and patch validation feedback to reduce the semantic gap between the LLMs and the code under analysis. The idea has been shown to perform well for general APR, but its effectiveness in other particular contexts remains underexplored. In this work, we assess the impact of reasoning and patch validation feedback to LLMs in t… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Code, data and artifacts are available: http://tinyurl.com/vrpilot-artifacts

  29. arXiv:2405.15266  [pdf, other

    cs.RO

    Conditional Variational Auto Encoder Based Dynamic Motion for Multi-task Imitation Learning

    Authors: Binzhao Xu, Muhayy Ud Din, Irfan Hussain

    Abstract: The dynamic motion primitive-based (DMP) method is an effective method of learning from demonstrations. However, most of the current DMP-based methods focus on learning one task with one module. Although, some deep learning-based frameworks can learn to multi-task at the same time. However, those methods require a large number of training data and have limited generalization of the learned behavio… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  30. arXiv:2405.14504  [pdf, other

    cs.CV cs.AI

    Enhanced Spatiotemporal Prediction Using Physical-guided And Frequency-enhanced Recurrent Neural Networks

    Authors: Xuanle Zhao, Yue Sun, Tielin Zhang, Bo Xu

    Abstract: Spatiotemporal prediction plays an important role in solving natural problems and processing video frames, especially in weather forecasting and human action recognition. Recent advances attempt to incorporate prior physical knowledge into the deep learning framework to estimate the unknown governing partial differential equations (PDEs), which have shown promising results in spatiotemporal predic… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 11 pages, 8 figures

  31. arXiv:2405.12089  [pdf, other

    cs.AR

    Using Formal Verification to Evaluate Single Event Upsets in a RISC-V Core

    Authors: Bing Xue, Mark Zwolinski

    Abstract: Reliability has been a major concern in embedded systems. Higher transistor density and lower voltage supply increase the vulnerability of embedded systems to soft errors. A Single Event Upset (SEU), which is also called a soft error, can reverse a bit in a sequential element, resulting in a system failure. Simulation-based fault injection has been widely used to evaluate reliability, as suggested… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  32. arXiv:2405.11155  [pdf, other

    eess.SY cs.CC

    Inner-approximate Reachability Computation via Zonotopic Boundary Analysis

    Authors: De** Ren, Zhen Liang, Chenyu Wu, Jianqiang Ding, Taoran Wu, Bai Xue

    Abstract: Inner-approximate reachability analysis involves calculating subsets of reachable sets, known as inner-approximations. This analysis is crucial in the fields of dynamic systems analysis and control theory as it provides a reliable estimation of the set of states that a system can reach from given initial states at a specific time instant. In this paper, we study the inner-approximate reachability… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: the extended version of the paper accepted by CAV 2024

  33. arXiv:2405.10452  [pdf, other

    cs.CL cs.LG

    Navigating Public Sentiment in the Circular Economy through Topic Modelling and Hyperparameter Optimisation

    Authors: Junhao Song, Yingfang Yuan, Kaiwen Chang, Bing Xu, ** Xuan, Wei Pang

    Abstract: To advance the circular economy (CE), it is crucial to gain insights into the evolution of public sentiments, cognitive pathways of the masses concerning circular products and digital technology, and recognise the primary concerns. To achieve this, we collected data related to the CE from diverse platforms including Twitter, Reddit, and The Guardian. This comprehensive data collection spanned acro… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  34. arXiv:2405.10121  [pdf, other

    cs.CL cs.MM

    Distilling Implicit Multimodal Knowledge into LLMs for Zero-Resource Dialogue Generation

    Authors: Bo Zhang, Hui Ma, Jian Ding, Jian Wang, Bo Xu, Hongfei Lin

    Abstract: Integrating multimodal knowledge into large language models (LLMs) represents a significant advancement in dialogue generation capabilities. However, the effective incorporation of such knowledge in zero-resource scenarios remains a substantial challenge due to the scarcity of diverse, high-quality dialogue datasets. To address this, we propose the Visual Implicit Knowledge Distillation Framework… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: Under Review

  35. arXiv:2405.09556  [pdf, other

    eess.SP cs.AI cs.IT

    Co-learning-aided Multi-modal-deep-learning Framework of Passive DOA Estimators for a Heterogeneous Hybrid Massive MIMO Receiver

    Authors: Jiatong Bai, Feng Shu, Qinghe Zheng, Bo Xu, Baihua Shi, Yiwen Chen, Weibin Zhang, Xianpeng Wang

    Abstract: Due to its excellent performance in rate and resolution, fully-digital (FD) massive multiple-input multiple-output (MIMO) antenna arrays has been widely applied in data transmission and direction of arrival (DOA) measurements, etc. But it confronts with two main challenges: high computational complexity and circuit cost. The two problems may be addressed well by hybrid analog-digital (HAD) structu… ▽ More

    Submitted 12 June, 2024; v1 submitted 27 April, 2024; originally announced May 2024.

  36. arXiv:2405.06869  [pdf, other

    cs.LG cs.NE

    Sharpness-Aware Minimization for Evolutionary Feature Construction in Regression

    Authors: Hengzhe Zhang, Qi Chen, Bing Xue, Wolfgang Banzhaf, Mengjie Zhang

    Abstract: In recent years, genetic programming (GP)-based evolutionary feature construction has achieved significant success. However, a primary challenge with evolutionary feature construction is its tendency to overfit the training data, resulting in poor generalization on unseen data. In this research, we draw inspiration from PAC-Bayesian theory and propose using sharpness-aware minimization in function… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence

  37. arXiv:2405.06806  [pdf, other

    cs.SE

    An Empirical Study on the Effectiveness of Large Language Models for SATD Identification and Classification

    Authors: Mohammad Sadegh Sheikhaei, Yuan Tian, Shaowei Wang, Bowen Xu

    Abstract: Self-Admitted Technical Debt (SATD), a concept highlighting sub-optimal choices in software development documented in code comments or other project resources, poses challenges in the maintainability and evolution of software systems. Large language models (LLMs) have demonstrated significant effectiveness across a broad range of software tasks, especially in software text generation tasks. Noneth… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: This is the preprint version of a paper that has been submitted to Empirical Software Engineering

    ACM Class: D.2; I.2

  38. arXiv:2405.05008  [pdf, other

    cs.CL

    ADELIE: Aligning Large Language Models on Information Extraction

    Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

    Abstract: Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effective… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  39. arXiv:2405.03155  [pdf, other

    cs.RO

    CushSense: Soft, Stretchable, and Comfortable Tactile-Sensing Skin for Physical Human-Robot Interaction

    Authors: Boxin Xu, Luoyan Zhong, Grace Zhang, Xiaoyu Liang, Diego Virtue, Rishabh Madan, Tapomayukh Bhattacharjee

    Abstract: Whole-arm tactile feedback is crucial for robots to ensure safe physical interaction with their surroundings. This paper introduces CushSense, a fabric-based soft and stretchable tactile-sensing skin designed for physical human-robot interaction (pHRI) tasks such as robotic caregiving. Using stretchable fabric and hyper-elastic polymer, CushSense identifies contacts by monitoring capacitive change… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 8 pages, 8 figures, ICRA2024

  40. arXiv:2405.02828  [pdf, other

    cs.SE cs.LG

    Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy

    Authors: Aftab Hussain, Md Rafiqul Islam Rabin, Toufique Ahmed, Bowen Xu, Premkumar Devanbu, Mohammad Amin Alipour

    Abstract: Large language models (LLMs) have provided a lot of exciting new capabilities in software development. However, the opaque nature of these models makes them difficult to reason about and inspect. Their opacity gives rise to potential security risks, as adversaries can train and deploy compromised models to disrupt the software development process in the victims' organization. This work presents… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2305.03803

  41. arXiv:2405.00435  [pdf, other

    cs.HC

    CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model

    Authors: Wei Zhang, Wong Kam-Kwai, Biying Xu, Yiwen Ren, Yuhuai Li, Minfeng Zhu, Yingchaojie Feng, Wei Chen

    Abstract: The integration of new technology with cultural studies enhances our understanding of cultural heritage but often struggles to connect with diverse audiences. It is challenging to align personal interpretations with the intended meanings across different cultures. Our study investigates the important factors in appreciating art from a cross-cultural perspective. We explore the application of Large… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  42. arXiv:2405.00145  [pdf, other

    cs.SE cs.CV

    GUing: A Mobile GUI Search Engine using a Vision-Language Model

    Authors: Jialiang Wei, Anne-Lise Courbis, Thomas Lambolais, Binbin Xu, Pierre Louis Bernard, Gérard Dray, Walid Maalej

    Abstract: App developers use the Graphical User Interface (GUI) of other apps as an important source of inspiration to design and improve their own apps. In recent years, research suggested various approaches to retrieve GUI designs that fit a certain text query from screenshot datasets acquired through automated GUI exploration. However, such text-to-GUI retrieval approaches only leverage the textual infor… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  43. arXiv:2404.18410  [pdf, other

    cs.CL

    Mixture-of-Instructions: Comprehensive Alignment of a Large Language Model through the Mixture of Diverse System Prompting Instructions

    Authors: Bowen Xu, Shaoyu Wu, Kai Liu, Lulu Hu

    Abstract: With the proliferation of large language models (LLMs), the comprehensive alignment of such models across multiple tasks has emerged as a critical area of research. Existing alignment methodologies primarily address single task, such as multi-turn dialogue, coding, mathematical problem-solving, and tool usage. However, AI-driven products that leverage language models usually necessitate a fusion o… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  44. arXiv:2404.17683  [pdf, other

    math.OC cs.GT cs.LG eess.SY

    Energy Storage Arbitrage in Two-settlement Markets: A Transformer-Based Approach

    Authors: Saud Alghumayjan, Jiajun Han, Ningkun Zheng, Ming Yi, Bolun Xu

    Abstract: This paper presents an integrated model for bidding energy storage in day-ahead and real-time markets to maximize profits. We show that in integrated two-stage bidding, the real-time bids are independent of day-ahead settlements, while the day-ahead bids should be based on predicted real-time prices. We utilize a transformer-based model for real-time price prediction, which captures complex dynami… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  45. arXiv:2404.15067  [pdf, other

    cs.CL

    Enhancing Textual Personality Detection toward Social Media: Integrating Long-term and Short-term Perspectives

    Authors: Haohao Zhu, Xiaokun Zhang, Junyu Lu, Youlin Wu, Zewen Bai, Changrong Min, Liang Yang, Bo Xu, Dongyu Zhang, Hongfei Lin

    Abstract: Textual personality detection aims to identify personality characteristics by analyzing user-generated content toward social media platforms. Numerous psychological literature highlighted that personality encompasses both long-term stable traits and short-term dynamic states. However, existing studies often concentrate only on either long-term or short-term personality representations, without eff… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 11 pages, 9 figures

  46. FineRec:Exploring Fine-grained Sequential Recommendation

    Authors: Xiaokun Zhang, Bo Xu, Youlin Wu, Yuan Zhong, Hongfei Lin, Fenglong Ma

    Abstract: Sequential recommendation is dedicated to offering items of interest for users based on their history behaviors. The attribute-opinion pairs, expressed by users in their reviews for items, provide the potentials to capture user preferences and item characteristics at a fine-grained level. To this end, we propose a novel framework FineRec that explores the attribute-opinion pairs of reviews to fine… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: This work has been accepted by SIGIR24' as a full paper

  47. Disentangling ID and Modality Effects for Session-based Recommendation

    Authors: Xiaokun Zhang, Bo Xu, Zhaochun Ren, Xiaochen Wang, Hongfei Lin, Fenglong Ma

    Abstract: Session-based recommendation aims to predict intents of anonymous users based on their limited behaviors. Modeling user behaviors involves two distinct rationales: co-occurrence patterns reflected by item IDs, and fine-grained preferences represented by item modalities (e.g., text and images). However, existing methods typically entangle these causes, leading to their failure in achieving accurate… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: This work has been accepted by SIGIR24' as a full paper

  48. arXiv:2404.11070  [pdf

    cs.CV eess.SP

    Sky-GVIO: an enhanced GNSS/INS/Vision navigation with FCN-based sky-segmentation in urban canyon

    Authors: **grong Wang, Bo Xu, Ronghe **, Shoujian Zhang, Kefu Gao, **gnan Liu

    Abstract: Accurate, continuous, and reliable positioning is a critical component of achieving autonomous driving. However, in complex urban canyon environments, the vulnerability of a stand-alone sensor and non-line-of-sight (NLOS) caused by high buildings, trees, and elevated structures seriously affect positioning results. To address these challenges, a sky-view images segmentation algorithm based on Full… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  49. arXiv:2404.10731  [pdf, ps, other

    cs.AI

    What is Meant by AGI? On the Definition of Artificial General Intelligence

    Authors: Bowen Xu

    Abstract: This paper aims to establish a consensus on AGI's definition. General intelligence refers to the adaptation to open environments according to certain principles using limited resources. It emphasizes that adaptation or learning is an indispensable property of intelligence, and places the controversial part within the principles of intelligence, which can be described from different perspectives.

    Submitted 16 April, 2024; originally announced April 2024.

  50. arXiv:2404.04140  [pdf, other

    cs.CV cs.LG

    Improving Detection in Aerial Images by Capturing Inter-Object Relationships

    Authors: Botao Ren, Botian Xu, Yifan Pu, **gyi Wang, Zhidong Deng

    Abstract: In many image domains, the spatial distribution of objects in a scene exhibits meaningful patterns governed by their semantic relationships. In most modern detection pipelines, however, the detection proposals are processed independently, overlooking the underlying relationships between objects. In this work, we introduce a transformer-based approach to capture these inter-object relationships to… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.