Skip to main content

Showing 1–50 of 383 results for author: Xia, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01887  [pdf, other

    cs.LG cs.AI cs.CL

    Beyond Numeric Awards: In-Context Dueling Bandits with LLM Agents

    Authors: Fanzeng Xia, Hao Liu, Yisong Yue, Tongxin Li

    Abstract: In-context decision-making is an important capability of artificial general intelligence, which Large Language Models (LLMs) have effectively demonstrated in various scenarios. However, LLMs often face challenges when dealing with numerical contexts, and limited attention has been paid to evaluating their performance through preference feedback generated by the environment. This paper investigates… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.17739  [pdf, other

    cs.CL cs.AI

    Find Parent then Label Children: A Two-stage Taxonomy Completion Method with Pre-trained Language Model

    Authors: Fei Xia, Yixuan Weng, Shizhu He, Kang Liu, Jun Zhao

    Abstract: Taxonomies, which organize domain concepts into hierarchical structures, are crucial for building knowledge systems and downstream applications. As domain knowledge evolves, taxonomies need to be continuously updated to include new concepts. Previous approaches have mainly focused on adding concepts to the leaf nodes of the existing hierarchical tree, which does not fully utilize the taxonomy's kn… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.14132  [pdf, other

    cs.AI

    Enhancing Monotonic Modeling with Spatio-Temporal Adaptive Awareness in Diverse Marketing

    Authors: Bin Li, Jiayan Pei, Feiyang Xiao, Yifan Zhao, Zhixing Zhang, Diwei Liu, HengXu He, Jia Jia

    Abstract: In the mobile internet era, the Online Food Ordering Service (OFOS) emerges as an integral component of inclusive finance owing to the convenience it brings to people. OFOS platforms offer dynamic allocation incentives to users and merchants through diverse marketing campaigns to encourage payments while maintaining the platforms' budget efficiency. Despite significant progress, the marketing doma… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 7 pages

  4. arXiv:2406.13626  [pdf, other

    cs.CL cs.AI

    Fine-Tuning Gemma-7B for Enhanced Sentiment Analysis of Financial News Headlines

    Authors: Kangtong Mo, Wenyan Liu, Xuanzhen Xu, Chang Yu, Yuelin Zou, Fangqing Xia

    Abstract: In this study, we explore the application of sentiment analysis on financial news headlines to understand investor sentiment. By leveraging Natural Language Processing (NLP) and Large Language Models (LLM), we analyze sentiment from the perspective of retail investors. The FinancialPhraseBank dataset, which contains categorized sentiments of financial news headlines, serves as the basis for our an… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  5. arXiv:2406.12501  [pdf, other

    cs.IR

    Improving Multi-modal Recommender Systems by Denoising and Aligning Multi-modal Content and User Feedback

    Authors: Guipeng Xv, Xinyu Li, Ruobing Xie, Chen Lin, Chong Liu, Feng Xia, Zhanhui Kang, Leyu Lin

    Abstract: Multi-modal recommender systems (MRSs) are pivotal in diverse online web platforms and have garnered considerable attention in recent years. However, previous studies overlook the challenges of (1) noisy multi-modal content, (2) noisy user feedback, and (3) aligning multi-modal content with user feedback. In order to tackle these challenges, we propose Denoising and Aligning Multi-modal Recommende… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  6. arXiv:2406.11138  [pdf, other

    cs.CV cs.AI

    Diffusion Models in Low-Level Vision: A Survey

    Authors: Chunming He, Yuqi Shen, Chengyu Fang, Fengyang Xiao, Longxiang Tang, Yulun Zhang, Wangmeng Zuo, Zhenhua Guo, Xiu Li

    Abstract: Deep generative models have garnered significant attention in low-level vision tasks due to their generative capabilities. Among them, diffusion model-based solutions, characterized by a forward diffusion process and a reverse denoising process, have emerged as widely acclaimed for their ability to produce samples of superior quality and diversity. This ensures the generation of visually compellin… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 20 pages, 23 figures, 4 tables

  7. arXiv:2406.07966  [pdf, other

    cs.CV

    Real-world Image Dehazing with Coherence-based Label Generator and Cooperative Unfolding Network

    Authors: Chengyu Fang, Chunming He, Fengyang Xiao, Yulun Zhang, Longxiang Tang, Yuelin Zhang, Kai Li, Xiu Li

    Abstract: Real-world Image Dehazing (RID) aims to alleviate haze-induced degradation in real-world settings. This task remains challenging due to the complexities in accurately modeling real haze distributions and the scarcity of paired real-world data. To address these challenges, we first introduce a cooperative unfolding network that jointly models atmospheric scattering and image scenes, effectively int… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 10 pages, 7 figures, 6 tables

  8. arXiv:2406.06618  [pdf, other

    cs.SI cs.AI cs.CY cs.LG physics.soc-ph

    PANDORA: Deep graph learning based COVID-19 infection risk level forecasting

    Authors: Shuo Yu, Feng Xia, Yueru Wang, Shihao Li, Falih Febrinanto, Madhu Chetty

    Abstract: COVID-19 as a global pandemic causes a massive disruption to social stability that threatens human life and the economy. Policymakers and all elements of society must deliver measurable actions based on the pandemic's severity to minimize the detrimental impact of COVID-19. A proper forecasting system is arguably important to provide an early signal of the risk of COVID-19 infection so that the au… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  9. arXiv:2406.06617  [pdf, other

    cs.SI cs.LG

    Collaborative Team Recognition: A Core Plus Extension Structure

    Authors: Shuo Yu, Fayez Alqahtani, Amr Tolba, Ivan Lee, Tao Jia, Feng Xia

    Abstract: Scientific collaboration is a significant behavior in knowledge creation and idea exchange. To tackle large and complex research questions, a trend of team formation has been observed in recent decades. In this study, we focus on recognizing collaborative teams and exploring inner patterns using scholarly big graph data. We propose a collaborative team recognition (CORE) model with a "core + exten… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  10. arXiv:2406.04702  [pdf, other

    cs.LG

    Marking the Pace: A Blockchain-Enhanced Privacy-Traceable Strategy for Federated Recommender Systems

    Authors: Zhen Cai, Tao Tang, Shuo Yu, Yunpeng Xiao, Feng Xia

    Abstract: Federated recommender systems have been crucially enhanced through data sharing and continuous model updates, attributed to the pervasive connectivity and distributed computing capabilities of Internet of Things (IoT) devices. Given the sensitivity of IoT data, transparent data processing in data sharing and model updates is paramount. However, existing methods fall short in tracing the flow of sh… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  11. arXiv:2406.04690  [pdf, other

    cs.LG stat.ML

    Higher-order Structure Based Anomaly Detection on Attributed Networks

    Authors: Xu Yuan, Na Zhou, Shuo Yu, Huafei Huang, Zhikui Chen, Feng Xia

    Abstract: Anomaly detection (such as telecom fraud detection and medical image detection) has attracted the increasing attention of people. The complex interaction between multiple entities widely exists in the network, which can reflect specific human behavior patterns. Such patterns can be modeled by higher-order network structures, thus benefiting anomaly detection on attributed networks. However, due to… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  12. arXiv:2405.17034  [pdf, other

    cs.LG cs.AI

    FUGNN: Harmonizing Fairness and Utility in Graph Neural Networks

    Authors: Renqiang Luo, Huafei Huang, Shuo Yu, Zhuoyang Han, Estrid He, Xiuzhen Zhang, Feng Xia

    Abstract: Fairness-aware Graph Neural Networks (GNNs) often face a challenging trade-off, where prioritizing fairness may require compromising utility. In this work, we re-examine fairness through the lens of spectral graph theory, aiming to reconcile fairness and utility within the framework of spectral graph learning. We explore the correlation between sensitive features and spectrum in GNNs, using theore… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  13. arXiv:2405.16021  [pdf, other

    cs.RO

    VADER: Visual Affordance Detection and Error Recovery for Multi Robot Human Collaboration

    Authors: Michael Ahn, Montserrat Gonzalez Arenas, Matthew Bennice, Noah Brown, Christine Chan, Byron David, Anthony Francis, Gavin Gonzalez, Rainer Hessmer, Tomas Jackson, Nikhil J Joshi, Daniel Lam, Tsang-Wei Edward Lee, Alex Luong, Sharath Maddineni, Harsh Patel, Jodilyn Peralta, Jornell Quiambao, Diego Reyes, Rosario M Jauregui Ruano, Dorsa Sadigh, Pannag Sanketi, Leila Takayama, Pavel Vodenski, Fei Xia

    Abstract: Robots today can exploit the rich world knowledge of large language models to chain simple behavioral skills into long-horizon tasks. However, robots often get interrupted during long-horizon tasks due to primitive skill failures and dynamic environments. We propose VADER, a plan, execute, detect framework with seeking help as a new skill that enables robots to recover and complete long-horizon ta… ▽ More

    Submitted 30 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  14. arXiv:2405.14156  [pdf, other

    cs.CV

    Unveiling the Tapestry of Consistency in Large Vision-Language Models

    Authors: Yuan Zhang, Fei Xiao, Tao Huang, Chun-Kai Fan, Hongyuan Dong, Jiawen Li, Jiacong Wang, Kuan Cheng, Shanghang Zhang, Haoyuan Guo

    Abstract: Large vision-language models (LVLMs) have recently achieved rapid progress, exhibiting great perception and reasoning abilities concerning visual information. However, when faced with prompts in different sizes of solution spaces, LVLMs fail to always give consistent answers regarding the same knowledge point. This inconsistency of answers between different solution spaces is prevalent in LVLMs an… ▽ More

    Submitted 7 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: This project is available at https://github.com/foundation-multimodal-models/ConBench

  15. arXiv:2405.09543  [pdf, other

    cs.CY cs.AI cs.IR cs.LG

    Algorithmic Fairness: A Tolerance Perspective

    Authors: Renqiang Luo, Tao Tang, Feng Xia, Jiaying Liu, Chengpei Xu, Leo Yu Zhang, Wei Xiang, Chengqi Zhang

    Abstract: Recent advancements in machine learning and deep learning have brought algorithmic fairness into sharp focus, illuminating concerns over discriminatory decision making that negatively impacts certain individuals or groups. These concerns have manifested in legal, ethical, and societal challenges, including the erosion of trust in intelligent systems. In response, this survey delves into the existi… ▽ More

    Submitted 26 April, 2024; originally announced May 2024.

    Comments: 33 pages, 4 figures

    MSC Class: 68T01; 68W40 ACM Class: I.2.6; K.4.2; H.1.2

  16. arXiv:2405.04101  [pdf, other

    cs.LG cs.AI

    Continual Learning in the Presence of Repetition

    Authors: Hamed Hemati, Lorenzo Pellegrini, Xiaotian Duan, Zixuan Zhao, Fangfang Xia, Marc Masana, Benedikt Tscheschner, Eduardo Veas, Yuxiang Zheng, Shiji Zhao, Shao-Yuan Li, Sheng-Jun Huang, Vincenzo Lomonaco, Gido M. van de Ven

    Abstract: Continual learning (CL) provides a framework for training models in ever-evolving environments. Although re-occurrence of previously seen objects or tasks is common in real-world problems, the concept of repetition in the data stream is not often considered in standard benchmarks for CL. Unlike with the rehearsal mechanism in buffer-based strategies, where sample repetition is controlled by the st… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Preprint; Challenge Report of the 4th Workshop on Continual Learning in Computer Vision at CVPR

  17. arXiv:2405.04029  [pdf, other

    cs.CR

    Enabling Privacy-Preserving and Publicly Auditable Federated Learning

    Authors: Huang Zeng, Anjia Yang, Jian Weng, Min-Rong Chen, Fengjun Xiao, Yi Liu, Ye Yao

    Abstract: Federated learning (FL) has attracted widespread attention because it supports the joint training of models by multiple participants without moving private dataset. However, there are still many security issues in FL that deserve discussion. In this paper, we consider three major issues: 1) how to ensure that the training process can be publicly audited by any third party; 2) how to avoid the infl… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: ICC 2024 - 2024 IEEE International Conference on Communications Conference Program

    ACM Class: C.2.2; C.2.4; E.3

  18. arXiv:2405.01882  [pdf, other

    cs.RO cs.AI eess.SP

    Millimeter Wave Radar-based Human Activity Recognition for Healthcare Monitoring Robot

    Authors: Zhanzhong Gu, Xiangjian He, Gengfa Fang, Chengpei Xu, Feng Xia, Wen**g Jia

    Abstract: Healthcare monitoring is crucial, especially for the daily care of elderly individuals living alone. It can detect dangerous occurrences, such as falls, and provide timely alerts to save lives. Non-invasive millimeter wave (mmWave) radar-based healthcare monitoring systems using advanced human activity recognition (HAR) models have recently gained significant attention. However, they encounter cha… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  19. arXiv:2405.00266  [pdf, other

    cs.NI

    Robot-As-A-Sensor: Forming a Sensing Network with Robots for Underground Mining Missions

    Authors: Xiaoyu Ai, Chengpei Xu, Binghao Li, Feng Xia

    Abstract: Nowadays, robots are deployed as mobile platforms equipped with sensing, communication and computing capabilities, especially in the mining industry, where they perform tasks in hazardous and repetitive environments. Despite their potential, individual robots face significant limitations when completing complex tasks that require the collaboration of multiple robots. This collaboration requires a… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: Submitted to Special Issue on Neuro-Inspired Learning for Robotics for IEEE Transactions on Cognitive and Developmental Systems

  20. arXiv:2404.17169  [pdf, other

    cs.LG cs.CY

    FairGT: A Fairness-aware Graph Transformer

    Authors: Renqiang Luo, Huafei Huang, Shuo Yu, Xiuzhen Zhang, Feng Xia

    Abstract: The design of Graph Transformers (GTs) generally neglects considerations for fairness, resulting in biased outcomes against certain sensitive subgroups. Since GTs encode graph information without relying on message-passing mechanisms, conventional fairness-aware graph learning methods cannot be directly applicable to address these issues. To tackle this challenge, we propose FairGT, a Fairness-awa… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Journal ref: IJCAI2024

  21. arXiv:2404.08965  [pdf, other

    cs.CV cs.MM

    Seeing Text in the Dark: Algorithm and Benchmark

    Authors: Chengpei Xu, Hao Fu, Long Ma, Wen**g Jia, Chengqi Zhang, Feng Xia, Xiaoyu Ai, Binghao Li, Wenjie Zhang

    Abstract: Localizing text in low-light environments is challenging due to visual degradations. Although a straightforward solution involves a two-stage pipeline with low-light image enhancement (LLE) as the initial step followed by detector, LLE is primarily designed for human vision instead of machine and can accumulate errors. In this work, we propose an efficient and effective single-stage approach for l… ▽ More

    Submitted 23 April, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

  22. arXiv:2404.06645  [pdf, other

    cs.RO cs.AI

    GenCHiP: Generating Robot Policy Code for High-Precision and Contact-Rich Manipulation Tasks

    Authors: Kaylee Burns, A**kya Jain, Keegan Go, Fei Xia, Michael Stark, Stefan Schaal, Karol Hausman

    Abstract: Large Language Models (LLMs) have been successful at generating robot policy code, but so far these results have been limited to high-level tasks that do not require precise movement. It is an open question how well such approaches work for tasks that require reasoning over contact forces and working within tight success tolerances. We find that, with the right action space, LLMs are capable of su… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 14 pages, 12 figures

    ACM Class: I.2.9

  23. arXiv:2404.00826  [pdf, other

    cs.CL

    Extracting Social Determinants of Health from Pediatric Patient Notes Using Large Language Models: Novel Corpus and Methods

    Authors: Yujuan Fu, Giridhar Kaushik Ramachandran, Nicholas J Dobbins, Namu Park, Michael Leu, Abby R. Rosenberg, Kevin Lybarger, Fei Xia, Ozlem Uzuner, Meliha Yetisgen

    Abstract: Social determinants of health (SDoH) play a critical role in sha** health outcomes, particularly in pediatric populations where interventions can have long-term implications. SDoH are frequently studied in the Electronic Health Record (EHR), which provides a rich repository for diverse patient data. In this work, we present a novel annotated corpus, the Pediatric Social History Annotation Corpus… ▽ More

    Submitted 4 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: 12 pages, 2 figures and 3 tables. Accepted by LREC-COLING 2024

  24. arXiv:2403.16519  [pdf, ps, other

    cs.SC

    Two Algorithms for Computing Rational Univariate Representations of Zero-Dimensional Ideals with Parameters

    Authors: Dingkang Wang, **g**g Wei, Fanghui Xiao, Xiaopeng Zheng

    Abstract: Two algorithms for computing the rational univariate representation of zero-dimensional ideals with parameters are presented in the paper. Different from the rational univariate representation of zero-dimensional ideals without parameters, the number of zeros of zero-dimensional ideals with parameters under various specializations is different, which leads to choosing and checking the separating e… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  25. arXiv:2403.15637  [pdf, other

    cs.RO

    CoNVOI: Context-aware Navigation using Vision Language Models in Outdoor and Indoor Environments

    Authors: Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, Mohamed Elnoor, Anuj Zore, Brian Ichter, Fei Xia, Jie Tan, Wenhao Yu, Dinesh Manocha

    Abstract: We present ConVOI, a novel method for autonomous robot navigation in real-world indoor and outdoor environments using Vision Language Models (VLMs). We employ VLMs in two ways: first, we leverage their zero-shot image classification capability to identify the context or scenario (e.g., indoor corridor, outdoor terrain, crosswalk, etc) of the robot's surroundings, and formulate context-based naviga… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 9 pages, 4 figures

  26. arXiv:2403.11806  [pdf, other

    cs.IT eess.SP

    Fluid Antenna for Mobile Edge Computing

    Authors: Yi** Zuo, Jiajia Guo, Biyun Sheng, Chen Dai, Fu Xiao, Shi **

    Abstract: In the evolving environment of mobile edge computing (MEC), optimizing system performance to meet the growing demand for low-latency computing services is a top priority. Integrating fluidic antenna (FA) technology into MEC networks provides a new approach to address this challenge. This letter proposes an FA-enabled MEC scheme that aims to minimize the total system delay by leveraging the mobilit… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  27. arXiv:2403.10815  [pdf, other

    eess.IV cs.CV

    MicroDiffusion: Implicit Representation-Guided Diffusion for 3D Reconstruction from Limited 2D Microscopy Projections

    Authors: Mude Hui, Zihao Wei, Hongru Zhu, Fei Xia, Yuyin Zhou

    Abstract: Volumetric optical microscopy using non-diffracting beams enables rapid imaging of 3D volumes by projecting them axially to 2D images but lacks crucial depth information. Addressing this, we introduce MicroDiffusion, a pioneering tool facilitating high-quality, depth-resolved 3D volume reconstruction from limited 2D projections. While existing Implicit Neural Representation (INR) models often yiel… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  28. arXiv:2403.09227  [pdf, other

    cs.RO cs.AI

    BEHAVIOR-1K: A Human-Centered, Embodied AI Benchmark with 1,000 Everyday Activities and Realistic Simulation

    Authors: Chengshu Li, Ruohan Zhang, Josiah Wong, Cem Gokmen, Sanjana Srivastava, Roberto Martín-Martín, Chen Wang, Gabrael Levine, Wensi Ai, Benjamin Martinez, Hang Yin, Michael Lingelbach, Minjune Hwang, Ayano Hiranaka, Sujay Garlanka, Arman Aydin, Sharon Lee, Jiankai Sun, Mona Anvari, Manasi Sharma, Dhruva Bansal, Samuel Hunter, Kyu-Young Kim, Alan Lou, Caleb R Matthews , et al. (10 additional authors not shown)

    Abstract: We present BEHAVIOR-1K, a comprehensive simulation benchmark for human-centered robotics. BEHAVIOR-1K includes two components, guided and motivated by the results of an extensive survey on "what do you want robots to do for you?". The first is the definition of 1,000 everyday activities, grounded in 50 scenes (houses, gardens, restaurants, offices, etc.) with more than 9,000 objects annotated with… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: A preliminary version was published at 6th Conference on Robot Learning (CoRL 2022)

  29. arXiv:2403.08310  [pdf, other

    cs.CV

    StyleDyRF: Zero-shot 4D Style Transfer for Dynamic Neural Radiance Fields

    Authors: Hongbin Xu, Weitao Chen, Feng Xiao, Baigui Sun, Wenxiong Kang

    Abstract: 4D style transfer aims at transferring arbitrary visual style to the synthesized novel views of a dynamic 4D scene with varying viewpoints and times. Existing efforts on 3D style transfer can effectively combine the visual features of style images and neural radiance fields (NeRF) but fail to handle the 4D dynamic scenes limited by the static scene assumption. Consequently, we aim to handle the no… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: In submission. The code and model are released at: https://github.com/ToughStoneX/StyleDyRF

  30. arXiv:2403.08182  [pdf, other

    cs.CV

    SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention

    Authors: Feng Xiao, Hongbin Xu, Qiuxia Wu, Wenxiong Kang

    Abstract: 3D visual grounding aims to automatically locate the 3D region of the specified object given the corresponding textual description. Existing works fail to distinguish similar objects especially when multiple referred objects are involved in the description. Experiments show that direct matching of language and visual modal has limited capacity to comprehend complex referential relationships in utt… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  31. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  32. arXiv:2402.17489  [pdf, other

    cs.AR

    SSRESF: Sensitivity-aware Single-particle Radiation Effects Simulation Framework in SoC Platforms based on SVM Algorithm

    Authors: Meng Liu, Shuai Li, Fei Xiao, Ruijie Wang, Chunxue Liu, Liang Wang

    Abstract: The ever-expanding scale of integrated circuits has brought about a significant rise in the design risks associated with radiation-resistant integrated circuit chips. Traditional single-particle experimental methods, with their iterative design approach, are increasingly ill-suited for the challenges posed by large-scale integrated circuits. In response, this article introduces a novel sensitivity… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted to the 61th ACM/IEEE Design Automation conference (DAC 2024)

  33. arXiv:2402.14254  [pdf, other

    cs.LG stat.ML

    A hierarchical decomposition for explaining ML performance discrepancies

    Authors: Jean Feng, Harvineet Singh, Fan Xia, Adarsh Subbaswamy, Alexej Gossmann

    Abstract: Machine learning (ML) algorithms can often differ in performance across domains. Understanding $\textit{why}$ their performance differs is crucial for determining what types of interventions (e.g., algorithmic or operational) are most effective at closing the performance gaps. Existing methods focus on $\textit{aggregate decompositions}$ of the total performance gap into the impact of a shift in t… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 11 pages, 5 figures in main body; 14 pages and 2 figures in appendices

  34. arXiv:2402.11450  [pdf, other

    cs.RO

    Learning to Learn Faster from Human Feedback with Language Model Predictive Control

    Authors: Jacky Liang, Fei Xia, Wenhao Yu, Andy Zeng, Montserrat Gonzalez Arenas, Maria Attarian, Maria Bauza, Matthew Bennice, Alex Bewley, Adil Dostmohamed, Chuyuan Kelly Fu, Nimrod Gileadi, Marissa Giustina, Keerthana Gopalakrishnan, Leonard Hasenclever, Jan Humplik, Jasmine Hsu, Nikhil Joshi, Ben Jyenis, Chase Kew, Sean Kirmani, Tsang-Wei Edward Lee, Kuang-Huei Lee, Assaf Hurwitz Michaely, Joss Moore , et al. (25 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to exhibit a wide range of capabilities, such as writing robot code from language commands -- enabling non-experts to direct robot behaviors, modify them based on feedback, or compose them to perform new tasks. However, these capabilities (driven by in-context learning) are limited to short-term interactions, where users' feedback remains relevant for o… ▽ More

    Submitted 31 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  35. arXiv:2402.07872  [pdf, other

    cs.RO cs.CL cs.CV cs.LG

    PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

    Authors: Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter

    Abstract: Vision language models (VLMs) have shown impressive capabilities across a variety of tasks, from logical reasoning to visual understanding. This opens the door to richer interaction with the world, for example robotic control. However, VLMs produce only textual outputs, while robotic control and other spatial tasks require outputting continuous coordinates, actions, or trajectories. How can we ena… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  36. arXiv:2402.06107  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    Multiple Instance Learning for Cheating Detection and Localization in Online Examinations

    Authors: Yemeng Liu, **g Ren, Jianshuo Xu, Xiaomei Bai, Roopdeep Kaur, Feng Xia

    Abstract: The spread of the Coronavirus disease-2019 epidemic has caused many courses and exams to be conducted online. The cheating behavior detection model in examination invigilation systems plays a pivotal role in guaranteeing the equality of long-distance examinations. However, cheating behavior is rare, and most researchers do not comprehensively take into account features such as head posture, gaze a… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 12 pages, 7 figures

    MSC Class: 68T40; 68T45 ACM Class: I.2.10; I.5.4

    Journal ref: IEEE Transactions on Cognitive and Developmental Systems 2024

  37. arXiv:2402.05322  [pdf, other

    cs.LG cs.AI cs.GR cs.SI

    Learning on Multimodal Graphs: A Survey

    Authors: Ciyuan Peng, Jiayuan He, Feng Xia

    Abstract: Multimodal data pervades various domains, including healthcare, social media, and transportation, where multimodal graphs play a pivotal role. Machine learning on multimodal graphs, referred to as multimodal graph learning (MGL), is essential for successful artificial intelligence (AI) applications. The burgeoning research in this field encompasses diverse graph data types and modalities, learning… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 9 pages, 1 figure

  38. arXiv:2402.04031  [pdf

    cs.CV cs.LG

    Polyp-DDPM: Diffusion-Based Semantic Polyp Synthesis for Enhanced Segmentation

    Authors: Zolnamar Dorjsembe, Hsing-Kuo Pao, Furen Xiao

    Abstract: This study introduces Polyp-DDPM, a diffusion-based method for generating realistic images of polyps conditioned on masks, aimed at enhancing the segmentation of gastrointestinal (GI) tract polyps. Our approach addresses the challenges of data limitations, high annotation costs, and privacy concerns associated with medical images. By conditioning the diffusion model on segmentation masks-binary ma… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  39. Digital Twin Mobility Profiling: A Spatio-Temporal Graph Learning Approach

    Authors: Xin Chen, Mingliang Hou, Tao Tang, Achhardeep Kaur, Feng Xia

    Abstract: With the arrival of the big data era, mobility profiling has become a viable method of utilizing enormous amounts of mobility data to create an intelligent transportation system. Mobility profiling can extract potential patterns in urban traffic from mobility data and is critical for a variety of traffic-related applications. However, due to the high level of complexity and the huge amount of data… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 10 pages, 7 figures

    MSC Class: 68T09; 68T30; 68U35 ACM Class: I.2.6; I.2.4; H.1.2

    Journal ref: The 7th IEEE International Conference on Data Science and Systems (DSS), Dec 20 - 22, 2021, Haikou, China

  40. arXiv:2402.03732  [pdf, other

    cs.AI cs.CL cs.DL cs.LG

    Deep Outdated Fact Detection in Knowledge Graphs

    Authors: Huiling Tu, Shuo Yu, Vidya Saikrishna, Feng Xia, Karin Verspoor

    Abstract: Knowledge graphs (KGs) have garnered significant attention for their vast potential across diverse domains. However, the issue of outdated facts poses a challenge to KGs, affecting their overall quality as real-world information evolves. Existing solutions for outdated fact detection often rely on manual recognition. In response, this paper presents DEAN (Deep outdatEd fAct detectioN), a novel dee… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 10 pages, 6 figures

    MSC Class: 68T09; 68T30; 68P20 ACM Class: I.2.6; I.2.4; H.3.7; H.3.3

    Journal ref: 2023 IEEE International Conference on Data Mining Workshops (ICDMW), December 1-4, 2023, Shanghai, China

  41. Generative Expressive Robot Behaviors using Large Language Models

    Authors: Karthik Mahadevan, Jonathan Chien, Noah Brown, Zhuo Xu, Carolina Parada, Fei Xia, Andy Zeng, Leila Takayama, Dorsa Sadigh

    Abstract: People employ expressive behaviors to effectively communicate and coordinate their actions with others, such as nodding to acknowledge a person glancing at them or saying "excuse me" to pass people in a busy corridor. We would like robots to also demonstrate expressive behaviors in human-robot interaction. Prior work proposes rule-based methods that struggle to scale to new communication modalitie… ▽ More

    Submitted 30 January, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

  42. arXiv:2401.12963  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

    Authors: Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Isabel Leal, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao , et al. (3 additional authors not shown)

    Abstract: Foundation models that incorporate language, vision, and more recently actions have revolutionized the ability to harness internet scale data to reason about useful tasks. However, one of the key challenges of training embodied foundation models is the lack of data grounded in the physical world. In this paper, we propose AutoRT, a system that leverages existing foundation models to scale up the d… ▽ More

    Submitted 1 July, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: 26 pages, 9 figures, ICRA 2024 VLMNM Workshop

  43. arXiv:2401.12486  [pdf, ps, other

    cs.IT

    Quaternary codes and their binary images

    Authors: Yansheng Wu, Chao Li, Lin Zhang, Fu Xiao

    Abstract: Recently, simplicial complexes are used in constructions of several infinite families of minimal and optimal linear codes by Hyun {\em et al.} Building upon their research, in this paper more linear codes over the ring $\mathbb{Z}_4$ are constructed by simplicial complexes. Specifically, the Lee weight distributions of the resulting quaternary codes are determined and two infinite families of four… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 21 pages

  44. arXiv:2401.12168  [pdf, other

    cs.CV cs.CL cs.LG cs.RO

    SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

    Authors: Boyuan Chen, Zhuo Xu, Sean Kirmani, Brian Ichter, Danny Driess, Pete Florence, Dorsa Sadigh, Leonidas Guibas, Fei Xia

    Abstract: Understanding and reasoning about spatial relationships is a fundamental capability for Visual Question Answering (VQA) and robotics. While Vision Language Models (VLM) have demonstrated remarkable performance in certain VQA benchmarks, they still lack capabilities in 3D spatial reasoning, such as recognizing quantitative relationships of physical objects like distances or size differences. We hyp… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  45. arXiv:2401.11767  [pdf, other

    cs.CV

    Concealed Object Segmentation with Hierarchical Coherence Modeling

    Authors: Fengyang Xiao, Pan Zhang, Chunming He, Runze Hu, Yutao Liu

    Abstract: Concealed object segmentation (COS) is a challenging task that involves localizing and segmenting those concealed objects that are visually blended with their surrounding environments. Despite achieving remarkable success, existing COS segmenters still struggle to achieve complete segmentation results in extremely concealed scenarios. In this paper, we propose a Hierarchical Coherence Modeling (HC… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted to CICAI 2023. 13 pages, 6 figures, 4 tables

  46. arXiv:2401.00876  [pdf, other

    cs.LG cs.AI q-bio.NC

    Balanced Graph Structure Information for Brain Disease Detection

    Authors: Falih Gozi Febrinanto, Mujie Liu, Feng Xia

    Abstract: Analyzing connections between brain regions of interest (ROI) is vital to detect neurological disorders such as autism or schizophrenia. Recent advancements employ graph neural networks (GNNs) to utilize graph structures in brains, improving detection performances. Current methods use correlation measures between ROI's blood-oxygen-level-dependent (BOLD) signals to generate the graph structure. Ot… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

    Comments: Presented at Pacific Rim Knowledge Acquisition Workshop (PKAW) 2023

  47. arXiv:2312.16080  [pdf, other

    cs.IT

    A Fractal-based Complex Belief Entropy for Uncertainty Measure in Complex Evidence Theory

    Authors: Keming Wu, Fuyuan Xiao, Yi Zhang

    Abstract: Complex Evidence Theory (CET), an extension of the traditional D-S evidence theory, has garnered academic interest for its capacity to articulate uncertainty through Complex Basic Belief Assignment (CBBA) and to perform uncertainty reasoning using complex combination rules. Nonetheless, quantifying uncertainty within CET remains a subject of ongoing research. To enhance decision-making, a method f… ▽ More

    Submitted 8 June, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

  48. arXiv:2312.09478  [pdf, other

    cs.LG cs.AI

    Entropy Causal Graphs for Multivariate Time Series Anomaly Detection

    Authors: Falih Gozi Febrinanto, Kristen Moore, Chandra Thapa, Mujie Liu, Vidya Saikrishna, Jiangang Ma, Feng Xia

    Abstract: Many multivariate time series anomaly detection frameworks have been proposed and widely applied. However, most of these frameworks do not consider intrinsic relationships between variables in multivariate time series data, thus ignoring the causal relationship among variables and degrading anomaly detection performance. This work proposes a novel framework called CGAD, an entropy Causal Graph for… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  49. arXiv:2312.08782  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis

    Authors: Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao, Yu Quan Chong, Chen Wang, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Zsolt Kira, Fei Xia, Yonatan Bisk

    Abstract: Building general-purpose robots that can operate seamlessly, in any environment, with any object, and utilizing various skills to complete diverse tasks has been a long-standing goal in Artificial Intelligence. Unfortunately, however, most existing robotic systems have been constrained - having been designed for specific tasks, trained on specific datasets, and deployed within specific environment… ▽ More

    Submitted 15 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

  50. arXiv:2312.04906  [pdf, other

    cs.CL

    Ophtha-LLaMA2: A Large Language Model for Ophthalmology

    Authors: Huan Zhao, Qian Ling, Yi Pan, Tianyang Zhong, **-Yu Hu, Junjie Yao, Fengqian Xiao, Zhenxiang Xiao, Yutong Zhang, San-Hua Xu, Shi-Nan Wu, Min Kang, Zihao Wu, Zhengliang Liu, Xi Jiang, Tianming Liu, Yi Shao

    Abstract: In recent years, pre-trained large language models (LLMs) have achieved tremendous success in the field of Natural Language Processing (NLP). Prior studies have primarily focused on general and generic domains, with relatively less research on specialized LLMs in the medical field. The specialization and high accuracy requirements for diagnosis in the medical field, as well as the challenges in co… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.