Skip to main content

Showing 1–50 of 145 results for author: shen, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17960  [pdf, other

    cs.CV cs.AI

    MAGIC: Meta-Ability Guided Interactive Chain-of-Distillation for Effective-and-Efficient Vision-and-Language Navigation

    Authors: Liuyi Wang, Zongtao He, Mengjiao Shen, **gwei Yang, Chengju Liu, Qijun Chen

    Abstract: Despite the remarkable developments of recent large models in Embodied Artificial Intelligence (E-AI), their integration into robotics is hampered by their excessive parameter sizes and computational demands. Towards the Vision-and-Language Navigation (VLN) task, a core task in E-AI, this paper reveals the great potential of using knowledge distillation for obtaining lightweight student models by… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.12079  [pdf, other

    cs.CV cs.AI cs.LG

    Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraint

    Authors: Xinglong Sun, Barath Lakshmanan, Maying Shen, Shiyi Lan, **gde Chen, Jose Alvarez

    Abstract: As we push the boundaries of performance in various vision tasks, the models grow in size correspondingly. To keep up with this growth, we need very aggressive pruning techniques for efficient inference and deployment on edge devices. Existing pruning approaches are limited to channel pruning and struggle with aggressive parameter reductions. In this paper, we propose a novel multi-dimensional pru… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Under Review

  3. arXiv:2406.04484  [pdf, ps, other

    cs.CV

    Step Out and Seek Around: On Warm-Start Training with Incremental Data

    Authors: Maying Shen, Hongxu Yin, Pavlo Molchanov, Lei Mao, Jose M. Alvarez

    Abstract: Data often arrives in sequence over time in real-world deep learning applications such as autonomous driving. When new training data is available, training the model from scratch undermines the benefit of leveraging the learned knowledge, leading to significant training costs. Warm-starting from a previously trained checkpoint is the most intuitive way to retain knowledge and advance learning. How… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  4. arXiv:2406.01125  [pdf, other

    cs.CV

    $Δ$-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers

    Authors: Pengtao Chen, Mingzhu Shen, Peng Ye, Jianjian Cao, Chongjun Tu, Christos-Savvas Bouganis, Yiren Zhao, Tao Chen

    Abstract: Diffusion models are widely recognized for generating high-quality and diverse images, but their poor real-time performance has led to numerous acceleration works, primarily focusing on UNet-based structures. With the more successful results achieved by diffusion transformers (DiT), there is still a lack of exploration regarding the impact of DiT structure on generation, as well as the absence of… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 12 pages, 6 figures, 6 tables

  5. arXiv:2405.19626  [pdf, other

    cs.DC

    Position: CXL Shared Memory Programming: Barely Distributed and Almost Persistent

    Authors: Yi Xu, Suyash Mahar, Ziheng Liu, Mingyao Shen, Steven Swanson

    Abstract: While Compute Express Link (CXL) enables support for cache-coherent shared memory among multiple nodes, it also introduces new types of failures--processes can fail before data does, or data might fail before a process does. The lack of a failure model for CXL-based shared memory makes it challenging to understand and mitigate these failures. To solve these challenges, in this paper, we describe… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  6. arXiv:2405.16395  [pdf, other

    cs.LG

    Daily Physical Activity Monitoring -- Adaptive Learning from Multi-source Motion Sensor Data

    Authors: Haoting Zhang, Donglin Zhan, Yunduan Lin, **ghai He, Qing Zhu, Zuo-Jun Max Shen, Zeyu Zheng

    Abstract: In healthcare applications, there is a growing need to develop machine learning models that use data from a single source, such as that from a wrist wearable device, to monitor physical activities, assess health risks, and provide immediate health recommendations or interventions. However, the limitation of using single-source data often compromises the model's accuracy, as it fails to capture the… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  7. arXiv:2405.13380  [pdf, other

    cs.CR

    The Illusion of Anonymity: Uncovering the Impact of User Actions on Privacy in Web3 Social Ecosystems

    Authors: Bin Wang, Tianjian Liu, Wenqi Wang, Yuan Weng, Chao Li, Guangquan Xu, Meng Shen, Sencun Zhu, Wei Wang

    Abstract: The rise of Web3 social ecosystems signifies the dawn of a new chapter in digital interaction, offering significant prospects for user engagement and financial advancement. Nonetheless, this progress is shadowed by potential privacy concessions, especially as these platforms frequently merge with existing Web2.0 social media accounts, amplifying data privacy risks for users. In this study, we in… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  8. arXiv:2404.18096  [pdf, other

    eess.IV cs.CV

    Snake with Shifted Window: Learning to Adapt Vessel Pattern for OCTA Segmentation

    Authors: Xinrun Chen, Mei Shen, Haojian Ning, Mengzhan Zhang, Chengliang Wang, Shiying Li

    Abstract: Segmenting specific targets or structures in optical coherence tomography angiography (OCTA) images is fundamental for conducting further pathological studies. The retinal vascular layers are rich and intricate, and such vascular with complex shapes can be captured by the widely-studied OCTA images. In this paper, we thus study how to use OCTA images with projection vascular layers to segment reti… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  9. arXiv:2404.10241  [pdf, other

    cs.CV cs.AI

    Vision-and-Language Navigation via Causal Learning

    Authors: Liuyi Wang, Zongtao He, Ronghao Dang, Mengjiao Shen, Chengju Liu, Qijun Chen

    Abstract: In the pursuit of robust and generalizable environment perception and language understanding, the ubiquitous challenge of dataset bias continues to plague vision-and-language navigation (VLN) agents, hindering their performance in unseen environments. This paper introduces the generalized cross-modal causal transformer (GOAT), a pioneering solution rooted in the paradigm of causal inference. By de… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  10. An Origami-Inspired Variable Friction Surface for Increasing the Dexterity of Robotic Grippers

    Authors: Qiujie Lu, Angus B. Clark, Matthew Shen, Nicolas Rojas

    Abstract: While the gras** capability of robotic grippers has shown significant development, the ability to manipulate objects within the hand is still limited. One explanation for this limitation is the lack of controlled contact variation between the grasped object and the gripper. For instance, human hands have the ability to firmly grip object surfaces, as well as slide over object faces, an aspect th… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 8 pages, 11 figures

    Journal ref: IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 2538-2545, April 2020

  11. arXiv:2404.04661  [pdf, other

    cs.LG cs.AI

    Transform then Explore: a Simple and Effective Technique for Exploratory Combinatorial Optimization with Reinforcement Learning

    Authors: Tianle Pu, Changjun Fan, Mutian Shen, Yizhou Lu, Li Zeng, Zohar Nussinov, Chao Chen, Zhong Liu

    Abstract: Many complex problems encountered in both production and daily life can be conceptualized as combinatorial optimization problems (COPs) over graphs. Recent years, reinforcement learning (RL) based models have emerged as a promising direction, which treat the COPs solving as a heuristic learning problem. However, current finite-horizon-MDP based RL models have inherent limitations. They are not all… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

  12. arXiv:2404.03604  [pdf, other

    math.OC cs.DS

    A Unified Algorithmic Framework for Dynamic Assortment Optimization under MNL Choice

    Authors: Shuo Sun, Rajan Udwani, Zuo-Jun Max Shen

    Abstract: We consider assortment and inventory planning problems with dynamic stockout-based substitution effects and no replenishment. We consider two settings: 1. Customers can see all available products when they arrive, which is commonly seen in physical stores. 2. The seller can choose to offer a subset of available products to each customer, which is typical on online platforms. Both settings are know… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  13. arXiv:2403.15791  [pdf, other

    cs.RO

    DriveEnv-NeRF: Exploration of A NeRF-Based Autonomous Driving Environment for Real-World Performance Validation

    Authors: Mu-Yi Shen, Chia-Chi Hsu, Hao-Yu Hou, Yu-Chen Huang, Wei-Fang Sun, Chia-Che Chang, Yu-Lun Liu, Chun-Yi Lee

    Abstract: In this study, we introduce the DriveEnv-NeRF framework, which leverages Neural Radiance Fields (NeRF) to enable the validation and faithful forecasting of the efficacy of autonomous driving agents in a targeted real-world scene. Standard simulator-based rendering often fails to accurately reflect real-world performance due to the sim-to-real gap, which represents the disparity between virtual sim… ▽ More

    Submitted 30 May, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

    Comments: Project page: https://github.com/muyishen2040/DriveEnvNeRF

  14. arXiv:2403.12807  [pdf, ps, other

    cs.GT

    Freshness-aware Block Propagation Optimization in 6G-based Web 3.0: An Evolutionary Game Approach

    Authors: **bo Wen, Jiawen Kang, Zehui Xiong, Hongyang Du, Zhaohui Yang, Dusit Niyato, Meng Shen, Yutao Jiao, Yang Zhang

    Abstract: Driven by the aspiration to establish a decentralized digital economy, Web 3.0 is emerging as the fundamental technology for digital transformation. Incorporating the promising sixth-generation (6G) technology with large bandwidth and space-air-ground integrated coverage, 6G-based Web 3.0 holds great potential in empowering users with enhanced data control and facilitating secure peer-to-peer tran… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  15. arXiv:2403.08819  [pdf, other

    cs.LG cs.CL stat.ML

    Thermometer: Towards Universal Calibration for Large Language Models

    Authors: Maohao Shen, Subhro Das, Kristjan Greenewald, Prasanna Sattigeri, Gregory Wornell, Soumya Ghosh

    Abstract: We consider the issue of calibration in large language models (LLM). Recent studies have found that common interventions such as instruction tuning often result in poorly calibrated LLMs. Although calibration is well-explored in traditional applications, calibrating LLMs is uniquely challenging. These challenges stem as much from the severe computational requirements of LLMs as from their versatil… ▽ More

    Submitted 27 June, 2024; v1 submitted 19 February, 2024; originally announced March 2024.

    Comments: Camera ready version for ICML 2024

  16. arXiv:2402.13033  [pdf, other

    cs.LG cs.IR cs.SI

    Enhancing Real-World Complex Network Representations with Hyperedge Augmentation

    Authors: Xiangyu Zhao, Zehui Li, Mingzhu Shen, Guy-Bart Stan, Pietro Liò, Yiren Zhao

    Abstract: Graph augmentation methods play a crucial role in improving the performance and enhancing generalisation capabilities in Graph Neural Networks (GNNs). Existing graph augmentation methods mainly perturb the graph structures and are usually limited to pairwise node relations. These methods cannot fully address the complexities of real-world large-scale networks that often involve higher-order node r… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: Preprint. Under review. 17 pages, 4 figures, 14 tables. arXiv admin note: text overlap with arXiv:2306.05108

  17. arXiv:2402.06160  [pdf, other

    cs.LG stat.ML

    Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?

    Authors: Maohao Shen, J. Jon Ryu, Soumya Ghosh, Yuheng Bu, Prasanna Sattigeri, Subhro Das, Gregory W. Wornell

    Abstract: This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies… ▽ More

    Submitted 12 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: 29 pages, 12 figures

  18. arXiv:2402.06094  [pdf, other

    cs.CL

    Rethinking Data Selection for Supervised Fine-Tuning

    Authors: Ming Shen

    Abstract: Although supervised finetuning (SFT) has emerged as an essential technique to align large language models with humans, it is considered superficial, with style learning being its nature. At the same time, recent works indicate the importance of data selection for SFT, showing that finetuning with high-quality and diverse subsets of the original dataset leads to superior downstream performance. In… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  19. arXiv:2402.02381  [pdf, other

    cs.NI cs.AI

    Empowering Computing and Networks Convergence System with Distributed Cooperative Routing

    Authors: Yujiao Hu, Qingmin Jia, Meng Shen, Renchao Xie, Tao Huang, F. Richard Yu

    Abstract: The emergence of intelligent applications and recent advances in the fields of computing and networks are driving the development of computing and networks convergence (CNC) system. However, existing researches failed to achieve comprehensive scheduling optimization of computing and network resources. This shortfall results in some requirements of computing requests unable to be guaranteed in an e… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Submit to IEEE Network

  20. arXiv:2401.05746  [pdf, other

    cs.MM

    Cross-Modality and Within-Modality Regularization for Audio-Visual DeepFake Detection

    Authors: Heqing Zou, Meng Shen, Yuchen Hu, Chen Chen, Eng Siong Chng, Deepu Rajan

    Abstract: Audio-visual deepfake detection scrutinizes manipulations in public video using complementary multimodal cues. Current methods, which train on fused multimodal data for multimodal targets face challenges due to uncertainties and inconsistencies in learned representations caused by independent modality manipulations in deepfake videos. To address this, we propose cross-modality and within-modality… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  21. arXiv:2312.16240  [pdf, other

    cs.CV cs.AI

    Merging Vision Transformers from Different Tasks and Domains

    Authors: Peng Ye, Chenyu Huang, Mingzhu Shen, Tao Chen, Yongqi Huang, Yuning Zhang, Wanli Ouyang

    Abstract: This work targets to merge various Vision Transformers (ViTs) trained on different tasks (i.e., datasets with different object categories) or domains (i.e., datasets with the same categories but different environments) into one unified model, yielding still good performance on each task or domain. Previous model merging works focus on either CNNs or NLP models, leaving the ViTs merging research un… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

  22. arXiv:2312.14249  [pdf, other

    q-bio.GN cs.LG

    GenoCraft: A Comprehensive, User-Friendly Web-Based Platform for High-Throughput Omics Data Analysis and Visualization

    Authors: Yingzhou Lu, Minjie Shen, Yue Zhao, Chenhao Li, Fan Meng, Xiao Wang, David Herrington, Yue Wang, Tim Fu, Capucine Van Rechem

    Abstract: The surge in high-throughput omics data has reshaped the landscape of biological research, underlining the need for powerful, user-friendly data analysis and interpretation tools. This paper presents GenoCraft, a web-based comprehensive software solution designed to handle the entire pipeline of omics data processing. GenoCraft offers a unified platform featuring advanced bioinformatics tools, cov… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  23. arXiv:2312.07871  [pdf, other

    cs.CV

    MLNet: Mutual Learning Network with Neighborhood Invariance for Universal Domain Adaptation

    Authors: Yanzuo Lu, Meng Shen, Andy J Ma, Xiaohua Xie, Jian-Huang Lai

    Abstract: Universal domain adaptation (UniDA) is a practical but challenging problem, in which information about the relation between the source and the target domains is not given for knowledge transfer. Existing UniDA methods may suffer from the problems of overlooking intra-domain variations in the target domain and difficulty in separating between the similar known and unknown class. To address these is… ▽ More

    Submitted 27 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024 (Poster)

  24. arXiv:2311.02303  [pdf, other

    cs.LG cs.AI

    MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning

    Authors: Bingchang Liu, Chaoyu Chen, Cong Liao, Zi Gong, Huan Wang, Zhichao Lei, Ming Liang, Dajun Chen, Min Shen, Hailian Zhou, Hang Yu, Jianguo Li

    Abstract: Code LLMs have emerged as a specialized research field, with remarkable studies dedicated to enhancing model's coding capabilities through fine-tuning on pre-trained models. Previous fine-tuning approaches were typically tailored to specific downstream tasks or scenarios, which meant separate fine-tuning for each task, requiring extensive training resources and posing challenges in terms of deploy… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  25. arXiv:2310.16300  [pdf, other

    cs.DC cs.OS

    Snapshot: Fast, Userspace Crash Consistency for CXL and PM Using msync

    Authors: Suyash Mahar, Mingyao Shen, Terence Kelly, Steven Swanson

    Abstract: Crash consistency using persistent memory programming libraries requires programmers to use complex transactions and manual annotations. In contrast, the failure-atomic msync() (FAMS) interface is much simpler as it transparently tracks updates and guarantees that modified data is atomically durable on a call to the failure-atomic variant of msync(). However, FAMS suffers from several drawbacks, l… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: A shorter version of this paper appeared in the Proceedings of ICCD 2023

  26. arXiv:2310.07183  [pdf, other

    cs.LG

    SAM-OCTA: Prompting Segment-Anything for OCTA Image Segmentation

    Authors: Xinrun Chen, Chengliang Wang, Haojian Ning, Shiying Li, Mei Shen

    Abstract: Segmenting specific targets or biomarkers is necessary to analyze optical coherence tomography angiography (OCTA) images. Previous methods typically segment all the targets in an OCTA sample, such as retinal vessels (RVs). Although these methods perform well in accuracy and precision, OCTA analyses often focusing local information within the images which has not been fulfilled. In this paper, we p… ▽ More

    Submitted 20 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2309.11758

  27. arXiv:2310.06266  [pdf, other

    cs.SE cs.AI cs.LG

    CodeFuse-13B: A Pretrained Multi-lingual Code Large Language Model

    Authors: Peng Di, Jianguo Li, Hang Yu, Wei Jiang, Wenting Cai, Yang Cao, Chaoyu Chen, Dajun Chen, Hongwei Chen, Liang Chen, Gang Fan, Jie Gong, Zi Gong, Wen Hu, Tingting Guo, Zhichao Lei, Ting Li, Zheng Li, Ming Liang, Cong Liao, Bingchang Liu, Jiachen Liu, Zhiwei Liu, Shaojun Lu, Min Shen , et al. (13 additional authors not shown)

    Abstract: Code Large Language Models (Code LLMs) have gained significant attention in the industry due to their wide applications in the full lifecycle of software engineering. However, the effectiveness of existing models in understanding non-English inputs for multi-lingual code-related tasks is still far from well studied. This paper introduces CodeFuse-13B, an open-sourced pre-trained code LLM. It is sp… ▽ More

    Submitted 10 January, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted by ICSE-SEIP 2024

  28. arXiv:2310.02183  [pdf, other

    cs.DC

    Puddles: Application-Independent Recovery and Location-Independent Data for Persistent Memory

    Authors: Suyash Mahar, Mingyao Shen, TJ Smith, Joseph Izraelevitz, Steven Swanson

    Abstract: In this paper, we argue that current work has failed to provide a comprehensive and maintainable in-memory representation for persistent memory. PM data should be easily mappable into a process address space, shareable across processes, shippable between machines, consistent after a crash, and accessible to legacy code with fast, efficient pointers as first-class abstractions. While existing s… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: To appear in EuroSys 2024

  29. arXiv:2310.00836  [pdf, other

    cs.CL cs.AI

    Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models

    Authors: Man Luo, Shrinidhi Kumbhar, Ming shen, Mihir Parmar, Neeraj Varshney, Pratyay Banerjee, Somak Aditya, Chitta Baral

    Abstract: Logical reasoning is fundamental for humans yet presents a substantial challenge in the domain of Artificial Intelligence. Initially, researchers used Knowledge Representation and Reasoning (KR) systems that did not scale and required non-trivial manual effort. Recently, the emergence of large language models (LLMs) has demonstrated the ability to overcome various limitations of formal Knowledge R… ▽ More

    Submitted 30 March, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: Work in progress

  30. arXiv:2310.00205  [pdf, other

    cs.SE cs.CR

    An Empirical Study on the Use of Static Analysis Tools in Open Source Embedded Software

    Authors: Mingjie Shen, Akul Pillai, Brian A. Yuan, James C. Davis, Aravind Machiry

    Abstract: This paper performs the first study to understand the prevalence, challenges, and effectiveness of using Static Application Security Testing (SAST) tools on Open-Source Embedded Software (EMBOSS) repositories. We collect a corpus of 258 of the most popular EMBOSS projects, representing 13 distinct categories such as real-time operating systems, network stacks, and applications. To understand the c… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  31. GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption

    Authors: Kaustubh Shivdikar, Yuhui Bao, Rashmi Agrawal, Michael Shen, Gilbert Jonatan, Evelio Mora, Alexander Ingare, Neal Livesay, José L. Abellán, John Kim, Ajay Joshi, David Kaeli

    Abstract: Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its promise of strong data privacy and security guarantees, FHE introduces a slowdown of up to five orders of magnitude as compared to the same computatio… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  32. arXiv:2309.09972  [pdf, other

    cs.AI cs.CR

    Artificial Intelligence for Web 3.0: A Comprehensive Survey

    Authors: Meng Shen, Zhehui Tan, Dusit Niyato, Yuzhi Liu, Jiawen Kang, Zehui Xiong, Liehuang Zhu, Wei Wang, Xuemin, Shen

    Abstract: Web 3.0 is the new generation of the Internet that is reconstructed with distributed technology, which focuses on data ownership and value expression. Also, it operates under the principle that data and digital assets should be owned and controlled by users rather than large corporations. In this survey, we explore the current development state of Web 3.0 and the application of AI Technology in We… ▽ More

    Submitted 17 August, 2023; originally announced September 2023.

  33. arXiv:2309.01961  [pdf, other

    cs.CV

    NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

    Authors: Taehoon Kim, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Mark Marsden, Alessandra Sala, Seung Hwan Kim, Bohyung Han, Kyoung Mu Lee, Honglak Lee, Kyounghoon Bae, Xiangyu Wu, Yi Gao, Hailiang Zhang, Yang Yang, Weili Guo, Jianfeng Lu, Youngtaek Oh, Jae Won Cho, Dong-** Kim, In So Kweon, Junmo Kim, Wooyoung Kang, Won Young Jhoo, Byungseok Roh , et al. (17 additional authors not shown)

    Abstract: In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image captioning models that advance the state-of-the-art both in terms of accuracy and fairness. Through the challenge, the image captioning models were tested… ▽ More

    Submitted 10 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Tech report, project page https://nice.lgresearch.ai/

  34. arXiv:2308.16785  [pdf

    cs.AI cs.HC

    Agent Teaming Situation Awareness (ATSA): A Situation Awareness Framework for Human-AI Teaming

    Authors: Qi Gao, Wei Xu, Mowei Shen, Zaifeng Gao

    Abstract: The rapid advancements in artificial intelligence (AI) have led to a growing trend of human-AI teaming (HAT) in various fields. As machines continue to evolve from mere automation to a state of autonomy, they are increasingly exhibiting unexpected behaviors and human-like cognitive/intelligent capabilities, including situation awareness (SA). This shift has the potential to enhance the performance… ▽ More

    Submitted 4 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

    Comments: 52 pages,5 figures, 1 table

  35. arXiv:2308.08025  [pdf, other

    quant-ph cs.AI cs.ET cs.LG stat.ML

    Potential Energy Advantage of Quantum Economy

    Authors: Junyu Liu, Hansheng Jiang, Zuo-Jun Max Shen

    Abstract: Energy cost is increasingly crucial in the modern computing industry with the wide deployment of large-scale machine learning models and language models. For the firms that provide computing services, low energy consumption is important both from the perspective of their own market growth and the government's regulations. In this paper, we study the energy benefits of quantum computing vis-a-vis c… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: 23 pages, many figures

  36. arXiv:2308.06717  [pdf, other

    cs.LG cs.AI cs.GT stat.ML

    Estimating and Incentivizing Imperfect-Knowledge Agents with Hidden Rewards

    Authors: Ilgin Dogan, Zuo-Jun Max Shen, Anil Aswani

    Abstract: In practice, incentive providers (i.e., principals) often cannot observe the reward realizations of incentivized agents, which is in contrast to many principal-agent models that have been previously studied. This information asymmetry challenges the principal to consistently estimate the agent's unknown rewards by solely watching the agent's decisions, which becomes even more challenging when the… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Comments: 72 pages, 6 figures. arXiv admin note: text overlap with arXiv:2304.07407

  37. arXiv:2307.13339  [pdf, other

    cs.CL cs.AI

    Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions

    Authors: Skyler Wu, Eric Meng Shen, Charumathi Badrinath, Jiaqi Ma, Himabindu Lakkaraju

    Abstract: Chain-of-thought (CoT) prompting has been shown to empirically improve the accuracy of large language models (LLMs) on various question answering tasks. While understanding why CoT prompting is effective is crucial to ensuring that this phenomenon is a consequence of desired model behavior, little work has addressed this; nonetheless, such an understanding is a critical prerequisite for responsibl… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Accepted to Workshop on Challenges in Deployable Generative AI at ICML 2023

  38. MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results

    Authors: Yuki Kondo, Norimichi Ukita, Takayuki Yamaguchi, Hao-Yu Hou, Mu-Yi Shen, Chia-Chi Hsu, En-Ming Huang, Yu-Chen Huang, Yu-Cheng Xia, Chien-Yao Wang, Chun-Yi Lee, Da Huo, Marc A. Kastner, Tingwei Liu, Yasutomo Kawanishi, Takatsugu Hirayama, Takahiro Komamizu, Ichiro Ide, Yosuke Shinya, Xinyao Liu, Guang Liang, Syusuke Yasui

    Abstract: Small Object Detection (SOD) is an important machine vision topic because (i) a variety of real-world applications require object detection for distant objects and (ii) SOD is a challenging task due to the noisy, blurred, and less-informative image appearances of small objects. This paper proposes a new SOD dataset consisting of 39,070 images including 137,121 bird instances, which is called the S… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: This paper is included in the proceedings of the 18th International Conference on Machine Vision Applications (MVA2023). It will be officially published at a later date. Project page : https://www.mva-org.jp/mva2023/challenge

    Journal ref: 2023 18th International Conference on Machine Vision and Applications (MVA)

  39. arXiv:2307.09002  [pdf, other

    cs.CR

    CBSeq: A Channel-level Behavior Sequence For Encrypted Malware Traffic Detection

    Authors: Susu Cui, Cong Dong, Meng Shen, Yuling Liu, Bo Jiang, Zhigang Lu

    Abstract: Machine learning and neural networks have become increasingly popular solutions for encrypted malware traffic detection. They mine and learn complex traffic patterns, enabling detection by fitting boundaries between malware traffic and benign traffic. Compared with signature-based methods, they have higher scalability and flexibility. However, affected by the frequent variants and updates of malwa… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: Submitted to IEEE TIFS

  40. arXiv:2306.17020   

    cs.CL cs.AI

    Classifying Crime Types using Judgment Documents from Social Media

    Authors: Haoxuan Xu, Zeyu He, Mengfan Shen, Songning Lai, Ziqiang Han, Yifan Peng

    Abstract: The task of determining crime types based on criminal behavior facts has become a very important and meaningful task in social science. But the problem facing the field now is that the data samples themselves are unevenly distributed, due to the nature of the crime itself. At the same time, data sets in the judicial field are less publicly available, and it is not practical to produce large data s… ▽ More

    Submitted 21 October, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: The paper has no errors; it just needs to be supplemented to become a new article

  41. arXiv:2306.14306  [pdf, other

    cs.LG cs.CV

    Adaptive Sharpness-Aware Pruning for Robust Sparse Networks

    Authors: Anna Bair, Hongxu Yin, Maying Shen, Pavlo Molchanov, Jose Alvarez

    Abstract: Robustness and compactness are two essential attributes of deep learning models that are deployed in the real world. The goals of robustness and compactness may seem to be at odds, since robustness requires generalization across domains, while the process of compression exploits specificity in one domain. We introduce Adaptive Sharpness-Aware Pruning (AdaSAP), which unifies these goals through the… ▽ More

    Submitted 13 March, 2024; v1 submitted 25 June, 2023; originally announced June 2023.

  42. Towards Balanced Active Learning for Multimodal Classification

    Authors: Meng Shen, Yizheng Huang, Jianxiong Yin, Heqing Zou, Deepu Rajan, Simon See

    Abstract: Training multimodal networks requires a vast amount of data due to their larger parameter space compared to unimodal networks. Active learning is a widely used technique for reducing data annotation costs by selecting only those samples that could contribute to improving model performance. However, current active learning strategies are mostly designed for unimodal tasks, and when applied to multi… ▽ More

    Submitted 21 August, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 12 pages, accepted by ACMMM 2023

  43. arXiv:2306.05734  [pdf, other

    cs.LG cs.CR cs.DS

    DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework

    Authors: Hua Wang, Sheng Gao, Huanyu Zhang, Weijie J. Su, Milan Shen

    Abstract: Hyperparameter optimization, also known as hyperparameter tuning, is a widely recognized technique for improving model performance. Regrettably, when training private ML models, many practitioners often overlook the privacy risks associated with hyperparameter optimization, which could potentially expose sensitive information about the underlying dataset. Currently, the sole existing approach to a… ▽ More

    Submitted 26 November, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  44. arXiv:2306.05275  [pdf, ps, other

    cs.LG cs.CR cs.IT stat.ML

    Federated Linear Contextual Bandits with User-level Differential Privacy

    Authors: Ruiquan Huang, Huanyu Zhang, Luca Melis, Milan Shen, Meisam Hajzinia, **g Yang

    Abstract: This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP). We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting. We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamenta… ▽ More

    Submitted 9 June, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Accepted by ICML 2023

  45. arXiv:2306.05108  [pdf, other

    cs.LG cs.SI

    Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs

    Authors: Zehui Li, Xiangyu Zhao, Mingzhu Shen, Guy-Bart Stan, Pietro Liò, Yiren Zhao

    Abstract: Graphs are widely used to encapsulate a variety of data formats, but real-world networks often involve complex node relations beyond only being pairwise. While hypergraphs and hierarchical graphs have been developed and employed to account for the complex node relations, they cannot fully represent these complexities in practice. Additionally, though many Graph Neural Networks (GNNs) have been pro… ▽ More

    Submitted 20 February, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: 16 pages, 5 figures, 11 tables

  46. arXiv:2306.04997  [pdf, other

    eess.SP cs.AI

    Blockage Prediction in Directional mmWave Links Using Liquid Time Constant Network

    Authors: Martin H. Nielsen, Chia-Yi Yeh, Ming Shen, Muriel Médard

    Abstract: We propose to use a liquid time constant (LTC) network to predict the future blockage status of a millimeter wave (mmWave) link using only the received signal power as the input to the system. The LTC network is based on an ordinary differential equation (ODE) system inspired by biology and specialized for near-future prediction for time sequence observation as the input. Using an experimental dat… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 2 pages, pre-print for IRMMW 2023 conference

  47. Robust and Efficient Fault Diagnosis of mm-Wave Active Phased Arrays using Baseband Signal

    Authors: Martin H. Nielsen, Yufeng Zhang, Changbin Xue, Jian Ren, Yingzeng Yin, Ming Shen, Gert F. Pedersen

    Abstract: One key communication block in 5G and 6G radios is the active phased array (APA). To ensure reliable operation, efficient and timely fault diagnosis of APAs on-site is crucial. To date, fault diagnosis has relied on measurement of frequency domain radiation patterns using costly equipment and multiple strictly controlled measurement probes, which are time-consuming, complex, and therefore infeasib… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 10 pages

    Journal ref: in IEEE Transactions on Antennas and Propagation, vol. 70, no. 7, pp. 5044-5053, July 2022

  48. arXiv:2305.17567  [pdf, other

    cs.GT math.OC

    No-Regret Learning in Dynamic Competition with Reference Effects Under Logit Demand

    Authors: Mengzi Amy Guo, Donghao Ying, Javad Lavaei, Zuo-Jun Max Shen

    Abstract: This work is dedicated to the algorithm design in a competitive framework, with the primary goal of learning a stable equilibrium. We consider the dynamic price competition between two firms operating within an opaque marketplace, where each firm lacks information about its competitor. The demand follows the multinomial logit (MNL) choice model, which depends on the consumers' observed price and t… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

  49. arXiv:2305.09299  [pdf, other

    cs.CV cs.CL

    UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning

    Authors: Heqing Zou, Meng Shen, Chen Chen, Yuchen Hu, Deepu Rajan, Eng Siong Chng

    Abstract: Multimodal learning aims to imitate human beings to acquire complementary information from multiple modalities for various downstream tasks. However, traditional aggregation-based multimodal fusion methods ignore the inter-modality relationship, treat each modality equally, suffer sensor noise, and thus reduce multimodal learning performance. In this work, we propose a novel multimodal contrastive… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Findings

  50. arXiv:2305.07170  [pdf, other

    cs.LG

    Towards Understanding and Improving GFlowNet Training

    Authors: Max W. Shen, Emmanuel Bengio, Ehsan Hajiramezanali, Andreas Loukas, Kyunghyun Cho, Tommaso Biancalani

    Abstract: Generative flow networks (GFlowNets) are a family of algorithms that learn a generative policy to sample discrete objects $x$ with non-negative reward $R(x)$. Learning objectives guarantee the GFlowNet samples $x$ from the target distribution $p^*(x) \propto R(x)$ when loss is globally minimized over all states or trajectories, but it is unclear how well they perform with practical limits on train… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted to ICML 2023