Skip to main content

Showing 1–50 of 164 results for author: Ma, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08184  [pdf, other

    cs.AI cs.HC

    MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents

    Authors: Luyuan Wang, Yongyu Deng, Yiwei Zha, Guodong Mao, Qinmin Wang, Tianchen Min, Wei Chen, Shoufa Chen

    Abstract: Large language model (LLM)-based mobile agents are increasingly popular due to their capability to interact directly with mobile phone Graphic User Interfaces (GUIs) and their potential to autonomously manage daily tasks. Despite their promising prospects in both academic and industrial sectors, little research has focused on benchmarking the performance of existing mobile agents, due to the inexh… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2406.07385  [pdf, other

    cs.GT cs.CC

    Disrupting Bipartite Trading Networks: Matching for Revenue Maximization

    Authors: Luca D'Amico-Wong, Yannai A. Gonczarowski, Gary Qiurui Ma, David C. Parkes

    Abstract: We model the role of an online platform disrupting a market with unit-demand buyers and unit-supply sellers. Each seller can transact with a subset of the buyers whom she already knows, as well as with any additional buyers to whom she is introduced by the platform. Given these constraints on trade, prices and transactions are induced by a competitive equilibrium. The platform's revenue is proport… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted at the Twenty-Fifth ACM Conference on Economics and Computation (EC'24), 2024

  3. arXiv:2406.05426  [pdf, other

    cs.LG

    Baking Symmetry into GFlowNets

    Authors: George Ma, Emmanuel Bengio, Yoshua Bengio, Dinghuai Zhang

    Abstract: GFlowNets have exhibited promising performance in generating diverse candidates with high rewards. These networks generate objects incrementally and aim to learn a policy that assigns probability of sampling objects in proportion to rewards. However, the current training pipelines of GFlowNets do not consider the presence of isomorphic actions, which are actions resulting in symmetric or isomorphi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  4. arXiv:2406.02224  [pdf, other

    cs.CL cs.AI

    FedMKT: Federated Mutual Knowledge Transfer for Large and Small Language Models

    Authors: Tao Fan, Guoqiang Ma, Yan Kang, Hanlin Gu, Yuanfeng Song, Lixin Fan, Kai Chen, Qiang Yang

    Abstract: Recent research in federated large language models (LLMs) has primarily focused on enabling clients to fine-tune their locally deployed homogeneous LLMs collaboratively or on transferring knowledge from server-based LLMs to small language models (SLMs) at downstream clients. However, a significant gap remains in the simultaneous mutual enhancement of both the server's LLM and clients' SLMs. To bri… ▽ More

    Submitted 18 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  5. arXiv:2405.18378  [pdf, other

    cs.LG

    A Canonization Perspective on Invariant and Equivariant Learning

    Authors: George Ma, Yifei Wang, Derek Lim, Stefanie Jegelka, Yisen Wang

    Abstract: In many applications, we desire neural networks to exhibit invariance or equivariance to certain groups due to symmetries inherent in the data. Recently, frame-averaging methods emerged to be a unified framework for attaining symmetries efficiently by averaging over input-dependent subsets of the group, i.e., frames. What we currently lack is a principled understanding of the design of frames. In… ▽ More

    Submitted 29 May, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  6. arXiv:2405.14185  [pdf, other

    cs.LG cs.PF

    A structure-aware framework for learning device placements on computation graphs

    Authors: Shukai Duan, Heng **, Nikos Kanakaris, Xiongye Xiao, Peiyu Zhang, Panagiotis Kyriakis, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan

    Abstract: Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  7. arXiv:2405.05672  [pdf, other

    cs.CV

    Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

    Authors: Mo Guan, Yan Wang, Guangkun Ma, Jiarui Liu, Mingzu Sun

    Abstract: Sign language serves as a non-vocal means of communication, transmitting information and significance through gestures, facial expressions, and bodily movements. The majority of current approaches for sign language recognition (SLR) and translation rely on RGB video inputs, which are vulnerable to fluctuations in the background. Employing a keypoint-based strategy not only mitigates the effects of… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 15 pages

  8. arXiv:2405.01918  [pdf, other

    cs.RO

    An Onboard Framework for Staircases Modeling Based on Point Clouds

    Authors: Chun Qing, Rongxiang Zeng, Xuan Wu, Yongliang Shi, Gan Ma

    Abstract: The detection of traversable regions on staircases and the physical modeling constitutes pivotal aspects of the mobility of legged robots. This paper presents an onboard framework tailored to the detection of traversable regions and the modeling of physical attributes of staircases by point cloud data. To mitigate the influence of illumination variations and the overfitting due to the dataset dive… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  9. arXiv:2404.13842  [pdf, other

    cs.CV cs.CG

    On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments

    Authors: Gang Ma, Hui Wei

    Abstract: Over the years, scene understanding has attracted a growing interest in computer vision, providing the semantic and physical scene information necessary for robots to complete some particular tasks autonomously. In 3D scenes, rich spatial geometric and topological information are often ignored by RGB-based approaches for scene understanding. In this study, we develop a bottom-up approach for scene… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  10. arXiv:2404.07671  [pdf

    cs.CV

    Deep learning-driven pulmonary arteries and veins segmentation reveals demography-associated pulmonary vasculature anatomy

    Authors: Yuetan Chu, Gongning Luo, Longxi Zhou, Shaodong Cao, Guolin Ma, Xianglin Meng, Juexiao Zhou, Changchun Yang, Dexuan Xie, Ricardo Henao, Xigang Xiao, Lianming Wu, Zhaowen Qiu, Xin Gao

    Abstract: Pulmonary artery-vein segmentation is crucial for diagnosing pulmonary diseases and surgical planning, and is traditionally achieved by Computed Tomography Pulmonary Angiography (CTPA). However, concerns regarding adverse health effects from contrast agents used in CTPA have constrained its clinical utility. In contrast, identifying arteries and veins using non-contrast CT, a conventional and low-… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  11. arXiv:2403.10538  [pdf, other

    cs.AR cs.AI cs.LG

    MATADOR: Automated System-on-Chip Tsetlin Machine Design Generation for Edge Applications

    Authors: Tousif Rahman, Gang Mao, Sidharth Maheshwari, Rishad Shafik, Alex Yakovlev

    Abstract: System-on-Chip Field-Programmable Gate Arrays (SoC-FPGAs) offer significant throughput gains for machine learning (ML) edge inference applications via the design of co-processor accelerator systems. However, the design effort for training and translating ML models into SoC-FPGA solutions can be substantial and requires specialist knowledge aware trade-offs between model performance, power consumpt… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  12. arXiv:2403.04158  [pdf, other

    cs.CL cs.AI

    DA-Net: A Disentangled and Adaptive Network for Multi-Source Cross-Lingual Transfer Learning

    Authors: Ling Ge, Chunming Hu, Guanghui Ma, Jihong Liu, Hong Zhang

    Abstract: Multi-Source cross-lingual transfer learning deals with the transfer of task knowledge from multiple labelled source languages to an unlabeled target language under the language shift. Existing methods typically focus on weighting the predictions produced by language-specific classifiers of different sources that follow a shared encoder. However, all source languages share the same encoder, which… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: AAAI 2024

  13. A Novel Hybrid Feature Importance and Feature Interaction Detection Framework for Predictive Optimization in Industry 4.0 Applications

    Authors: Zhipeng Ma, Bo Nørregaard Jørgensen, Zheng Grace Ma

    Abstract: Advanced machine learning algorithms are increasingly utilized to provide data-based prediction and decision-making support in Industry 4.0. However, the prediction accuracy achieved by the existing models is insufficient to warrant practical implementation in real-world applications. This is because not all features present in real-world datasets possess a direct relevance to the predictive analy… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Journal ref: IECON 2023- 49th Annual Conference of the IEEE Industrial Electronics Society

  14. arXiv:2402.07610  [pdf, other

    cs.CL cs.AI

    Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrap**

    Authors: Haoyu Wang, Guozheng Ma, Ziqiao Meng, Zeyu Qin, Li Shen, Zhong Zhang, Bingzhe Wu, Liu Liu, Yatao Bian, Tingyang Xu, Xueqian Wang, Peilin Zhao

    Abstract: Self-alignment is an effective way to reduce the cost of human annotation while ensuring promising model capability. However, most current methods complete the data collection and training steps in a single round, which may overlook the continuously improving ability of self-aligned models. This gives rise to a key query: What if we do multi-time bootstrap** self-alignment? Does this strategy en… ▽ More

    Submitted 27 June, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  15. arXiv:2402.02397  [pdf

    physics.optics cs.CV cs.NE

    Multiplexed all-optical permutation operations using a reconfigurable diffractive optical network

    Authors: Guangdong Ma, Xilin Yang, Bijie Bai, **gxi Li, Yuhang Li, Tianyi Gan, Che-Yung Shen, Yijie Zhang, Yuzhu Li, Mona Jarrahi, Aydogan Ozcan

    Abstract: Large-scale and high-dimensional permutation operations are important for various applications in e.g., telecommunications and encryption. Here, we demonstrate the use of all-optical diffractive computing to execute a set of high-dimensional permutation operations between an input and output field-of-view through layer rotations in a diffractive optical network. In this reconfigurable multiplexed… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 37 Pages, 10 Figures

  16. Business Models for Digitalization Enabled Energy Efficiency and Flexibility in Industry: A Survey with Nine Case Studies

    Authors: Zhipeng Ma, Bo Nørregaard Jørgensen, Michelle Levesque, Mouloud Amazouz, Zheng Grace Ma

    Abstract: Digitalization is challenging in heavy industrial sectors, and many pi-lot projects facing difficulties to be replicated and scaled. Case studies are strong pedagogical vehicles for learning and sharing experience & knowledge, but rarely available in the literature. Therefore, this paper conducts a survey to gather a diverse set of nine industry cases, which are subsequently subjected to analysis… ▽ More

    Submitted 26 January, 2024; originally announced February 2024.

    Journal ref: Energy Informatics. EI.A 2023. Lecture Notes in Computer Science, vol 14467

  17. arXiv:2401.17268  [pdf, other

    cs.CL cs.AI cs.LG

    Weaver: Foundation Models for Creative Writing

    Authors: Tiannan Wang, Jiamin Chen, Qingrui Jia, Shuai Wang, Ruoyu Fang, Huilin Wang, Zhaowei Gao, Chunzhao Xie, Chuou Xu, Jihong Dai, Yibin Liu, Jialong Wu, Shengwei Ding, Long Li, Zhiwei Huang, Xinle Deng, Teng Yu, Gangan Ma, Han Xiao, Zixin Chen, Danjun Xiang, Yunxia Wang, Yuanyuan Zhu, Yi Xiao, **g Wang , et al. (21 additional authors not shown)

    Abstract: This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  18. Energy Flexibility Potential in the Brewery Sector: A Multi-agent Based Simulation of 239 Danish Breweries

    Authors: Daniel Anthony Howard, Zheng Grace Ma, Jacob Alstrup Engvang, Morten Hagenau, Kathrine Lau Jorgensen, Jonas Fausing Olesen, Bo Nørregaard Jørgensen

    Abstract: The beverage industry is a typical food processing industry, accounts for significant energy consumption, and has flexible demands. However, the deployment of energy flexibility in the beverage industry is complex and challenging. Furthermore, activation of energy flexibility from the whole brewery industry is necessary to ensure grid stability. Therefore, this paper assesses the energy flexibilit… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  19. arXiv:2401.11248  [pdf, other

    cs.IR cs.CL

    Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval

    Authors: Guangyuan Ma, Xing Wu, Zijia Lin, Songlin Hu

    Abstract: Masked auto-encoder pre-training has emerged as a prevalent technique for initializing and enhancing dense retrieval systems. It generally utilizes additional Transformer decoder blocks to provide sustainable supervision signals and compress contextual information into dense representations. However, the underlying reasons for the effectiveness of such a pre-training technique remain unclear. The… ▽ More

    Submitted 22 April, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: Accepted by SIGIR24. Our code is available at https://github.com/ma787639046/bowdpr

  20. arXiv:2401.07329  [pdf, other

    cs.NE

    Attention-based UNet enabled Lightweight Image Semantic Communication System over Internet of Things

    Authors: Guoxin Ma, Haonan Tong, Nuocheng Yang, Changchuan Yin

    Abstract: This paper studies the problem of the lightweight image semantic communication system that is deployed on Internet of Things (IoT) devices. In the considered system model, devices must use semantic communication techniques to support user behavior recognition in ultimate video service with high data transmission efficiency. However, it is computationally expensive for IoT devices to deploy semanti… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: 6 pages, 6 figures, accepted by IEEE WCNC 2024

  21. arXiv:2401.06748  [pdf, other

    cs.CG

    Measure Theoretic Reeb Graphs and Reeb Spaces

    Authors: Qingsong Wang, Guanquan Ma, Raghavendra Sridharamurthy, Bei Wang

    Abstract: A Reeb graph is a graphical representation of a scalar function on a topological space that encodes the topology of the level sets. A Reeb space is a generalization of the Reeb graph to a multiparameter function. In this paper, we propose novel constructions of Reeb graphs and Reeb spaces that incorporate the use of a measure. Specifically, we introduce measure-theoretic Reeb graphs and Reeb space… ▽ More

    Submitted 22 March, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

  22. Multi-Agent Based Simulation for Investigating Electric Vehicle Adoption and Its Impacts on Electricity Distribution Grids and CO2 Emissions

    Authors: Kristoffer Christensen, Zheng Grace Ma, Bo Nørregaard Jørgensen

    Abstract: Electric vehicles are expected to significantly contribute to CO2-eq. emissions reduction, but the increasing number of EVs also introduces chal-lenges to the energy system, and to what extent it contributes to achieving cli-mate goals remains unknown. Static modeling and assumption-based simula-tions have been used for such investigation, but they cannot capture the realistic ecosystem dynamics.… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Journal ref: In: Energy Informatics. EI.A 2023. Lecture Notes in Computer Science, vol 14468

  23. A Modifiable Architectural Design for Commercial Greenhouses Energy Economic Dispatch Testbed

    Authors: Christian Skafte Beck Clausen, Bo Nørregaard Jørgensen, Zheng Grace Ma

    Abstract: Facing economic challenges due to the diverse objectives of businesses, and consumers, commercial greenhouses strive to minimize energy costs while addressing CO2 emissions. This scenario is intensified by rising energy costs and the global imperative to curtail CO2 emissions. To address these dynamic economic challenges, this paper proposes an architectural design for an energy economic dispatch… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: 19 pages

    Journal ref: In: Energy Informatics. EI.A 2023. Lecture Notes in Computer Science, vol 14467

  24. arXiv:2401.01155  [pdf, ps, other

    cs.IT cs.LG

    Deep Learning-Based Detection for Marker Codes over Insertion and Deletion Channels

    Authors: Guochen Ma, Xiaopeng Jiao, Jianjun Mu, Hui Han, Yaming Yang

    Abstract: Marker code is an effective coding scheme to protect data from insertions and deletions. It has potential applications in future storage systems, such as DNA storage and racetrack memory. When decoding marker codes, perfect channel state information (CSI), i.e., insertion and deletion probabilities, are required to detect insertion and deletion errors. Sometimes, the perfect CSI is not easy to obt… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  25. arXiv:2312.08628  [pdf

    cs.CV

    YOLO-OB: An improved anchor-free real-time multiscale colon polyp detector in colonoscopy

    Authors: Xiao Yang, Enmin Song, Guangzhi Ma, Yunfeng Zhu, Dongming Yu, Bowen Ding, Xianyuan Wang

    Abstract: Colon cancer is expected to become the second leading cause of cancer death in the United States in 2023. Although colonoscopy is one of the most effective methods for early prevention of colon cancer, up to 30% of polyps may be missed by endoscopists, thereby increasing patients' risk of develo** colon cancer. Though deep neural networks have been proven to be an effective means of enhancing th… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  26. arXiv:2312.06718  [pdf, other

    cs.AI

    Large Scale Foundation Models for Intelligent Manufacturing Applications: A Survey

    Authors: Haotian Zhang, Semujju Stuart Dereck, Zhicheng Wang, Xianwei Lv, Kang Xu, Liang Wu, Ye Jia, **g Wu, Zhuo Long, Wensheng Liang, X. G. Ma, Ruiyan Zhuang

    Abstract: Although the applications of artificial intelligence especially deep learning had greatly improved various aspects of intelligent manufacturing, they still face challenges for wide employment due to the poor generalization ability, difficulties to establish high-quality training datasets, and unsatisfactory performance of deep learning methods. The emergence of large scale foundational models(LSFM… ▽ More

    Submitted 22 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

  27. arXiv:2312.05657  [pdf, other

    cs.LG cs.AI cs.PL cs.SE

    Leveraging Reinforcement Learning and Large Language Models for Code Optimization

    Authors: Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Heng **, Chenyu Zhou, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Theodore L. Willke, Shahin Nazarian, Paul Bogdan

    Abstract: Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framewor… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  28. arXiv:2311.18675  [pdf, other

    cs.CV

    Cascaded Interaction with Eroded Deep Supervision for Salient Object Detection

    Authors: Hewen Xiao, Jie Mei, Guangfu Ma, Weiren Wu

    Abstract: Deep convolutional neural networks have been widely applied in salient object detection and have achieved remarkable results in this field. However, existing models suffer from information distortion caused by interpolation during up-sampling and down-sampling. In response to this drawback, this article starts from two directions in the network: feature and label. On the one hand, a novel cascaded… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  29. arXiv:2310.10095  [pdf, other

    eess.IV cs.CV cs.LG

    A Multi-Scale Spatial Transformer U-Net for Simultaneously Automatic Reorientation and Segmentation of 3D Nuclear Cardiac Images

    Authors: Yangfan Ni, Duo Zhang, Gege Ma, Lijun Lu, Zhongke Huang, Wentao Zhu

    Abstract: Accurate reorientation and segmentation of the left ventricular (LV) is essential for the quantitative analysis of myocardial perfusion imaging (MPI), in which one critical step is to reorient the reconstructed transaxial nuclear cardiac images into standard short-axis slices for subsequent image processing. Small-scale LV myocardium (LV-MY) region detection and the diverse cardiac structures of i… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 17 pages, 7 figures

  30. arXiv:2310.10049  [pdf, other

    cs.LG cs.AI

    FATE-LLM: A Industrial Grade Federated Learning Framework for Large Language Models

    Authors: Tao Fan, Yan Kang, Guoqiang Ma, Wei**g Chen, Wenbin Wei, Lixin Fan, Qiang Yang

    Abstract: Large Language Models (LLMs), such as ChatGPT, LLaMA, GLM, and PaLM, have exhibited remarkable performances across various tasks in recent years. However, LLMs face two main challenges in real-world applications. One challenge is that training LLMs consumes vast computing resources, preventing LLMs from being adopted by small and medium-sized enterprises with limited computing resources. Another i… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  31. arXiv:2310.07418  [pdf, other

    cs.LG cs.AI

    Revisiting Plasticity in Visual Reinforcement Learning: Data, Modules and Training Stages

    Authors: Guozheng Ma, Lu Li, Sen Zhang, Zixuan Liu, Zhen Wang, Yixin Chen, Li Shen, Xueqian Wang, Dacheng Tao

    Abstract: Plasticity, the ability of a neural network to evolve with new data, is crucial for high-performance and sample-efficient visual reinforcement learning (VRL). Although methods like resetting and regularization can potentially mitigate plasticity loss, the influences of various components within the VRL framework on the agent's plasticity are still poorly understood. In this work, we conduct a syst… ▽ More

    Submitted 19 May, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: ICLR 2024 poster

  32. arXiv:2309.16178  [pdf, other

    cs.SD eess.AS

    LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR

    Authors: Guodong Ma, Wenxuan Wang, Yuke Li, Yuting Yang, Binbin Du, Haoran Fu

    Abstract: Recently, to mitigate the confusion between different languages in code-switching (CS) automatic speech recognition (ASR), the conditionally factorized models, such as the language-aware encoder (LAE), explicitly disregard the contextual information between different languages. However, this information may be helpful for ASR modeling. To alleviate this issue, we propose the LAE-ST-MoE framework.… ▽ More

    Submitted 7 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE ASRU 2023

  33. arXiv:2309.11166  [pdf, other

    cs.CL cs.AI

    Are Large Language Models Really Robust to Word-Level Perturbations?

    Authors: Haoyu Wang, Guozheng Ma, Cong Yu, Ning Gui, Linrui Zhang, Zhiqi Huang, Suwei Ma, Yongzhe Chang, Sen Zhang, Li Shen, Xueqian Wang, Peilin Zhao, Dacheng Tao

    Abstract: The swift advancement in the scales and capabilities of Large Language Models (LLMs) positions them as promising tools for a variety of downstream tasks. In addition to the pursuit of better performance and the avoidance of violent feedback on a certain prompt, to ensure the responsibility of the LLM, much attention is drawn to the robustness of LLMs. However, existing evaluation methods mostly re… ▽ More

    Submitted 27 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  34. arXiv:2309.09528  [pdf

    cs.HC

    Gesture Recognition in Millimeter-Wave Radar Based on Spatio-Temporal Feature Sequences

    Authors: Qun Fang, YiHui Yan, GuoQing Ma

    Abstract: Gesture recognition is a pivotal technology in the realm of intelligent education, and millimeter-wave (mmWave) signals possess advantages such as high resolution and strong penetration capability. This paper introduces a highly accurate and robust gesture recognition method using mmWave radar. The method involves capturing the raw signals of hand movements with the mmWave radar module and preproc… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  35. arXiv:2309.08781  [pdf, ps, other

    cs.GT

    Platform Equilibrium: Analayzing Social Welfare in Online Market Places

    Authors: Alon Eden, Gary Qiurui Ma, David C. Parkes

    Abstract: We introduce the theoretical study of a Platform Equilibrium in a market with unit-demand buyers and unit-supply sellers. Each seller can join a platform and transact with any buyer or remain off-platform and transact with a subset of buyers whom she knows. Given the constraints on trade, prices form a competitive equilibrium and clears the market. The platform charges a transaction fee to all on-… ▽ More

    Submitted 20 June, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted at the Twenty-Fifth ACM Conference on Economics and Computation (EC'24), 2024

  36. arXiv:2309.02731  [pdf, other

    cs.CL cs.AI

    HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus

    Authors: Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu

    Abstract: ChatGPT has gained significant interest due to its impressive performance, but people are increasingly concerned about its potential risks, particularly around the detection of AI-generated content (AIGC), which is often difficult for untrained humans to identify. Current datasets utilized for detecting ChatGPT-generated text primarily center around question-answering, yet they tend to disregard t… ▽ More

    Submitted 25 January, 2024; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: This paper has been accepted by CIKM2023 workshop

  37. arXiv:2308.08285  [pdf, other

    cs.IR cs.CL

    Pre-training with Large Language Model-based Document Expansion for Dense Passage Retrieval

    Authors: Guangyuan Ma, Xing Wu, Peng Wang, Zijia Lin, Songlin Hu

    Abstract: In this paper, we systematically study the potential of pre-training with Large Language Model(LLM)-based document expansion for dense passage retrieval. Concretely, we leverage the capabilities of LLMs for document expansion, i.e. query generation, and effectively transfer expanded knowledge to retrievers using pre-training strategies tailored for passage retrieval. These strategies include contr… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: 10 pages, 3 tables, 4 figures, under review

  38. arXiv:2308.04663  [pdf, other

    eess.IV cs.CV cs.LG

    Classification of lung cancer subtypes on CT images with synthetic pathological priors

    Authors: Wentao Zhu, Yuan **, Gege Ma, Geng Chen, Jan Egger, Shaoting Zhang, Dimitris N. Metaxas

    Abstract: The accurate diagnosis on pathological subtypes for lung cancer is of significant importance for the follow-up treatments and prognosis managements. In this paper, we propose self-generating hybrid feature network (SGHF-Net) for accurately classifying lung cancer subtypes on computed tomography (CT) images. Inspired by studies stating that cross-scale associations exist in the image patterns betwe… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: 16 pages, 7 figures

    Journal ref: Medical Image Analysis 95, July 2024, 103199

  39. arXiv:2308.01531  [pdf

    cs.SD eess.AS physics.app-ph

    Optimizing multi-user indoor sound communications with acoustic reconfigurable metasurfaces

    Authors: Hongkuan Zhang, Qiyuan Wang, Mathias Fink, Guancong Ma

    Abstract: Sound in indoor spaces forms a complex wavefield due to multiple scattering encountered by the sound. Indoor acoustic communication involving multiple sources and receivers thus inevitably suffers from cross-talks. Here, we demonstrate the isolation of acoustic communication channels in a room by wavefield sha** using acoustic reconfigurable metasurfaces (ARMs) controlled by optimization protoco… ▽ More

    Submitted 10 February, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

    Journal ref: Nature Communications (2024)

  40. arXiv:2308.00920  [pdf

    physics.med-ph cs.CV cs.LG eess.IV

    Virtual histological staining of unlabeled autopsy tissue

    Authors: Yuzhu Li, Nir Pillar, **gxi Li, Tairan Liu, Di Wu, Songyu Sun, Guangdong Ma, Kevin de Haan, Luzhe Huang, Sepehr Hamidi, Anatoly Urisman, Tal Keidar Haran, William Dean Wallace, Jonathan E. Zuckerman, Aydogan Ozcan

    Abstract: Histological examination is a crucial step in an autopsy; however, the traditional histochemical staining of post-mortem samples faces multiple challenges, including the inferior staining quality due to autolysis caused by delayed fixation of cadaver tissue, as well as the resource-intensive nature of chemical staining procedures covering large tissue areas, which demand substantial labor, cost, a… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 24 Pages, 7 Figures

    Journal ref: Nature Communications (2024)

  41. arXiv:2307.05956  [pdf, other

    cs.SD eess.AS

    Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

    Authors: Wenxuan Wang, Guodong Ma, Yuke Li, Binbin Du

    Abstract: Multilingual speech recognition for both monolingual and code-switching speech is a challenging task. Recently, based on the Mixture of Experts (MoE), many works have made good progress in multilingual and code-switching ASR, but present huge computational complexity with the increase of supported languages. In this work, we propose a computation-efficient network named Language-Routing Mixture of… ▽ More

    Submitted 13 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: To appear in Proc. INTERSPEECH 2023, August 20-24, 2023, Dublin, Ireland

  42. arXiv:2307.04024  [pdf, other

    cs.LG cs.CR

    Robust Ranking Explanations

    Authors: Chao Chen, Chenghua Guo, Guixiang Ma, Ming Zeng, Xi Zhang, Sihong Xie

    Abstract: Robust explanations of machine learning models are critical to establish human trust in the models. Due to limited cognition capability, most humans can only interpret the top few salient features. It is critical to make top salient features robust to adversarial attacks, especially those against the more vulnerable gradient-based explanations. Existing defense measures robustness using $\ell_p$-n… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: Accepted to IMLH (Interpretable ML in Healthcare) workshop at ICML 2023. arXiv admin note: substantial text overlap with arXiv:2212.14106

  43. arXiv:2306.12045  [pdf, other

    q-bio.NC cs.CV cs.LG cs.NE

    Temporal Conditioning Spiking Latent Variable Models of the Neural Response to Natural Visual Scenes

    Authors: Gehua Ma, Runhao Jiang, Rui Yan, Hua** Tang

    Abstract: Develo** computational models of neural response is crucial for understanding sensory processing and neural computations. Current state-of-the-art neural network methods use temporal filters to handle temporal dependencies, resulting in an unrealistic and inflexible processing paradigm. Meanwhile, these methods target trial-averaged firing rates and fail to capture important features in spike tr… ▽ More

    Submitted 19 December, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPS 2023 (https://openreview.net/forum?id=V4YeOvsQfu). 22 pages, 7 figures, 3 tables

  44. arXiv:2306.04357  [pdf, other

    cs.CL cs.AI

    Dial-MAE: ConTextual Masked Auto-Encoder for Retrieval-based Dialogue Systems

    Authors: Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu

    Abstract: Dialogue response selection aims to select an appropriate response from several candidates based on a given user and system utterance history. Most existing works primarily focus on post-training and fine-tuning tailored for cross-encoders. However, there are no post-training methods tailored for dense encoders in dialogue response selection. We argue that when the current language model, based on… ▽ More

    Submitted 26 March, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted by NAACL 2024

  45. arXiv:2306.00656  [pdf, other

    cs.LG cs.AI

    Normalization Enhances Generalization in Visual Reinforcement Learning

    Authors: Lu Li, Jiafei Lyu, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li, Zhiheng Li

    Abstract: Recent advances in visual reinforcement learning (RL) have led to impressive success in handling complex tasks. However, these methods have demonstrated limited generalization capability to visual disturbances, which poses a significant challenge for their real-world application and adaptability. Though normalization techniques have demonstrated huge success in supervised and unsupervised learning… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  46. arXiv:2305.16379  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning

    Authors: Guozheng Ma, Linrui Zhang, Haoyu Wang, Lu Li, Zilin Wang, Zhen Wang, Li Shen, Xueqian Wang, Dacheng Tao

    Abstract: Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms. Notably, employing simple observation transformations alone can yield outstanding performance without extra auxiliary representation tasks or pre-trained encoders. However, it remains unclear which attributes of DA account for its effectiveness in achieving sample-eff… ▽ More

    Submitted 27 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 poster

  47. Exploiting Noise as a Resource for Computation and Learning in Spiking Neural Networks

    Authors: Gehua Ma, Rui Yan, Hua** Tang

    Abstract: $\textbf{Formal version available at}$ https://cell.com/patterns/fulltext/S2666-3899(23)00200-3 Networks of spiking neurons underpin the extraordinary information-processing capabilities of the brain and have become pillar models in neuromorphic artificial intelligence. Despite extensive research on spiking neural networks (SNNs), most studies are established on deterministic models, overlooking… ▽ More

    Submitted 14 September, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 36 pages, 9 figures, 4 tables. Formal version available at https://cell.com/patterns/fulltext/S2666-3899(23)00200-3

  48. arXiv:2305.01524  [pdf, other

    cs.RO

    3D Laser-and-tissue Agnostic Data-driven Method for Robotic Laser Surgical Planning

    Authors: Guangshen Ma, Ravi Prakash, Brian Mann, Weston Ross, Patrick Codd

    Abstract: In robotic laser surgery, shape prediction of an one-shot ablation cavity is an important problem for minimizing errant overcutting of healthy tissue during the course of pathological tissue resection and precise tumor removal. Since it is difficult to physically model the laser-tissue interaction due to the variety of optical tissue properties, complicated process of heat transfer, and uncertaint… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

  49. arXiv:2304.12633  [pdf, other

    cs.IR cs.CL

    PUNR: Pre-training with User Behavior Modeling for News Recommendation

    Authors: Guangyuan Ma, Hongtao Liu, Xing Wu, Wanhui Qian, Zhepeng Lv, Qing Yang, Songlin Hu

    Abstract: News recommendation aims to predict click behaviors based on user behaviors. How to effectively model the user representations is the key to recommending preferred news. Existing works are mostly focused on improvements in the supervised fine-tuning stage. However, there is still a lack of PLM-based unsupervised pre-training methods optimized for user representations. In this work, we propose an u… ▽ More

    Submitted 30 October, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted by Findings of EMNLP23. Github Repo: https://github.com/ma787639046/punr

  50. arXiv:2304.10195  [pdf, other

    cs.CL cs.IR

    CoT-MoTE: Exploring ConTextual Masked Auto-Encoder Pre-training with Mixture-of-Textual-Experts for Passage Retrieval

    Authors: Guangyuan Ma, Xing Wu, Peng Wang, Songlin Hu

    Abstract: Passage retrieval aims to retrieve relevant passages from large collections of the open-domain corpus. Contextual Masked Auto-Encoding has been proven effective in representation bottleneck pre-training of a monolithic dual-encoder for passage retrieval. Siamese or fully separated dual-encoders are often adopted as basic retrieval architecture in the pre-training and fine-tuning stages for encodin… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.