Skip to main content

Showing 1–50 of 482 results for author: Guo, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00014  [pdf

    cs.RO eess.SY

    Simplifying Kinematic Parameter Estimation in sEMG Prosthetic Hands: A Two-Point Approach

    Authors: Gang Liu, Zhenxiang Wang, Ziyang He, Shanshan Guo, Rui Zhang, Dezhong Yao

    Abstract: Regression-based sEMG prosthetic hands are widely used for their ability to provide continuous kinematic parameters. However, establishing these models traditionally requires complex kinematic sensor systems to collect corresponding kinematic data in synchronization with EMG, which is cumbersome and user-unfriendly. This paper presents a simplified approach utilizing only two data points to depict… ▽ More

    Submitted 1 May, 2024; originally announced July 2024.

    Comments: 13 pages

  2. arXiv:2406.19486  [pdf, other

    cs.CL cs.AI cs.ET cs.LG eess.SP

    LoPT: Low-Rank Prompt Tuning for Parameter Efficient Language Models

    Authors: Shouchang Guo, Sonam Damani, Keng-hao Chang

    Abstract: In prompt tuning, a prefix or suffix text is added to the prompt, and the embeddings (soft prompts) or token indices (hard prompts) of the prefix/suffix are optimized to gain more control over language models for specific tasks. This approach eliminates the need for hand-crafted prompt engineering or explicit model fine-tuning. Prompt tuning is significantly more parameter-efficient than model fin… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  3. arXiv:2406.16878  [pdf, ps, other

    eess.SP cs.AI cs.IT

    Benchmarking Semantic Communications for Image Transmission Over MIMO Interference Channels

    Authors: Yanhu Wang, Shuaishuai Guo, Anming Dong, Hui Zhao

    Abstract: Semantic communications offer promising prospects for enhancing data transmission efficiency. However, existing schemes have predominantly concentrated on point-to-point transmissions. In this paper, we aim to investigate the validity of this claim in interference scenarios compared to baseline approaches. Specifically, our focus is on general multiple-input multiple-output (MIMO) interference cha… ▽ More

    Submitted 10 April, 2024; originally announced June 2024.

  4. arXiv:2406.14675  [pdf, other

    cs.CV cs.AI cs.LG

    This Looks Better than That: Better Interpretable Models with ProtoPNeXt

    Authors: Frank Willard, Luke Moffett, Emmanuel Mokel, Jon Donnelly, Stark Guo, Julia Yang, Giyoung Kim, Alina Jade Barnett, Cynthia Rudin

    Abstract: Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision. However, they are difficult to train, with high sensitivity to hyperparameter tuning, inhibiting their application to new datasets and our understanding of which methods truly improve their performance. To facilitate the careful study of prototypical-part networks (ProtoPNets), w… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  5. arXiv:2406.14540  [pdf, other

    cs.RO cs.AI cs.CV

    IRASim: Learning Interactive Real-Robot Action Simulators

    Authors: Fangqi Zhu, Hongtao Wu, Song Guo, Yuxiao Liu, Chilam Cheang, Tao Kong

    Abstract: Scalable robot learning in the real world is limited by the cost and safety issues of real robots. In addition, rolling out robot trajectories in the real world can be time-consuming and labor-intensive. In this paper, we propose to learn an interactive real-robot action simulator as an alternative. We introduce a novel method, IRASim, which leverages the power of generative models to generate ext… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Opensource, project website: https://gen-irasim.github.io

  6. arXiv:2406.14399  [pdf, other

    cs.LG cs.CV physics.ao-ph stat.ML

    WEATHER-5K: A Large-scale Global Station Weather Dataset Towards Comprehensive Time-series Forecasting Benchmark

    Authors: Tao Han, Song Guo, Zhenghao Chen, Wanghan Xu, Lei Bai

    Abstract: Global Station Weather Forecasting (GSWF) is crucial for various sectors, including aviation, agriculture, energy, and disaster preparedness. Recent advancements in deep learning have significantly improved the accuracy of weather predictions by optimizing models based on public meteorological data. However, existing public datasets for GSWF optimization and benchmarking still suffer from signific… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 26 pages,13 figures

  7. arXiv:2406.14302  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning

    Authors: Patrik Reizinger, Siyuan Guo, Ferenc Huszár, Bernhard Schölkopf, Wieland Brendel

    Abstract: Identifying latent representations or causal structures is important for good generalization and downstream task performance. However, both fields have been developed rather independently. We observe that several methods in both representation and causal structure learning rely on the same data-generating process (DGP), namely, exchangeable but not i.i.d. (independent and identically distributed)… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  8. arXiv:2406.14056  [pdf, other

    cs.CV

    VGA: Vision GUI Assistant -- Minimizing Hallucinations through Image-Centric Fine-Tuning

    Authors: Ziyang Meng, Yu Dai, Zezheng Gong, Shaoxiong Guo, Minglong Tang, Tongquan Wei

    Abstract: Recent advances in Large Vision-Language Models (LVLMs) have significantly improve performance in image comprehension tasks, such as formatted charts and rich-content images. Yet, Graphical User Interface (GUI) pose a greater challenge due to their structured format and detailed textual information. Existing LVLMs often overly depend on internal knowledge and neglect image content, resulting in ha… ▽ More

    Submitted 21 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

    Comments: 18 pages

    MSC Class: 68-04 68-04 ACM Class: I.2.7; I.2.10

  9. arXiv:2406.13948  [pdf, other

    cs.AI cs.CL cs.LG

    CityGPT: Empowering Urban Spatial Cognition of Large Language Models

    Authors: Jie Feng, Yuwei Du, Tianhui Liu, Siqi Guo, Yuming Lin, Yong Li

    Abstract: Large language models(LLMs) with powerful language generation and reasoning capabilities have already achieved success in many domains, e.g., math and code generation. However, due to the lacking of physical world's corpus and knowledge during training, they usually fail to solve many real-life tasks in the urban space. In this paper, we propose CityGPT, a systematic framework for enhancing the ca… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.13945  [pdf, other

    cs.AI cs.CL cs.LG

    CityBench: Evaluating the Capabilities of Large Language Model as World Model

    Authors: Jie Feng, Jun Zhang, Junbo Yan, Xin Zhang, Tianjian Ouyang, Tianhui Liu, Yuwei Du, Siqi Guo, Yong Li

    Abstract: Large language models (LLMs) with powerful generalization ability has been widely used in many domains. A systematic and reliable evaluation of LLMs is a crucial step in their development and applications, especially for specific professional fields. In the urban domain, there have been some early explorations about the usability of LLMs, but a systematic and scalable evaluation benchmark is still… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  11. arXiv:2406.12074  [pdf, other

    cs.CL

    COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities

    Authors: Zihao He, Rebecca Dorn, Siyi Guo, Minh Duc Chu, Kristina Lerman

    Abstract: Social scientists use surveys to probe the opinions and beliefs of populations, but these methods are slow, costly, and prone to biases. Recent advances in large language models (LLMs) enable creating computational representations or "digital twins" of populations that generate human-like responses mimicking the population's language, styles, and attitudes. We introduce Community-Cross-Instruct, a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  12. arXiv:2406.10506  [pdf, ps, other

    cs.CY

    Validating an Instrument for Teachers' Acceptance of Artificial Intelligence in Education

    Authors: Shuchen Guo, Lehong Shi, Xiaoming Zhai

    Abstract: As artificial intelligence (AI) receives wider attention in education, examining teachers' acceptance of AI (TAAI) becomes essential. However, existing instruments measuring TAAI reported limited reliability and validity evidence and faced some design challenges, such as missing informed definitions of AI to participants. This study aimed to develop and validate a TAAI instrument, with providing s… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  13. arXiv:2406.08909  [pdf, other

    cs.CV

    A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras

    Authors: Chenyang Shi, Shasha Guo, Boyi Wei, Hanxiao Liu, Yibo Zhang, Ningfang Song, **g **

    Abstract: Event cameras are renowned for their high efficiency due to outputting a sparse, asynchronous stream of events. However, they are plagued by noisy events, especially in low light conditions. Denoising is an essential task for event cameras, but evaluating denoising performance is challenging. Label-dependent denoising metrics involve artificially adding noise to clean sequences, complicating evalu… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  14. arXiv:2406.08090  [pdf, other

    cs.CV

    From Sim-to-Real: Toward General Event-based Low-light Frame Interpolation with Per-scene Optimization

    Authors: Ziran Zhang, Yongrui Ma, Yueting Chen, Feng Zhang, **wei Gu, Tianfan Xue, Shi Guo

    Abstract: Video Frame Interpolation (VFI) is important for video enhancement, frame rate up-conversion, and slow-motion generation. The introduction of event cameras, which capture per-pixel brightness changes asynchronously, has significantly enhanced VFI capabilities, particularly for high-speed, nonlinear motions. However, these event-based methods encounter challenges in low-light conditions, notably tr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  15. arXiv:2406.06937  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation

    Authors: Zhengrui Ma, Qingkai Fang, Shaolei Zhang, Shoutao Guo, Yang Feng, Min Zhang

    Abstract: Simultaneous translation models play a crucial role in facilitating communication. However, existing research primarily focuses on text-to-text or speech-to-text models, necessitating additional cascade components to achieve speech-to-speech translation. These pipeline methods suffer from error propagation and accumulate delays in each cascade component, resulting in reduced synchronization betwee… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: ACL 2024; Codes and demos are at https://github.com/ictnlp/NAST-S2x

  16. arXiv:2406.06910  [pdf, other

    cs.CL

    Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models

    Authors: Shoutao Guo, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng

    Abstract: Simultaneous Machine Translation (SiMT) generates target translations while reading the source sentence. It relies on a policy to determine the optimal timing for reading sentences and generating translations. Existing SiMT methods generally adopt the traditional Transformer architecture, which concurrently determines the policy and generates translations. While they excel at determining policies,… ▽ More

    Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: 18 pages, 8 figures, 7 tables. v2 of arXiv:2402.13036

  17. arXiv:2406.03878  [pdf, other

    cs.CL

    Decoder-only Streaming Transformer for Simultaneous Translation

    Authors: Shoutao Guo, Shaolei Zhang, Yang Feng

    Abstract: Simultaneous Machine Translation (SiMT) generates translation while reading source tokens, essentially producing the target prefix based on the source prefix. To achieve good performance, it leverages the relationship between source and target prefixes to exact a policy to guide the generation of translations. Although existing SiMT methods primarily focus on the Encoder-Decoder architecture, we e… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024. 14 pages, 10 Tables, 5 Figures

  18. arXiv:2406.03049  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

    Authors: Shaolei Zhang, Qingkai Fang, Shoutao Guo, Zhengrui Ma, Min Zhang, Yang Feng

    Abstract: Simultaneous speech-to-speech translation (Simul-S2ST, a.k.a streaming speech translation) outputs target speech while receiving streaming speech inputs, which is critical for real-time communication. Beyond accomplishing translation between speech, Simul-S2ST requires a policy to control the model to generate corresponding target speech at the opportune moment within speech inputs, thereby posing… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 main conference, Project Page: https://ictnlp.github.io/StreamSpeech-site/

  19. arXiv:2406.02903  [pdf, other

    cs.CL

    Open Grounded Planning: Challenges and Benchmark Construction

    Authors: Shiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun

    Abstract: The emergence of large language models (LLMs) has increasingly drawn attention to the use of LLMs for human-like planning. Existing work on LLM-based planning either focuses on leveraging the inherent language generation capabilities of LLMs to produce free-style plans, or employs reinforcement learning approaches to learn decision-making for a limited set of actions within restricted environments… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Accept to ACL 2024 main conference

  20. arXiv:2406.01866  [pdf, other

    cs.CL cs.CY cs.SI

    #EpiTwitter: Public Health Messaging During the COVID-19 Pandemic

    Authors: Ashwin Rao, Nazanin Sabri, Siyi Guo, Louiqa Raschid, Kristina Lerman

    Abstract: Effective communication during health crises is critical, with social media serving as a key platform for public health experts (PHEs) to engage with the public. However, it also amplifies pseudo-experts promoting contrarian views. Despite its importance, the role of emotional and moral language in PHEs' communication during COVID-19 remains under explored. This study examines how PHEs and pseudo-… ▽ More

    Submitted 10 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  21. arXiv:2406.01574  [pdf, other

    cs.CL

    MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

    Authors: Yubo Wang, Xueguang Ma, Ge Zhang, Yuansheng Ni, Abhranil Chandra, Shiguang Guo, Weiming Ren, Aaran Arulraj, Xuan He, Ziyan Jiang, Tianle Li, Max Ku, Kai Wang, Alex Zhuang, Rongqi Fan, Xiang Yue, Wenhu Chen

    Abstract: In the age of large-scale language models, benchmarks like the Massive Multitask Language Understanding (MMLU) have been pivotal in pushing the boundaries of what AI can achieve in language comprehension and reasoning across diverse domains. However, as models continue to improve, their performance on these benchmarks has begun to plateau, making it increasingly difficult to discern differences in… ▽ More

    Submitted 23 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  22. arXiv:2406.00894  [pdf, other

    cs.LG cs.AI cs.CL

    Pretrained Hybrids with MAD Skills

    Authors: Nicholas Roberts, Samuel Guo, Zhiqi Gao, Satya Sai Srinath Namburi GNVV, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala

    Abstract: While Transformers underpin modern large language models (LMs), there is a growing list of alternative architectures with new capabilities, promises, and tradeoffs. This makes choosing the right LM architecture challenging. Recently-proposed $\textit{hybrid architectures}$ seek a best-of-all-worlds approach that reaps the benefits of all architectures. Hybrid design is difficult for two reasons: i… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  23. arXiv:2405.19327  [pdf, other

    cs.CL cs.AI cs.LG

    MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

    Authors: Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin, Raven Yuan, Tuney Zheng, Wei Pang, Xinrun Du, Yiming Liang, Yinghao Ma, Yizhi Li, Ziyang Ma, Bill Lin, Emmanouil Benetos, Huan Yang, Junting Zhou, Kai**g Ma, Minghao Liu, Morry Niu , et al. (20 additional authors not shown)

    Abstract: Large Language Models (LLMs) have made great strides in recent years to achieve unprecedented performance across different tasks. However, due to commercial interest, the most competitive models like GPT, Gemini, and Claude have been gated behind proprietary interfaces without disclosing the training details. Recently, many institutions have open-sourced several strong LLMs like LLaMA-3, comparabl… ▽ More

    Submitted 2 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: https://map-neo.github.io/

  24. arXiv:2405.18836  [pdf, other

    stat.ME cs.LG

    Do Finetti: On Causal Effects for Exchangeable Data

    Authors: Siyuan Guo, Chi Zhang, Karthika Mohan, Ferenc Huszár, Bernhard Schölkopf

    Abstract: We study causal effect estimation in a setting where the data are not i.i.d. (independent and identically distributed). We focus on exchangeable data satisfying an assumption of independent causal mechanisms. Traditional causal effect estimation frameworks, e.g., relying on structural causal models and do-calculus, are typically limited to i.i.d. data and do not extend to more general exchangeable… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  25. arXiv:2405.16152  [pdf, other

    cs.CV cs.HC

    SuDA: Support-based Domain Adaptation for Sim2Real Motion Capture with Flexible Sensors

    Authors: Jiawei Fang, Haishan Song, Chengxu Zuo, Xiaoxia Gao, Xiaowei Chen, Shihui Guo, Yipeng Qin

    Abstract: Flexible sensors hold promise for human motion capture (MoCap), offering advantages such as wearability, privacy preservation, and minimal constraints on natural movement. However, existing flexible sensor-based MoCap methods rely on deep learning and necessitate large and diverse labeled datasets for training. These data typically need to be collected in MoCap studios with specialized equipment a… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 20 pages conference, accepted ICML paper

  26. arXiv:2405.15485  [pdf, other

    cs.AI cs.CL cs.LG

    Learning Beyond Pattern Matching? Assaying Mathematical Understanding in LLMs

    Authors: Siyuan Guo, Aniket Didolkar, Nan Rosemary Ke, Anirudh Goyal, Ferenc Huszár, Bernhard Schölkopf

    Abstract: We are beginning to see progress in language model assisted scientific discovery. Motivated by the use of LLMs as a general scientific assistant, this paper assesses the domain knowledge of LLMs through its understanding of different mathematical skills required to solve problems. In particular, we look at not just what the pre-trained model already knows, but how it learned to learn from informat… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  27. arXiv:2405.14744  [pdf, other

    cs.CY

    Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

    Authors: Xuan Liu, Jie Zhang, Song Guo, Haoyang Shang, Chengxu Yang, Quanyan Zhu

    Abstract: Large language models (LLMs) have been shown to face hallucination issues due to the data they trained on often containing human bias; whether this is reflected in the decision-making process of LLM agents remains under-explored. As LLM Agents are increasingly employed in intricate social environments, a pressing and natural question emerges: Can LLM Agents leverage hallucinations to mirror human… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  28. arXiv:2405.13999  [pdf, other

    cs.CV

    Computer-Vision-Enabled Worker Video Analysis for Motion Amount Quantification

    Authors: Hari Iyer, Neel Macwan, Shenghan Guo, Hee** Jeong

    Abstract: The performance of physical workers is significantly influenced by the quantity of their motions. However, monitoring and assessing these motions is challenging due to the complexities of motion sensing, tracking, and quantification. Recent advancements have utilized in-situ video analysis for real-time observation of worker behaviors, enabling data-driven quantification of motion amounts. Neverth… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  29. arXiv:2405.13080  [pdf, other

    cs.CR cs.LG

    EmInspector: Combating Backdoor Attacks in Federated Self-Supervised Learning Through Embedding Inspection

    Authors: Yuwen Qian, Shuchi Wu, Kang Wei, Ming Ding, Di Xiao, Tao Xiang, Chuan Ma, Song Guo

    Abstract: Federated self-supervised learning (FSSL) has recently emerged as a promising paradigm that enables the exploitation of clients' vast amounts of unlabeled data while preserving data privacy. While FSSL offers advantages, its susceptibility to backdoor attacks, a concern identified in traditional federated supervised learning (FSL), has not been investigated. To fill the research gap, we undertake… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 18 pages, 12 figures

  30. arXiv:2405.12459  [pdf, other

    cs.LG

    PLM4Traj: Cognizing Movement Patterns and Travel Purposes from Trajectories with Pre-trained Language Models

    Authors: Zeyu Zhou, Yan Lin, Haomin Wen, Shengnan Guo, Jilin Hu, Youfang Lin, Huaiyu Wan

    Abstract: Spatio-temporal trajectories play a vital role in various spatio-temporal data mining tasks. Develo** a versatile trajectory learning approach that can adapt to different tasks while ensuring high accuracy is crucial. This requires effectively extracting movement patterns and travel purposes embedded in trajectories. However, this task is challenging due to limitations in the size and quality of… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  31. arXiv:2405.12205  [pdf, other

    cs.AI cs.LG

    Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

    Authors: Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora

    Abstract: Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, develo** a prompt-guided interac… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review

  32. arXiv:2405.10132  [pdf, other

    cs.CV

    Cooperative Visual-LiDAR Extrinsic Calibration Technology for Intersection Vehicle-Infrastructure: A review

    Authors: Xinyu Zhang, Yi** Xiong, Qianxin Qu, Renjie Wang, Xin Gao, **g Liu, Shichun Guo, Jun Li

    Abstract: In the typical urban intersection scenario, both vehicles and infrastructures are equipped with visual and LiDAR sensors. By successfully integrating the data from vehicle-side and road monitoring devices, a more comprehensive and accurate environmental perception and information acquisition can be achieved. The Calibration of sensors, as an essential component of autonomous driving technology, ha… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  33. arXiv:2405.09552  [pdf, other

    eess.IV cs.AI cs.CV

    ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

    Authors: Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

    Abstract: Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose seman… ▽ More

    Submitted 2 June, 2024; v1 submitted 15 April, 2024; originally announced May 2024.

  34. arXiv:2405.07409  [pdf, other

    cs.NI

    ZBanner: Fast Stateless Scanning Capable of Obtaining Responses over TCP

    Authors: Chiyu Chen, Yuliang Lu, Guozheng Yang, Yi Xie, Shasha Guo

    Abstract: Fast large-scale network scanning is an important way to understand internet service configurations and security in real time, among which stateless scan is representative. Existing stateless scanners can perform single-packet scans for internet-wide network measurements but are limited to host discovery or port scanning. To obtain further information over TCP, slower stateful scanners must be use… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: The paper has been submitted and the code will be published later

  35. arXiv:2405.05499  [pdf, other

    cs.LG cs.AI

    Multi-Scale Dilated Convolution Network for Long-Term Time Series Forecasting

    Authors: Feifei Li, Suhan Guo, Feng Han, Jian Zhao, Furao Shen

    Abstract: Accurate forecasting of long-term time series has important applications for decision making and planning. However, it remains challenging to capture the long-term dependencies in time series data. To better extract long-term dependencies, We propose Multi Scale Dilated Convolution Network (MSDCN), a method that utilizes a shallow dilated convolution architecture to capture the period and trend ch… ▽ More

    Submitted 14 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  36. arXiv:2405.05275  [pdf, other

    cs.SI cs.AI cs.IR

    SoMeR: Multi-View User Representation Learning for Social Media

    Authors: Siyi Guo, Keith Burghardt, Valeria Pantè, Kristina Lerman

    Abstract: User representation learning aims to capture user preferences, interests, and behaviors in low-dimensional vector representations. These representations have widespread applications in recommendation systems and advertising; however, existing methods typically rely on specific features like text content, activity patterns, or platform metadata, failing to holistically model user behavior across di… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  37. arXiv:2405.03376  [pdf, other

    cs.LG cs.CV

    CRA5: Extreme Compression of ERA5 for Portable Global Climate and Weather Research via an Efficient Variational Transformer

    Authors: Tao Han, Zhenghao Chen, Song Guo, Wanghan Xu, Lei Bai

    Abstract: The advent of data-driven weather forecasting models, which learn from hundreds of terabytes (TB) of reanalysis data, has significantly advanced forecasting capabilities. However, the substantial costs associated with data storage and transmission present a major challenge for data providers and users, affecting resource-constrained researchers and limiting their accessibility to participate in AI… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: Main text and supplementary, 22 pages, 13 figures

  38. arXiv:2405.01221  [pdf, other

    cs.NI

    A Survey on Semantic Communication Networks: Architecture, Security, and Privacy

    Authors: Shaolong Guo, Yuntao Wang, Ning Zhang, Zhou Su, Tom H. Luan, Zhiyi Tian, Xuemin Shen

    Abstract: Semantic communication, emerging as a breakthrough beyond the classical Shannon paradigm, aims to convey the essential meaning of source data rather than merely focusing on precise yet content-agnostic bit transmission. By interconnecting diverse intelligent agents (e.g., autonomous vehicles and VR devices) via semantic communications, the semantic communication networks (SemComNet) supports seman… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  39. arXiv:2404.19141  [pdf, other

    cs.LG

    Micro-Macro Spatial-Temporal Graph-based Encoder-Decoder for Map-Constrained Trajectory Recovery

    Authors: Tonglong Wei, Youfang Lin, Yan Lin, Shengnan Guo, Lan Zhang, Huaiyu Wan

    Abstract: Recovering intermediate missing GPS points in a sparse trajectory, while adhering to the constraints of the road network, could offer deep insights into users' moving behaviors in intelligent transportation systems. Although recent studies have demonstrated the advantages of achieving map-constrained trajectory recovery via an end-to-end manner, they still face two significant challenges. Firstly,… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted as a regular paper at IEEE TKDE

  40. arXiv:2404.12006  [pdf, other

    cs.CL

    Variational Multi-Modal Hypergraph Attention Network for Multi-Modal Relation Extraction

    Authors: Qian Li, Cheng Ji, Shu Guo, Yong Zhao, Qianren Mao, Shangguang Wang, Yuntao Wei, Jianxin Li

    Abstract: Multi-modal relation extraction (MMRE) is a challenging task that aims to identify relations between entities in text leveraging image information. Existing methods are limited by their neglect of the multiple entity pairs in one sentence sharing very similar contextual information (ie, the same text and image), resulting in increased difficulty in the MMRE task. To address this limitation, we pro… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  41. arXiv:2404.09681  [pdf, other

    cs.CR

    An Empirical Study of Open Edge Computing Platforms: Ecosystem, Usage, and Security Risks

    Authors: Yu Bi, Mingshuo Yang, Yong Fang, Xianghang Mi, Shanqing Guo, Shujun Tang, Haixin Duan

    Abstract: Emerging in recent years, open edge computing platforms (OECPs) claim large-scale edge nodes, the extensive usage and adoption, as well as the openness to any third parties to join as edge nodes. For instance, OneThingCloud, a major OECP operated in China, advertises 5 million edge nodes, 70TB bandwidth, and 1,500PB storage. However, little information is publicly available for such OECPs with reg… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  42. arXiv:2404.06779  [pdf, other

    cs.CV cs.GR

    Efficient and Scalable Chinese Vector Font Generation via Component Composition

    Authors: **yu Song, Weitao You, Shuhui Shi, Shuxuan Guo, Lingyun Sun, Wei Wang

    Abstract: Chinese vector font generation is challenging due to the complex structure and huge amount of Chinese characters. Recent advances remain limited to generating a small set of characters with simple structure. In this work, we first observe that most Chinese characters can be disassembled into frequently-reused components. Therefore, we introduce the first efficient and scalable Chinese vector font… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 15 pages, 23 figures

  43. arXiv:2404.06393  [pdf, other

    cs.SD cs.AI eess.AS

    MuPT: A Generative Symbolic Music Pretrained Transformer

    Authors: Xingwei Qu, Yuelin Bai, Yinghao Ma, Ziya Zhou, Ka Man Lo, Jiaheng Liu, Ruibin Yuan, Lejun Min, Xueling Liu, Tianyu Zhang, Xinrun Du, Shuyue Guo, Yiming Liang, Yizhi Li, Shangda Wu, Junting Zhou, Tianyu Zheng, Ziyang Ma, Fengze Han, Wei Xue, Gus Xia, Emmanouil Benetos, Xiang Yue, Chenghua Lin, Xu Tan , et al. (4 additional authors not shown)

    Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the chal… ▽ More

    Submitted 10 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  44. arXiv:2404.05576  [pdf, other

    cs.LG

    Dynamic Backtracking in GFlowNets: Enhancing Decision Steps with Reward-Dependent Adjustment Mechanisms

    Authors: Shuai Guo, Jielei Chu, Lei Zhu, Zhaoyu Li, Tianrui Li

    Abstract: Generative Flow Networks (GFlowNets or GFNs) are probabilistic models predicated on Markov flows, and they employ specific amortization algorithms to learn stochastic policies that generate compositional substances including biomolecules, chemical materials, etc. With a strong ability to generate high-performance biochemical molecules, GFNs accelerate the discovery of scientific substances, effect… ▽ More

    Submitted 13 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  45. arXiv:2404.04575  [pdf, other

    cs.LG cs.AI math.OC

    To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO

    Authors: Zi-Hao Qiu, Siqi Guo, Mao Xu, Tuo Zhao, Lijun Zhang, Tianbao Yang

    Abstract: The temperature parameter plays a profound role during training and/or inference with large foundation models (LFMs) such as large language models (LLMs) and CLIP models. Particularly, it adjusts the logits in the softmax function in LLMs, which is crucial for next token generation, and it scales the similarities in the contrastive loss for training CLIP models. A significant question remains: Is… ▽ More

    Submitted 16 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: 41 pages, 10 figures, accepted by ICML2024

  46. arXiv:2404.04286  [pdf, other

    cs.CL cs.AI cs.LG

    Language Model Evolution: An Iterated Learning Perspective

    Authors: Yi Ren, Shangmin Guo, Linlu Qiu, Bailin Wang, Danica J. Sutherland

    Abstract: With the widespread adoption of Large Language Models (LLMs), the prevalence of iterative interactions among these models is anticipated to increase. Notably, recent advancements in multi-round self-improving methods allow LLMs to generate new examples for training subsequent models. At the same time, multi-agent LLM systems, involving automated interactions among agents, are also increasing in pr… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  47. arXiv:2404.03912  [pdf, other

    cs.CL cs.AI

    Forget NLI, Use a Dictionary: Zero-Shot Topic Classification for Low-Resource Languages with Application to Luxembourgish

    Authors: Fred Philippy, Shohreh Haddadan, Siwen Guo

    Abstract: In NLP, zero-shot classification (ZSC) is the task of assigning labels to textual data without any labeled examples for the target classes. A common method for ZSC is to fine-tune a language model on a Natural Language Inference (NLI) dataset and then use it to infer the entailment between the input document and the target labels. However, this approach faces certain challenges, particularly for l… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 3rd Annual Meeting of the ELRA/ISCA Special Interest Group on Under-resourced Languages (SIGUL 2024)

  48. arXiv:2404.03543  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

    Authors: Jiawei Guo, Ziming Li, Xueling Liu, Kai**g Ma, Tianyu Zheng, Zhouliang Yu, Ding Pan, Yizhi LI, Ruibo Liu, Yue Wang, Shuyue Guo, Xingwei Qu, Xiang Yue, Ge Zhang, Wenhu Chen, Jie Fu

    Abstract: Large Language Models (LLMs) for code are rapidly evolving, with code editing emerging as a critical capability. We introduce CodeEditorBench, an evaluation framework designed to rigorously assess the performance of LLMs in code editing tasks, including debugging, translating, polishing, and requirement switching. Unlike existing benchmarks focusing solely on code generation, CodeEditorBench empha… ▽ More

    Submitted 6 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  49. arXiv:2404.02616  [pdf, other

    cs.IR cs.CL

    Improving Topic Relevance Model by Mix-structured Summarization and LLM-based Data Augmentation

    Authors: Yizhu Liu, Ran Tao, Shengyu Guo, Yifan Yang

    Abstract: Topic relevance between query and document is a very important part of social search, which can evaluate the degree of matching between document and user's requirement. In most social search scenarios such as Dian**, modeling search relevance always faces two challenges. One is that many documents in social search are very long and have much redundant information. The other is that the training… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  50. arXiv:2404.01923  [pdf, other

    cs.CL cs.AI

    SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation

    Authors: Shasha Guo, Lizi Liao, **g Zhang, Yanling Wang, Cui** Li, Hong Chen

    Abstract: Knowledge base question generation (KBQG) aims to generate natural language questions from a set of triplet facts extracted from KB. Existing methods have significantly boosted the performance of KBQG via pre-trained language models (PLMs) thanks to the richly endowed semantic knowledge. With the advance of pre-training techniques, large language models (LLMs) (e.g., GPT-3.5) undoubtedly possess m… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by NAACL 2024 Findings