Skip to main content

Showing 1–50 of 499 results for author: Kang, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00978  [pdf, other

    cs.AI cs.LG

    Hybrid RAG-empowered Multi-modal LLM for Secure Healthcare Data Management: A Diffusion-based Contract Theory Approach

    Authors: Cheng Su, **bo Wen, Jiawen Kang, Yonghua Wang, Hudan Pan, M. Shamim Hossain

    Abstract: Secure data management and effective data sharing have become paramount in the rapidly evolving healthcare landscape. The advancement of generative artificial intelligence has positioned Multi-modal Large Language Models (MLLMs) as crucial tools for managing healthcare data. MLLMs can support multi-modal inputs and generate diverse types of content by leveraging large-scale training on vast amount… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 12 pages, 6 figures

  2. arXiv:2406.18864  [pdf, other

    cs.CV

    Learning Modality Knowledge Alignment for Cross-Modality Transfer

    Authors: Wenxuan Ma, Shuang Li, Lincan Cai, **gxuan Kang

    Abstract: Cross-modality transfer aims to leverage large pretrained models to complete tasks that may not belong to the modality of pretraining data. Existing works achieve certain success in extending classical finetuning to cross-modal scenarios, yet we still lack understanding about the influence of modality gap on the transfer. In this work, a series of experiments focusing on the source representation… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  3. arXiv:2406.18763  [pdf, other

    cs.LG cs.AI

    Conformalized Link Prediction on Graph Neural Networks

    Authors: Tianyi Zhao, Jian Kang, Lu Cheng

    Abstract: Graph Neural Networks (GNNs) excel in diverse tasks, yet their applications in high-stakes domains are often hampered by unreliable predictions. Although numerous uncertainty quantification methods have been proposed to address this limitation, they often lack \textit{rigorous} uncertainty estimates. This work makes the first attempt to introduce a distribution-free and model-agnostic uncertainty… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  4. arXiv:2406.16030  [pdf, other

    cs.CL cs.AI

    Zero-Shot Cross-Lingual NER Using Phonemic Representations for Low-Resource Languages

    Authors: Jimin Sohn, Haeji Jung, Alex Cheng, Jooeon Kang, Yilin Du, David R. Mortensen

    Abstract: Existing zero-shot cross-lingual NER approaches require substantial prior knowledge of the target language, which is impractical for low-resource languages. In this paper, we propose a novel approach to NER using phonemic representation based on the International Phonetic Alphabet (IPA) to bridge the gap between representations of different languages. Our experiments show that our method significa… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures, 5 tables

  5. arXiv:2406.15664  [pdf, other

    stat.ML cs.LG

    Flat Posterior Does Matter For Bayesian Transfer Learning

    Authors: Sungjun Lim, Jeyoon Yeom, Sooyon Kim, Hoyoon Byun, **ho Kang, Yohan Jung, Jiyoung Jung, Kyungwoo Song

    Abstract: The large-scale pre-trained neural network has achieved notable success in enhancing performance for downstream tasks. Another promising approach for generalization is Bayesian Neural Network (BNN), which integrates Bayesian methods into neural network architectures, offering advantages such as Bayesian Model averaging (BMA) and uncertainty quantification. Despite these benefits, transfer learning… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  6. arXiv:2406.14333  [pdf, other

    cs.IR cs.SD eess.AS

    LARP: Language Audio Relational Pre-training for Cold-Start Playlist Continuation

    Authors: Rebecca Salganik, Xiaohao Liu, Yunshan Ma, Jian Kang, Tat-Seng Chua

    Abstract: As online music consumption increasingly shifts towards playlist-based listening, the task of playlist continuation, in which an algorithm suggests songs to extend a playlist in a personalized and musically cohesive manner, has become vital to the success of music streaming. Currently, many existing playlist continuation approaches rely on collaborative filtering methods to perform recommendation.… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.13964  [pdf, other

    cs.NI

    Hierarchical Micro-Segmentations for Zero-Trust Services via Large Language Model (LLM)-enhanced Graph Diffusion

    Authors: Yinqiu Liu, Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin Shen

    Abstract: In the rapidly evolving Next-Generation Networking (NGN) era, the adoption of zero-trust architectures has become increasingly crucial to protect security. However, provisioning zero-trust services in NGNs poses significant challenges, primarily due to the environmental complexity and dynamics. Motivated by these challenges, this paper explores efficient zero-trust service provisioning using hiera… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 13 pages

  8. arXiv:2406.12034  [pdf, other

    cs.CL cs.LG

    Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

    Authors: Junmo Kang, Leonid Karlinsky, Hongyin Luo, Zhen Wang, Jacob Hansen, James Glass, David Cox, Rameswar Panda, Rogerio Feris, Alan Ritter

    Abstract: We present Self-MoE, an approach that transforms a monolithic LLM into a compositional, modular system of self-specialized experts, named MiXSE (MiXture of Self-specialized Experts). Our approach leverages self-specialization, which constructs expert modules using self-generated synthetic data, each equipped with a shared base LLM and incorporating self-optimized routing. This allows for dynamic a… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  9. arXiv:2406.11208  [pdf

    cs.NI

    Privacy-preserving Pseudonym Schemes for Personalized 3D Avatars in Mobile Social Metaverses

    Authors: Cheng Su, Xiaofeng Luo, Zhenmou Liu, Jiawen Kang, Min Hao, Zehui Xiong, Zhaohui Yang, Chongwen Huang

    Abstract: The emergence of mobile social metaverses, a novel paradigm bridging physical and virtual realms, has led to the widespread adoption of avatars as digital representations for Social Metaverse Users (SMUs) within virtual spaces. Equipped with immersive devices, SMUs leverage Edge Servers (ESs) to deploy their avatars and engage with other SMUs in virtual spaces. To enhance immersion, SMUs incline t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 6pages, 4 figures

  10. arXiv:2406.10671  [pdf

    cs.CL

    Augmenting Biomedical Named Entity Recognition with General-domain Resources

    Authors: Yu Yin, Hyunjae Kim, Xiao Xiao, Chih Hsuan Wei, Jaewoo Kang, Zhiyong Lu, Hua Xu, Meng Fang, Qingyu Chen

    Abstract: Training a neural network-based biomedical named entity recognition (BioNER) model usually requires extensive and costly human annotations. While several studies have employed multi-task learning with multiple BioNER datasets to reduce human effort, this approach does not consistently yield performance improvements and may introduce label ambiguity in different biomedical corpora. We aim to tackle… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: We make data, codes, and models publicly available via https://github.com/qingyu-qc/bioner_gerbera

  11. arXiv:2406.10382  [pdf, other

    cs.AI cs.CL

    Efficient Prompting for LLM-based Generative Internet of Things

    Authors: Bin Xiao, Burak Kantarci, Jiawen Kang, Dusit Niyato, Mohsen Guizani

    Abstract: Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently. Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network s… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures

  12. arXiv:2406.09003  [pdf, other

    cs.CV cs.LG

    Enhancing Cross-Modal Fine-Tuning with Gradually Intermediate Modality Generation

    Authors: Lincan Cai, Shuang Li, Wenxuan Ma, **gxuan Kang, Binhui Xie, Zixun Sun, Chengwei Zhu

    Abstract: Large-scale pretrained models have proven immensely valuable in handling data-intensive modalities like text and image. However, fine-tuning these models for certain specialized modalities, such as protein sequence and cosmic ray, poses challenges due to the significant modality discrepancy and scarcity of labeled data. In this paper, we propose an end-to-end method, PaRe, to enhance cross-modal f… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  13. arXiv:2406.08809  [pdf, other

    cs.SD cs.AI eess.AS

    Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges

    Authors: Jaeyong Kang, Dorien Herremans

    Abstract: Deep learning models for music have advanced drastically in the last few years. But how good are machine learning models at capturing emotion these days and what challenges are researchers facing? In this paper, we provide a comprehensive overview of the available music-emotion datasets and discuss evaluation standards as well as competitions in the field. We also provide a brief overview of vario… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  14. arXiv:2406.05914  [pdf, other

    eess.AS cs.SD eess.SP

    Soundscape Captioning using Sound Affective Quality Network and Large Language Model

    Authors: Yuanbo Hou, Qiaoqiao Ren, Andrew Mitchell, Wenwu Wang, Jian Kang, Tony Belpaeme, Dick Botteldooren

    Abstract: We live in a rich and varied acoustic world, which is experienced by individuals or communities as a soundscape. Computational auditory scene analysis, disentangling acoustic scenes by detecting and classifying events, focuses on objective attributes of sounds, such as their category and temporal characteristics, ignoring the effect of sounds on people and failing to explore the relationship betwe… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/Yuanbo2020/SoundSCaper

  15. arXiv:2406.05422  [pdf, other

    cs.AI cs.RO

    Diffusion-based Reinforcement Learning for Dynamic UAV-assisted Vehicle Twins Migration in Vehicular Metaverses

    Authors: Yongju Tong, Jiawen Kang, Junlong Chen, Minrui Xu, Gaolei Li, Weiting Zhang, Xincheng Yan

    Abstract: Air-ground integrated networks can relieve communication pressure on ground transportation networks and provide 6G-enabled vehicular Metaverses services offloading in remote areas with sparse RoadSide Units (RSUs) coverage and downtown areas where users have a high demand for vehicular services. Vehicle Twins (VTs) are the digital twins of physical vehicles to enable more immersive and realistic v… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  16. arXiv:2406.05418  [pdf, other

    cs.AI cs.NI

    Multi-attribute Auction-based Resource Allocation for Twins Migration in Vehicular Metaverses: A GPT-based DRL Approach

    Authors: Yongju Tong, Junlong Chen, Minrui Xu, Jiawen Kang, Zehui Xiong, Dusit Niyato, Chau Yuen, Zhu Han

    Abstract: Vehicular Metaverses are developed to enhance the modern automotive industry with an immersive and safe experience among connected vehicles and roadside infrastructures, e.g., RoadSide Units (RSUs). For seamless synchronization with virtual spaces, Vehicle Twins (VTs) are constructed as digital representations of physical entities. However, resource-intensive VTs updating and high mobility of vehi… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 16 pages, 6 figures, 3 tables

  17. arXiv:2406.00014  [pdf, other

    cs.DB cs.AI cs.CL cs.IR

    KU-DMIS at EHRSQL 2024:Generating SQL query via question templatization in EHR

    Authors: Hajung Kim, Chanhwi Kim, Hoonick Lee, Kyochul Jang, Jiwoo Lee, Kyungjae Lee, Gangwoo Kim, Jaewoo Kang

    Abstract: Transforming natural language questions into SQL queries is crucial for precise data retrieval from electronic health record (EHR) databases. A significant challenge in this process is detecting and rejecting unanswerable questions that request information beyond the database's scope or exceed the system's capabilities. In this paper, we introduce a novel text-to-SQL framework that robustly handle… ▽ More

    Submitted 19 June, 2024; v1 submitted 21 May, 2024; originally announced June 2024.

    Comments: Published at ClinicalNLP workshop @ NAACL 2024

  18. arXiv:2405.20568  [pdf, other

    cs.LG cs.NI

    Generative AI for Deep Reinforcement Learning: Framework, Analysis, and Use Cases

    Authors: Geng Sun, Wenwen Xie, Dusit Niyato, Fang Mei, Jiawen Kang, Hongyang Du, Shiwen Mao

    Abstract: As a form of artificial intelligence (AI) technology based on interactive learning, deep reinforcement learning (DRL) has been widely applied across various fields and has achieved remarkable accomplishments. However, DRL faces certain limitations, including low sample efficiency and poor generalization. Therefore, we present how to leverage generative AI (GAI) to address these issues above and en… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  19. arXiv:2405.20567  [pdf, other

    cs.RO

    Fast Decentralized State Estimation for Legged Robot Locomotion via EKF and MHE

    Authors: Jiarong Kang, Yi Wang, Xiaobin Xiong

    Abstract: In this paper, we present a fast and decentralized state estimation framework for the control of legged locomotion. The nonlinear estimation of the floating base states is decentralized to an orientation estimation via Extended Kalman Filter (EKF) and a linear velocity estimation via Moving Horizon Estimation (MHE). The EKF fuses the inertia sensor with vision to estimate the floating base orienta… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  20. arXiv:2405.16265  [pdf, other

    cs.LG

    MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time

    Authors: Jikun Kang, Xin Zhe Li, Xi Chen, Amirreza Kazemi, Qianyi Sun, Boxing Chen, Dong Li, Xu He, Quan He, Feng Wen, Jianye Hao, Jun Yao

    Abstract: Although Large Language Models (LLMs) achieve remarkable performance across various tasks, they often struggle with complex reasoning tasks, such as answering mathematical questions. Recent efforts to address this issue have primarily focused on leveraging mathematical datasets through supervised fine-tuning or self-improvement techniques. However, these methods often depend on high-quality datase… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  21. arXiv:2405.14222  [pdf, other

    cs.LG cs.CV eess.IV

    RAQ-VAE: Rate-Adaptive Vector-Quantized Variational Autoencoder

    Authors: Jiwan Seo, Joonhyuk Kang

    Abstract: Vector Quantized Variational AutoEncoder (VQ-VAE) is an established technique in machine learning for learning discrete representations across various modalities. However, its scalability and applicability are limited by the need to retrain the model to adjust the codebook for different data or model scales. We introduce the Rate-Adaptive VQ-VAE (RAQ-VAE) framework, which addresses this challenge… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Under review

  22. arXiv:2405.12701  [pdf, other

    cs.CL cs.AI

    OLAPH: Improving Factuality in Biomedical Long-form Question Answering

    Authors: Minbyul Jeong, Hyeon Hwang, Chanwoong Yoon, Taewhoo Lee, Jaewoo Kang

    Abstract: In the medical domain, numerous scenarios necessitate the long-form generation ability of large language models (LLMs). Specifically, when addressing patients' questions, it is essential that the model's response conveys factual claims, highlighting the need for an automated method to evaluate those claims. Thus, we introduce MedLFQA, a benchmark dataset reconstructed using long-form question-answ… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  23. arXiv:2405.12472  [pdf, ps, other

    cs.NI

    Optimizing Generative AI Networking: A Dual Perspective with Multi-Agent Systems and Mixture of Experts

    Authors: Ruichen Zhang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, ** Zhang, Dong In Kim

    Abstract: In the continued development of next-generation networking and artificial intelligence content generation (AIGC) services, the integration of multi-agent systems (MAS) and the mixture of experts (MoE) frameworks is becoming increasingly important. Motivated by this, this article studies the contrasting and converging of MAS and MoE in AIGC-enabled networking. First, we discuss the architectural de… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  24. arXiv:2405.11473  [pdf, other

    cs.CV cs.AI

    FIFO-Diffusion: Generating Infinite Videos from Text without Training

    Authors: Jihwan Kim, Junoh Kang, **young Choi, Bohyung Han

    Abstract: We propose a novel inference technique based on a pretrained diffusion model for text-conditional video generation. Our approach, called FIFO-Diffusion, is conceptually capable of generating infinitely long videos without additional training. This is achieved by iteratively performing diagonal denoising, which concurrently processes a series of consecutive frames with increasing noise levels in a… ▽ More

    Submitted 12 June, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: Project Page: https://jjihwan.github.io/projects/FIFO-Diffusion

  25. arXiv:2405.04907  [pdf, other

    cs.NI

    Empowering Wireless Networks with Artificial Intelligence Generated Graph

    Authors: Jiacheng Wang, Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Haibo Zhou, Dong In Kim

    Abstract: In wireless communications, transforming network into graphs and processing them using deep learning models, such as Graph Neural Networks (GNNs), is one of the mainstream network optimization approaches. While effective, the generative AI (GAI) shows stronger capabilities in graph analysis, processing, and generation, than conventional methods such as GNN, offering a broader exploration space for… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  26. arXiv:2405.04198  [pdf, other

    cs.CR

    Enhancing Physical Layer Communication Security through Generative AI with Mixture of Experts

    Authors: Changyuan Zhao, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim, Xuemin, Shen, Khaled B. Letaief

    Abstract: AI technologies have become more widely adopted in wireless communications. As an emerging type of AI technologies, the generative artificial intelligence (GAI) gains lots of attention in communication security. Due to its powerful learning ability, GAI models have demonstrated superiority over conventional AI methods. However, GAI still has several limitations, including high computational comple… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures

  27. arXiv:2405.00523  [pdf, other

    cs.AI cs.CL

    CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions

    Authors: Donghee Choi, Mogan Gim, Donghyeon Park, Mujeen Sung, Hyunjae Kim, Jaewoo Kang, Jihun Choi

    Abstract: This paper introduces CookingSense, a descriptive collection of knowledge assertions in the culinary domain extracted from various sources, including web data, scientific papers, and recipes, from which knowledge covering a broad range of aspects is acquired. CookingSense is constructed through a series of dictionary-based filtering and language model-based semantic filtering techniques, which res… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: LREC-COLING 2024 Accepted

  28. arXiv:2404.18881  [pdf, other

    cs.HC cs.LG cs.SE

    Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking

    Authors: Hong ** Kang, Fabrice Harel-Canada, Muhammad Ali Gulzar, Violet Peng, Miryung Kim

    Abstract: Data augmentation techniques apply transformations to existing texts to generate additional data. The transformations may produce low-quality texts, where the meaning of the text is changed and the text may even be mangled beyond human comprehension. Analyzing the synthetically generated texts and their corresponding labels is slow and demanding. To winnow out texts with incorrect labels, we devel… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 Findings

  29. arXiv:2404.18756  [pdf, other

    cs.SE cs.PL

    K-CIRCT: A Layered, Composable, and Executable Formal Semantics for CIRCT Hardware IRs

    Authors: Jianhong Zhao, **hui Kang, Yongwang Zhao

    Abstract: CIRCT, an open-source EDA framework akin to LLVM for software, is a foundation for various hardware description languages. Despite its crucial role, CIRCT's lack of formal semantics challenges necessary rigorous hardware verification. Thus, this paper introduces K-CIRCT, the first formal semantics in {\K} for a substantial CIRCT subset adequate for simulating a RISC-V processor. Our semantics are… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  30. arXiv:2404.18077  [pdf, other

    cs.NI cs.LG

    Generative AI for Low-Carbon Artificial Intelligence of Things

    Authors: **bo Wen, Ruichen Zhang, Dusit Niyato, Jiawen Kang, Hongyang Du, Yang Zhang, Zhu Han

    Abstract: By integrating Artificial Intelligence (AI) with the Internet of Things (IoT), Artificial Intelligence of Things (AIoT) has revolutionized many fields. However, AIoT is facing the challenges of energy consumption and carbon emissions due to the continuous advancement of mobile technology. Fortunately, Generative AI (GAI) holds immense potential to reduce carbon emissions of AIoT due to its excelle… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  31. arXiv:2404.16947  [pdf, other

    cs.SE

    Fuzzing MLIR by Synthesizing Custom Mutations

    Authors: Ben Limpanukorn, Jiyuan Wang, Hong ** Kang, Eric Zitong Zhou, Miryung Kim

    Abstract: Multi-Level Intermediate Representation (MLIR) is an effort to enable faster compiler development by providing an extensible framework for downstream developers to define custom IRs with MLIR dialects. MLIR dialects define new IRs that are tailored for specific domains. The diversity and rapid evolution of these IRs make it impractical to pre-define custom generator logic for every available diale… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  32. arXiv:2404.16356  [pdf, other

    cs.NI cs.AI cs.LG

    Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey

    Authors: Minrui Xu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Yuguang Fang, Dong In Kim, Xuemin, Shen

    Abstract: Generative AI (GAI) can enhance the cognitive, reasoning, and planning capabilities of intelligent modules in the Internet of Vehicles (IoV) by synthesizing augmented datasets, completing sensor data, and making sequential decisions. In addition, the mixture of experts (MoE) can enable the distributed and collaborative execution of AI models without performance degradation between connected vehicl… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  33. arXiv:2404.15292  [pdf, other

    eess.SP cs.IT

    Multi-objective Optimization for Multi-UAV-assisted Mobile Edge Computing

    Authors: Geng Sun, Yixian Wang, Zemin Sun, Qingqing Wu, Jiawen Kang, Dusit Niyato, Victor C. M. Leung

    Abstract: Recent developments in unmanned aerial vehicles (UAVs) and mobile edge computing (MEC) have provided users with flexible and resilient computing services. However, meeting the computing-intensive and latency-sensitive demands of users poses a significant challenge due to the limited resources of UAVs. To address this challenge, we present a multi-objective optimization approach for multi-UAV-assis… ▽ More

    Submitted 23 March, 2024; originally announced April 2024.

  34. arXiv:2404.14687  [pdf, other

    cs.MM cs.AI cs.CL cs.CV

    Pegasus-v1 Technical Report

    Authors: Raehyuk Jung, Hyojun Go, Jaehyuk Yi, Jiho Jang, Daniel Kim, Jay Suh, Aiden Lee, Cooper Han, Jae Lee, Jeff Kim, **-Young Kim, Junwan Kim, Kyle Park, Lucas Lee, Mars Ha, Minjoon Seo, Abraham Jo, Ed Park, Hassan Kianinejad, SJ Kim, Tony Moon, Wade Jeong, Andrei Popescu, Esther Kim, EK Yoon , et al. (19 additional authors not shown)

    Abstract: This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language. Pegasus-1 is designed to address the unique challenges posed by video data, such as interpreting spatiotemporal information, to offer nuanced video content comprehension across various lengths. This technical report overviews Pegasus-1's archi… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  35. arXiv:2404.13919  [pdf, other

    cs.CL cs.AI cs.HC

    Navigating the Path of Writing: Outline-guided Text Generation with Large Language Models

    Authors: Yukyung Lee, Soonwon Ka, Bokyung Son, Pilsung Kang, Jaewook Kang

    Abstract: Large Language Models (LLMs) have significantly impacted the writing process, enabling collaborative content creation and enhancing productivity. However, generating high-quality, user-aligned text remains challenging. In this paper, we propose Writing Path, a framework that uses explicit outlines to guide LLMs in generating goal-oriented, high-quality pieces of writing. Our approach draws inspira… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: under review

  36. arXiv:2404.13898  [pdf, other

    cs.NI

    Cross-Modal Generative Semantic Communications for Mobile AIGC: Joint Semantic Encoding and Prompt Engineering

    Authors: Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Shiwen Mao, ** Zhang, Xuemin Shen

    Abstract: Employing massive Mobile AI-Generated Content (AIGC) Service Providers (MASPs) with powerful models, high-quality AIGC services can become accessible for resource-constrained end users. However, this advancement, referred to as mobile AIGC, also introduces a significant challenge: users should download large AIGC outputs from the MASPs, leading to substantial bandwidth consumption and potential tr… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  37. arXiv:2404.13289  [pdf, other

    cs.CL cs.MM cs.SD eess.AS

    Double Mixture: Towards Continual Event Detection from Speech

    Authors: **gqi Kang, Tongtong Wu, **ming Zhao, Guitao Wang, Yinwei Wei, Hao Yang, Guilin Qi, Yuan-Fang Li, Gholamreza Haffari

    Abstract: Speech event detection is crucial for multimedia retrieval, involving the tagging of both semantic and acoustic events. Traditional ASR systems often overlook the interplay between these events, focusing solely on content, even though the interpretation of dialogue can vary with environmental context. This paper tackles two primary challenges in speech event detection: the continual integration of… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: The first two authors contributed equally to this work

  38. Estimating the Hessian Matrix of Ranking Objectives for Stochastic Learning to Rank with Gradient Boosted Trees

    Authors: **gwei Kang, Maarten de Rijke, Harrie Oosterhuis

    Abstract: Stochastic learning to rank (LTR) is a recent branch in the LTR field that concerns the optimization of probabilistic ranking models. Their probabilistic behavior enables certain ranking qualities that are impossible with deterministic models. For example, they can increase the diversity of displayed documents, increase fairness of exposure over documents, and better balance exploitation and explo… ▽ More

    Submitted 9 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: SIGIR2024 conference Short paper track

  39. arXiv:2404.10556  [pdf, other

    cs.NI eess.SP

    Generative AI for Advanced UAV Networking

    Authors: Geng Sun, Wenwen Xie, Dusit Niyato, Hongyang Du, Jiawen Kang, **g Wu, Sumei Sun, ** Zhang

    Abstract: With the impressive achievements of chatGPT and Sora, generative artificial intelligence (GAI) has received increasing attention. Not limited to the field of content generation, GAI is also widely used to solve the problems in wireless communication scenarios due to its powerful learning and generalization capabilities. Therefore, we discuss key applications of GAI in improving unmanned aerial veh… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  40. arXiv:2404.09699  [pdf, other

    cs.GT

    Generative AI for Game Theory-based Mobile Networking

    Authors: Long He, Geng Sun, Dusit Niyato, Hongyang Du, Fang Mei, Jiawen Kang, Mérouane Debbah, and Zhu Han

    Abstract: With the continuous advancement of network technology, various emerging complex networking optimization problems opened up a wide range of applications utilizating of game theory. However, since game theory is a mathematical framework, game theory-based solutions often require the experience and knowledge of human experts. Recently, the remarkable advantages exhibited by generative artificial inte… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  41. arXiv:2404.09134  [pdf, ps, other

    cs.NI cs.LG

    Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission

    Authors: Ruichen Zhang, Hongyang Du, Yinqiu Liu, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Dong In Kim

    Abstract: In response to the needs of 6G global communications, satellite communication networks have emerged as a key solution. However, the large-scale development of satellite communication networks is constrained by the complex system models, whose modeling is challenging for massive users. Moreover, transmission interference between satellites and users seriously affects communication performance. To s… ▽ More

    Submitted 29 June, 2024; v1 submitted 13 April, 2024; originally announced April 2024.

    Comments: 15 pages, 10 figures

  42. arXiv:2404.08899  [pdf, other

    cs.NI

    ProSecutor: Protecting Mobile AIGC Services on Two-Layer Blockchain via Reputation and Contract Theoretic Approaches

    Authors: Yinqiu Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Xuemin, Shen

    Abstract: Mobile AI-Generated Content (AIGC) has achieved great attention in unleashing the power of generative AI and scaling the AIGC services. By employing numerous Mobile AIGC Service Providers (MASPs), ubiquitous and low-latency AIGC services for clients can be realized. Nonetheless, the interactions between clients and MASPs in public mobile networks, pertaining to three key mechanisms, namely MASP se… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: 17 pages

  43. arXiv:2404.07450  [pdf, other

    cs.NI cs.NE

    Collaborative Ground-Space Communications via Evolutionary Multi-objective Deep Reinforcement Learning

    Authors: Jiahui Li, Geng Sun, Qingqing Wu, Dusit Niyato, Jiawen Kang, Abbas Jamalipour, Victor C. M. Leung

    Abstract: In this paper, we propose a distributed collaborative beamforming (DCB)-based uplink communication paradigm for enabling ground-space direct communications. Specifically, DCB treats the terminals that are unable to establish efficient direct connections with the low Earth orbit (LEO) satellites as distributed antennas, forming a virtual antenna array to enhance the terminal-to-satellite uplink ach… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: This paper has been submitted to IEEE Journal on Selected Areas in Communications

  44. arXiv:2404.07405  [pdf, other

    cs.CV

    Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing

    Authors: Jaemin Kang, Hoeseok Yang, Hyungshin Kim

    Abstract: Deep learning has been successfully applied to object detection from remotely sensed images. Images are typically processed on the ground rather than on-board due to the computation power of the ground system. Such offloaded processing causes delays in acquiring target mission information, which hinders its application to real-time use cases. For on-device object detection, researches have been co… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  45. arXiv:2404.04937  [pdf, other

    cs.CR cs.GT

    Optimizing Information Propagation for Blockchain-empowered Mobile AIGC: A Graph Attention Network Approach

    Authors: Jiana Liao, **bo Wen, Jiawen Kang, Yang Zhang, Jianbo Du, Qihao Li, Weiting Zhang, Dong Yang

    Abstract: Artificial Intelligence-Generated Content (AIGC) is a rapidly evolving field that utilizes advanced AI algorithms to generate content. Through integration with mobile edge networks, mobile AIGC networks have gained significant attention, which can provide real-time customized and personalized AIGC services and products. Since blockchains can facilitate decentralized and transparent data management… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.13237

  46. arXiv:2404.03321  [pdf, other

    cs.NI

    Fusion of Mixture of Experts and Generative Artificial Intelligence in Mobile Edge Metaverse

    Authors: Guangyuan Liu, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Shiwen Mao, Dong In Kim

    Abstract: In the digital transformation era, Metaverse offers a fusion of virtual reality (VR), augmented reality (AR), and web technologies to create immersive digital experiences. However, the evolution of the Metaverse is slowed down by the challenges of content creation, scalability, and dynamic user interaction. Our study investigates an integration of Mixture of Experts (MoE) models with Generative Ar… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  47. arXiv:2404.03139  [pdf, other

    cs.LG cs.SI

    Theoretical and Empirical Insights into the Origins of Degree Bias in Graph Neural Networks

    Authors: Arjun Subramonian, Jian Kang, Yizhou Sun

    Abstract: Graph Neural Networks (GNNs) often perform better for high-degree nodes than low-degree nodes on node classification tasks. This degree bias can reinforce social marginalization by, e.g., sidelining authors of lowly-cited papers when predicting paper topics in citation networks. While researchers have proposed numerous hypotheses for why GNN degree bias occurs, we find via a survey of 38 degree bi… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

  48. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  49. arXiv:2404.01583  [pdf, other

    cs.NI

    Defining Problem from Solutions: Inverse Reinforcement Learning (IRL) and Its Applications for Next-Generation Networking

    Authors: Yinqiu Liu, Ruichen Zhang, Hongyang Du, Dusit Niyato, Jiawen Kang, Zehui Xiong, Dong In Kim

    Abstract: Performance optimization is a critical concern in networking, on which Deep Reinforcement Learning (DRL) has achieved great success. Nonetheless, DRL training relies on precisely defined reward functions, which formulate the optimization objective and indicate the positive/negative progress towards the optimal. With the ever-increasing environmental complexity and human participation in Next-Gener… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 9 pages

  50. arXiv:2404.01503  [pdf, other

    cs.AI

    Some Orders Are Important: Partially Preserving Orders in Top-Quality Planning

    Authors: Michael Katz, Junkyu Lee, Jungkoo Kang, Shirin Sohrabi

    Abstract: The ability to generate multiple plans is central to using planning in real-life applications. Top-quality planners generate sets of such top-cost plans, allowing flexibility in determining equivalent ones. In terms of the order between actions in a plan, the literature only considers two extremes -- either all orders are important, making each plan unique, or all orders are unimportant, treating… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: To appear at SoCS 2024