Skip to main content

Showing 51–100 of 396 results for author: Song, K

.
  1. arXiv:2401.04549  [pdf, ps, other

    math.AP

    Riesz potential estimates for mixed local-nonlocal problems with measure data

    Authors: Iwona Chlebicka, Kyeong Song, Yeonghun Youn, Anna Zatorska-Goldstein

    Abstract: We study gradient regularity for mixed local-nonlocal problems modelled upon \[ -Δ_p u +(-Δ_p)^su=μ\qquad\text{for} \quad 2-\tfrac{1}{n}<p<\infty\quad \text{and}\quad s\in(0,1)\,,\] where $μ$ is a bounded Borel measure. We prove pointwise bounds for the gradient $Du$ in terms of the truncated 1-Riesz potential of $μ$.

    Submitted 9 January, 2024; originally announced January 2024.

  2. arXiv:2401.03601  [pdf, other

    cs.CL cs.AI

    InFoBench: Evaluating Instruction Following Ability in Large Language Models

    Authors: Yiwei Qin, Kaiqiang Song, Yebowen Hu, Wenlin Yao, Sangwoo Cho, Xiaoyang Wang, Xuansheng Wu, Fei Liu, Pengfei Liu, Dong Yu

    Abstract: This paper introduces the Decomposed Requirements Following Ratio (DRFR), a new metric for evaluating Large Language Models' (LLMs) ability to follow instructions. Addressing a gap in current methodologies, DRFR breaks down complex instructions into simpler criteria, facilitating a detailed analysis of LLMs' compliance with various aspects of tasks. Alongside this metric, we present InFoBench, a b… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

  3. Nonlinear Rydberg exciton-polaritons in Cu$_2$O microcavities

    Authors: Maxim Makhonin, Anthonin Delphan, Kok Wee Song, Paul Walker, Tommi Isoniemi, Peter Claronino, Konstantinos Orfanakis, Sai Kiran Rajendran, Hamid Ohadi, Julian Heckötter, Marc Aßmann, Manfred Bayer, Alexander Tartakovskii, Maurice Skolnick, Oleksandr Kyriienko, Dmitry Krizhanovskii

    Abstract: Rydberg excitons (analogues of Rydberg atoms in condensed matter systems) are highly excited bound electron-hole states with large Bohr radii. The interaction between them as well as exciton coupling to light may lead to strong optical nonlinearity, with applications in sensing and quantum information processing. Here, we achieve strong effective photon-photon interactions (Kerr-like optical nonli… ▽ More

    Submitted 14 March, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

  4. arXiv:2401.01721  [pdf, other

    cs.IT eess.SP

    Limited Feedback on Measurements: Sharing a Codebook or a Generative Model?

    Authors: Nurettin Turan, Benedikt Fesl, Michael Joham, Zhengxiang Ma, Anthony C. K. Soong, Baoling Sheen, Weimin Xiao, Wolfgang Utschick

    Abstract: Discrete Fourier transform (DFT) codebook-based solutions are well-established for limited feedback schemes in frequency division duplex (FDD) systems. In recent years, data-aided solutions have been shown to achieve higher performance, enabled by the adaptivity of the feedback scheme to the propagation environment of the base station (BS) cell. In particular, a versatile limited feedback scheme u… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  5. arXiv:2312.16457  [pdf, other

    cs.CV cs.GR

    City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web

    Authors: Kaiwen Song, Xiaoyi Zeng, Chenqu Ren, Juyong Zhang

    Abstract: Existing neural radiance field-based methods can achieve real-time rendering of small scenes on the web platform. However, extending these methods to large-scale scenes still poses significant challenges due to limited resources in computation, memory, and bandwidth. In this paper, we propose City-on-Web, the first method for real-time rendering of large-scale scenes on the web. We propose a block… ▽ More

    Submitted 31 March, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: Project page: https://ustc3dv.github.io/City-on-Web/

  6. arXiv:2312.14746  [pdf, ps, other

    cs.SE

    ESBMC v7.4: Harnessing the Power of Intervals

    Authors: Rafael Menezes, Mohannad Aldughaim, Bruno Farias, Xianzhiyu Li, Edoardo Manino, Fedor Shmarov, Kunjian Song, Franz Brauße, Mikhail R. Gadelha, Norbert Tihanyi, Konstantin Korovin, Lucas C. Cordeiro

    Abstract: ESBMC implements many state-of-the-art techniques for model checking. We report on new and improved features that allow us to obtain verification results for previously unsupported programs and properties. ESBMC employs a new static interval analysis of expressions in programs to increase verification performance. This includes interval-based reasoning over booleans and integers, forward and backw… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  7. arXiv:2312.14511  [pdf

    cs.RO eess.SY

    3D Programming of Patterned Heterogeneous Interface for 4D Smart Robotics

    Authors: Kewei Song, Chunfeng Xiong, Ze Zhang, Kunlin Wu, Weiyang Wan, Yifan Wang, Shinjiro Umezu, Hirotaka Sato

    Abstract: Shape memory structures are playing an important role in many cutting-edge intelligent fields. However, the existing technologies can only realize 4D printing of a single polymer or metal, which limits practical applications. Here, we report a construction strategy for TSMP/M heterointerface, which uses Pd2+-containing shape memory polymer (AP-SMR) to induce electroless plating reaction and relies… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 37 Pages, 11 Figures

  8. arXiv:2312.13714  [pdf, other

    cs.CV

    Bootstrap Masked Visual Modeling via Hard Patches Mining

    Authors: Haochen Wang, Junsong Fan, Yuxi Wang, Kaiyou Song, Tiancai Wang, Xiangyu Zhang, Zhaoxiang Zhang

    Abstract: Masked visual modeling has attracted much attention due to its promising potential in learning generalizable representations. Typical approaches urge models to predict specific contents of masked tokens, which can be intuitively considered as teaching a student (the model) to solve given problems (predicting masked contents). Under such settings, the performance is highly correlated with mask stra… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2304.05919

  9. arXiv:2312.11927  [pdf, other

    cs.LG cs.SI stat.ME

    Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery

    Authors: Pengwei Yan, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Tianqian** Lin, Changlong Sun, Xiaozhong Liu

    Abstract: While self-supervised graph pretraining techniques have shown promising results in various domains, their application still experiences challenges of limited topology learning, human knowledge dependency, and incompetent multi-level interactions. To address these issues, we propose a novel solution, Dual-level Graph self-supervised Pretraining with Motif discovery (DGPM), which introduces a unique… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 14 pages, 6 figures, accepted by AAAI'24

  10. arXiv:2312.10457  [pdf, ps, other

    cs.CV

    Semantic-Aware Autoregressive Image Modeling for Visual Representation Learning

    Authors: Kaiyou Song, Shan Zhang, Tong Wang

    Abstract: The development of autoregressive modeling (AM) in computer vision lags behind natural language processing (NLP) in self-supervised pre-training. This is mainly caused by the challenge that images are not sequential signals and lack a natural order when applying autoregressive modeling. In this study, inspired by human beings' way of gras** an image, i.e., focusing on the main object first, we p… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI2024

  11. arXiv:2312.08618  [pdf, other

    cs.CL

    Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention

    Authors: Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu

    Abstract: This paper introduces a novel approach to enhance the capabilities of Large Language Models (LLMs) in processing and understanding extensive text sequences, a critical aspect in applications requiring deep comprehension and synthesis of large volumes of information. Recognizing the inherent challenges in extending the context window for LLMs, primarily built on Transformer architecture, we propose… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  12. arXiv:2312.06168  [pdf, other

    cs.RO

    Motion Planning for Multiple Mobile Manipulator System in Complex Flip** Manipulation

    Authors: Wenhang Liu, Kun Song, Meng Ren, Jiawei Hu, Michael Yu Wang, Zhenhua Xiong

    Abstract: Multiple robot systems are favored for object manipulation and transportation, especially for large objects. However, in more complex manipulation such as flip**, these systems encounter a new challenge, configuration disconnectivity of manipulators. Gras** objects by manipulators will impose closed-chain constraints on the system, which in turn limits the feasible motions of manipulators and… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  13. arXiv:2312.05757  [pdf, ps, other

    cs.LG cs.AI cs.DL cs.SI stat.ME

    Towards Human-like Perception: Learning Structural Causal Model in Heterogeneous Graph

    Authors: Tianqian** Lin, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Weikang Yuan, Xurui Li, Changlong Sun, Cui Huang, Xiaozhong Liu

    Abstract: Heterogeneous graph neural networks have become popular in various domains. However, their generalizability and interpretability are limited due to the discrepancy between their inherent inference flows and human reasoning logic or underlying causal relationships for the learning problem. This study introduces a novel solution, HG-SCM (Heterogeneous Graph as Structural Causal Model). It can mimic… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 28 pages, 10 figures, 6 tables, accepted by Information Processing & Management

    Journal ref: Information Processing & Management, 60 (2024) 1-21

  14. arXiv:2312.05274  [pdf, other

    cs.LG cs.CV

    Target to Source: Guidance-Based Diffusion Model for Test-Time Adaptation

    Authors: Kaiyu Song, Hanjiang Lai

    Abstract: Most recent works of test-time adaptation (TTA) aim to alleviate domain shift problems by re-training source classifiers in each domain. On the other hand, the emergence of the diffusion model provides another solution to TTA, which directly maps the test data from the target domain to the source domain based on a diffusion model pre-trained in the source domain. The source classifier does not nee… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  15. arXiv:2312.04802  [pdf, other

    cs.CV

    MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model

    Authors: Kaiyu Song, Hanjiang Lai

    Abstract: Deep neural networks (DNNs) are vulnerable to adversarial perturbation, where an imperceptible perturbation is added to the image that can fool the DNNs. Diffusion-based adversarial purification focuses on using the diffusion model to generate a clean image against such adversarial attacks. Unfortunately, the generative process of the diffusion model is also inevitably affected by adversarial pert… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  16. arXiv:2312.04789  [pdf, other

    cs.DC cs.OS

    Lightweight Frequency-Based Tiering for CXL Memory Systems

    Authors: Kevin Song, Jiacheng Yang, Sihang Liu, Gennady Pekhimenko

    Abstract: Modern workloads are demanding increasingly larger memory capacity. Compute Express Link (CXL)-based memory tiering has emerged as a promising solution for addressing this trend by utilizing traditional DRAM alongside slow-tier CXL-memory devices in the same system. Unfortunately, most prior tiering systems are recency-based, which cannot accurately identify hot and cold pages, since a recently ac… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  17. arXiv:2311.18760  [pdf, other

    cs.CL cs.AI

    TaskBench: Benchmarking Large Language Models for Task Automation

    Authors: Yongliang Shen, Kaitao Song, Xu Tan, Wenqi Zhang, Kan Ren, Siyu Yuan, Weiming Lu, Dongsheng Li, Yueting Zhuang

    Abstract: Recently, the incredible progress of large language models (LLMs) has ignited the spark of task automation, which decomposes the complex tasks described by user instructions into sub-tasks, and invokes external tools to execute them, and plays a central role in autonomous agents. However, there lacks a systematic and standardized benchmark to foster the development of LLMs in task automation. To t… ▽ More

    Submitted 9 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  18. arXiv:2311.10774  [pdf, other

    cs.CL cs.AI

    MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning

    Authors: Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu

    Abstract: With the rapid development of large language models (LLMs) and their integration into large multimodal models (LMMs), there has been impressive progress in zero-shot completion of user-oriented vision-language tasks. However, a gap remains in the domain of chart image understanding due to the distinct abstract components in charts. To address this, we introduce a large-scale MultiModal Chart Instr… ▽ More

    Submitted 15 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024

  19. arXiv:2311.04732  [pdf, other

    cond-mat.mtrl-sci physics.comp-ph

    General-purpose machine-learned potential for 16 elemental metals and their alloys

    Authors: Keke Song, Rui Zhao, Jiahui Liu, Yanzhou Wang, Eric Lindgren, Yong Wang, Shunda Chen, Ke Xu, Ting Liang, Penghua Ying, Nan Xu, Zhiqiang Zhao, Jiuyang Shi, Junjie Wang, Shuang Lyu, Zezhu Zeng, Shirong Liang, Haikuan Dong, Ligang Sun, Yue Chen, Zhuhua Zhang, Wanlin Guo, ** Qian, Jian Sun, Paul Erhart , et al. (3 additional authors not shown)

    Abstract: Machine-learned potentials (MLPs) have exhibited remarkable accuracy, yet the lack of general-purpose MLPs for a broad spectrum of elements and their alloys limits their applicability. Here, we present a feasible approach for constructing a unified general-purpose MLP for numerous elements, demonstrated through a model (UNEP-v1) for 16 elemental metals and their alloys. To achieve a complete repre… ▽ More

    Submitted 12 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Main text with 17 pages and 8 figures; supplementary with 26 figures and 4 tables; source code and training/test data available

  20. arXiv:2311.01723  [pdf, other

    cs.CV cs.AI

    Towards Calibrated Robust Fine-Tuning of Vision-Language Models

    Authors: Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul Choo, Alexander Hauptmann, Zhi-Qi Cheng, Kyungwoo Song

    Abstract: Improving out-of-distribution (OOD) generalization through in-distribution (ID) adaptation is a primary goal of robust fine-tuning methods beyond the naive fine-tuning approach. However, despite decent OOD generalization performance from recent robust fine-tuning methods, OOD confidence calibration for reliable machine learning has not been fully addressed. This work proposes a robust fine-tuning… ▽ More

    Submitted 27 May, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: Presented at the NeurIPS 2023 Workshop on Distribution Shifts (DistShift)

  21. arXiv:2310.16869  [pdf

    eess.IV physics.optics

    Single-pixel imaging based on deep learning

    Authors: Kai Song, Yaoxing Bian, Ku Wu, Hongrui Liu, Shuang** Han, Jiaming Li, Jiazhao Tian, Chengbin Qin, Jianyong Hu, Liantuan Xiao

    Abstract: Single-pixel imaging can collect images at the wavelengths outside the reach of conventional focal plane array detectors. However, the limited image quality and lengthy computational times for iterative reconstruction still impede the practical application of single-pixel imaging. Recently, deep learning has been introduced into single-pixel imaging, which has attracted a lot of attention due to i… ▽ More

    Submitted 16 November, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  22. arXiv:2310.15105  [pdf, other

    cs.CV

    FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning

    Authors: Kun Song, Huimin Ma, Bochao Zou, Huishuai Zhang, Weiran Huang

    Abstract: Due to the limited availability of data, existing few-shot learning methods trained from scratch fail to achieve satisfactory performance. In contrast, large-scale pre-trained models such as CLIP demonstrate remarkable few-shot and zero-shot capabilities. To enhance the performance of pre-trained models for downstream tasks, fine-tuning the model on downstream data is frequently necessary. However… ▽ More

    Submitted 17 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  23. arXiv:2310.13899  [pdf, other

    cs.RO

    FHT-Map: Feature-based Hierarchical Topological Map for Relocalization and Path Planning

    Authors: Kun Song, Wenhang Liu, Gaoming Chen, Xiang Xu, Zhenhua Xiong

    Abstract: Topological maps are favorable for their small storage compared to geometric map. However, they are limited in relocalization and path planning capabilities. To solve this problem, a feature-based hierarchical topological map (FHT-Map) is proposed along with a real-time map construction algorithm for robot exploration. Specifically, the FHT-Map utilizes both RGB cameras and LiDAR information and c… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 8 pages, 7figures, 2 tables

  24. arXiv:2310.11954  [pdf, other

    cs.CL cs.MM eess.AS

    MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

    Authors: Dingyao Yu, Kaitao Song, Peiling Lu, Tianyu He, Xu Tan, Wei Ye, Shikun Zhang, Jiang Bian

    Abstract: AI-empowered music processing is a diverse field that encompasses dozens of tasks, ranging from generation tasks (e.g., timbre synthesis) to comprehension tasks (e.g., music classification). For developers and amateurs, it is very difficult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data… ▽ More

    Submitted 25 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  25. arXiv:2310.09158  [pdf, other

    cs.AI

    Learning To Teach Large Language Models Logical Reasoning

    Authors: Meiqi Chen, Yubo Ma, Kaitao Song, Yixin Cao, Yan Zhang, Dongsheng Li

    Abstract: Large language models (LLMs) have gained enormous attention from both academia and industry, due to their exceptional ability in language generation and extremely powerful generalization. However, current LLMs still output unreliable content in practical reasoning tasks due to their inherent issues (e.g., hallucination). To better disentangle this problem, in this paper, we conduct an in-depth inv… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  26. arXiv:2310.06833  [pdf, other

    cond-mat.stat-mech cond-mat.soft

    Milestoning estimators of dissipation in systems observed at a coarse resolution: When ignorance is truly bliss

    Authors: Kristian Blom, Kevin Song, Etienne Vouga, Aljaž Godec, Dmitrii E. Makarov

    Abstract: Many non-equilibrium, active processes are observed at a coarse-grained level, where different microscopic configurations are projected onto the same observable state. Such "lumped" observables display memory, and in many cases the irreversible character of the underlying microscopic dynamics becomes blurred, e.g., when the projection hides dissipative cycles. As a result, the observations appear… ▽ More

    Submitted 11 October, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Journal ref: PNAS 121, e2318333121 (2024)

  27. arXiv:2310.01332  [pdf, other

    cond-mat.mtrl-sci

    Deformation Localisation in Ion-Irradiated FeCr

    Authors: Kay Song, Dina Sheyfer, Wenjun Liu, Jonathan Z Tischler, Suchandrima Das, Kenichiro Mizohata, Hongbing Yu, David E J Armstrong, Felix Hofmann

    Abstract: Irradiation-induced ductility loss is a major concern facing structural steels in next-generation nuclear reactors. Currently, the mechanisms for this are unclear but crucial to address for the design of reactor components. Here, the deformation characteristics around nanoindents in Fe and Fe10Cr irradiated with Fe ions to $\sim$1 displacement-per-atom at 313 K are non-destructively studied. Defor… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

    Comments: 23 pages, 9 figures (1 graphical abstract, 4 figures in main text, 4 figures in supplementary file)

  28. arXiv:2309.16451  [pdf, other

    cs.CV

    Towards Novel Class Discovery: A Study in Novel Skin Lesions Clustering

    Authors: Wei Feng, Lie Ju, Lin Wang, Kaimin Song, Zongyuan Ge

    Abstract: Existing deep learning models have achieved promising performance in recognizing skin diseases from dermoscopic images. However, these models can only recognize samples from predefined categories, when they are deployed in the clinic, data from new unknown categories are constantly emerging. Therefore, it is crucial to automatically discover and identify new semantic categories from new data. In t… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 10 pages, 1 figure,Accepted by miccai 2023

  29. arXiv:2309.09835  [pdf, ps, other

    math.AP

    Singular elliptic measure data problems with irregular obstacles

    Authors: Sun-Sig Byun, Kyeong Song, Yeonghun Youn

    Abstract: We investigate elliptic irregular obstacle problems with $p$-growth involving measure data. Emphasis is on the strongly singular case $1 < p \le 2-1/n$, and we obtain several new comparison estimates to prove gradient potential estimates in an intrinsic form. Our approach can be also applied to derive zero-order potential estimates.

    Submitted 18 September, 2023; originally announced September 2023.

  30. arXiv:2309.09414  [pdf, ps, other

    astro-ph.SR

    The triggering process of an X-class solar flare on a small quadrupolar active region

    Authors: Qiao Song, **g-Song Wang, Xiaoxin Zhang, Hechao Chen, Shuhong Yang, Zhenyong Hou, Yijun Hou, Qian Ye, Peng Zhang, Xiuqing Hu, **** Dun, Weiguo Zong, Xianyong Bai, Bo Chen, Ling** He, Kefei Song

    Abstract: The occurrence of X-class solar flares and their potential impact on the space weather often receive great attention than other flares. But predicting when and where an X-class flare will occur is still a challenge. With the multi-wavelength observation from the Solar Dynamics Observatory and FengYun- 3E satellite, we investigate the triggering of a GOES X1.0 flare occurring in the NOAA active reg… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: 24 pages, 7 figures, accepted for publication in ApJ

  31. arXiv:2309.08532  [pdf, other

    cs.CL cs.AI

    Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers

    Authors: Qingyan Guo, Rui Wang, Junliang Guo, Bei Li, Kaitao Song, Xu Tan, Guoqing Liu, Jiang Bian, Yujiu Yang

    Abstract: Large Language Models (LLMs) excel in various tasks, but they rely on carefully crafted prompts that often demand substantial human effort. To automate this process, in this paper, we propose a novel framework for discrete prompt optimization, called EvoPrompt, which borrows the idea of evolutionary algorithms (EAs) as they exhibit good performance and fast convergence. To enable EAs to work on di… ▽ More

    Submitted 27 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: International Conference on Learning Representations (ICLR) 2024

  32. arXiv:2309.04087  [pdf, other

    cs.CL

    Unsupervised Multi-document Summarization with Holistic Inference

    Authors: Haopeng Zhang, Sangwoo Cho, Kaiqiang Song, Xiaoyang Wang, Hongwei Wang, Jiawei Zhang, Dong Yu

    Abstract: Multi-document summarization aims to obtain core information from a collection of documents written on the same topic. This paper proposes a new holistic framework for unsupervised multi-document extractive summarization. Our method incorporates the holistic beam search inference method associated with the holistic measurements, named Subset Representative Index (SRI). SRI balances the importance… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: Findings of IJCNLP-AACL 2023

  33. arXiv:2309.03576  [pdf, other

    cs.CV

    DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions

    Authors: Haochen Wang, Junsong Fan, Yuxi Wang, Kaiyou Song, Tong Wang, Zhaoxiang Zhang

    Abstract: As it is empirically observed that Vision Transformers (ViTs) are quite insensitive to the order of input tokens, the need for an appropriate self-supervised pretext task that enhances the location awareness of ViTs is becoming evident. To address this, we present DropPos, a novel pretext task designed to reconstruct Dropped Positions. The formulation of DropPos is simple: we first drop a large ra… ▽ More

    Submitted 21 September, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by NeurIPS 2023

  34. arXiv:2309.02285  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    PromptTTS 2: Describing and Generating Voices with Text Prompt

    Authors: Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

    Abstract: Speech conveys more information than text, as the same word can be uttered in various voices to convey diverse information. Compared to traditional text-to-speech (TTS) methods relying on speech prompts (reference speech) for voice variability, using text prompts (descriptions) is more user-friendly since speech prompts can be hard to find or may not exist at all. TTS approaches based on the text… ▽ More

    Submitted 11 October, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Demo page: https://speechresearch.github.io/prompttts2

  35. arXiv:2308.13764  [pdf, other

    cs.CV cs.AI

    Unified Single-Stage Transformer Network for Efficient RGB-T Tracking

    Authors: Jianqiang Xia, DianXi Shi, Ke Song, Linna Song, XiaoLei Wang, Songchang **, Li Zhou, Yu Cheng, Lei **, Zheng Zhu, Jianan Li, Gang Wang, Junliang Xing, Jian Zhao

    Abstract: Most existing RGB-T tracking networks extract modality features in a separate manner, which lacks interaction and mutual guidance between modalities. This limits the network's ability to adapt to the diverse dual-modality appearances of targets and the dynamic relationships between the modalities. Additionally, the three-stage fusion tracking paradigm followed by these networks significantly restr… ▽ More

    Submitted 26 August, 2023; originally announced August 2023.

  36. arXiv:2308.07735  [pdf, other

    cond-mat.mtrl-sci

    Microstructural and material property changes in severely deformed Eurofer-97

    Authors: Kay Song, Guanze He, Abdallah Reza, Tamas Ungár, Phani Karamched, David Yang, Ivan Tolkachev, Kenichiro Mizohata, David E J Armstrong, Felix Hofmann

    Abstract: Severe plastic deformation changes the microstructure and properties of steels, which may be favourable for their use in structural components of nuclear reactors. In this study, high-pressure torsion (HPT) was used to refine the grain structure of Eurofer-97, a ferritic/ martensitic steel. Electron microscopy and X-ray diffraction were used to characterise the microstructural changes. Following H… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: 59 pages, 19 figures

  37. arXiv:2308.06522  [pdf, other

    cs.LG cs.AI

    SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models

    Authors: Sara Babakniya, Ahmed Roushdy Elkordy, Yahya H. Ezzeldin, Qingfeng Liu, Kee-Bong Song, Mostafa El-Khamy, Salman Avestimehr

    Abstract: Transfer learning via fine-tuning pre-trained transformer models has gained significant success in delivering state-of-the-art results across various NLP tasks. In the absence of centralized data, Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. However, due to the limited communication, computation, and storage capabilities of edge devi… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

  38. arXiv:2308.05649  [pdf, other

    cs.LO

    ESBMC v7.3: Model Checking C++ Programs using Clang AST

    Authors: Kunjian Song, Mikhail R. Gadelha, Franz Brauße, Rafael S. Menezes, Lucas C. Cordeiro

    Abstract: This paper introduces ESBMC v7.3, the latest Efficient SMT-Based Context-Bounded Model Checker version, which now incorporates a new clang-based C++ front-end. While the previous CPROVER-based front-end served well for handling C++03 programs, it encountered challenges kee** up with the evolving C++ language. As new language and library features were added in each C++ version, the limitations of… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  39. arXiv:2308.05495  [pdf, other

    cond-mat.mtrl-sci

    Topological soliton molecule in quasi 1D charge density wave

    Authors: Taehwan Im, Sun Kyu Song, Jae Whan Park, Han Woong Yeom

    Abstract: Soliton molecules, bound states of two solitons, can be important for the informatics using solitons and the quest for exotic particles in a wide range of physical systems from unconventional superconductors to nuclear matter and Higgs field, but have been observed only in temporal dimension for classical wave optical systems. Here, we identify a topological soliton molecule formed spatially in an… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  40. arXiv:2308.00771  [pdf, other

    cond-mat.mtrl-sci

    Dose and compositional dependence of irradiation-induced property change in FeCr

    Authors: Kay Song, Dina Sheyfer, Kenichiro Mizohata, Minyi Zhang, Wenjun Liu, Doğa Gürsoy, David Yang, Ivan Tolkachev, Hongbing Yu, David E J Armstrong, Felix Hofmann

    Abstract: Ferritic/martensitic steels will be used as structural components in next generation nuclear reactors. Their successful operation relies on an understanding of irradiation-induced defect behaviour in the material. In this study, Fe and FeCr alloys (3-12%Cr) were irradiated with 20 MeV Fe-ions at 313 K to doses ranging between 0.00008 dpa to 6.0 dpa. This dose range covers six orders of magnitude,… ▽ More

    Submitted 4 March, 2024; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: 49 pages, 9 figures, 3 tables

  41. arXiv:2308.00304  [pdf, other

    cs.CL

    Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

    Authors: Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen

    Abstract: We consider the problem of eliciting compositional generalization capabilities in large language models (LLMs) with a novel type of prompting strategy. Compositional generalization empowers the LLMs to solve problems that are harder than the ones they have seen (i.e., easy-to-hard generalization), which is a critical reasoning capability of human-like intelligence. However, even the current state-… ▽ More

    Submitted 14 August, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

  42. arXiv:2307.16642  [pdf, other

    stat.ME

    A Spectral Approach for the Dynamic Bradley-Terry Model

    Authors: Xin-Yu Tian, Jian Shi, Xiaotong Shen, Kai Song

    Abstract: The dynamic ranking, due to its increasing importance in many applications, is becoming crucial, especially with the collection of voluminous time-dependent data. One such application is sports statistics, where dynamic ranking aids in forecasting the performance of competitive teams, drawing on historical and current data. Despite its usefulness, predicting and inferring rankings pose challenges… ▽ More

    Submitted 4 August, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

  43. arXiv:2307.13127  [pdf, other

    stat.ML cs.LG

    A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning

    Authors: Spencer Giddens, Yiwang Zhou, Kevin R. Krull, Tara M. Brinkman, Peter X. K. Song, Fang Liu

    Abstract: It is commonplace to use data containing personal information to build predictive models in the framework of empirical risk minimization (ERM). While these models can be highly accurate in prediction, results obtained from these models with the use of sensitive data may be susceptible to privacy attacks. Differential privacy (DP) is an appealing framework for addressing such data privacy issues by… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 24 pages and 2 figures for the main manuscript, 5 pages and 2 figures for the supplementary materials

  44. arXiv:2307.05122  [pdf, other

    econ.EM

    Synthetic Decomposition for Counterfactual Predictions

    Authors: Nathan Canen, Kyungchul Song

    Abstract: Counterfactual predictions are challenging when the policy variable goes beyond its pre-policy support. However, in many cases, information about the policy of interest is available from different ("source") regions where a similar policy has already been implemented. In this paper, we propose a novel method of using such data from source regions to predict a new policy in a target region. Instead… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  45. arXiv:2307.04630  [pdf, other

    cs.SD eess.AS

    The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task

    Authors: Kun Song, Yi lei, Peikun Chen, Yiqing Cao, Kun Wei, Yongmao Zhang, Lei Xie, Ning Jiang, Guoqing Zhao

    Abstract: This paper describes the NPU-MSXF system for the IWSLT 2023 speech-to-speech translation (S2ST) task which aims to translate from English speech of multi-source to Chinese speech. The system is built in a cascaded manner consisting of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS). We make tremendous efforts to handle the challenging multi-source input. Spec… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: IWSLT@ACL 2023 system paper. Our submitted system ranks 1st in the S2ST task of the IWSLT 2023 evaluation campaign

  46. arXiv:2307.00852  [pdf, other

    cs.CL

    VOLTA: Improving Generative Diversity by Variational Mutual Information Maximizing Autoencoder

    Authors: Yueen Ma, Dafeng Chi, **g**g Li, Kai Song, Yuzheng Zhuang, Irwin King

    Abstract: The natural language generation domain has witnessed great success thanks to Transformer models. Although they have achieved state-of-the-art generative quality, they often neglect generative diversity. Prior attempts to tackle this issue suffer from either low model capacity or over-complicated architectures. Some recent methods employ the VAE framework to enhance diversity, but their latent vari… ▽ More

    Submitted 18 March, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

  47. ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading

    Authors: Yujia Xiao, Shaofei Zhang, Xi Wang, Xu Tan, Lei He, Sheng Zhao, Frank K. Soong, Tan Lee

    Abstract: While state-of-the-art Text-to-Speech systems can generate natural speech of very high quality at sentence level, they still meet great challenges in speech generation for paragraph / long-form reading. Such deficiencies are due to i) ignorance of cross-sentence contextual information, and ii) high computation and memory cost for long-form synthesis. To address these issues, this work develops a l… ▽ More

    Submitted 7 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: 5 pages, 4 figures, Proceedings of Interspeech 2023

  48. arXiv:2306.06841  [pdf, other

    cs.AI

    Leveraging Skill-to-Skill Supervision for Knowledge Tracing

    Authors: Hyeondey Kim, **woo Nam, Minjae Lee, Yun Jegal, Kyungwoo Song

    Abstract: Knowledge tracing plays a pivotal role in intelligent tutoring systems. This task aims to predict the probability of students answering correctly to specific questions. To do so, knowledge tracing systems should trace the knowledge state of the students by utilizing their problem-solving history and knowledge about the problems. Recent advances in knowledge tracing models have enabled better explo… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: AAAI2023 Artificial Intelligence for Education

  49. arXiv:2306.05629  [pdf, other

    cs.IT eess.SY

    R-PMAC: A Robust Preamble Based MAC Mechanism Applied in Industrial Internet of Things

    Authors: Kai Song, Biqian Feng, Yongpeng Wu, Zhen Gao, Wenjun Zhang

    Abstract: This paper proposes a novel media access control (MAC) mechanism, called the robust preamble-based MAC mechanism (R-PMAC), which can be applied to power line communication (PLC) networks in the context of the Industrial Internet of Things (IIoT). Compared with other MAC mechanisms such as P-MAC and the MAC layer of IEEE1901.1, R-PMAC has higher networking speed. Besides, it supports whitelist auth… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: This paper has been accepted by IEEE Internet of Things Journal

  50. arXiv:2306.05414  [pdf, other

    cs.CV

    Improving Tuning-Free Real Image Editing with Proximal Guidance

    Authors: Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Anastasis Stathopoulos, Xiaoxiao He, Yuxiao Chen, Di Liu, Qilong Zhangli, **dong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas

    Abstract: DDIM inversion has revealed the remarkable potential of real image editing within diffusion-based methods. However, the accuracy of DDIM reconstruction degrades as larger classifier-free guidance (CFG) scales being used for enhanced editing. Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing… ▽ More

    Submitted 5 July, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: Added inversion guidance, and fixed typos