Skip to main content

Showing 1–50 of 256 results for author: Ye, S

.
  1. arXiv:2406.18021  [pdf, other

    cs.SD cs.LG eess.AS

    SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR

    Authors: Shuaishuai Ye, Shunfei Chen, Xinhui Hu, Xinkang Xu

    Abstract: In this work, we propose a Switch-Conformer-based MoE system named SC-MoE for unified streaming and non-streaming code-switching (CS) automatic speech recognition (ASR), where we design a streaming MoE layer consisting of three language experts, which correspond to Mandarin, English, and blank, respectively, and equipped with a language identification (LID) network with a Connectionist Temporal Cl… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by InterSpeech 2024; 5 pages, 2 figures

  2. arXiv:2406.17525  [pdf, ps, other

    cond-mat.supr-con

    Indications of superconductivities in blend of variant apatite and covellite

    Authors: Hongyang Wang, Yi**g Zhao, Hao Wu, Ling Wang, Zhixing Wu, Zhihui Geng, Jiewen Xiao, Weiwei Xue, Shufeng Ye, Ning Chen, Xianfeng Qiao, Yao Yao

    Abstract: Through heavily do** sulfur into an apatite framework, we synthesize a new blend mainly comprising variant apatite and covellite (copper sulfide). Magnetic measurement exhibits that significant diamagnetism appears at around 260 K and drops dramatically below 30 K implying coexistence of two superconducting phases. The upper critical magnetic field is larger than 1000 Oe at 250 K. Electric measu… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 15 pages, 4 figures

  3. arXiv:2406.11813  [pdf, other

    cs.CL

    How Do Large Language Models Acquire Factual Knowledge During Pretraining?

    Authors: Hoyeon Chang, **ho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo

    Abstract: Despite the recent observation that large language models (LLMs) can store substantial factual knowledge, there is a limited understanding of the mechanisms of how they acquire factual knowledge through pretraining. This work addresses this gap by studying how LLMs acquire factual knowledge during pretraining. The findings reveal several important insights into the dynamics of factual knowledge ac… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  4. arXiv:2406.08418  [pdf, other

    cs.CV cs.AI

    OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

    Authors: Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang **, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang , et al. (15 additional authors not shown)

    Abstract: Image-text interleaved data, consisting of multiple images and texts arranged in a natural document format, aligns with the presentation paradigm of internet data and closely resembles human reading habits. Recent studies have shown that such data aids multimodal in-context learning and maintains the capabilities of large language models during multimodal fine-tuning. However, the limited scale an… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2406.05761  [pdf, other

    cs.CL

    The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

    Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Gui** Son, Ye** Cho, Sheikh Shafayat, **heon Baek, Sue Hyun Park, Hyeonbin Hwang, **kyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

    Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: Work in Progress

  6. arXiv:2405.17245  [pdf, other

    cs.DC cs.AI cs.LG cs.NI

    Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference

    Authors: Shengyuan Ye, Jiangsu Du, Liekang Zeng, Wenzhong Ou, Xiaowen Chu, Yutong Lu, Xu Chen

    Abstract: Transformer-based models have unlocked a plethora of powerful intelligent applications at the edge, such as voice assistant in smart home. Traditional deployment approaches offload the inference workloads to the remote cloud server, which would induce substantial pressure on the backbone network as well as raise users' privacy concerns. To address that, in-situ inference has been recently recogniz… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE International Conference on Computer Communications 2024

  7. arXiv:2405.05498  [pdf, other

    cs.SD eess.AS

    The RoyalFlush Automatic Speech Diarization and Recognition System for In-Car Multi-Channel Automatic Speech Recognition Challenge

    Authors: **gguang Tian, Shuaishuai Ye, Shunfei Chen, Yang Xiang, Zhaohui Yin, Xinhui Hu, Xinkang Xu

    Abstract: This paper presents our system submission for the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) Challenge, which focuses on speaker diarization and speech recognition in complex multi-speaker scenarios. To address these challenges, we develop end-to-end speaker diarization models that notably decrease the diarization error rate (DER) by 49.58\% compared to the official baseline on t… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  8. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  9. arXiv:2405.03567  [pdf, other

    cs.SD cs.AI eess.AS

    Deep Space Separable Distillation for Lightweight Acoustic Scene Classification

    Authors: ShuQi Ye, Yuan Tian

    Abstract: Acoustic scene classification (ASC) is highly important in the real world. Recently, deep learning-based methods have been widely employed for acoustic scene classification. However, these methods are currently not lightweight enough as well as their performance is not satisfactory. To solve these problems, we propose a deep space separable distillation network. Firstly, the network performs high-… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  10. arXiv:2405.02575  [pdf, other

    econ.GN

    Monetary Policies on Green Financial Markets: Evidence from a Multi-Moment Connectedness Network

    Authors: Tingguo Zheng, Hongyin Zhang, Shiqi Ye

    Abstract: This paper introduces a novel multi-moment connectedness network approach for analyzing the interconnectedness of green financial market. Focusing on the impact of monetary policy shocks, our study reveals that connectedness within the green bond and equity markets varies with different moments (returns, volatility, skewness, and kurtosis) and changes significantly around Federal Open Market Commi… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

  11. arXiv:2404.17766  [pdf, other

    cs.LG cs.AI cs.DC cs.NI

    Implementation of Big AI Models for Wireless Networks with Collaborative Edge Computing

    Authors: Liekang Zeng, Shengyuan Ye, Xu Chen, Yang Yang

    Abstract: Big Artificial Intelligence (AI) models have emerged as a crucial element in various intelligent applications at the edge, such as voice assistants in smart homes and autonomous robotics in smart factories. Training big AI models, e.g., for personalized fine-tuning and continual model refinement, poses significant challenges to edge devices due to the inherent conflict between limited computing re… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  12. arXiv:2404.16821  [pdf, other

    cs.CV

    How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

    Authors: Zhe Chen, Weiyun Wang, Hao Tian, Shenglong Ye, Zhangwei Gao, Erfei Cui, Wenwen Tong, Kongzhi Hu, Jiapeng Luo, Zheng Ma, Ji Ma, Jiaqi Wang, Xiaoyi Dong, Hang Yan, Hewei Guo, Conghui He, Botian Shi, Zhenjiang **, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai , et al. (10 additional authors not shown)

    Abstract: In this report, we introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements: (1) Strong Vision Encoder: we explored a continuous learning strategy for the large-scale vision foundation model -- InternViT-6B, boosting its visual… ▽ More

    Submitted 29 April, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: Technical report

  13. arXiv:2404.16418  [pdf, other

    cs.CL

    Instruction Matters, a Simple yet Effective Task Selection Approach in Instruction Tuning for Specific Tasks

    Authors: Changho Lee, Janghoon Han, Seonghyeon Ye, Stanley Jungkyu Choi, Honglak Lee, Kyunghoon Bae

    Abstract: Instruction tuning has shown its ability to not only enhance zero-shot generalization across various tasks but also its effectiveness in improving the performance of specific tasks. A crucial aspect in instruction tuning for a particular task is a strategic selection of related tasks that offer meaningful supervision, thereby enhancing efficiency and preventing performance degradation from irrelev… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: 21 pages, 6 figures, 16 tables

  14. arXiv:2404.10346  [pdf, other

    cs.CL

    Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards

    Authors: Hyeonbin Hwang, Doyoung Kim, Seungone Kim, Seonghyeon Ye, Minjoon Seo

    Abstract: Training on large amounts of rationales (i.e., CoT Fine-tuning) is effective at improving the reasoning capabilities of large language models (LLMs). However, acquiring human-authored rationales or augmenting rationales from proprietary models is costly and not scalable. In this paper, we study the problem of whether LLMs could self-improve their reasoning capabilities. To this end, we propose Sel… ▽ More

    Submitted 16 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Preprint Under Review

  15. arXiv:2403.12982  [pdf

    cond-mat.mtrl-sci cs.LG physics.chem-ph

    Knowledge-Reuse Transfer Learning Methods in Molecular and Material Science

    Authors: An Chen, Zhilong Wang, Karl Luigi Loza Vidaurre, Yanqiang Han, Simin Ye, Kehao Tao, Shiwei Wang, **g Gao, **** Li

    Abstract: Molecules and materials are the foundation for the development of modern advanced industries such as energy storage systems and semiconductor devices. However, traditional trial-and-error methods or theoretical calculations are highly resource-intensive, and extremely long R&D (Research and Development) periods cannot meet the urgent need for molecules/materials in industrial development. Machine… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 42 pages, 10 figures

  16. arXiv:2403.11126  [pdf, ps, other

    cond-mat.supr-con

    Observation of diamagnetic strange-metal phase in sulfur-copper codoped lead apatite

    Authors: Hongyang Wang, Hao Wu, Ning Chen, Xianfeng Qiao, Ling Wang, Zhixing Wu, Zhihui Geng, Weiwei Xue, Shufeng Ye, Yao Yao

    Abstract: By codo** sulfur and copper into lead apatite, the crystal grains are directionally stacked and the room-temperature resistivity is reduced from insulating to $2\times10^{-5}~Ω\cdot$m. The resistance-temperature curve exhibits a nearly linear relationship at low temperature suggesting the presence of strange-metal phase, and a second-order phase transition is then observed at around 230~K during… ▽ More

    Submitted 6 May, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: 12 pages, 4 figures

  17. arXiv:2403.10809  [pdf, other

    cs.RO

    Efficient Trajectory Forecasting and Generation with Conditional Flow Matching

    Authors: Sean Ye, Matthew Gombolay

    Abstract: Trajectory prediction and generation are vital for autonomous robots navigating dynamic environments. While prior research has typically focused on either prediction or generation, our approach unifies these tasks to provide a versatile framework and achieve state-of-the-art performance. Diffusion models, which are currently state-of-the-art for learned trajectory generation in long-horizon planni… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  18. arXiv:2403.10794  [pdf, other

    cs.RO cs.LG cs.MA

    Diffusion-Reinforcement Learning Hierarchical Motion Planning in Adversarial Multi-agent Games

    Authors: Zixuan Wu, Sean Ye, Manisha Natarajan, Matthew C. Gombolay

    Abstract: Reinforcement Learning- (RL-)based motion planning has recently shown the potential to outperform traditional approaches from autonomous navigation to robot manipulation. In this work, we focus on a motion planning task for an evasive target in a partially observable multi-agent adversarial pursuit-evasion games (PEG). These pursuit-evasion problems are relevant to various applications, such as se… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: This work has been submitted to the IEEE Robotics and Automation Letters (RA-L) for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  19. arXiv:2403.05265  [pdf, other

    cs.AI

    MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts

    Authors: Zinan Zeng, Sen Ye, Zijian Cai, Heng Wang, Yuhan Liu, Haokai Zhang, Minnan Luo

    Abstract: Online movie review websites are valuable for information and discussion about movies. However, the massive spoiler reviews detract from the movie-watching experience, making spoiler detection an important task. Previous methods simply focus on reviews' text content, ignoring the heterogeneity of information in the platform. For instance, the metadata and the corresponding user's information of a… ▽ More

    Submitted 13 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  20. arXiv:2402.17242  [pdf, other

    cs.SI cs.DB

    Scalable Community Search with Accuracy Guarantee on Attributed Graphs

    Authors: Yuxiang Wang, Shuzhan Ye, Xiaoliang Xu, Yuxia Geng, Zhenghe Zhao, Xiangyu Ke, Tianxing Wu

    Abstract: Given an attributed graph $G$ and a query node $q$, \underline{C}ommunity \underline{S}earch over \underline{A}ttributed \underline{G}raphs (CS-AG) aims to find a structure- and attribute-cohesive subgraph from $G$ that contains $q$. Although CS-AG has been widely studied, they still face three challenges. (1) Exact methods based on graph traversal are time-consuming, especially for large graphs.… ▽ More

    Submitted 29 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  21. arXiv:2402.14334  [pdf, other

    cs.CL

    INSTRUCTIR: A Benchmark for Instruction Following of Information Retrieval Models

    Authors: Hanseok Oh, Hyunji Lee, Seonghyeon Ye, Haebin Shin, Hansol Jang, Changwook Jun, Minjoon Seo

    Abstract: Despite the critical need to align search targets with users' intention, retrievers often only prioritize query information without delving into the users' intended search context. Enhancing the capability of retrievers to understand intentions and preferences of users, akin to language model instructions, has the potential to yield more aligned search targets. Prior studies restrict the applicati… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  22. arXiv:2402.12444  [pdf, other

    astro-ph.HE astro-ph.GA

    The Redshift Evolution of the Binary Black Hole Mass Distribution from Dense Star Clusters

    Authors: Claire S. Ye, Maya Fishbach

    Abstract: Gravitational-wave detectors are unveiling a population of binary black hole (BBH) mergers out to redshifts $z \approx 1$, and are starting to constrain how the BBH population evolves with redshift. We present predictions for the redshift evolution of the BBH mass and spin distributions for systems originating from dense star clusters. Utilizing a grid of 144 state-of-the-art dynamical models for… ▽ More

    Submitted 3 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 10 pages, 5 figures. Published at ApJ

  23. arXiv:2401.13267  [pdf, other

    cs.CV

    Dual-modal Dynamic Traceback Learning for Medical Report Generation

    Authors: Shuchang Ye, Mingyuan Meng, Mingjian Li, Dagan Feng, **man Kim

    Abstract: With increasing reliance on medical imaging in clinical practices, automated report generation from medical images is in great demand. Existing report generation methods typically adopt an encoder-decoder deep learning framework to build a uni-directional image-to-report map**. However, such a framework ignores the bi-directional mutual associations between images and reports, thus incurring dif… ▽ More

    Submitted 6 March, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

  24. arXiv:2401.00999  [pdf, ps, other

    cond-mat.supr-con

    Possible Meissner effect near room temperature in copper-substituted lead apatite

    Authors: Hongyang Wang, Yao Yao, Ke Shi, Yi**g Zhao, Hao Wu, Zhixing Wu, Zhihui Geng, Shufeng Ye, Ning Chen

    Abstract: With copper-substituted lead apatite below room temperature, we observe diamagnetic dc magnetization under magnetic field of 25 Oe with remarkable bifurcation between zero-field-cooling and field-cooling measurements, and under 200 Oe it changes to be paramagnetism. A glassy memory effect is found during cooling. Typical hysteresis loops for superconductors are detected below 250 K, along with an… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: 7 pages, 4 figures

  25. arXiv:2312.15244  [pdf, ps, other

    cs.IT eess.SP

    Fluid Antenna Array Enhanced Over-the-Air Computation

    Authors: Deyou Zhang, Sicong Ye, Ming Xiao, Kezhi Wang, Marco Di Renzo, Mikael Skoglund

    Abstract: Over-the-air computation (AirComp) has emerged as a promising technology for fast wireless data aggregation by harnessing the superposition property of wireless multiple-access channels. This paper investigates a fluid antenna (FA) array-enhanced AirComp system, employing the new degrees of freedom achieved by antenna movements. Specifically, we jointly optimize the transceiver design and antenna… ▽ More

    Submitted 23 December, 2023; originally announced December 2023.

  26. arXiv:2312.08681  [pdf, other

    math.GR math.GT

    Splittings and poly-freeness of triangle Artin groups

    Authors: Xiaolei Wu, Shengkui Ye

    Abstract: We prove that the triangle Artin group $\mathrm{Art}_{23M}$ splits as a graph of free groups if and only if $M$ is greater than $5$ and even. This answers two questions of Jankiewicz \cite[Question 2.2, Question 2.3]{Jan21} in the negative. Combined with the results of Squier and Jankiewicz, this completely determines when a triangle Artin group splits as a graph of free groups. Furthermore, we pr… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  27. arXiv:2312.04921  [pdf, other

    astro-ph.IM cs.DC

    Integrating the PanDA Workload Management System with the Vera C. Rubin Observatory

    Authors: Edward Karavakis, Wen Guan, Zhaoyu Yang, Tadashi Maeno, Torre Wenaus, Jennifer Adelman-McCarthy, Fernando Barreiro Megino, Kaushik De, Richard Dubois, Michelle Gower, Tim Jenness, Alexei Klimentov, Tatiana Korchuganova, Mikolaj Kowalik, Fa-Hui Lin, Paul Nilsson, Sergey Padolski, Wei Yang, Shuwei Ye

    Abstract: The Vera C. Rubin Observatory will produce an unprecedented astronomical data set for studies of the deep and dynamic universe. Its Legacy Survey of Space and Time (LSST) will image the entire southern sky every three to four days and produce tens of petabytes of raw image data and associated calibration data over the course of the experiment's run. More than 20 terabytes of data must be stored ev… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 8 pages, 3 figures, 26th International Conference on Computing in High Energy & Nuclear Physics

  28. arXiv:2311.08106  [pdf, other

    cs.CL

    Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models

    Authors: Yu** Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, Se-young Yun

    Abstract: The dynamic nature of knowledge in an ever-changing world presents challenges for language models trained on static data; the model in the real world often requires not only acquiring new knowledge but also overwriting outdated information into updated ones. To study the ability of language models for these time-dependent dynamics in human language, we introduce a novel task, EvolvingQA, a tempora… ▽ More

    Submitted 20 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 15 pages, 10 figures, 5 tables; accepted to NAACL 2024

  29. arXiv:2311.04933  [pdf

    cs.CL cs.AI

    Evaluating Large Language Models in Ophthalmology

    Authors: Jason Holmes, Shuyuan Ye, Yiwei Li, Shi-Nan Wu, Zhengliang Liu, Zihao Wu, **yu Hu, Huan Zhao, Xi Jiang, Wei Liu, Hong Wei, Jie Zou, Tianming Liu, Yi Shao

    Abstract: Purpose: The performance of three different large language models (LLMS) (GPT-3.5, GPT-4, and PaLM2) in answering ophthalmology professional questions was evaluated and compared with that of three different professional populations (medical undergraduates, medical masters, and attending physicians). Methods: A 100-item ophthalmology single-choice test was administered to three different LLMs (GPT-… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  30. arXiv:2310.14049  [pdf, other

    cs.AR

    Post-Layout Simulation Driven Analog Circuit Sizing

    Authors: Xiaohan Gao, Haoyi Zhang, Siyuan Ye, Mingjie Liu, David Z. Pan, Linxiao Shen, Runsheng Wang, Yibo Lin, Ru Huang

    Abstract: Post-layout simulation provides accurate guidance for analog circuit design, but post-layout performance is hard to be directly optimized at early design stages. Prior work on analog circuit sizing often utilizes pre-layout simulation results as the optimization objective. In this work, we propose a post-layout-simulation-driven (post-simulation-driven for short) analog circuit sizing framework th… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

  31. arXiv:2310.04897  [pdf

    cs.CY cs.AI

    Generative AI May Prefer to Present National-level Characteristics of Cities Based on Stereotypical Geographic Impressions at the Continental Level

    Authors: Shan Ye

    Abstract: A simple experiment was conducted to test the ability of the Chinese-based generative artificial intelligence (AI) platform, Wenxin Yige, to render images of urban street views of different countries. The study found that images generated by this AI platform may contain continental-level stereotypes in terms of showing the level of economic development and modernization. Street view images generat… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 9 pages, 3 figures

  32. arXiv:2310.00434  [pdf, other

    cs.CV cs.GR

    DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models

    Authors: Zhiyao Sun, Tian Lv, Sheng Ye, Matthieu Lin, Jenny Sheng, Yu-Hui Wen, Min**g Yu, Yong-** Liu

    Abstract: The generation of stylistic 3D facial animations driven by speech presents a significant challenge as it requires learning a many-to-many map** between speech, style, and the corresponding natural facial motion. However, existing methods either employ a deterministic model for speech-to-motion map** or encode the style using a one-hot encoding scheme. Notably, the one-hot encoding approach fai… ▽ More

    Submitted 14 May, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: SIGGRAPH 2024 (Journal Track). Project page: https://diffposetalk.github.io/

  33. arXiv:2309.13122  [pdf, other

    astro-ph.GA astro-ph.HE astro-ph.SR

    The dominant mechanism(s) for populating the outskirts of star clusters with neutron star binaries

    Authors: Nathan W. C. Leigh, Claire S. Ye, Steffani M. Grondin, Giacomo Fragione, Jeremy J. Webb, Craig O. Heinke

    Abstract: It has been argued that heavy binaries composed of neutron stars (NSs) and millisecond pulsars (MSPs) can end up in the outskirts of star clusters via an interaction with a massive black hole (BH) binary expelling them from the core. We argue here, however, that this mechanism will rarely account for such observed objects. Only for primary masses $\lesssim$ 100 M$_{\odot}$ and a narrow range of or… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

    Comments: 13 pages, 7 figures, 2 tables, submitted to MNRAS

  34. arXiv:2309.11043  [pdf, other

    cs.CV

    Score Mismatching for Generative Modeling

    Authors: Senmao Ye, Fei Liu

    Abstract: We propose a new score-based model with one-step sampling. Previously, score-based models were burdened with heavy computations due to iterative sampling. For substituting the iterative process, we train a standalone generator to compress all the time steps with the gradient backpropagated from the score network. In order to produce meaningful gradients for the generator, the score network is trai… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

  35. arXiv:2309.09260  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci cond-mat.str-el

    Visualizing the Zhang-Rice singlet, molecular orbitals and pair formation in cuprate

    Authors: Shusen Ye, Jianfa Zhao, Zhiheng Yao, Sixuan Chen, Zehao Dong, Xintong Li, Luchuan Shi, Qingqing Liu, Changqing **, Yayu Wang

    Abstract: The parent compound of cuprates is a charge-transfer-type Mott insulator with strong hybridization between the Cu $3d_{\mathrm x^2-y^2}$ and O $2p$ orbitals. A key question concerning the pairing mechanism is the behavior of doped holes in the antiferromagnetic (AF) Mott insulator background, which is a prototypical quantum many-body problem. It was proposed that doped hole on the O site tends to… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  36. arXiv:2309.08159  [pdf, other

    cs.CV cs.IR cs.LG

    AdSEE: Investigating the Impact of Image Style Editing on Advertisement Attractiveness

    Authors: Liyao Jiang, Chenglin Li, Haolan Chen, Xiaodong Gao, Xinwang Zhong, Yang Qiu, Shani Ye, Di Niu

    Abstract: Online advertisements are important elements in e-commerce sites, social media platforms, and search engines. With the increasing popularity of mobile browsing, many online ads are displayed with visual information in the form of a cover image in addition to text descriptions to grab the attention of users. Various recent studies have focused on predicting the click rates of online advertisements… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted to KDD 2023 Applied Data Science Track

  37. arXiv:2309.08097  [pdf, other

    cs.CV

    Detail Reinforcement Diffusion Model: Augmentation Fine-Grained Visual Categorization in Few-Shot Conditions

    Authors: Tianxu Wu, Shuo Ye, Shuhuang Chen, Qinmu Peng, Xinge You

    Abstract: The challenge in fine-grained visual categorization lies in how to explore the subtle differences between different subclasses and achieve accurate discrimination. Previous research has relied on large-scale annotated data and pre-trained deep models to achieve the objective. However, when only a limited amount of samples is available, similar methods may become less effective. Diffusion models ha… ▽ More

    Submitted 15 May, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted by TETCI

  38. arXiv:2309.07640  [pdf, other

    cs.CV

    Indoor Scene Reconstruction with Fine-Grained Details Using Hybrid Representation and Normal Prior Enhancement

    Authors: Sheng Ye, Yubin Hu, Matthieu Lin, Yu-Hui Wen, Wang Zhao, Yong-** Liu, Wen** Wang

    Abstract: The reconstruction of indoor scenes from multi-view RGB images is challenging due to the coexistence of flat and texture-less regions alongside delicate and fine-grained regions. Recent methods leverage neural radiance fields aided by predicted surface normal priors to recover the scene geometry. These methods excel in producing complete and smooth results for floor and wall areas. However, they s… ▽ More

    Submitted 25 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

  39. arXiv:2309.06877  [pdf, other

    cs.CV cs.MM

    Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization

    Authors: Zhenguang Liu, Xinyang Yu, Ruili Wang, Shuai Ye, Zhe Ma, Jianfeng Dong, Sifeng He, Feng Qian, Xiaobo Zhang, Roger Zimmermann, Lei Yang

    Abstract: The self-media era provides us tremendous high quality videos. Unfortunately, frequent video copyright infringements are now seriously damaging the interests and enthusiasm of video creators. Identifying infringing videos is therefore a compelling task. Current state-of-the-art methods tend to simply feed high-dimensional mixed video features into deep neural networks and count on the networks to… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: This paper is accepted by ACM MM 2023

  40. arXiv:2309.00884  [pdf, other

    astro-ph.EP physics.plasm-ph

    Excitation of extraordinary modes inside the source of Saturn's kilometric radiation

    Authors: Hao Ning, Yao Chen, Chuanyang Li, Shengyi Ye, Alexey Kuznetsov, Siyuan Wu

    Abstract: The electron cyclotron maser instability (ECMI) of extraordinary mode waves was investigated with the parameters observed in Saturn's kilometric radiation (SKR) sources. Previous studies employed simplified dispersion relations, and did not consider the excitation of the relativistic (R) mode. This mode is introduced by considering the relativistic effect in plasmas consisting of both cold and hot… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

    Journal ref: A&A 678, A94 (2023)

  41. arXiv:2308.16713  [pdf, other

    q-bio.BM

    Accurate Prediction of Antibody Function and Structure Using Bio-Inspired Antibody Language Model

    Authors: Hongtai **g, Zhengtao Gao, Sheng Xu, Tao Shen, Zhangzhi Peng, Shwai He, Tao You, Shuang Ye, Wei Lin, Siqi Sun

    Abstract: In recent decades, antibodies have emerged as indispensable therapeutics for combating diseases, particularly viral infections. However, their development has been hindered by limited structural information and labor-intensive engineering processes. Fortunately, significant advancements in deep learning methods have facilitated the precise prediction of protein structure and function by leveraging… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  42. arXiv:2308.09591  [pdf, other

    cs.CV

    O$^2$-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model

    Authors: Yubin Hu, Sheng Ye, Wang Zhao, Matthieu Lin, Yuze He, Yu-Hui Wen, Ying He, Yong-** Liu

    Abstract: Occlusion is a common issue in 3D reconstruction from RGB-D videos, often blocking the complete reconstruction of objects and presenting an ongoing problem. In this paper, we propose a novel framework, empowered by a 2D diffusion-based in-painting model, to reconstruct complete surfaces for the hidden parts of objects. Specifically, we utilize a pre-trained diffusion model to fill in the hidden ar… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: AAAI 2024

  43. Single Millisecond Pulsars from Dynamical Interaction Processes in Dense Star Clusters

    Authors: Claire S. Ye, Kyle Kremer, Scott M. Ransom, Frederic A. Rasio

    Abstract: Globular clusters (GCs) are particularly efficient at forming millisecond pulsars. Among these pulsars, about half lack a companion star, a significantly higher fraction than in the Galactic field. This fraction increases further in some of the densest GCs, especially those that have undergone core collapse, suggesting that dynamical interaction processes play a key role. For the first time, we cr… ▽ More

    Submitted 19 January, 2024; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: 19 pages, 12 figures, 4 tables. Published at ApJ

  44. arXiv:2307.10928  [pdf, other

    cs.CL cs.AI

    FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets

    Authors: Seonghyeon Ye, Doyoung Kim, Sungdong Kim, Hyeonbin Hwang, Seungone Kim, Yongrae Jo, James Thorne, Juho Kim, Minjoon Seo

    Abstract: Evaluation of Large Language Models (LLMs) is challenging because instruction-following necessitates alignment with human values and the required set of skills varies depending on the instruction. However, previous studies have mainly focused on coarse-grained evaluation (i.e. overall preference-based evaluation), which limits interpretability since it does not consider the nature of user instruct… ▽ More

    Submitted 14 April, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: ICLR 2024 Spotlight

  45. arXiv:2307.06244  [pdf, other

    cs.RO cs.LG cs.MA

    Diffusion Models for Multi-target Adversarial Tracking

    Authors: Sean Ye, Manisha Natarajan, Zixuan Wu, Matthew Gombolay

    Abstract: Target tracking plays a crucial role in real-world scenarios, particularly in drug-trafficking interdiction, where the knowledge of an adversarial target's location is often limited. Improving autonomous tracking systems will enable unmanned aerial, surface, and underwater vehicles to better assist in interdicting smugglers that use manned surface, semi-submersible, and aerial vessels. As unmanned… ▽ More

    Submitted 12 January, 2024; v1 submitted 12 July, 2023; originally announced July 2023.

  46. arXiv:2307.04858  [pdf, other

    cs.HC cs.CV q-bio.NC

    AmadeusGPT: a natural language interface for interactive animal behavioral analysis

    Authors: Shaokai Ye, Jessy Lauer, Mu Zhou, Alexander Mathis, Mackenzie W. Mathis

    Abstract: The process of quantifying and analyzing animal behavior involves translating the naturally occurring descriptive language of their actions into machine-readable code. Yet, codifying behavior analysis is often challenging without deep understanding of animal behavior and technical machine learning knowledge. To limit this gap, we introduce AmadeusGPT: a natural language interface that turns natura… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: demo available https://github.com/AdaptiveMotorControlLab/AmadeusGPT

    Journal ref: Published in Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) 2023

  47. Augmenting Sports Videos with VisCommentator

    Authors: Chen Zhu-Tian, Shuainan Ye, Xiangtong Chu, Haijun Xia, Hui Zhang, Huamin Qu, Yingcai Wu

    Abstract: Visualizing data in sports videos is gaining traction in sports analytics, given its ability to communicate insights and explicate player strategies engagingly. However, augmenting sports videos with such data visualizations is challenging, especially for sports analysts, as it requires considerable expertise in video editing. To ease the creation process, we present a design space that characteri… ▽ More

    Submitted 10 May, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

    Journal ref: IEEE Transactions on Visualization and Computer Graphics ( Volume: 28, Issue: 1, January 2022)

  48. arXiv:2306.12870  [pdf, other

    cs.SI

    HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention

    Authors: Sen Ye, Zhaoxuan Tan, Zhenyu Lei, Ruijie He, Hongrui Wang, Qinghua Zheng, Minnan Luo

    Abstract: Twitter bot detection has become an increasingly important and challenging task to combat online misinformation, facilitate social content moderation, and safeguard the integrity of social platforms. Though existing graph-based Twitter bot detection methods achieved state-of-the-art performance, they are all based on the homophily assumption, which assumes users with the same label are more likely… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 11 pages, 7 figures

  49. arXiv:2306.11301  [pdf, other

    cs.LG cs.AI cs.RO

    Adversarial Search and Tracking with Multiagent Reinforcement Learning in Sparsely Observable Environment

    Authors: Zixuan Wu, Sean Ye, Manisha Natarajan, Letian Chen, Rohan Paleja, Matthew C. Gombolay

    Abstract: We study a search and tracking (S&T) problem where a team of dynamic search agents must collaborate to track an adversarial, evasive agent. The heterogeneous search team may only have access to a limited number of past adversary trajectories within a large search space. This problem is challenging for both model-based searching and reinforcement learning (RL) methods since the adversary exhibits r… ▽ More

    Submitted 20 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Accepted by IEEE International Symposium on Multi-Robot & Multi-Agent Systems (MRS) 2023

  50. arXiv:2306.11168  [pdf, other

    cs.LG cs.AI cs.MA

    Learning Models of Adversarial Agent Behavior under Partial Observability

    Authors: Sean Ye, Manisha Natarajan, Zixuan Wu, Rohan Paleja, Letian Chen, Matthew C. Gombolay

    Abstract: The need for opponent modeling and tracking arises in several real-world scenarios, such as professional sports, video game design, and drug-trafficking interdiction. In this work, we present Graph based Adversarial Modeling with Mutal Information (GrAMMI) for modeling the behavior of an adversarial opponent agent. GrAMMI is a novel graph neural network (GNN) based approach that uses mutual inform… ▽ More

    Submitted 5 July, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: 8 pages, 3 figures, 2 tables