Skip to main content

Showing 201–250 of 18,850 results for author: Zhang, Y

.
  1. arXiv:2406.11003  [pdf, other

    cs.CV cs.AI

    3D Gaze Tracking for Studying Collaborative Interactions in Mixed-Reality Environments

    Authors: Eduardo Davalos, Yike Zhang, Ashwin T. S., Joyce H. Fonteles, Umesh Timalsina, Guatam Biswas

    Abstract: This study presents a novel framework for 3D gaze tracking tailored for mixed-reality settings, aimed at enhancing joint attention and collaborative efforts in team-based scenarios. Conventional gaze tracking, often limited by monocular cameras and traditional eye-tracking apparatus, struggles with simultaneous data synchronization and analysis from multiple participants in group contexts. Our pro… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 9 pages, 8 figures, conference, submitted to ICMI 2024

  2. arXiv:2406.10992  [pdf, other

    math.RA

    Extending Structures for Dendriform Algebras

    Authors: Yuanyuan Zhang, Junwen Wang

    Abstract: In this paper, we devote to extending structures for dendriform algebras. First, we define extending datums and unified products of dendriform algebras, and theoretically solve the extending structure problem. As an application, we consider flag datums as a special case of extending structures, and give an example of the extending structure problem. Second, we introduce matched pairs and bicrossed… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: 38pages,comments are welcome

    MSC Class: 16W99

  3. arXiv:2406.10902  [pdf, other

    cs.CV cs.CL

    Light Up the Shadows: Enhance Long-Tailed Entity Grounding with Concept-Guided Vision-Language Models

    Authors: Yikai Zhang, Qianyu He, Xintao Wang, Siyu Yuan, Jiaqing Liang, Yanghua Xiao

    Abstract: Multi-Modal Knowledge Graphs (MMKGs) have proven valuable for various downstream tasks. However, scaling them up is challenging because building large-scale MMKGs often introduces mismatched images (i.e., noise). Most entities in KGs belong to the long tail, meaning there are few images of them available online. This scarcity makes it difficult to determine whether a found image matches the entity… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.10833  [pdf, other

    cs.CL

    A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery

    Authors: Yu Zhang, Xiusi Chen, Bowen **, Sheng Wang, Shuiwang Ji, Wei Wang, Jiawei Han

    Abstract: In many scientific fields, large language models (LLMs) have revolutionized the way with which text and other modalities of data (e.g., molecules and proteins) are dealt, achieving superior performance in various applications and augmenting the scientific discovery process. Nevertheless, previous surveys on scientific LLMs often concentrate on one to two fields or a single modality. In this paper,… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: 33 pages (GitHub: https://github.com/yuzhimanhua/Awesome-Scientific-Language-Models)

  5. arXiv:2406.10793  [pdf, ps, other

    math.OC

    Symplectic Extra-gradient Type Method for Solving General Non-monotone Inclusion Problem

    Authors: Ya-xiang Yuan, Yi Zhang

    Abstract: In recent years, accelerated extra-gradient methods have attracted much attention by researchers, for solving monotone inclusion problems. A limitation of most current accelerated extra-gradient methods lies in their direct utilization of the initial point, which can potentially decelerate numerical convergence rate. In this work, we present a new accelerated extra-gradient method, by utilizing th… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 37 pages, 7 figures, 1 table

  6. arXiv:2406.10749  [pdf, ps, other

    math.CV math.AP

    Unique continuation of Schrödinger-type equations for $\bar\partial$ II

    Authors: Yifei Pan, Yuan Zhang

    Abstract: In this paper, we extend our earlier unique continuation results \cite{PZ2} for the Schrödinger-type inequality $ |\bar\partial u| \le V|u|$ on a domain in $\mathbb C^n$ by removing the smoothness assumption on solutions $u = (u_1, \ldots, u_N)$. More specifically, we establish the unique continuation property for $W_{loc}^{1,1}$ solutions when the potential $V\in L_{loc}^p $, $ p>2n$; and for… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: 10 pages

    MSC Class: Primary 32W05; Secondary 35J10

  7. arXiv:2406.10744   

    cs.CV

    Technique Report of CVPR 2024 PBDL Challenges

    Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Jose Alvarez, Coert van Gemeren, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Sheng** Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou , et al. (77 additional authors not shown)

    Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More

    Submitted 27 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

    Comments: The author list and contents need to be verified by all authors

  8. arXiv:2406.10737  [pdf, other

    cs.LG cs.CV

    Dynamic Domains, Dynamic Solutions: DPCore for Continual Test-Time Adaptation

    Authors: Yunbei Zhang, Akshay Mehra, Jihun Hamm

    Abstract: Continual Test-Time Adaptation (TTA) seeks to adapt a source pre-trained model to continually changing, unlabeled target domains. Existing TTA methods are typically designed for environments where domain changes occur gradually and can struggle in more dynamic scenarios. Inspired by the principles of online K-Means, this paper introduces a novel approach to continual TTA through visual prompting.… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  9. arXiv:2406.10551  [pdf, other

    cond-mat.str-el

    Electron dynamics in a three-dimensional Brillouin zone analysed by machine learning

    Authors: Paulina Majchrzak, Charlotte Sanders, Yu Zhang, Andrii Kuibarov, Oleksandr Suvorov, Emma Springate, Iryna Kovalchuk, Saicharan Aswartham, Grigory Shipunov, Bernd Büchner, Alexander Yaresko, Sergey Borisenko, Philip Hofmann

    Abstract: The electron dynamics in the unoccupied states of the Weyl semimetal PtBi$_2$ is studied by time- and angle-resolved photoemission spectroscopy (TR-ARPES). The measurement's result is the photoemission intensity $I$ as a function of at least four parameters: the emission angle and kinetic energy of the photoelectrons, the time delay between pump and probe laser pulses, and the probe laser photon e… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  10. arXiv:2406.10550  [pdf, other

    cond-mat.str-el

    Ultrafast carrier dynamics throughout the three-dimensional Brillouin zone of the Weyl semimetal PtBi$_2$

    Authors: Paulina Majchrzak, Charlotte Sanders, Yu Zhang, Andrii Kuibarov, Oleksandr Suvorov, Emma Springate, Iryna Kovalchuk, Saicharan Aswartham, Grigory Shipunov, Bernd Büchner, Alexander Yaresko, Sergey Borisenko, Philip Hofmann

    Abstract: Using time- and angle-resolved photoemission spectroscopy, we examine the unoccupied electronic structure and electron dynamics of the type-I Weyl semimetal PtBi$_2$. Using the ability to change the probe photon energy over a wide range, we identify the predicted Weyl points in the unoccupied three-dimensional band structure and we discuss the effect of $k_\perp$ broadening in the normally unoccup… ▽ More

    Submitted 18 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

  11. arXiv:2406.10519  [pdf, other

    cs.CV cs.AI

    Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation

    Authors: Pengfei Gu, Yejia Zhang, Huimin Li, Hongxiao Wang, Yizhe Zhang, Chaoli Wang, Danny Z. Chen

    Abstract: Masked Autoencoders (MAEs) have been shown to be effective in pre-training Vision Transformers (ViTs) for natural and medical image analysis problems. By reconstructing missing pixel/voxel information in visible patches, a ViT encoder can aggregate contextual information for downstream tasks. But, existing MAE pre-training methods, which were specifically developed with the ViT architecture, lack… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  12. arXiv:2406.10474  [pdf, other

    cs.DC

    Federated Neural Radiance Field for Distributed Intelligence

    Authors: Yintian Zhang, Ziyu Shao

    Abstract: Novel view synthesis (NVS) is an important technology for many AR and VR applications. The recently proposed Neural Radiance Field (NeRF) approach has demonstrated superior performance on NVS tasks, and has been applied to other related fields. However, certain application scenarios with distributed data storage may pose challenges on acquiring training images for the NeRF approach, due to strict… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  13. arXiv:2406.10434  [pdf, other

    eess.SY

    Risk-Aware Value-Oriented Net Demand Forecasting for Virtual Power Plants

    Authors: Yufan Zhang, Jiajun Han, Yuanyuan Shi

    Abstract: This paper develops a risk-aware net demand forecasting product for virtual power plants, which helps reduce the risk of high operation costs. At the training phase, a bilevel program for parameter estimation is formulated, where the upper level optimizes over the forecast model parameter to minimize the conditional value-at-risk (a risk metric) of operation costs. The lower level solves the opera… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Submitted to The 56th North American Power Symposium (NAPS 2024)

  14. arXiv:2406.10393  [pdf, other

    cs.CL

    EWEK-QA: Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems

    Authors: Mohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang, Xiaoguang Li, Jianye Hao, Qun Liu, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh

    Abstract: The emerging citation-based QA systems are gaining more attention especially in generative AI search applications. The importance of extracted knowledge provided to these systems is vital from both accuracy (completeness of information) and efficiency (extracting the information in a timely manner). In this regard, citation-based QA systems are suffering from two shortcomings. First, they usually… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  15. arXiv:2406.10361  [pdf, other

    eess.IV

    On Efficient Neural Network Architectures for Image Compression

    Authors: Yichi Zhang, Zhihao Duan, Fengqing Zhu

    Abstract: Recent advances in learning-based image compression typically come at the cost of high complexity. Designing computationally efficient architectures remains an open challenge. In this paper, we empirically investigate the impact of different network designs in terms of rate-distortion performance and computational complexity. Our experiments involve testing various transforms, including convolutio… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 2024 IEEE International Conference on Image Processing (ICIP2024)

  16. arXiv:2406.10284  [pdf, other

    cs.CL cs.SD eess.AS

    Improving child speech recognition with augmented child-like speech

    Authors: Yuanyuan Zhang, Zhengjun Yue, Tanvina Patel, Odette Scharenborg

    Abstract: State-of-the-art ASRs show suboptimal performance for child speech. The scarcity of child speech limits the development of child speech recognition (CSR). Therefore, we studied child-to-child voice conversion (VC) from existing child speakers in the dataset and additional (new) child speakers via monolingual and cross-lingual (Dutch-to-German) VC, respectively. The results showed that cross-lingua… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 5 pages, 1 figure Accepted to INTERSPEECH 2024

  17. arXiv:2406.10252  [pdf, other

    cs.IR cs.AI cs.CL

    AutoSurvey: Large Language Models Can Automatically Write Surveys

    Authors: Yidong Wang, Qi Guo, Wen** Yao, Hongbo Zhang, Xin Zhang, Zhen Wu, Meishan Zhang, Xinyu Dai, Min Zhang, Qingsong Wen, Wei Ye, Shikun Zhang, Yue Zhang

    Abstract: This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys in rapidly evolving fields like artificial intelligence. Traditional survey paper creation faces challenges due to the vast volume and complexity of information, prompting the need for efficient survey methods. While large language models (LLMs) offer promise in… ▽ More

    Submitted 17 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  18. arXiv:2406.10193  [pdf

    cond-mat.str-el cond-mat.supr-con

    Three-dimensional quantum Griffiths singularity in bulk iron-pnictide superconductors

    Authors: Shao-Bo Liu, Congkuan Tian, Yongqing Cai, Hang Cui, Xinjian Wei, Mantang Chen, Yang Zhao, Yuan Sui, Shuyue Guan, Shuang Jia, Yu Zhang, Ya Feng, Jiankun Li, Jian Cui, Yuanjun Song, Tingting Hao, Chaoyu Chen, Jian-Hao Chen

    Abstract: The quantum Griffiths singularity (QGS) is a phenomenon driven by quenched disorders that break conventional scaling invariance and result in a divergent dynamical critical exponent during quantum phase transitions (QPT). While this phenomenon has been well-documented in low-dimensional conventional superconductors and in three-dimensional (3D) magnetic metal systems, its presence in 3D supercondu… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 17 pages, 4 figures

  19. arXiv:2406.10158  [pdf, other

    cs.DB cs.DC

    Harnessing GPU Power for Enhanced OLTP: A Study in Concurrency Control Schemes

    Authors: Zihan Sun, Yong Zhang, Chao Li, Chunxiao Xing

    Abstract: GPUs, whose performance has gone through a huge leap over the past decade, have proved their ability to accelerate Online Analytical Processing (OLAP) operations. On the other hand, there is still a huge gap in the field of GPU-accelerated Online Transaction Processing (OLTP) operations since it was generally believed that GPUswere not suitable for OLTP in the past. However, the massive parallelis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  20. arXiv:2406.10125  [pdf, other

    cs.CV

    MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report

    Authors: Zhongyu Yang, Mai Liu, **luo Xie, Yueming Zhang, Chen Shen, Wei Shao, Jichao Jiao, Tengfei Xing, Runbo Hu, Pengfei Xu

    Abstract: Autonomous driving without high-definition (HD) maps demands a higher level of active scene understanding. In this competition, the organizers provided the multi-perspective camera images and standard-definition (SD) maps to explore the boundaries of scene reasoning capabilities. We found that most existing algorithms construct Bird's Eye View (BEV) features from these multi-perspective images and… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  21. arXiv:2406.10100  [pdf, other

    cs.CV cs.AI

    SkySenseGPT: A Fine-Grained Instruction Tuning Dataset and Model for Remote Sensing Vision-Language Understanding

    Authors: Junwei Luo, Zhen Pang, Yongjun Zhang, Tingzhu Wang, Linlin Wang, Bo Dang, Jiangwei Lao, Jian Wang, **gdong Chen, Yihua Tan, Yansheng Li

    Abstract: Remote Sensing Large Multi-Modal Models (RSLMMs) are develo** rapidly and showcase significant capabilities in remote sensing imagery (RSI) comprehension. However, due to the limitations of existing datasets, RSLMMs have shortcomings in understanding the rich semantic relations among objects in complex remote sensing scenes. To unlock RSLMMs' complex comprehension ability, we propose a large-sca… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 30 pages, 5 figures, 19 tables, dataset and code see https://github.com/Luo-Z13/SkySenseGPT

  22. arXiv:2406.09989  [pdf, other

    q-bio.NC eess.SY

    Suppressing seizure via optimal electrical stimulation to the hub of epileptic brain network

    Authors: Zhichao Liang, Guanyi Zhao, Yinuo Zhang, Weiting Sun, **gzhe Lin, Jialin Wang, Quanying Liu

    Abstract: The electrical stimulation to the seizure onset zone (SOZ) serves as an efficient approach to seizure suppression. Recently, seizure dynamics have gained widespread attendance in its network propagation mechanisms. Compared with the direct stimulation to SOZ, other brain network-level approaches that can effectively suppress epileptic seizures remain under-explored. In this study, we introduce a p… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  23. arXiv:2406.09961  [pdf, other

    cs.SE cs.CL cs.CV

    ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

    Authors: Chufan Shi, Cheng Yang, Yaxin Liu, Bo Shui, Junjie Wang, Mohan **g, Linran Xu, Xinyu Zhu, Siheng Li, Yuxiang Zhang, Gongye Liu, Xiaomei Nie, Deng Cai, Yujiu Yang

    Abstract: We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs). ChartMimic utilizes information-intensive visual charts and textual instructions as inputs, requiring LMMs to generate the corresponding code for chart rendering. ChartMimic includes 1,000 human-curated (figure, instruction, code) triplets, which repres… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Data and code are available at https://github.com/ChartMimic/ChartMimic

  24. arXiv:2406.09904  [pdf, other

    cs.LG

    QQQ: Quality Quattuor-Bit Quantization for Large Language Models

    Authors: Ying Zhang, Peng Zhang, Mincong Huang, **gyang Xiang, Yujie Wang, Chao Wang, Yineng Zhang, Lei Yu, Chuan Liu, Wei Lin

    Abstract: Quantization is a proven effective method for compressing large language models. Although popular techniques like W8A8 and W4A16 effectively maintain model performance, they often fail to concurrently speed up the prefill and decoding stages of inference. W4A8 is a promising strategy to accelerate both of them while usually leads to a significant performance degradation. To address these issues, w… ▽ More

    Submitted 28 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  25. arXiv:2406.09881  [pdf, other

    cs.CL

    A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation

    Authors: Yongkang Liu, Ercong Nie, Shi Feng, Zheng Hua, Zifeng Ding, Daling Wang, Yifei Zhang, Hinrich Schütze

    Abstract: Current state-of-the-art dialogue systems heavily rely on extensive training datasets. However, challenges arise in domains where domain-specific training datasets are insufficient or entirely absent. To tackle this challenge, we propose a novel data \textbf{A}ugmentation framework for \textbf{M}ulti-\textbf{D}omain \textbf{D}ialogue \textbf{G}eneration, referred to as \textbf{AMD$^2$G}. The AMD… ▽ More

    Submitted 28 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 17pages,ECML-PKDD

    Journal ref: 2024 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases

  26. arXiv:2406.09815  [pdf, other

    cs.CL cs.AI

    Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments

    Authors: Zhenrui Yue, Huimin Zeng, Lanyu Shang, Yifan Liu, Yang Zhang, Dong Wang

    Abstract: The rapid propagation of misinformation poses substantial risks to public interest. To combat misinformation, large language models (LLMs) are adapted to automatically verify claim credibility. Nevertheless, existing methods heavily rely on the embedded knowledge within LLMs and / or black-box APIs for evidence collection, leading to subpar performance with smaller LLMs or upon unreliable context.… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024

  27. arXiv:2406.09742  [pdf, other

    cs.IR

    IFA: Interaction Fidelity Attention for Entire Lifelong Behaviour Sequence Modeling

    Authors: Wenhui Yu, Chao Feng, Yanze Zhang, Lantao Hu, Peng Jiang, Han Li

    Abstract: The lifelong user behavior sequence provides abundant information of user preference and gains impressive improvement in the recommendation task, however increases computational consumption significantly. To meet the severe latency requirement in online service, a short sub-sequence is sampled based on similarity to the target item. Unfortunately, items not in the sub-sequence are abandoned, leadi… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 7 pages, 2 figures

  28. arXiv:2406.09687  [pdf

    cond-mat.mes-hall cond-mat.str-el

    Interplay between topology and correlations in the second moiré band of twisted bilayer MoTe2

    Authors: Fan Xu, Xumin Chang, Jiayong Xiao, Yixin Zhang, Feng Liu, Zheng Sun, Ning Mao, Nikolai Peshcherenko, Jiayi Li, Kenji Watanabe, Takashi Taniguchi, Bingbing Tong, Li Lu, **feng Jia, Dong Qian, Zhiwen Shi, Yang Zhang, Xiaoxue Liu, Shengwei Jiang, Tingxin Li

    Abstract: Topological flat bands formed in two-dimensional lattice systems offer unique opportunity to study the fractional phases of matter in the absence of an external magnetic field. Celebrated examples include fractional quantum anomalous Hall (FQAH) effects and fractional topological insulators. Recently, FQAH effects have been experimentally realized in both the twisted bilayer MoTe2 (tMoTe2) system… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  29. arXiv:2406.09679  [pdf, other

    cs.CV

    Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

    Authors: Yuhang Zhou, Zihua Zhao, Haolin Li, Siyuan Du, Jiangchao Yao, Ya Zhang, Yanfeng Wang

    Abstract: Training a unified model to take multiple targets into account is a trend towards artificial general intelligence. However, how to efficiently mitigate the training conflicts among heterogeneous data collected from different domains or tasks remains under-explored. In this study, we explore to leverage Mixture of Low-rank Adapters (MoLA) to mitigate conflicts in heterogeneous data training, which… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: ICML2024

  30. arXiv:2406.09655  [pdf, ps, other

    math.RA

    N-fold module factorizations

    Authors: Yongliang Sun, Yaohua Zhang

    Abstract: Module factorizations with two factors of a regular normal element in a ring are newly introduced by Xiao-Wu Chen. In the paper, we introduce n-fold module factorizations, that is, module factorizations with n factors. First we realize the category of n-fold module factorizations as the module category of a matrix subring. Then we relate n-fold module factorizations with Gorenstein projective comp… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 17pages, All comments are welcome

    MSC Class: 16E65; 18G80; 18G65; 18G25

  31. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  32. arXiv:2406.09412  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    Explore the Limits of Omni-modal Pretraining at Scale

    Authors: Yiyuan Zhang, Handong Li, **g Liu, Xiangyu Yue

    Abstract: We propose to build omni-modal intelligence, which is capable of understanding any modality and learning universal representations. In specific, we propose a scalable pretraining paradigm, named Multimodal Context (MiCo), which can scale up the numbers of modalities and amount of data, together with the model parameters, in the pretraining process. With MiCo, the pretrained models show significant… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: Project Website: https://invictus717.github.io/MiCo/

  33. arXiv:2406.09410  [pdf, other

    cs.CV cs.AI

    Scene Graph Generation in Large-Size VHR Satellite Imagery: A Large-Scale Dataset and A Context-Aware Approach

    Authors: Yansheng Li, Linlin Wang, Tingzhu Wang, Xue Yang, Junwei Luo, Qi Wang, Youming Deng, Wenbin Wang, Xian Sun, Haifeng Li, Bo Dang, Yongjun Zhang, Yi Yu, Junchi Yan

    Abstract: Scene graph generation (SGG) in satellite imagery (SAI) benefits promoting intelligent understanding of geospatial scenarios from perception to cognition. In SAI, objects exhibit great variations in scales and aspect ratios, and there exist rich relationships between objects (even between spatially disjoint objects), which makes it necessary to holistically conduct SGG in large-size very-high-reso… ▽ More

    Submitted 30 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: This paper releases a SAI-oriented SGG toolkit with about 30 OBD methods and 10 SGG methods, and develops a benchmark based on RSG where our HOD-Net and RPCM significantly outperform the state-of-the-art methods in both OBD and SGG tasks. The RSG dataset and SAI-oriented toolkit will be made publicly available at https://linlin-dev.github.io/project/RSG

  34. Less Cybersickness, Please: Demystifying and Detecting Stereoscopic Visual Inconsistencies in VR Apps

    Authors: Shuqing Li, Cuiyun Gao, Jian** Zhang, Yujia Zhang, Yepang Liu, Jiazhen Gu, Yun Peng, Michael R. Lyu

    Abstract: The quality of Virtual Reality (VR) apps is vital, particularly the rendering quality of the VR Graphical User Interface (GUI). Different from traditional 2D apps, VR apps create a 3D digital scene for users, by rendering two distinct 2D images for the user's left and right eyes, respectively. Stereoscopic visual inconsistency (denoted as "SVI") issues, however, undermine the rendering process of… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This work has been accepted at the ACM International Conference on the Foundations of Software Engineering (FSE) 2024, Porto de Galinhas, Brazil. DOI: https://doi.org/10.1145/3660803

  35. arXiv:2406.09167  [pdf, other

    cs.SD eess.AS

    Vision Transformer Segmentation for Visual Bird Sound Denoising

    Authors: Sahil Kumar, Jialu Li, Youshan Zhang

    Abstract: Audio denoising, especially in the context of bird sounds, remains a challenging task due to persistent residual noise. Traditional and deep learning methods often struggle with artificial or low-frequency noise. In this work, we propose ViTVS, a novel approach that leverages the power of the vision transformer (ViT) architecture. ViTVS adeptly combines segmentation techniques to disentangle clean… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: INTERSPEECH 2024

  36. arXiv:2406.09161  [pdf, other

    cs.SD eess.AS

    Complex Image-Generative Diffusion Transformer for Audio Denoising

    Authors: Junhui Li, Pu Wang, Jialu Li, Youshan Zhang

    Abstract: The audio denoising technique has captured widespread attention in the deep neural network field. Recently, the audio denoising problem has been converted into an image generation task, and deep learning-based approaches have been applied to tackle this problem. However, its performance is still limited, leaving room for further improvement. In order to enhance audio denoising performance, this pa… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: INTERSPEECH 2024

  37. arXiv:2406.09154  [pdf, other

    cs.SD cs.CL eess.AS

    Diffusion Gaussian Mixture Audio Denoise

    Authors: Pu Wang, Junhui Li, Jialu Li, Liangdong Guo, Youshan Zhang

    Abstract: Recent diffusion models have achieved promising performances in audio-denoising tasks. The unique property of the reverse process could recover clean signals. However, the distribution of real-world noises does not comply with a single Gaussian distribution and is even unknown. The sampling of Gaussian noise conditions limits its application scenarios. To overcome these challenges, we propose a Di… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: INTERSPEECH 2024

  38. arXiv:2406.09016  [pdf, other

    cs.CV

    Cross-Modal Learning for Anomaly Detection in Fused Magnesium Smelting Process: Methodology and Benchmark

    Authors: Gaochang Wu, Yapeng Zhang, Lan Deng, **gxin Zhang, Tianyou Chai

    Abstract: Fused Magnesium Furnace (FMF) is a crucial industrial equipment in the production of magnesia, and anomaly detection plays a pivotal role in ensuring its efficient, stable, and secure operation. Existing anomaly detection methods primarily focus on analyzing dominant anomalies using the process variables (such as arc current) or constructing neural networks based on abnormal visual features, while… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 14 pages, 6 figures, 5 tables. Submitted to IEEE

  39. arXiv:2406.08980  [pdf, other

    q-bio.BM cs.LG

    From Theory to Therapy: Reframing SBDD Model Evaluation via Practical Metrics

    Authors: Bowen Gao, Haichuan Tan, Yanwen Huang, Minsi Ren, Xiao Huang, Wei-Ying Ma, Ya-Qin Zhang, Yanyan Lan

    Abstract: Recent advancements in structure-based drug design (SBDD) have significantly enhanced the efficiency and precision of drug discovery by generating molecules tailored to bind specific protein pockets. Despite these technological strides, their practical application in real-world drug development remains challenging due to the complexities of synthesizing and testing these molecules. The reliability… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  40. arXiv:2406.08969  [pdf, other

    hep-th

    New Factorizations of Yang-Mills Amplitudes

    Authors: Yong Zhang

    Abstract: We propose a new factorization pattern for tree-level Yang-Mills (YM) amplitudes, where they decompose into a sum of products of two lower-point amplitudes by setting specific two-point non-planar Mandelstam variables within a rectangular configuration to zero. This approach manifests the hidden zeros of YM amplitudes recently identified. Furthermore, by setting specific Lorentz products involving… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 5+2 pages

  41. arXiv:2406.08961  [pdf, other

    q-bio.BM cs.LG

    SIU: A Million-Scale Structural Small Molecule-Protein Interaction Dataset for Unbiased Bioactivity Prediction

    Authors: Yanwen Huang, Bowen Gao, Yinjun Jia, Hongbo Ma, Wei-Ying Ma, Ya-Qin Zhang, Yanyan Lan

    Abstract: Small molecules play a pivotal role in modern medicine, and scrutinizing their interactions with protein targets is essential for the discovery and development of novel, life-saving therapeutics. The term "bioactivity" encompasses various biological effects resulting from these interactions, including both binding and functional responses. The magnitude of bioactivity dictates the therapeutic or t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  42. arXiv:2406.08928  [pdf, other

    cs.CV eess.IV

    Multiple Prior Representation Learning for Self-Supervised Monocular Depth Estimation via Hybrid Transformer

    Authors: Guodong Sun, Junjie Liu, Mingxuan Liu, Moyun Liu, Yang Zhang

    Abstract: Self-supervised monocular depth estimation aims to infer depth information without relying on labeled data. However, the lack of labeled information poses a significant challenge to the model's representation, limiting its ability to capture the intricate details of the scene accurately. Prior information can potentially mitigate this issue, enhancing the model's understanding of scene structure a… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 28 pages, 12 figures

  43. arXiv:2406.08909  [pdf, other

    cs.CV

    A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras

    Authors: Chenyang Shi, Shasha Guo, Boyi Wei, Hanxiao Liu, Yibo Zhang, Ningfang Song, **g **

    Abstract: Event cameras are renowned for their high efficiency due to outputting a sparse, asynchronous stream of events. However, they are plagued by noisy events, especially in low light conditions. Denoising is an essential task for event cameras, but evaluating denoising performance is challenging. Label-dependent denoising metrics involve artificially adding noise to clean sequences, complicating evalu… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  44. arXiv:2406.08845  [pdf, other

    cs.CV

    Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

    Authors: Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Kai Wang, Yue Yang, Ziyao Guo, Wenqi Shao, Yang You, Yu Qiao, ** Luo, Kaipeng Zhang

    Abstract: Recent text-to-video (T2V) technology advancements, as demonstrated by models such as Gen2, Pika, and Sora, have significantly broadened its applicability and popularity. Despite these strides, evaluating these models poses substantial challenges. Primarily, due to the limitations inherent in automatic metrics, manual evaluation is often considered a superior method for assessing T2V generation. H… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  45. arXiv:2406.08810  [pdf, other

    cs.CV

    Few-Shot Anomaly Detection via Category-Agnostic Registration Learning

    Authors: Chaoqin Huang, Haoyan Guan, Aofan Jiang, Yanfeng Wang, Michael Spratling, Xinchao Wang, Ya Zhang

    Abstract: Most existing anomaly detection methods require a dedicated model for each category. Such a paradigm, despite its promising results, is computationally expensive and inefficient, thereby failing to meet the requirements for real-world applications. Inspired by how humans detect anomalies, by comparing a query image to known normal ones, this paper proposes a novel few-shot anomaly detection (FSAD)… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  46. arXiv:2406.08806  [pdf, ps, other

    eess.SY

    Adaptive Cooperative Streaming of Holographic Video Over Wireless Networks: A Proximal Policy Optimization Solution

    Authors: Wanli Wen, Ji** Yan, Yulu Zhang, Zhen Huang, Liang Liang, Yunjian Jia

    Abstract: Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted for publication in IEEE Wireless Communications Letters

  47. arXiv:2406.08789  [pdf, ps, other

    cond-mat.supr-con

    Growth and characterization of the La$_{3}$Ni$_{2}$O$_{7-δ}$ thin films: dominant contribution of the $d_{x^{2}-y^{2}}$ orbital at ambient pressure

    Authors: Yuecong Liu, Mengjun Ou, Haifeng Chu, Huan Yang, Qing Li, Yingjie Zhang, Hai-Hu Wen

    Abstract: By using the pulsed-laser-ablation technique, we have successfully grown the La$_{3}$Ni$_{2}$O$_{7-δ}$ thin films with $c$-axis orientation perpendicular to the film surface. X-ray diffraction shows that the (00l) peaks can be well indexed to the La$_{3}$Ni$_{2}$O$_{7-δ}$ phase. Resistive measurements show that the samples can be tuned from weak insulating to metallic behavior through adjusting th… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  48. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  49. arXiv:2406.08688  [pdf, other

    cs.SE cs.AI

    On Security Weaknesses and Vulnerabilities in Deep Learning Systems

    Authors: Zhongzheng Lai, Huaming Chen, Ruoxi Sun, Yu Zhang, Minhui Xue, Dong Yuan

    Abstract: The security guarantee of AI-enabled software systems (particularly using deep learning techniques as a functional core) is pivotal against the adversarial attacks exploiting software vulnerabilities. However, little attention has been paid to a systematic investigation of vulnerabilities in such systems. A common situation learned from the open source software community is that deep learning engi… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  50. arXiv:2406.08675  [pdf, other

    quant-ph

    Multi-reference Quantum Davidson Algorithm for Quantum Dynamics

    Authors: Noah Berthusen, Faisal Alam, Yu Zhang

    Abstract: Simulating quantum systems is one of the most promising tasks where quantum computing can potentially outperform classical computing. However, the robustness needed for reliable simulations of medium to large systems is beyond the reach of existing quantum devices. To address this, Quantum Krylov Subspace (QKS) methods have been developed, enhancing the ability to perform accelerated simulations o… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Report number: LA-UR-24-25256