Skip to main content

Showing 1–50 of 1,949 results for author: shi, J

.
  1. arXiv:2407.05728  [pdf, other

    math.OC

    Stochastic Linear-Quadratic Stackelberg Differential Game with Asymmetric Informational Uncertainties: Robust Optimization Approach

    Authors: Na Xiang, **gtao Shi

    Abstract: This paper is concerned with a two-person zero-sum indefinite stochastic linear-quadratic Stackelberg differential game with asymmetric informational uncertainties, where both the leader and follower face different and unknown disturbances. We take a robust optimization approach and soft-constraint analysis, a min-max stochastic linear-quadratic optimal control problem is solved by the follower fi… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 31 pages, 1 figure

    MSC Class: 93E20; 60H10; 91A15; 91A65

  2. arXiv:2407.05717  [pdf

    eess.SY cs.RO eess.SP

    A New Framework for Nonlinear Kalman Filters

    Authors: Shida Jiang, Junzhe Shi, Scott Moura

    Abstract: The Kalman filter (KF) is a state estimation algorithm that optimally combines system knowledge and measurements to minimize the mean squared error of the estimated states. While KF was initially designed for linear systems, numerous extensions of it, such as extended Kalman filter (EKF), unscented Kalman filter (UKF), cubature Kalman filter (CKF), etc., have been proposed for nonlinear systems. A… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 26 pages, 5 + 9 figues, 2 tables

  3. arXiv:2407.05647  [pdf, other

    cs.CV

    Learning to Adapt Category Consistent Meta-Feature of CLIP for Few-Shot Classification

    Authors: Jiaying Shi, Xuetong Xue, Shenghui Xu

    Abstract: The recent CLIP-based methods have shown promising zero-shot and few-shot performance on image classification tasks. Existing approaches such as CoOp and Tip-Adapter only focus on high-level visual features that are fully aligned with textual features representing the ``Summary" of the image. However, the goal of few-shot learning is to classify unseen images of the same category with few labeled… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  4. arXiv:2407.05165  [pdf, other

    cs.SE

    Feedback-Driven Automated Whole Bug Report Reproduction for Android Apps

    Authors: Dingbang Wang, Yu Zhao, Sidong Feng, Zhaoxu Zhang, William G. J. Halfond, Chunyang Chen, Xiaoxia Sun, Jiangfan Shi, Tingting Yu

    Abstract: In software development, bug report reproduction is a challenging task. This paper introduces ReBL, a novel feedback-driven approach that leverages GPT-4, a large-scale language model, to automatically reproduce Android bug reports. Unlike traditional methods, ReBL bypasses the use of Step to Reproduce (S2R) entities. Instead, it leverages the entire textual bug report and employs innovative promp… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: Accepted by ISSTA 2024

  5. arXiv:2407.04201  [pdf, ps, other

    math.OC

    A General Maximum Principle for Progressive Optimal Control of Fully Coupled Forward-Backward Stochastic Systems with Jumps

    Authors: Bin Wang, Yu Si, **gtao Shi

    Abstract: This paper is concerned with a general maximum principle for the fully coupled forward-backward stochastic optimal control problem with jumps, where the control domain is not necessarily convex, within the progressively measurable framework. It is worth noting that not only the control variable enters into all the coefficients, but also the jump size "$e$" . We first proposed that the solution… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 32 pages

    MSC Class: 93E20; 49K45; 60H10; 60G55

  6. arXiv:2407.03886  [pdf, other

    cs.CV eess.IV

    DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment

    Authors: **song Shi, Pan Gao, Xiaojiang Peng, Jie Qin

    Abstract: Image quality assessment (IQA) has long been a fundamental challenge in image understanding. In recent years, deep learning-based IQA methods have shown promising performance. However, the lack of large amounts of labeled data in the IQA field has hindered further advancements in these methods. This paper introduces DSMix, a novel data augmentation technique specifically designed for IQA tasks, ai… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  7. arXiv:2407.03713  [pdf, other

    astro-ph.GA

    Compositions of the Hercules-Aquila Cloud and Virgo Over-density

    Authors: Dashuang Ye, Cuihua Du, Mingji Deng, Jiwei Liao, Yang Huang, Jianrong Shi, Jun Ma

    Abstract: Based on a sample of K giant from Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST) Data Release 8 and a sample of RR Lyrae (RRL) from \textit{Gaia} Data Release 3, we investigate the compositions of the Hercules-Aquila Cloud (HAC) and Virgo Over-density (VOD) and their collective contribution to the tilt and triaxiality of the stellar halo ($r\,\textless\,40\,{\rm kpc}$) as well… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 11 pages, 9 figures, accepted for publication in MNRAS

  8. arXiv:2407.02899  [pdf, other

    hep-ex

    Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  9. arXiv:2407.01621  [pdf, other

    cs.LG q-bio.QM stat.ME stat.ML

    Deciphering interventional dynamical causality from non-intervention systems

    Authors: Jifan Shi, Yang Li, Juan Zhao, Siyang Leng, Kazuyuki Aihara, Luonan Chen, Wei Lin

    Abstract: Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational crite… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  10. arXiv:2407.01505  [pdf, other

    cs.CL cs.AI

    Self-Cognition in Large Language Models: An Exploratory Study

    Authors: Dong** Chen, Jiawen Shi, Yao Wan, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun

    Abstract: While Large Language Models (LLMs) have achieved remarkable success across various applications, they also raise concerns regarding self-cognition. In this paper, we perform a pioneering study to explore self-cognition in LLMs. Specifically, we first construct a pool of self-cognition instruction prompts to evaluate where an LLM exhibits self-cognition and four well-designed principles to quantify… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted at ICML 2024 Large Language Models and Cognition Workshop

  11. arXiv:2407.00837  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    Towards Robust Speech Representation Learning for Thousands of Languages

    Authors: William Chen, Wangyou Zhang, Yifan Peng, Xinjian Li, **chuan Tian, Jiatong Shi, Xuankai Chang, Soumi Maiti, Karen Livescu, Shinji Watanabe

    Abstract: Self-supervised learning (SSL) has helped extend speech technologies to more languages by reducing the need for labeled data. However, models are still far from supporting the world's 7000+ languages. We propose XEUS, a Cross-lingual Encoder for Universal Speech, trained on over 1 million hours of data across 4057 languages, extending the language coverage of SSL models 4-fold. We combine 1 millio… ▽ More

    Submitted 2 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: Updated affiliations; 20 pages

  12. arXiv:2406.19190  [pdf, ps, other

    hep-ex

    Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  13. arXiv:2406.18995  [pdf, other

    cs.LG cs.AI

    FedMLP: Federated Multi-Label Medical Image Classification under Task Heterogeneity

    Authors: Zhaobin Sun, Nannan Wu, Junjie Shi, Li Yu, Xin Yang, Kwang-Ting Cheng, Zengqiang Yan

    Abstract: Cross-silo federated learning (FL) enables decentralized organizations to collaboratively train models while preserving data privacy and has made significant progress in medical image classification. One common assumption is task homogeneity where each client has access to all classes during training. However, in clinical practice, given a multi-label classification task, constrained by the level… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Early accepted by MICCAI 2024

  14. arXiv:2406.18769  [pdf

    cond-mat.mes-hall cond-mat.other

    Subharmonic oscillations in the Floquet circuit with the frequency-synthesis dimension

    Authors: Bo Lv, Shiyun Xia, Ye Tian, Ting Liu, Hongyang Mu, Zhichao Shen, Sijie Wang, Zheng Zhu, Huibin Tao, Fanyi Meng, **hui Shi

    Abstract: The period-doubling oscillation emerges with the coexistence between zero and π modes in Floquet topological insulator. Here, utilized the flexibility of the circuit, we construct the Floquet circuit with frequency-synthetic dimension and find the topological-protected deeply-subharmonic oscillations with the period extensively exceeding the doubling-driven period. In the construction framework, t… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 31 pages, 10 figures

  15. arXiv:2406.18610  [pdf, other

    cs.CV

    Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo Labeling

    Authors: Haoran Li, Xingjian Li, Jiahua Shi, Huaming Chen, Bo Du, Daisuke Kihara, Johan Barthelemy, Jun Shen, Min Xu

    Abstract: Cryo-Electron Tomography (cryo-ET) is a 3D imaging technology facilitating the study of macromolecular structures at near-atomic resolution. Recent volumetric segmentation approaches on cryo-ET images have drawn widespread interest in biological sector. However, existing methods heavily rely on manually labeled data, which requires highly professional skills, thereby hindering the adoption of full… ▽ More

    Submitted 30 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: 11 pages

  16. arXiv:2406.18344  [pdf, other

    cs.CV

    AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space

    Authors: Huzheng Yang, James Gee, Jianbo Shi

    Abstract: We study the intriguing connection between visual data, deep networks, and the brain. Our method creates a universal channel alignment by using brain voxel fMRI response prediction as the training objective. We discover that deep networks, trained with different objectives, share common feature channels across various models. These channels can be clustered into recurring sets, corresponding to di… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  17. arXiv:2406.18198  [pdf, other

    cs.CV

    VDG: Vision-Only Dynamic Gaussian for Driving Simulation

    Authors: Hao Li, **gfeng Li, Dingwen Zhang, Chenming Wu, Jieqi Shi, Chen Zhao, Haocheng Feng, Errui Ding, **gdong Wang, Junwei Han

    Abstract: Dynamic Gaussian splatting has led to impressive scene reconstruction and image synthesis advances in novel views. Existing methods, however, heavily rely on pre-computed poses and Gaussian initialization by Structure from Motion (SfM) algorithms or expensive sensors. For the first time, this paper addresses this issue by integrating self-supervised VO into our pose-free dynamic Gaussian method (V… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  18. arXiv:2406.18183  [pdf, other

    hep-ex

    Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (638 additional authors not shown)

    Abstract: Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 26 pages,5 tables, 4 figures

  19. arXiv:2406.18083  [pdf, other

    hep-ex

    Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 19 pages, 2 figures

  20. arXiv:2406.17452  [pdf, ps, other

    hep-ex

    Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (649 additional authors not shown)

    Abstract: We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  21. arXiv:2406.16841  [pdf, ps, other

    physics.atom-ph

    Proposal for the generation of continuous-wave vacuum ultraviolet laser light for Th-229 isomer precision spectroscopy

    Authors: Qi Xiao, Gleb Penyazkov, Ruihan Yu, Beichen Huang, Jiatong Li, Juanlang Shi, Yanmei Yu, Yuxiang Mo, Shiqian Ding

    Abstract: We propose to generate continuous-wave vacuum ultraviolet (VUV) laser light at 148.4 nm using four-wave mixing in cadmium vapor for precision spectroscopy of the Th-229 isomer transition. Due to the large transition matrix elements of cadmium, the readily accessible wavelengths for the incident laser beams, and the high coherence of the four-wave mixing process, over 30 $μ$W of VUV power can be ge… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  22. arXiv:2406.16495  [pdf, other

    cs.CL cs.AI

    OTCE: Hybrid SSM and Attention with Cross Domain Mixture of Experts to construct Observer-Thinker-Conceiver-Expresser

    Authors: **gze Shi, Ting Xie, Bingheng Wu, Chunjun Zheng, Kai Wang

    Abstract: Recent research has shown that combining Mamba with Transformer architecture, which has selective state space and quadratic self-attention mechanism, outperforms using Mamba or Transformer architecture alone in language modeling tasks. The quadratic self-attention mechanism effectively alleviates the shortcomings of selective state space in handling long-term dependencies of any element in the seq… ▽ More

    Submitted 24 June, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

  23. arXiv:2406.15030  [pdf, ps, other

    hep-ex

    Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

    Abstract: Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  24. arXiv:2406.13638  [pdf, other

    physics.data-an astro-ph.IM hep-ex physics.ins-det

    XENONnT WIMP Search: Signal & Background Modeling and Statistical Inference

    Authors: XENON Collaboration, E. Aprile, J. Aalbers, K. Abe, S. Ahmed Maouloud, L. Althueser, B. Andrieu, E. Angelino, D. Antón Martin, F. Arneodo, L. Baudis, M. Bazyk, L. Bellagamba, R. Biondi, A. Bismark, K. Boese, A. Brown, G. Bruno, R. Budnik, J. M. R. Cardoso, A. P. Cimental Chávez, A. P. Colijn, J. Conrad, J. J. Cuenca-García, V. D'Andrea , et al. (139 additional authors not shown)

    Abstract: The XENONnT experiment searches for weakly-interacting massive particle (WIMP) dark matter scattering off a xenon nucleus. In particular, XENONnT uses a dual-phase time projection chamber with a 5.9-tonne liquid xenon target, detecting both scintillation and ionization signals to reconstruct the energy, position, and type of recoil. A blind search for nuclear recoil WIMPs with an exposure of 1.1 t… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 20 pages, 10 figures

  25. arXiv:2406.12638  [pdf, other

    cs.CV cs.LG

    Efficient and Long-Tailed Generalization for Pre-trained Vision-Language Model

    Authors: Jiang-Xin Shi, Chi Zhang, Tong Wei, Yu-Feng Li

    Abstract: Pre-trained vision-language models like CLIP have shown powerful zero-shot inference ability via image-text matching and prove to be strong few-shot learners in various downstream tasks. However, in real-world scenarios, adapting CLIP to downstream tasks may encounter the following challenges: 1) data may exhibit long-tailed data distributions and might not have abundant samples for all the classe… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  26. arXiv:2406.12390  [pdf, other

    physics.optics quant-ph

    Multimer states in multilevel waveguide QED

    Authors: Jiaming Shi, Alexander N. Poddubny

    Abstract: We study theoretically the interplay of spontaneous emission and interactions for the multiple-excited quasistationary eigenstates in a finite periodic array of multilevel atoms coupled to the waveguide. We develop an analytical approach to calculate such eigenstates based on the subradiant dimer basis. Our calculations reveal the peculiar multimerization effect driven by the anharmonicity of the… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 5 pages, 4 figures + Supplementary

  27. arXiv:2406.12180  [pdf

    cond-mat.mtrl-sci quant-ph

    Unusual charge density wave introduced by Janus structure in monolayer vanadium dichalcogenides

    Authors: Ziqiang Xu, Yan Shao, Chun Huang, Genyu Hu, Shihao Hu, Zhi-Lin Li, Xiaoyu Hao, Yanhui Hou, Teng Zhang, **-An Shi, Chen Liu, Jia-Ou Wang, Wu Zhou, Jiadong Zhou, Wei Ji, **gsi Qiao, Xu Wu, Hong-Jun Gao, Yeliang Wang

    Abstract: As a fundamental structural feature, the symmetry of materials determines the exotic quantum properties in transition metal dichalcogenides (TMDs) with charge density wave (CDW). Breaking the inversion symmetry, the Janus structure, an artificially constructed lattice, provides an opportunity to tune the CDW states and the related properties. However, limited by the difficulties in atomic-level fa… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  28. arXiv:2406.11144  [pdf, other

    math.OC

    Modified Line Search Sequential Quadratic Methods for Equality-Constrained Optimization with Unified Global and Local Convergence Guarantees

    Authors: Albert S. Berahas, Raghu Bollapragada, Jiahao Shi

    Abstract: In this paper, we propose a method that has foundations in the line search sequential quadratic programming paradigm for solving general nonlinear equality constrained optimization problems. The method employs a carefully designed modified line search strategy that utilizes second-order information of both the objective and constraint functions, as required, to mitigate the Maratos effect. Contrar… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  29. arXiv:2406.10981  [pdf, other

    cs.CV

    ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models

    Authors: Kaifeng Gao, Jiaxin Shi, Hanwang Zhang, Chun** Wang, Jun Xiao

    Abstract: With the advance of diffusion models, today's video generation has achieved impressive quality. But generating temporal consistent long videos is still challenging. A majority of video diffusion models (VDMs) generate long videos in an autoregressive manner, i.e., generating subsequent clips conditioned on last frames of previous clip. However, existing approaches all involve bidirectional computa… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Code will be available at https://github.com/Dawn-LX/Causal-VideoGen

  30. arXiv:2406.10911  [pdf, other

    cs.SD eess.AS

    SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction

    Authors: Yuxun Tang, Jiatong Shi, Yuning Wu, Qin **

    Abstract: In speech generation tasks, human subjective ratings, usually referred to as the opinion score, are considered the "gold standard" for speech quality evaluation, with the mean opinion score (MOS) serving as the primary evaluation metric. Due to the high cost of human annotation, several MOS prediction systems have emerged in the speech domain, demonstrating good performance. These MOS prediction m… ▽ More

    Submitted 20 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  31. arXiv:2406.10583  [pdf, other

    hep-ex

    Demonstration of neutron identification in neutrino interactions in the MicroBooNE liquid argon time projection chamber

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book , et al. (165 additional authors not shown)

    Abstract: A significant challenge in measurements of neutrino oscillations is reconstructing the incoming neutrino energies. While modern fully-active tracking calorimeters such as liquid argon time projection chambers in principle allow the measurement of all final state particles above some detection threshold, undetected neutrons remain a considerable source of missing energy with little to no data const… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Report number: FERMILAB-PUB-24-0301

  32. arXiv:2406.10123  [pdf, other

    hep-ex physics.ins-det

    Improving neutrino energy estimation of charged-current interaction events with recurrent neural networks in MicroBooNE

    Authors: MicroBooNE collaboration, P. Abratenko, O. Alterkait, D. Andrade Aldana, L. Arellano, J. Asaadi, A. Ashkenazi, S. Balasubramanian, B. Baller, A. Barnard, G. Barr, D. Barrow, J. Barrow, V. Basque, J. Bateman, O. Benevides Rodrigues, S. Berkman, A. Bhanderi, A. Bhat, M. Bhattacharya, M. Bishai, A. Blake, B. Bogart, T. Bolton, J. Y. Book , et al. (164 additional authors not shown)

    Abstract: We present a deep learning-based method for estimating the neutrino energy of charged-current neutrino-argon interactions. We employ a recurrent neural network (RNN) architecture for neutrino energy estimation in the MicroBooNE experiment, utilizing liquid argon time projection chamber (LArTPC) detector technology. Traditional energy estimation approaches in LArTPCs, which largely rely on reconstr… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Report number: FERMILAB-PUB-24-0287

  33. arXiv:2406.09913  [pdf, other

    cs.CV

    OpenECAD: An Efficient Visual Language Model for Computer-Aided Design

    Authors: Zhe Yuan, Jianqi Shi, Yanhong Huang

    Abstract: Computer-aided design (CAD) tools are utilized in the manufacturing industry for modeling everything from cups to spacecraft. These programs are complex to use and typically require years of training and experience to master. Structured and well-constrained 2D sketches and 3D constructions are crucial components of CAD modeling. A well-executed CAD model can be seamlessly integrated into the manuf… ▽ More

    Submitted 22 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  34. arXiv:2406.09869  [pdf, ps, other

    cs.SD eess.AS

    MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model

    Authors: Jiatong Shi, Xutai Ma, Hirofumi Inaguma, Anna Sun, Shinji Watanabe

    Abstract: Speech discrete representation has proven effective in various downstream applications due to its superior compression rate of the waveform, fast convergence during training, and compatibility with other modalities. Discrete units extracted from self-supervised learning (SSL) models have emerged as a prominent approach for obtaining speech discrete representation. However, while discrete units hav… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech2024

  35. arXiv:2406.09475  [pdf, other

    hep-ex

    Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  36. arXiv:2406.08905  [pdf, other

    cs.SD eess.AS

    SingOMD: Singing Oriented Multi-resolution Discrete Representation Construction from Speech Models

    Authors: Yuxun Tang, Yuning Wu, Jiatong Shi, Qin **

    Abstract: Discrete representation has shown advantages in speech generation tasks, wherein discrete tokens are derived by discretizing hidden features from self-supervised learning (SSL) pre-trained models. However, the direct application of speech SSL models to singing generation encounters domain gaps between speech and singing. Furthermore, singing generation necessitates a more refined representation th… ▽ More

    Submitted 20 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  37. arXiv:2406.08761  [pdf, other

    cs.SD eess.AS

    VISinger2+: End-to-End Singing Voice Synthesis Augmented by Self-Supervised Learning Representation

    Authors: Yifeng Yu, Jiatong Shi, Yuning Wu, Shinji Watanabe

    Abstract: Singing Voice Synthesis (SVS) has witnessed significant advancements with the advent of deep learning techniques. However, a significant challenge in SVS is the scarcity of labeled singing voice data, which limits the effectiveness of supervised learning methods. In response to this challenge, this paper introduces a novel approach to enhance the quality of SVS by leveraging unlabeled data from pr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 4 pages, 2 figures

  38. arXiv:2406.08641  [pdf, ps, other

    cs.SD cs.CL eess.AS

    ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

    Authors: Jiatong Shi, Shih-Heng Wang, William Chen, Martijn Bartelds, Vanya Bannihatti Kumar, **chuan Tian, Xuankai Chang, Dan Jurafsky, Karen Livescu, Hung-yi Lee, Shinji Watanabe

    Abstract: ML-SUPERB evaluates self-supervised learning (SSL) models on the tasks of language identification and automatic speech recognition (ASR). This benchmark treats the models as feature extractors and uses a single shallow downstream model, which can be fine-tuned for a downstream task. However, real-world use cases may require different configurations. This paper presents ML-SUPERB~2.0, which is a ne… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  39. arXiv:2406.08416  [pdf, other

    cs.SD eess.AS

    TokSing: Singing Voice Synthesis based on Discrete Tokens

    Authors: Yuning Wu, Chunlei zhang, Jiatong Shi, Yuxun Tang, Shan Yang, Qin **

    Abstract: Recent advancements in speech synthesis witness significant benefits by leveraging discrete tokens extracted from self-supervised learning (SSL) models. Discrete tokens offer higher storage efficiency and greater operability in intermediate representations compared to traditional continuous Mel spectrograms. However, when it comes to singing voice synthesis(SVS), achieving higher levels of melody… ▽ More

    Submitted 20 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted by Interspeech 2024

  40. arXiv:2406.08225  [pdf, ps, other

    hep-ex

    Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (636 additional authors not shown)

    Abstract: Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  41. arXiv:2406.07725  [pdf, ps, other

    cs.SD eess.AS

    The Interspeech 2024 Challenge on Speech Processing Using Discrete Units

    Authors: Xuankai Chang, Jiatong Shi, **chuan Tian, Yuning Wu, Yuxun Tang, Yihan Wu, Shinji Watanabe, Yossi Adi, Xie Chen, Qin **

    Abstract: Representing speech and audio signals in discrete units has become a compelling alternative to traditional high-dimensional feature vectors. Numerous studies have highlighted the efficacy of discrete units in various applications such as speech compression and restoration, speech recognition, and speech generation. To foster exploration in this domain, we introduce the Interspeech 2024 Challenge,… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: This manuscript has been accepted by Interspeech2024

  42. arXiv:2406.07686  [pdf, other

    cs.CV

    AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation

    Authors: Kai Wang, Shijian Deng, **g Shi, Dimitrios Hatzinakos, Yapeng Tian

    Abstract: Recent Diffusion Transformers (DiTs) have shown impressive capabilities in generating high-quality single-modality content, including images, videos, and audio. However, it is still under-explored whether the transformer-based diffuser can efficiently denoise the Gaussian noises towards superb multimodal content creation. To bridge this gap, we introduce AV-DiT, a novel and efficient audio-visual… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  43. arXiv:2406.07571  [pdf, other

    cs.CY

    Supporting Self-Reflection at Scale with Large Language Models: Insights from Randomized Field Experiments in Classrooms

    Authors: Harsh Kumar, Ruiwei Xiao, Benjamin Lawson, Ilya Musabirov, Jiakai Shi, Xinyuan Wang, Huayin Luo, Joseph Jay Williams, Anna Rafferty, John Stamper, Michael Liut

    Abstract: Self-reflection on learning experiences constitutes a fundamental cognitive process, essential for the consolidation of knowledge and the enhancement of learning efficacy. However, traditional methods to facilitate reflection often face challenges in personalization, immediacy of feedback, engagement, and scalability. Integration of Large Language Models (LLMs) into the reflection process could mi… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    Comments: Accepted at L@S'24

  44. arXiv:2406.06118  [pdf, other

    hep-ex

    Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (644 additional authors not shown)

    Abstract: The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  45. arXiv:2406.05827  [pdf, ps, other

    hep-ex

    Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (634 additional authors not shown)

    Abstract: We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  46. arXiv:2406.04614  [pdf, ps, other

    cs.CL cs.AI

    LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

    Authors: Zhi Zhou, Jiang-Xin Shi, Peng-Xiao Song, Xiao-Wen Yang, Yi-Xuan **, Lan-Zhe Guo, Yu-Feng Li

    Abstract: Large language models (LLMs), including both proprietary and open-source models, have showcased remarkable capabilities in addressing a wide range of downstream tasks. Nonetheless, when it comes to practical Chinese legal tasks, these models fail to meet the actual requirements. Proprietary models do not ensure data privacy for sensitive legal cases, while open-source models demonstrate unsatisfac… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Technical Report

  47. arXiv:2406.04329  [pdf, other

    cs.LG stat.ML

    Simplified and Generalized Masked Diffusion for Discrete Data

    Authors: Jiaxin Shi, Kehang Han, Zhe Wang, Arnaud Doucet, Michalis K. Titsias

    Abstract: Masked (or absorbing) diffusion is actively explored as an alternative to autoregressive models for generative modeling of discrete data. However, existing work in this area has been hindered by unnecessarily complex model formulations and unclear relationships between different perspectives, leading to suboptimal parameterization, training objectives, and ad hoc adjustments to counteract these is… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  48. arXiv:2406.03805  [pdf, other

    cs.CR

    AutoJailbreak: Exploring Jailbreak Attacks and Defenses through a Dependency Lens

    Authors: Lin Lu, Hai Yan, Zenghui Yuan, Jiawen Shi, Wenqi Wei, Pin-Yu Chen, Pan Zhou

    Abstract: Jailbreak attacks in large language models (LLMs) entail inducing the models to generate content that breaches ethical and legal norm through the use of malicious prompts, posing a substantial threat to LLM security. Current strategies for jailbreak attack and defense often focus on optimizing locally within specific algorithmic frameworks, resulting in ineffective optimization and limited scalabi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 32 pages, 2 figures

  49. arXiv:2406.02950  [pdf, other

    eess.AS cs.CL cs.SD

    4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders

    Authors: Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Brian Yan, Jiatong Shi, Yifan Peng, Shinji Watanabe

    Abstract: End-to-end automatic speech recognition (E2E-ASR) can be classified into several network architectures, such as connectionist temporal classification (CTC), recurrent neural network transducer (RNN-T), attention-based encoder-decoder, and mask-predict models. Each network architecture has advantages and disadvantages, leading practitioners to switch between these different models depending on appl… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: submitted to IEEE/ACM Transactions on Audio Speech and Language Processing

  50. arXiv:2406.02931  [pdf, other

    hep-ex

    Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

    Abstract: Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 9 pages, 7 figures