Skip to main content

Showing 151–200 of 714 results for author: Yin, X

.
  1. arXiv:2306.13307  [pdf, other

    eess.AS cs.CL

    Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems

    Authors: Mingyu Cui, Jiawen Kang, Jiajun Deng, Xi Yin, Yutao Xie, Xie Chen, Xunying Liu

    Abstract: Current ASR systems are mainly trained and evaluated at the utterance level. Long range cross utterance context can be incorporated. A key task is to derive a suitable compact representation of the most relevant history contexts. In contrast to previous researches based on either LSTM-RNN encoded histories that attenuate the information from longer range contexts, or frame level concatenation of t… ▽ More

    Submitted 25 June, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: Accepted by INTERSPEECH 2023

  2. Chiral and nonreciprocal single-photon scattering in a chiral-giant-molecule waveguide-QED system

    Authors: Juan Zhou, Xian-Li Yin, Jie-Qiao Liao

    Abstract: We study chiral and nonreciprocal single-photon scattering in a chiral-giant-molecule waveguide-QED system. Here, the giant molecule consists of two coupled giant atoms, which interact with two linear waveguides, forming a four-port quantum device. We obtain the exact analytical expressions of the four scattering amplitudes using a real-space method. Under the Markovian limit, we find that the sin… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

    Comments: 13 pages,10 figures

    Journal ref: Phys. Rev. A 107, 063703 (2023)

  3. arXiv:2306.09027  [pdf, other

    physics.optics physics.app-ph

    Ultra-low-loss optical interconnect enabled by topological unidirectional guided resonance

    Authors: Haoran Wang, Yi Zuo, Xuefan Yin, Zihao Chen, Zixuan Zhang, Feifan Wang, Yuefeng Hu, Xiaoyu Zhang, Chao Peng

    Abstract: Grating couplers that interconnect photonic chips to off-chip components are of essential importance for various optoelectronics applications. Despite numerous efforts in past decades, existing grating couplers still suffer from poor energy efficiency and thus hinder photonic integration toward a larger scale. Here, we theoretically propose and experimentally demonstrate a method to achieve ultra-… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 21 pages, 5 figures

  4. arXiv:2306.08219  [pdf, other

    cs.IR cs.SD eess.AS

    Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects

    Authors: Xinghua Qu, Hongyang Liu, Zhu Sun, Xiang Yin, Yew Soon Ong, Lu Lu, Zejun Ma

    Abstract: Conversational recommender systems (CRSs) have become crucial emerging research topics in the field of RSs, thanks to their natural advantages of explicitly acquiring user preferences via interactive conversations and revealing the reasons behind recommendations. However, the majority of current CRSs are text-based, which is less user-friendly and may pose challenges for certain users, such as tho… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: Accepted by SIGIR 2023 Resource Track

  5. arXiv:2306.07294  [pdf, other

    cs.LG cs.AI cs.NE

    Computational and Storage Efficient Quadratic Neurons for Deep Neural Networks

    Authors: Chuangtao Chen, Grace Li Zhang, Xunzhao Yin, Cheng Zhuo, Ulf Schlichtmann, Bing Li

    Abstract: Deep neural networks (DNNs) have been widely deployed across diverse domains such as computer vision and natural language processing. However, the impressive accomplishments of DNNs have been realized alongside extensive computational demands, thereby impeding their applicability on resource-constrained devices. To address this challenge, many researchers have been focusing on basic neuron structu… ▽ More

    Submitted 27 November, 2023; v1 submitted 10 June, 2023; originally announced June 2023.

    Comments: Accepted by Design Automation and Test in Europe (DATE) 2024

  6. arXiv:2306.04830  [pdf, other

    eess.SY

    Extended Neighboring Extremal Optimal Control with State and Preview Perturbations

    Authors: Amin Vahidi-Moghaddam, Kaixiang Zhang, Zhaojian Li, Xunyuan Yin, Ziyou Song, Yan Wang

    Abstract: Optimal control schemes have achieved remarkable performance in numerous engineering applications. However, they typically require high computational cost, which has limited their use in real-world engineering systems with fast dynamics and/or limited computation power. To address this challenge, Neighboring Extremal (NE) has been developed as an efficient optimal adaption strategy to adapt a pre-… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  7. arXiv:2306.03509  [pdf, other

    eess.AS cs.AI cs.SD

    Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

    Authors: Ziyue Jiang, Yi Ren, Zhenhui Ye, **glin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: Scaling text-to-speech to a large and wild dataset has been proven to be highly effective in achieving timbre and speech style generalization, particularly in zero-shot TTS. However, previous works usually encode speech into latent using audio codec and use autoregressive language models or diffusion models to generate it, which ignores the intrinsic nature of speech and may lead to inferior or un… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  8. arXiv:2306.03504  [pdf, other

    cs.CV cs.SD eess.AS

    Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis

    Authors: Zhenhui Ye, Ziyue Jiang, Yi Ren, **glin Liu, Chen Zhang, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: We are interested in a novel task, namely low-resource text-to-talking avatar. Given only a few-minute-long talking person video with the audio track as the training data and arbitrary texts as the driving input, we aim to synthesize high-quality talking portrait videos corresponding to the input text. This task has broad application prospects in the digital human industry but has not been technic… ▽ More

    Submitted 2 August, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Accepted by ICML 2023 Workshop, 6 pages, 3 figures

  9. arXiv:2306.02236  [pdf, other

    cs.CV cs.AI cs.LG

    Detector Guidance for Multi-Object Text-to-Image Generation

    Authors: Lu** Liu, Zijian Zhang, Yi Ren, Rongjie Huang, Xiang Yin, Zhou Zhao

    Abstract: Diffusion models have demonstrated impressive performance in text-to-image generation. They utilize a text encoder and cross-attention blocks to infuse textual information into images at a pixel level. However, their capability to generate images with text containing multiple objects is still restricted. Previous works identify the problem of information mixing in the CLIP text encoder and introdu… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  10. arXiv:2305.19642  [pdf, other

    quant-ph

    Continuous-Variable Quantum Key Distribution at 10 GBaud using an Integrated Photonic-Electronic Receiver

    Authors: Adnan A. E. Hajomer, Cedric Bruynsteen, Ivan Derkach, Nitin Jain, Axl Bomhals, Sarah Bastiaens, Ulrik L. Andersen, Xin Yin, Tobias Gehring

    Abstract: Quantum key distribution (QKD) is a well-known application of quantum information theory that guarantees information-theoretically secure key exchange. As QKD becomes more and more commercially viable, challenges such as scalability, network integration, and high production costs need to be addressed. Photonic and electronic integrated circuits that can be produced in large volumes at low cost hol… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  11. arXiv:2305.18474  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

    Authors: Jiawei Huang, Yi Ren, Rongjie Huang, Dongchao Yang, Zhenhui Ye, Chen Zhang, **glin Liu, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: Large diffusion models have been successful in text-to-audio (T2A) synthesis tasks, but they often suffer from common issues such as semantic misalignment and poor temporal consistency due to limited natural language understanding and data scarcity. Additionally, 2D spatial structures widely used in T2A works lead to unsatisfactory audio quality when generating variable-length audio samples since… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  12. arXiv:2305.17732  [pdf, other

    cs.SD eess.AS

    StyleS2ST: Zero-shot Style Transfer for Direct Speech-to-speech Translation

    Authors: Kun Song, Yi Ren, Yi Lei, Chunfeng Wang, Kun Wei, Lei Xie, Xiang Yin, Zejun Ma

    Abstract: Direct speech-to-speech translation (S2ST) has gradually become popular as it has many advantages compared with cascade S2ST. However, current research mainly focuses on the accuracy of semantic translation and ignores the speech style transfer from a source language to a target language. The lack of high-fidelity expressive parallel data makes such style transfer challenging, especially in more p… ▽ More

    Submitted 25 July, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: Accepted to Interspeech 2023

  13. arXiv:2305.16342  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

    Authors: Zhi-Hao Lai, Tian-Hao Zhang, Qi Liu, Xinyuan Qian, Li-Fang Wei, Song-Lu Chen, Feng Chen, Xu-Cheng Yin

    Abstract: The local and global features are both essential for automatic speech recognition (ASR). Many recent methods have verified that simply combining local and global features can further promote ASR performance. However, these methods pay less attention to the interaction of local and global features, and their series architectures are rigid to reflect local and global relationships. To address these… ▽ More

    Submitted 29 May, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023

  14. arXiv:2305.15602  [pdf, other

    eess.SY cs.AI cs.LG math.DS

    Control invariant set enhanced safe reinforcement learning: improved sampling efficiency, guaranteed stability and robustness

    Authors: Song Bo, Bernard T. Agyeman, Xunyuan Yin, **feng Liu

    Abstract: Reinforcement learning (RL) is an area of significant research interest, and safe RL in particular is attracting attention due to its ability to handle safety-driven constraints that are crucial for real-world applications. This work proposes a novel approach to RL training, called control invariant set (CIS) enhanced RL, which leverages the advantages of utilizing the explicit form of CIS to impr… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2304.05509

  15. arXiv:2305.15403  [pdf, other

    cs.CL cs.SD eess.AS

    AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

    Authors: Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, Zhenhui Ye, **zheng He, Lichao Zhang, **glin Liu, Xiang Yin, Zhou Zhao

    Abstract: Direct speech-to-speech translation (S2ST) aims to convert speech from one language into another, and has demonstrated significant progress to date. Despite the recent success, current S2ST models still suffer from distinct degradation in noisy environments and fail to translate visual speech (i.e., the movement of lips and teeth). In this work, we present AV-TranSpeech, the first audio-visual spe… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  16. arXiv:2305.14049  [pdf, other

    cs.CL cs.SD eess.AS

    Rethinking Speech Recognition with A Multimodal Perspective via Acoustic and Semantic Cooperative Decoding

    Authors: Tian-Hao Zhang, Hai-Bo Qin, Zhi-Hao Lai, Song-Lu Chen, Qi Liu, Feng Chen, Xinyuan Qian, Xu-Cheng Yin

    Abstract: Attention-based encoder-decoder (AED) models have shown impressive performance in ASR. However, most existing AED methods neglect to simultaneously leverage both acoustic and semantic features in decoder, which is crucial for generating more accurate and informative semantic states. In this paper, we propose an Acoustic and Semantic Cooperative Decoder (ASCD) for ASR. In particular, unlike vanilla… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023

  17. arXiv:2305.12457  [pdf, other

    cs.CV cs.MM

    Unsupervised Multi-view Pedestrian Detection

    Authors: Mengyin Liu, Chao Zhu, Shiqi Ren, Xu-Cheng Yin

    Abstract: With the prosperity of the video surveillance, multiple cameras have been applied to accurately locate pedestrians in a specific area. However, previous methods rely on the human-labeled annotations in every video frame and camera view, leading to heavier burden than necessary camera calibration and synchronization. Therefore, we propose in this paper an Unsupervised Multi-view Pedestrian Detectio… ▽ More

    Submitted 19 November, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

  18. arXiv:2305.10763  [pdf, other

    cs.SD eess.AS

    CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training

    Authors: Zhenhui Ye, Rongjie Huang, Yi Ren, Ziyue Jiang, **glin Liu, **zheng He, Xiang Yin, Zhou Zhao

    Abstract: Improving text representation has attracted much attention to achieve expressive text-to-speech (TTS). However, existing works only implicitly learn the prosody with masked token reconstruction tasks, which leads to low training efficiency and difficulty in prosody modeling. We propose CLAPSpeech, a cross-modal contrastive pre-training framework that explicitly learns the prosody variance of the s… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted by ACL 2023 (Main Conference)

  19. arXiv:2305.07895  [pdf, other

    cs.CV cs.CL

    On the Hidden Mystery of OCR in Large Multimodal Models

    Authors: Yuliang Liu, Zhang Li, Biao Yang, Chunyuan Li, Xucheng Yin, Cheng-lin Liu, Lianwen **, Xiang Bai

    Abstract: Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In this paper, we conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks including Text Recognition, Scene Text-Cent… ▽ More

    Submitted 17 January, 2024; v1 submitted 13 May, 2023; originally announced May 2023.

  20. arXiv:2305.05871  [pdf, other

    cs.CV

    Medical supervised masked autoencoders: Crafting a better masking strategy and efficient fine-tuning schedule for medical image classification

    Authors: Jiawei Mao, Shujian Guo, Yuanqi Chang, Xuesong Yin, Binling Nie

    Abstract: Masked autoencoders (MAEs) have displayed significant potential in the classification and semantic segmentation of medical images in the last year. Due to the high similarity of human tissues, even slight changes in medical images may represent diseased tissues, necessitating fine-grained inspection to pinpoint diseased tissues. The random masking strategy of MAEs is likely to result in areas of l… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  21. arXiv:2305.05652  [pdf, other

    eess.SY math.DS

    Distributed economic predictive control of integrated energy systems for enhanced synergy and grid response: A decomposition and cooperation strategy

    Authors: Long Wu, Xunyuan Yin, Lei Pan, **feng Liu

    Abstract: The close integration of increasing operating units into an integrated energy system (IES) results in complex interconnections between these units. The strong dynamic interactions create barriers to designing a successful distributed coordinated controller to achieve synergy between all the units and unlock the potential for grid response. To address these challenges, we introduce a directed graph… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  22. arXiv:2305.04491  [pdf

    cond-mat.supr-con physics.app-ph

    Self-passivated freestanding superconducting oxide film for flexible electronics

    Authors: Zhuoyue Jia, Chi Sin Tang, **g Wu, Changjian Li, Wanting Xu, Kairong Wu, Difan Zhou, ** Yang, Shengwei Zeng, Zhigang Zeng, Dengsong Zhang, Ariando Ariando, Mark B. H. Breese, Chuanbing Cai, Xinmao Yin

    Abstract: The integration of high-temperature superconducting YBa2Cu3O6+x (YBCO) into flexible electronic devices has the potential to revolutionize the technology industry. The effective preparation of high-quality flexible YBCO films therefore plays a key role in this development. We present a novel approach for transferring water-sensitive YBCO films onto flexible substrates without any buffer layer. Fre… ▽ More

    Submitted 6 July, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: 22 pages,4 figures,references added

    Journal ref: Applied Physics Reviews 10, 031401 (2023)

  23. arXiv:2305.00787  [pdf, other

    cs.CV

    GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

    Authors: Zhenhui Ye, **zheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, **glin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao

    Abstract: Generating talking person portraits with arbitrary speech audio is a crucial problem in the field of digital human and metaverse. A modern talking face generation method is expected to achieve the goals of generalized audio-lip synchronization, good video quality, and high system efficiency. Recently, neural radiance field (NeRF) has become a popular rendering technique in this field since it coul… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    Comments: 18 Pages, 7 figures

  24. arXiv:2305.00361  [pdf, ps, other

    math.PR

    Large and moderate deviations for empirical density fields of stochastic SEIR epidemics with vertex-dependent transition rates

    Authors: Xiaofeng Xue, Xueting Yin

    Abstract: In this paper, we are concerned with stochastic susceptible-exposed-infected-removed epidemics on complete graphs with vertex-dependent transition rates. Large and moderate deviations of empirical density fields of our models are given. Proofs of our main results utilize exponential martingale strategies. Mathematical difficulties are mainly in checks of exponential tightness of fluctuation densit… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: 54 pages

  25. arXiv:2304.14476  [pdf, other

    quant-ph

    Causal State Estimation and the Heisenberg Uncertainty Principle

    Authors: Junxin Chen, Benjamin B. Lane, Su Direkci, Dhruva Ganapathy, Xinghui Yin, Nergis Mavalvala, Yanbei Chen, Vivishek Sudhir

    Abstract: The observables of a noisy quantum system can be estimated by appropriately filtering the records of their continuous measurement. Such filtering is relevant for state estimation and measurement-based quantum feedback control. It is therefore imperative that the observables estimated through a causal filter satisfy the Heisenberg uncertainty principle. In the Markovian setting, prior work implicit… ▽ More

    Submitted 17 October, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

  26. arXiv:2304.11732  [pdf, other

    stat.ML cs.LG

    Quantile Extreme Gradient Boosting for Uncertainty Quantification

    Authors: Xiaozhe Yin, Masoud Fallah-Shorshani, Rob McConnell, Scott Fruin, Yao-Yi Chiang, Meredith Franklin

    Abstract: As the availability, size and complexity of data have increased in recent years, machine learning (ML) techniques have become popular for modeling. Predictions resulting from applying ML models are often used for inference, decision-making, and downstream applications. A crucial yet often overlooked aspect of ML is uncertainty quantification, which can significantly impact how predictions from mod… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: 15 pages, 7 figures and 4 tables

  27. arXiv:2304.08477  [pdf, other

    cs.CV

    Latent-Shift: Latent Diffusion with Temporal Shift for Efficient Text-to-Video Generation

    Authors: Jie An, Songyang Zhang, Harry Yang, Sonal Gupta, Jia-Bin Huang, Jiebo Luo, Xi Yin

    Abstract: We propose Latent-Shift -- an efficient text-to-video generation method based on a pretrained text-to-image generation model that consists of an autoencoder and a U-Net diffusion model. Learning a video diffusion model in the latent space is much more efficient than in the pixel space. The latter is often limited to first generating a low-resolution video followed by a sequence of frame interpolat… ▽ More

    Submitted 17 April, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: https://latent-shift.github.io

  28. arXiv:2304.06426  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci physics.app-ph

    Essential role of liquid phase on melt-processed GdBCO single-grain superconductors

    Authors: Xiongfang Liu, Xuechun Wang, **yu He, Yixue Fu, Xinmao Yin, Chuanbing Cai, Yibing Zhang, Difan Zhou

    Abstract: RE-Ba-Cu-O (RE denotes rare earth elements) single-grain superconductors have garnered considerable attention owning to their ability to trap strong magnetic field and self-stability for maglev. Here, we employed a modified melt-growth method by adding liquid source (LS) to provide a liquid rich environment during crystal growth. It further enables a significantly low maximum processing temperatur… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  29. arXiv:2304.05514  [pdf, other

    eess.SY math.DS

    State estimation of a carbon capture process through POD model reduction and neural network approximation

    Authors: Siyu Liu, Xunyuan Yin, **feng Liu

    Abstract: This paper presents an efficient approach for state estimation of post-combustion CO2 capture plants (PCCPs) by using reduced-order neural network models. The method involves extracting lower-dimensional feature vectors from high-dimensional operational data of the PCCP and constructing a reduced-order process model using proper orthogonal decomposition (POD). Multi-layer perceptron (MLP) neural n… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  30. arXiv:2304.05509  [pdf, other

    eess.SY cs.AI cs.LG

    Control invariant set enhanced reinforcement learning for process control: improved sampling efficiency and guaranteed stability

    Authors: Song Bo, Xunyuan Yin, **feng Liu

    Abstract: Reinforcement learning (RL) is an area of significant research interest, and safe RL in particular is attracting attention due to its ability to handle safety-driven constraints that are crucial for real-world applications of RL algorithms. This work proposes a novel approach to RL training, called control invariant set (CIS) enhanced RL, which leverages the benefits of CIS to improve stability gu… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  31. arXiv:2304.04773  [pdf, other

    eess.IV cs.CV

    HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains

    Authors: Huan**g Yue, Yubo Peng, Biting Yu, Xuanwu Yin, Zhenyu Zhou, **gyu Yang

    Abstract: High dynamic range (HDR) video reconstruction is attracting more and more attention due to the superior visual quality compared with those of low dynamic range (LDR) videos. The availability of LDR-HDR training pairs is essential for the HDR reconstruction quality. However, there are still no real LDR-HDR pairs for dynamic scenes due to the difficulty in capturing LDR-HDR frames simultaneously. In… ▽ More

    Submitted 12 April, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

  32. arXiv:2304.03135  [pdf, other

    cs.CV cs.AI cs.MM

    VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

    Authors: Mengyin Liu, Jie Jiang, Chao Zhu, Xu-Cheng Yin

    Abstract: Detecting pedestrians accurately in urban scenes is significant for realistic applications like autonomous driving or video surveillance. However, confusing human-like objects often lead to wrong detections, and small scale or heavily occluded pedestrians are easily missed due to their unusual appearances. To address these challenges, only object regions are inadequate, thus how to fully utilize m… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023

  33. arXiv:2304.02554  [pdf, other

    cs.CL

    Human-like Summarization Evaluation with ChatGPT

    Authors: Mingqi Gao, Jie Ruan, Renliang Sun, Xunjian Yin, Shi** Yang, Xiaojun Wan

    Abstract: Evaluating text summarization is a challenging problem, and existing evaluation metrics are far from satisfactory. In this study, we explored ChatGPT's ability to perform human-like summarization evaluation using four human evaluation methods on five datasets. We found that ChatGPT was able to complete annotations relatively smoothly using Likert scale scoring, pairwise comparison, Pyramid, and bi… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 9 pages, 5 figures, in process

  34. arXiv:2304.00729  [pdf, other

    eess.SY

    Data-Driven Safe Controller Synthesis for Deterministic Systems: A Posteriori Method With Validation Tests

    Authors: Yu Chen, Chao Shang, Xiaolin Huang, Xiang Yin

    Abstract: In this work, we investigate the data-driven safe control synthesis problem for unknown dynamic systems. We first formulate the safety synthesis problem as a robust convex program (RCP) based on notion of control barrier function. To resolve the issue of unknown system dynamic, we follow the existing approach by converting the RCP to a scenario convex program (SCP) by randomly collecting finite sa… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  35. arXiv:2304.00212  [pdf, other

    cs.CV cs.LG

    Devil is in the Queries: Advancing Mask Transformers for Real-world Medical Image Segmentation and Out-of-Distribution Localization

    Authors: Mingze Yuan, Yingda Xia, Hexin Dong, Zifan Chen, Jiawen Yao, Mingyan Qiu, Ke Yan, Xiaoli Yin, Yu Shi, Xin Chen, Zaiyi Liu, Bin Dong, **gren Zhou, Le Lu, Ling Zhang, Li Zhang

    Abstract: Real-world medical image segmentation has tremendous long-tailed complexity of objects, among which tail conditions correlate with relatively rare diseases and are clinically significant. A trustworthy medical AI algorithm should demonstrate its effectiveness on tail conditions to avoid clinically dangerous damage in these out-of-distribution (OOD) cases. In this paper, we adopt the concept of obj… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

    Comments: CVPR 2023 Highlight

  36. arXiv:2303.16976  [pdf, other

    cs.CV

    MaLP: Manipulation Localization Using a Proactive Scheme

    Authors: Vishal Asnani, Xi Yin, Tal Hassner, Xiaoming Liu

    Abstract: Advancements in the generation quality of various Generative Models (GMs) has made it necessary to not only perform binary manipulation detection but also localize the modified pixels in an image. However, prior works termed as passive for manipulation localization exhibit poor generalization performance over unseen GMs and attribute modifications. To combat this issue, we propose a proactive sche… ▽ More

    Submitted 4 April, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: Published at Conference on Computer Vision and Pattern Recognition 2023

  37. arXiv:2303.16550  [pdf, other

    quant-ph physics.ao-ph physics.flu-dyn

    Potential quantum advantage for simulation of fluid dynamics

    Authors: Xiangyu Li, Xiaolong Yin, Nathan Wiebe, Jaehun Chun, Gregory K. Schenter, Margaret S. Cheung, Johannes Mülmenstädt

    Abstract: Numerical simulation of turbulent fluid dynamics needs to either parameterize turbulence-which introduces large uncertainties-or explicitly resolve the smallest scales-which is prohibitively expensive. Here we provide evidence through analytic bounds and numerical studies that a potential quantum exponential speedup can be achieved to simulate the Navier-Stokes equations governing turbulence using… ▽ More

    Submitted 28 March, 2024; v1 submitted 29 March, 2023; originally announced March 2023.

    Report number: PNNL-SA-181572

  38. arXiv:2303.15790  [pdf, other

    hep-ex hep-ph physics.ins-det

    STCF Conceptual Design Report: Volume 1 -- Physics & Detector

    Authors: M. Achasov, X. C. Ai, R. Aliberti, L. P. An, Q. An, X. Z. Bai, Y. Bai, O. Bakina, A. Barnyakov, V. Blinov, V. Bobrovnikov, D. Bodrov, A. Bogomyagkov, A. Bondar, I. Boyko, Z. H. Bu, F. M. Cai, H. Cai, J. J. Cao, Q. H. Cao, Z. Cao, Q. Chang, K. T. Chao, D. Y. Chen, H. Chen , et al. (413 additional authors not shown)

    Abstract: The Super $Ï„$-Charm facility (STCF) is an electron-positron collider proposed by the Chinese particle physics community. It is designed to operate in a center-of-mass energy range from 2 to 7 GeV with a peak luminosity of $0.5\times 10^{35}{\rm cm}^{-2}{\rm s}^{-1}$ or higher. The STCF will produce a data sample about a factor of 100 larger than that by the present $Ï„$-Charm factory -- the BEPCII,… ▽ More

    Submitted 5 October, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

    Journal ref: Front. Phys. 19(1), 14701 (2024)

  39. arXiv:2303.14764  [pdf, other

    hep-ph

    The $Z$ resonance, inelastic dark matter, and new physics anomalies in the Simple Extension of the Standard Model (SESM) with general scalar potential

    Authors: Wenxing Zhang, Tianjun Li, Xiangwei Yin

    Abstract: We consider the generic scalar potential with CP-violation, and study the $Z$ resonance and inelastic dark matter in the Simple Extension of the Standard Model (SESM), which can explain the dark matter as well as new physics anomalies such as the B physics anomalies and muon anomalous magnetic moment, etc. With the new scalar potential terms, we obtain the mass splittings for the real and imaginar… ▽ More

    Submitted 5 October, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: 22 pages, 6 figures, 5 tables

  40. arXiv:2303.14746   

    quant-ph

    Giant-atom entanglement in waveguide-QED systems including non-Markovian effect

    Authors: Xian-Li Yin, Jie-Qiao Liao

    Abstract: We study the generation of quantum entanglement between two giant atoms coupled to a common one-dimensional waveguide. Here each giant atom interacts with the waveguide at two separate coupling points. Within the Wigner-Weisskopf framework for single coupling points, we obtain the time-delayed quantum master equations governing the evolution of the two giant atoms for three different coupling conf… ▽ More

    Submitted 8 June, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: We withdraw this submission because the obtained time-delayed quantum master equations do not have complete positivity during some periods of the dynamic evolution

  41. arXiv:2303.09988  [pdf, other

    cs.CV

    Star-Net: Improving Single Image Desnowing Model With More Efficient Connection and Diverse Feature Interaction

    Authors: Jiawei Mao, Yuanqi Chang, Xuesong Yin, Binling Nie

    Abstract: Compared to other severe weather image restoration tasks, single image desnowing is a more challenging task. This is mainly due to the diversity and irregularity of snow shape, which makes it extremely difficult to restore images in snowy scenes. Moreover, snow particles also have a veiling effect similar to haze or mist. Although current works can effectively remove snow particles with various sh… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  42. An in-depth exploration of LAMOST Unknown spectra based on density clustering

    Authors: Haifeng Yang, Xiaona Yin, Jianghui Cai, Yuqing Yang, Ali Luo, Zhongrui Bai, Lichan Zhou, Xujun Zhao, Yaling Xun

    Abstract: LAMOST (Large Sky Area Multi-Object Fiber Spectroscopic Telescope) has completed the observation of nearly 20 million celestial objects, including a class of spectra labeled `Unknown'. Besides low signal-to-noise ratio, these spectra often show some anomalous features that do not work well with current templates. In this paper, a total of 638,000 `Unknown' spectra from LAMOST DR5 are selected, and… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: 18 pages, 15 figures

  43. arXiv:2303.08229  [pdf, other

    eess.SY math.DS

    Sensor network design for post-combustion CO2 capture plants: economy, complexity and robustness

    Authors: Siyu Liu, Xunyuan Yin, **feng Liu

    Abstract: State estimation is crucial for the monitoring and control of post-combustion CO2 capture plants (PCCPs). The performance of state estimation is highly reliant on the configuration of sensors. In this work, we consider the problem of sensor selection for PCCPs and propose a computationally efficient method to determine an appropriate number of sensors and the corresponding placement of the sensors… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  44. arXiv:2303.01086  [pdf, other

    cs.CL cs.SD eess.AS

    LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion

    Authors: Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma

    Abstract: As a key component of automated speech recognition (ASR) and the front-end in text-to-speech (TTS), grapheme-to-phoneme (G2P) plays the role of converting letters to their corresponding pronunciations. Existing methods are either slow or poor in performance, and are limited in application scenarios, particularly in the process of on-device inference. In this paper, we integrate the advantages of b… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP2023

  45. arXiv:2302.09839  [pdf, other

    hep-ph hep-ex

    Boosting indirect detection of a secluded dark matter sector

    Authors: **mian Li, Takaaki Nomura, Junle Pei, Xiangwei Yin, Cong Zhang

    Abstract: Dark Matter (DM) residing in a secluded sector with suppressed portal interaction could evade direct detections and collider searches. The indirect detections provide the most robust probe to this scenario. Depending on the structure of the dark sector, novel DM annihilation spectra are possible. The dark shower is a common phenomenon for particles in the dark sector which take part in strong inte… ▽ More

    Submitted 11 August, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: 9 pages, 4 figures, including supplemental material, accepted for publication in PRD

  46. arXiv:2302.06063  [pdf

    cond-mat.mtrl-sci

    Spin State Disproportionation in Insulating Ferromagnetic LaCoO3 Epitaxial Thin Films

    Authors: Shanquan Chen, Jhong-Yi Chang, Qinghua Zhang, Qiuyue Li, Ting Lin, Fanqi Meng, Haoliang Huang, Shengwei Zeng, Xinmao Yin, My Ngoc Duong, Yalin Lu, Lang Chen, Er-Jia Guo, Hanghui Chen, Chun-Fu Chang, Chang-Yang Kuo, Zuhuang Chen

    Abstract: The origin of insulating ferromagnetism in epitaxial LaCoO3 films under tensile strain remains elusive despite extensive research efforts have been devoted. Surprisingly, the spin state of its Co ions, the main parameter of its ferromagnetism, is still to be determined. Here, we have systematically investigated the spin state in epitaxial LaCoO3 thin films to clarify the mechanism of strain induce… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Journal ref: Advanced Science, 2303630 (2023)

  47. Research on data integration of overseas discrete archives from the perspective of digital humanties

    Authors: Rina Su, 2. Yumeng Li, Xin Yang, Xin Yin, Tao Chen

    Abstract: The digitization of displaced archives is of great historical and cultural significance. Through the construction of digital humanistic platforms represented by MISS platform, and the comprehensive application of IIIF technology, knowledge graph technology, ontology technology, and other popular information technologies. We can find that the digital framework of displaced archives built through th… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Journal ref: International Journal of Web&Semantic Technology,2023,Vol14,Num1

  48. arXiv:2301.12661  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

    Authors: Rongjie Huang, Jiawei Huang, Dongchao Yang, Yi Ren, Lu** Liu, Mingze Li, Zhenhui Ye, **glin Liu, Xiang Yin, Zhou Zhao

    Abstract: Large-scale multimodal generative modeling has created milestones in text-to-image and text-to-video generation. Its application to audio still lags behind for two main reasons: the lack of large-scale datasets with high-quality text-audio pairs, and the complexity of modeling long continuous audio data. In this work, we propose Make-An-Audio with a prompt-enhanced diffusion model that addresses t… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

    Comments: Audio samples are available at https://Text-to-Audio.github.io

  49. arXiv:2301.12291  [pdf, other

    eess.IV cs.CV

    CancerUniT: Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans

    Authors: Jieneng Chen, Yingda Xia, Jiawen Yao, Ke Yan, Jianpeng Zhang, Le Lu, Fakai Wang, Bo Zhou, Mingyan Qiu, Qihang Yu, Mingze Yuan, Wei Fang, Yuxing Tang, Minfeng Xu, Jian Zhou, Yuqian Zhao, Qifeng Wang, Xianghua Ye, Xiaoli Yin, Yu Shi, Xin Chen, **gren Zhou, Alan Yuille, Zaiyi Liu, Ling Zhang

    Abstract: Human readers or radiologists routinely perform full-body multi-organ multi-disease detection and diagnosis in clinical practice, while most medical AI systems are built to focus on single organs with a narrow list of a few diseases. This might severely limit AI's clinical adoption. A certain number of AI models need to be assembled non-trivially to match the diagnostic process of a human reading… ▽ More

    Submitted 6 October, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

    Comments: ICCV 2023 Camera Ready Version

  50. arXiv:2301.12181  [pdf, other

    cs.AR

    A Survey on Approximate Multiplier Designs for Energy Efficiency: From Algorithms to Circuits

    Authors: Ying Wu, Chuangtao Chen, Weihua Xiao, Xuan Wang, Chenyi Wen, Jie Han, Xunzhao Yin, Weikang Qian, Cheng Zhuo

    Abstract: Given the stringent requirements of energy efficiency for Internet-of-Things edge devices, approximate multipliers, as a basic component of many processors and accelerators, have been constantly proposed and studied for decades, especially in error-resilient applications. The computation error and energy efficiency largely depend on how and where the approximation is introduced into a design. Thus… ▽ More

    Submitted 29 June, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

    Comments: 38 pages, 37 figures