-
GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization
Authors:
Yirui Chen,
Xudong Huang,
Quan Zhang,
Wei Li,
Mingjian Zhu,
Qiangyu Yan,
Simiao Li,
Hanting Chen,
Hailin Hu,
Jie Yang,
Wei Liu,
Jie Hu
Abstract:
The extraordinary ability of generative models emerges as a new trend in image editing and generating realistic images, posing a serious threat to the trustworthiness of multimedia data and driving the research of image manipulation detection and location(IMDL). However, the lack of a large-scale data foundation makes IMDL task unattainable. In this paper, a local manipulation pipeline is designed…
▽ More
The extraordinary ability of generative models emerges as a new trend in image editing and generating realistic images, posing a serious threat to the trustworthiness of multimedia data and driving the research of image manipulation detection and location(IMDL). However, the lack of a large-scale data foundation makes IMDL task unattainable. In this paper, a local manipulation pipeline is designed, incorporating the powerful SAM, ChatGPT and generative models. Upon this basis, We propose the GIM dataset, which has the following advantages: 1) Large scale, including over one million pairs of AI-manipulated images and real images. 2) Rich Image Content, encompassing a broad range of image classes 3) Diverse Generative Manipulation, manipulated images with state-of-the-art generators and various manipulation tasks. The aforementioned advantages allow for a more comprehensive evaluation of IMDL methods, extending their applicability to diverse images. We introduce two benchmark settings to evaluate the generalization capability and comprehensive performance of baseline methods. In addition, we propose a novel IMDL framework, termed GIMFormer, which consists of a ShadowTracer, Frequency-Spatial Block (FSB), and a Multi-window Anomalous Modelling (MWAM) Module. Extensive experiments on the GIM demonstrate that GIMFormer surpasses previous state-of-the-art works significantly on two different benchmarks.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Bounding-Box Inference for Error-Aware Model-Based Reinforcement Learning
Authors:
Erin J. Talvitie,
Zilei Shao,
Huiying Li,
**ghan Hu,
Jacob Boerma,
Rory Zhao,
Xintong Wang
Abstract:
In model-based reinforcement learning, simulated experiences from the learned model are often treated as equivalent to experience from the real environment. However, when the model is inaccurate, it can catastrophically interfere with policy learning. Alternatively, the agent might learn about the model's accuracy and selectively use it only when it can provide reliable predictions. We empirically…
▽ More
In model-based reinforcement learning, simulated experiences from the learned model are often treated as equivalent to experience from the real environment. However, when the model is inaccurate, it can catastrophically interfere with policy learning. Alternatively, the agent might learn about the model's accuracy and selectively use it only when it can provide reliable predictions. We empirically explore model uncertainty measures for selective planning and show that best results require distribution insensitive inference to estimate the uncertainty over model-based updates. To that end, we propose and evaluate bounding-box inference, which operates on bounding-boxes around sets of possible states and other quantities. We find that bounding-box inference can reliably support effective selective planning.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Pairwise-Independent Contention Resolution
Authors:
Anupam Gupta,
**qiao Hu,
Gregory Kehne,
Roie Levin
Abstract:
We study online contention resolution schemes (OCRSs) and prophet inequalities for non-product distributions. Specifically, when the active set is sampled according to a pairwise-independent (PI) distribution, we show a $(1-o_k(1))$-selectable OCRS for uniform matroids of rank $k$, and $Ω(1)$-selectable OCRSs for laminar, graphic, cographic, transversal, and regular matroids. These imply prophet i…
▽ More
We study online contention resolution schemes (OCRSs) and prophet inequalities for non-product distributions. Specifically, when the active set is sampled according to a pairwise-independent (PI) distribution, we show a $(1-o_k(1))$-selectable OCRS for uniform matroids of rank $k$, and $Ω(1)$-selectable OCRSs for laminar, graphic, cographic, transversal, and regular matroids. These imply prophet inequalities with the same ratios when the set of values is drawn according to a PI distribution. Our results complement recent work of Dughmi, Kalayci, and Patel (STOC '24) showing that no $ω(1/k)$-selectable OCRS exists in the PI setting for general matroids of rank $k$.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Breaking Secure Aggregation: Label Leakage from Aggregated Gradients in Federated Learning
Authors:
Zhibo Wang,
Zhiwei Chang,
Jiahui Hu,
Xiaoyi Pang,
Jiacheng Du,
Yongle Chen,
Kui Ren
Abstract:
Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover…
▽ More
Federated Learning (FL) exhibits privacy vulnerabilities under gradient inversion attacks (GIAs), which can extract private information from individual gradients. To enhance privacy, FL incorporates Secure Aggregation (SA) to prevent the server from obtaining individual gradients, thus effectively resisting GIAs. In this paper, we propose a stealthy label inference attack to bypass SA and recover individual clients' private labels. Specifically, we conduct a theoretical analysis of label inference from the aggregated gradients that are exclusively obtained after implementing SA. The analysis results reveal that the inputs (embeddings) and outputs (logits) of the final fully connected layer (FCL) contribute to gradient disaggregation and label restoration. To preset the embeddings and logits of FCL, we craft a fishing model by solely modifying the parameters of a single batch normalization (BN) layer in the original model. Distributing client-specific fishing models, the server can derive the individual gradients regarding the bias of FCL by resolving a linear system with expected embeddings and the aggregated gradients as coefficients. Then the labels of each client can be precisely computed based on preset logits and gradients of FCL's bias. Extensive experiments show that our attack achieves large-scale label recovery with 100\% accuracy on various datasets and model architectures.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Sublattice Dichotomy in Monolayer FeSe Superconductor
Authors:
Cui Ding,
Zhipeng Xu,
Xiaotong Jiao,
Qiyin Hu,
Wenxuan Zhao,
Lexian Yang,
Kun Jiang,
**-Feng Jia,
Lili Wang,
Jiang** Hu,
Qi-Kun Xue
Abstract:
The pairing mechanism behind the monolayer FeSe is one essential question for iron-based superconductors. In this work, we show the sublattice degree of freedoms of monolayer FeSe plays a special role in its pairing properties, namely the sublattice dichotomy. The high-quality monolayer FeSe samples with atomic flat $1\times1$ topography on the SrTiO$_3$(001) substrates are grown by molecular beam…
▽ More
The pairing mechanism behind the monolayer FeSe is one essential question for iron-based superconductors. In this work, we show the sublattice degree of freedoms of monolayer FeSe plays a special role in its pairing properties, namely the sublattice dichotomy. The high-quality monolayer FeSe samples with atomic flat $1\times1$ topography on the SrTiO$_3$(001) substrates are grown by molecular beam epitaxy. By comparing the tunneling spectra at $α$ and $β$ Fe sublattices, we find the coherence peak of $α$-Fe at the inner gap $+V_i$ is higher than $β$-Fe while the coherence peak of $β$-Fe at $-V_i$ is higher than $α$-Fe with a similar amount. We also observed a reversed effect at the outer gap $\pm V_o$. We propose the $η$-pairing mechanism between $k$ and $-k+Q$ is the key mechanism for this unconventional sublattice dichotomy effect.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Testing cosmic anisotropy with Pade approximation and Pantheon+ sample
Authors:
Jian** Hu,
Jian Hu,
Xuandong Jia,
Baoquan Gao,
Fayin Wang
Abstract:
Cosmography can be used to constrain the kinematics of universe in a model-independent way. In this work, we attempted to combine the Pad$\rm \acute{e}$ approximations with the latest Pantheon+ sample for testing cosmological principle. Based on the Pad$\rm \acute{e}$ approximations, we first gave the cosmographic constraints on the different order polynomials including third-order (Pad…
▽ More
Cosmography can be used to constrain the kinematics of universe in a model-independent way. In this work, we attempted to combine the Pad$\rm \acute{e}$ approximations with the latest Pantheon+ sample for testing cosmological principle. Based on the Pad$\rm \acute{e}$ approximations, we first gave the cosmographic constraints on the different order polynomials including third-order (Pad$\rm \acute{e}$$_{(2,1)}$), fourth-order (Pad$\rm \acute{e}$$_{(2,2)}$) and fifth-order (Pad$\rm \acute{e}$$_{(3,2)}$). Based on the Pad$\rm \acute{e}$$_{(2,1)}$ ($j_{0}$ = 1) polynomial and hemisphere comparison (HC) method, we tested the cosmological principle and found the preferred directions of cosmic anisotropy, such as (l, b) = (304.6$^{\circ}$$_{-37.4}^{+51.4}$, $-$18.7$^{\circ}$$_{-20.3}^{+14.7}$) and (311.1$^{\circ}$$_{-8.4}^{+17.4}$, $-$17.53$^{\circ}$$_{-7.7}^{+7.8}$) for $q_{0}$ and $H_{0}$, respectively. These two directions are consistent with each other in $1σ$ confidence level, but the corresponding results of statistical isotropy analyses including Isotropy and Isotropy with real positions (RP) are quite different. The statistical significance of $H_{0}$ are stronger than that of $q_{0}$, i.e., 4.75$σ$ and 4.39$σ$ for the Isotropy and Isotropy with RP respectively. Reanalysis with fixed $q_{0} = -0.55$ (corresponds to $Ω_{m}$ = 0.30) gives similar results. Overall, our model-independent results provide clear indications for a possible cosmic anisotropy, which must be taken seriously. Further test is needed to better understand this signal.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Wide Field of View Large Aperture Meta-Doublet Eyepiece
Authors:
Anna Wirth-Singh,
Johannes E. Fröch,
Fan Yang,
Louis Martin,
Hualiang Zhang,
Quentin T. Tanguy,
Zhihao Zhou,
Luocheng Huang,
Demis D. John,
Biljana Stamenic,
Juejun Hu,
Tian Gu,
Arka Majumdar
Abstract:
Wide field of view and light weight optics are critical for advanced eyewear, with applications in augmented/virtual reality and night vision. Conventional refractive lenses are often stacked to correct aberrations at wide field of view, leading to limited performance and increased size and weight. In particular, simultaneously achieving wide field of view and large aperture for light collection i…
▽ More
Wide field of view and light weight optics are critical for advanced eyewear, with applications in augmented/virtual reality and night vision. Conventional refractive lenses are often stacked to correct aberrations at wide field of view, leading to limited performance and increased size and weight. In particular, simultaneously achieving wide field of view and large aperture for light collection is desirable but challenging to realize in a compact form-factor. Here, we demonstrate a wide field of view (greater than 60$^\circ$) meta-optic doublet eyepiece with an entrance aperture of 2.1 cm. At the design wavelength of 633 nm, the meta-optic doublet achieves comparable performance to a refractive lens-based eyepiece system. This meta-doublet eyepiece illustrates the potential for meta-optics to play an important role in the development of high-quality monochrome near-eye display and night vision systems.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Perspective+ Unet: Enhancing Segmentation with Bi-Path Fusion and Efficient Non-Local Attention for Superior Receptive Fields
Authors:
**tong Hu,
Siyan Chen,
Zhiyi Pan,
Sen Zeng,
Wenming Yang
Abstract:
Precise segmentation of medical images is fundamental for extracting critical clinical information, which plays a pivotal role in enhancing the accuracy of diagnoses, formulating effective treatment plans, and improving patient outcomes. Although Convolutional Neural Networks (CNNs) and non-local attention methods have achieved notable success in medical image segmentation, they either struggle to…
▽ More
Precise segmentation of medical images is fundamental for extracting critical clinical information, which plays a pivotal role in enhancing the accuracy of diagnoses, formulating effective treatment plans, and improving patient outcomes. Although Convolutional Neural Networks (CNNs) and non-local attention methods have achieved notable success in medical image segmentation, they either struggle to capture long-range spatial dependencies due to their reliance on local features, or face significant computational and feature integration challenges when attempting to address this issue with global attention mechanisms. To overcome existing limitations in medical image segmentation, we propose a novel architecture, Perspective+ Unet. This framework is characterized by three major innovations: (i) It introduces a dual-pathway strategy at the encoder stage that combines the outcomes of traditional and dilated convolutions. This not only maintains the local receptive field but also significantly expands it, enabling better comprehension of the global structure of images while retaining detail sensitivity. (ii) The framework incorporates an efficient non-local transformer block, named ENLTB, which utilizes kernel function approximation for effective long-range dependency capture with linear computational and spatial complexity. (iii) A Spatial Cross-Scale Integrator strategy is employed to merge global dependencies and local contextual cues across model stages, meticulously refining features from various levels to harmonize global and local information. Experimental results on the ACDC and Synapse datasets demonstrate the effectiveness of our proposed Perspective+ Unet. The code is available in the supplementary material.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Model structure arising from one hereditary cotorsion pair on extriangulated categories
Authors:
Jiangsheng Hu,
Dongdong Zhang,
Panyue Zhou
Abstract:
Let $\mathcal{C}$ be a weakly idempotent complete extriangulated category. In contrast with the Hovey correspondence of admissible model structures on weakly idempotent complete exact categories from two complete cotorsion pairs, we give a construction of model structures on $\mathcal{C}$ from only one complete cotorsion pair. Our main result not only generalizes the work by Beligiannis-Reiten and…
▽ More
Let $\mathcal{C}$ be a weakly idempotent complete extriangulated category. In contrast with the Hovey correspondence of admissible model structures on weakly idempotent complete exact categories from two complete cotorsion pairs, we give a construction of model structures on $\mathcal{C}$ from only one complete cotorsion pair. Our main result not only generalizes the work by Beligiannis-Reiten and Cui-Lu-Zhang, but also provides methods to construct model structures from silting objects of $\mathcal{C}$ and co-$t$-structures in triangulated categories.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Authors:
Bofei Gao,
Zefan Cai,
Runxin Xu,
Peiyi Wang,
Ce Zheng,
Runji Lin,
Keming Lu,
Dayiheng Liu,
Chang Zhou,
Wen Xiao,
Junjie Hu,
Tianyu Liu,
Baobao Chang
Abstract:
Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale la…
▽ More
Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale labels (i.e., the correctness of the current step and the explanations). In this paper, we propose \textbf{Math-Minos}, a natural language feedback enhanced verifier by constructing automatically-generated training data and a two-stage training paradigm for effective training and efficient inference. Our experiments reveal that a small set (30k) of natural language feedbacks can significantly boost the performance of the verifier by the accuracy of 1.6\% (86.6\% $\rightarrow$ 88.2\%) on GSM8K and 0.8\% (37.8\% $\rightarrow$ 38.6\%) on MATH. We have released our code and data for further exploration.
△ Less
Submitted 8 July, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
Formation of super-thin galaxies in Illustris-TNG
Authors:
Jianhong Hu,
Dandan Xu,
Cheng Li
Abstract:
Superthin galaxies are observed to have stellar disks with extremely small minor-to-major axis ratios. In this work, we investigate the formation of superthin galaxies in the TNG100 simulation. We trace the merger history and investigate the evolution of galaxy properties of a selected sample of superthin galaxies and a control sample of galaxies that share the same joint probability distribution…
▽ More
Superthin galaxies are observed to have stellar disks with extremely small minor-to-major axis ratios. In this work, we investigate the formation of superthin galaxies in the TNG100 simulation. We trace the merger history and investigate the evolution of galaxy properties of a selected sample of superthin galaxies and a control sample of galaxies that share the same joint probability distribution in the stellar-mass and color diagram. Through making comparisons between the two galaxy samples, we find that present-day superthin galaxies had similar morphologies as the control sample counterparts at higher redshifts, but have developed extended flat `superthin' morphologies since $z \sim 1$. During this latter evolution stage, superthin galaxies undergo overwhelmingly higher frequency of prograde mergers (with orbit-spin angle $θ_{\rm orb} \leqslant 40^\circ$). Accordingly the spins of their dark matter halos have grown significantly and become noticeably higher than that of their normal disk counterparts. This further results in the buildup of their stellar disks at larger distances much beyond the regimes of normal disk galaxies. We also discuss the formation scenario of those superthin galaxies that live in larger dark matter halos as satellite galaxies therein.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Dynamical phase-field model of cavity electromagnonic systems
Authors:
Shihao Zhuang,
Yujie Zhu,
Changchun Zhong,
Liang Jiang,
Xufeng Zhang,
Jia-Mian Hu
Abstract:
Cavity electromagnonic system, which simultaneously consists of cavities for photons, magnons (quanta of spin waves), and acoustic phonons, provides an exciting platform to achieve coherent energy transduction among different physical systems down to single quantum level. Here we report a dynamical phase-field model that allows simulating the coupled dynamics of the electromagnetic waves, magnetiz…
▽ More
Cavity electromagnonic system, which simultaneously consists of cavities for photons, magnons (quanta of spin waves), and acoustic phonons, provides an exciting platform to achieve coherent energy transduction among different physical systems down to single quantum level. Here we report a dynamical phase-field model that allows simulating the coupled dynamics of the electromagnetic waves, magnetization, and strain in 3D multiphase systems. As examples of application, we computationally demonstrate the excitation of hybrid magnon-photon modes (magnon polaritons), Floquet-induced magnonic Aulter-Townes splitting, dynamical energy exchange (Rabi oscillation) and relative phase control (Ramsey interference) between the two magnon polariton modes. The simulation results are consistent with analytical calculations based on Floquet Hamiltonian theory. Simulations are also performed to design a cavity electro-magno-mechanical system that enables the triple phonon-magnon-photon resonance, where the resonant excitation of a chiral, fundamental (n=1) transverse acoustic phonon mode by magnon polaritons is demonstrated. With the capability to predict coupling strength, dissipation rates, and temporal evolution of photon/magnon/phonon mode profiles using fundamental materials parameters as the inputs, the present dynamical phase-field model represents a valuable computational tool to guide the fabrication of the cavity electromagnonic system and the design of operating conditions for applications in quantum sensing, transduction, and communication.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Precision measurement of the $Ξ^-_b$ baryon lifetime
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1064 additional authors not shown)
Abstract:
A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys…
▽ More
A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second systematic. This value is averaged with the corresponding value from Run 1 to obtain ${r_τ^{\rm Run\,1,2} = 1.078\pm0.012\pm0.007}$. Multiplying by the world-average value of the $Λ^0_b$ lifetime yields $τ_{Ξ^-_b}^{\rm Run~1,2} = 1.578\pm0.018\pm0.010\pm0.011$ ps, where the uncertainties are statistical, systematic, and due to the limited knowledge of the $Λ^0_b$ lifetime. This measurement improves the precision of the current world average of the $Ξ^-_b$ lifetime by about a factor of two, and is in good agreement with the most recent theoretical predictions.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Experimental verification of the optimal fingerprint method for detecting climate change
Authors:
**bo Hu,
Hong Yuan,
Letian Chen,
Nan Zhao,
C. P. Sun
Abstract:
The optimal fingerprint method serves as a potent approach for detecting and attributing climate change. However, its experimental validation encounters challenges due to the intricate nature of climate systems. Here, we experimentally examine the optimal fingerprint method simulated by a precisely controlled magnetic resonance system of spins. The spin dynamic under an applied deterministic drivi…
▽ More
The optimal fingerprint method serves as a potent approach for detecting and attributing climate change. However, its experimental validation encounters challenges due to the intricate nature of climate systems. Here, we experimentally examine the optimal fingerprint method simulated by a precisely controlled magnetic resonance system of spins. The spin dynamic under an applied deterministic driving field and a noise field is utilized to emulate the complex climate system with external forcing and internal variability. Our experimental results affirm the theoretical prediction regarding the existence of an optimal detection direction which maximizes the signal-to-noise ratio, thereby validating the optimal fingerprint method. This work offers direct empirical verification of the optimal fingerprint method, crucial for comprehending climate change and its societal impacts.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Wake dynamics of wind turbines in unsteady streamwise flow conditions
Authors:
Nathaniel J. Wei,
Adnan El Makdah,
JiaCheng Hu,
Frieder Kaiser,
David E. Rival,
John O. Dabiri
Abstract:
The unsteady flow physics of wind-turbine wakes under dynamic forcing conditions are critical to the modeling and control of wind farms for optimal power density. Unsteady forcing in the streamwise direction may be generated by unsteady inflow conditions in the atmospheric boundary layer, dynamic induction control of the turbine, or streamwise surge motions of a floating offshore wind turbine due…
▽ More
The unsteady flow physics of wind-turbine wakes under dynamic forcing conditions are critical to the modeling and control of wind farms for optimal power density. Unsteady forcing in the streamwise direction may be generated by unsteady inflow conditions in the atmospheric boundary layer, dynamic induction control of the turbine, or streamwise surge motions of a floating offshore wind turbine due to floating-platform oscillations. This study seeks to identify the dominant flow mechanisms in unsteady wakes forced by a periodic upstream inflow condition. A theoretical framework for the problem is derived, which describes traveling-wave undulations in the wake radius and streamwise velocity. These dynamics encourage the aggregation of tip vortices into large structures that are advected along in the wake. Flow measurements in the wake of a periodically surging turbine were obtained in an optically accessible towing-tank facility, with an average diameter-based Reynolds number of 300,000 and with surge-velocity amplitudes of up to 40% of the mean inflow velocity. Qualitative agreement between trends in the measurements and model predictions is observed, supporting the validity of the theoretical analyses. The experiments also demonstrate large enhancements in the recovery of the wake relative to the steady-flow case, with wake-length reductions of up to 46.5% and improvements in the available power at 10 diameters downstream of up to 15.7%. These results provide fundamental insights into the dynamics of unsteady wakes and serve as additional evidence that unsteady fluid mechanics can be leveraged to increase the power density of wind farms.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
GUICourse: From General Vision Language Models to Versatile GUI Agents
Authors:
Wentong Chen,
Junbo Cui,
**yi Hu,
Yujia Qin,
Junjie Fang,
Yue Zhao,
Chongyi Wang,
Jun Liu,
Guirong Chen,
Yupeng Huo,
Yuan Yao,
Yankai Lin,
Zhiyuan Liu,
Maosong Sun
Abstract:
Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile agents to help humans finish GUI navigation tasks. However, current VLMs are challenged in terms of fundamental abilities (OCR and grounding) and GUI knowledge (th…
▽ More
Utilizing Graphic User Interface (GUI) for human-computer interaction is essential for accessing a wide range of digital tools. Recent advancements in Vision Language Models (VLMs) highlight the compelling potential to develop versatile agents to help humans finish GUI navigation tasks. However, current VLMs are challenged in terms of fundamental abilities (OCR and grounding) and GUI knowledge (the functions and control methods of GUI elements), preventing them from becoming practical GUI agents. To solve these challenges, we contribute GUICourse, a suite of datasets to train visual-based GUI agents from general VLMs. First, we introduce the GUIEnv dataset to strengthen the OCR and grounding capabilities of VLMs. Then, we introduce the GUIAct and GUIChat datasets to enrich their knowledge of GUI components and interactions. Experiments demonstrate that our GUI agents have better performance on common GUI tasks than their baseline VLMs. Even the small-size GUI agent (with 3.1B parameters) can still work well on single-step and multi-step GUI tasks. Finally, we analyze the different varieties in the training stage of this agent by ablation study. Our source codes and datasets are released at https://github.com/yiye3/GUICourse.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Asymptotic properties of a multicolored random reinforced urn model with an application to multi-armed bandits
Authors:
Li Yang,
Jiang Hu,
Jianghao Li,
Zhidong Bai
Abstract:
The random self-reinforcement mechanism, characterized by the principle of ``the rich get richer'', has demonstrated significant utility across various domains. One prominent model embodying this mechanism is the random reinforcement urn model. This paper investigates a multicolored, multiple-drawing variant of the random reinforced urn model. We establish the limiting behavior of the normalized u…
▽ More
The random self-reinforcement mechanism, characterized by the principle of ``the rich get richer'', has demonstrated significant utility across various domains. One prominent model embodying this mechanism is the random reinforcement urn model. This paper investigates a multicolored, multiple-drawing variant of the random reinforced urn model. We establish the limiting behavior of the normalized urn composition and demonstrate strong convergence upon scaling the counts of each color. Additionally, we derive strong convergence estimators for the reinforcement means, i.e., for the expectations of the replacement matrix's diagonal elements, and prove their joint asymptotic normality. It is noteworthy that the estimators of the largest reinforcement mean are asymptotically independent of the estimators of the other smaller reinforcement means. Additionally, if a reinforcement mean is not the largest, the estimators of these smaller reinforcement means will also demonstrate asymptotic independence among themselves. Furthermore, we explore the parallels between the reinforced mechanisms in random reinforced urn models and multi-armed bandits, addressing hypothesis testing for expected payoffs in the latter context.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction
Authors:
Bowen Song,
Jason Hu,
Zhaoxu Luo,
Jeffrey A. Fessler,
Liyue Shen
Abstract:
Diffusion models face significant challenges when employed for large-scale medical image reconstruction in real practice such as 3D Computed Tomography (CT). Due to the demanding memory, time, and data requirements, it is difficult to train a diffusion model directly on the entire volume of high-dimensional data to obtain an efficient 3D diffusion prior. Existing works utilizing diffusion priors o…
▽ More
Diffusion models face significant challenges when employed for large-scale medical image reconstruction in real practice such as 3D Computed Tomography (CT). Due to the demanding memory, time, and data requirements, it is difficult to train a diffusion model directly on the entire volume of high-dimensional data to obtain an efficient 3D diffusion prior. Existing works utilizing diffusion priors on single 2D image slice with hand-crafted cross-slice regularization would sacrifice the z-axis consistency, which results in severe artifacts along the z-axis. In this work, we propose a novel framework that enables learning the 3D image prior through position-aware 3D-patch diffusion score blending for reconstructing large-scale 3D medical images. To the best of our knowledge, we are the first to utilize a 3D-patch diffusion prior for 3D medical image reconstruction. Extensive experiments on sparse view and limited angle CT reconstruction show that our DiffusionBlend method significantly outperforms previous methods and achieves state-of-the-art performance on real-world CT reconstruction problems with high-dimensional 3D image (i.e., $256 \times 256 \times 500$). Our algorithm also comes with better or comparable computational efficiency than previous state-of-the-art methods.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the…
▽ More
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Investigation of reflection-based measurements of microwave kinetic inductance detectors in the optical bands
Authors:
Jie Hu,
Faouzi Boussaha,
Paul Nicaise,
Christine Chaumont,
Maria Appavou,
Viet Dung Pham,
Michel Piat
Abstract:
In this paper, we investigate the single photon response from the reflection of the Microwave Kinetic Inductance Detector (MKID) array. Reflection measurements are carried out using two configurations: one is measured simultaneously with the transmission, and the other is obtained with a single-ended MKID array terminated with an open load. Compared with the transmission, reflection measurements s…
▽ More
In this paper, we investigate the single photon response from the reflection of the Microwave Kinetic Inductance Detector (MKID) array. Reflection measurements are carried out using two configurations: one is measured simultaneously with the transmission, and the other is obtained with a single-ended MKID array terminated with an open load. Compared with the transmission, reflection measurements significantly reduce the readout noise of the single-ended MKID array. This is also reflected in the improvement of the median energy resolving power by around 20%-30% under pulsed photon illumination at $λ= 405$~nm, mainly due to an increase in the size of the resonance circle on the IQ plane. This method has the potential to be used to read out large MKID arrays.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Authors:
Roman Bachmann,
Oğuzhan Fatih Kar,
David Mizrahi,
Ali Garjani,
Mingfei Gao,
David Griffiths,
Jiaming Hu,
Afshin Dehghan,
Amir Zamir
Abstract:
Current multimodal and multitask foundation models like 4M or UnifiedIO show promising results, but in practice their out-of-the-box abilities to accept diverse inputs and perform diverse tasks are limited by the (usually rather small) number of modalities and tasks they are trained on. In this paper, we expand upon the capabilities of them by training a single model on tens of highly diverse moda…
▽ More
Current multimodal and multitask foundation models like 4M or UnifiedIO show promising results, but in practice their out-of-the-box abilities to accept diverse inputs and perform diverse tasks are limited by the (usually rather small) number of modalities and tasks they are trained on. In this paper, we expand upon the capabilities of them by training a single model on tens of highly diverse modalities and by performing co-training on large-scale multimodal datasets and text corpora. This includes training on several semantic and geometric modalities, feature maps from recent state of the art models like DINOv2 and ImageBind, pseudo labels of specialist models like SAM and 4DHumans, and a range of new modalities that allow for novel ways to interact with the model and steer the generation, for example image metadata or color palettes. A crucial step in this process is performing discrete tokenization on various modalities, whether they are image-like, neural network feature maps, vectors, structured data like instance segmentation or human poses, or data that can be represented as text. Through this, we expand on the out-of-the-box capabilities of multimodal models and specifically show the possibility of training one model to solve at least 3x more tasks/modalities than existing ones and doing so without a loss in performance. This enables more fine-grained and controllable multimodal generation capabilities and allows us to study the distillation of models trained on diverse data and objectives into a unified model. We successfully scale the training to a three billion parameter model using tens of modalities and different datasets. The resulting models and training code are open sourced at 4m.epfl.ch.
△ Less
Submitted 14 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Authors:
Yucheng Han,
Rui Wang,
Chi Zhang,
Juntao Hu,
Pei Cheng,
Bin Fu,
Hanwang Zhang
Abstract:
Recent advancements in image generation have enabled the creation of high-quality images from text conditions. However, when facing multi-modal conditions, such as text combined with reference appearances, existing methods struggle to balance multiple conditions effectively, typically showing a preference for one modality over others. To address this challenge, we introduce EMMA, a novel image gen…
▽ More
Recent advancements in image generation have enabled the creation of high-quality images from text conditions. However, when facing multi-modal conditions, such as text combined with reference appearances, existing methods struggle to balance multiple conditions effectively, typically showing a preference for one modality over others. To address this challenge, we introduce EMMA, a novel image generation model accepting multi-modal prompts built upon the state-of-the-art text-to-image (T2I) diffusion model, ELLA. EMMA seamlessly incorporates additional modalities alongside text to guide image generation through an innovative Multi-modal Feature Connector design, which effectively integrates textual and supplementary modal information using a special attention mechanism. By freezing all parameters in the original T2I diffusion model and only adjusting some additional layers, we reveal an interesting finding that the pre-trained T2I diffusion model can secretly accept multi-modal prompts. This interesting property facilitates easy adaptation to different existing frameworks, making EMMA a flexible and effective tool for producing personalized and context-aware images and even videos. Additionally, we introduce a strategy to assemble learned EMMA modules to produce images conditioned on multiple modalities simultaneously, eliminating the need for additional training with mixed multi-modal prompts. Extensive experiments demonstrate the effectiveness of EMMA in maintaining high fidelity and detail in generated images, showcasing its potential as a robust solution for advanced multi-modal conditional image generation tasks.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Research on Early Warning Model of Cardiovascular Disease Based on Computer Deep Learning
Authors:
Yuxiang Hu,
**xin Hu,
Ting Xu,
Bo Zhang,
Jiajie Yuan,
Haozhang Deng
Abstract:
This project intends to study a cardiovascular disease risk early warning model based on one-dimensional convolutional neural networks. First, the missing values of 13 physiological and symptom indicators such as patient age, blood glucose, cholesterol, and chest pain were filled and Z-score was standardized. The convolutional neural network is converted into a 2D matrix, the convolution function…
▽ More
This project intends to study a cardiovascular disease risk early warning model based on one-dimensional convolutional neural networks. First, the missing values of 13 physiological and symptom indicators such as patient age, blood glucose, cholesterol, and chest pain were filled and Z-score was standardized. The convolutional neural network is converted into a 2D matrix, the convolution function of 1,3, and 5 is used for the first-order convolution operation, and the Max Pooling algorithm is adopted for dimension reduction. Set the learning rate and output rate. It is optimized by the Adam algorithm. The result of classification is output by a soft classifier. This study was conducted based on Statlog in the UCI database and heart disease database respectively. The empirical data indicate that the forecasting precision of this technique has been enhanced by 11.2%, relative to conventional approaches, while there is a significant improvement in the logarithmic curve fitting. The efficacy and applicability of the novel approach are corroborated through the examination employing a one-dimensional convolutional neural network.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Improving Adversarial Robustness via Feature Pattern Consistency Constraint
Authors:
Jiacong Hu,
**gwen Ye,
Zunlei Feng,
Jiazhen Yang,
Shunyu Liu,
Xiaotian Yu,
Lingxiang Jia,
Mingli Song
Abstract:
Convolutional Neural Networks (CNNs) are well-known for their vulnerability to adversarial attacks, posing significant security concerns. In response to these threats, various defense methods have emerged to bolster the model's robustness. However, most existing methods either focus on learning from adversarial perturbations, leading to overfitting to the adversarial examples, or aim to eliminate…
▽ More
Convolutional Neural Networks (CNNs) are well-known for their vulnerability to adversarial attacks, posing significant security concerns. In response to these threats, various defense methods have emerged to bolster the model's robustness. However, most existing methods either focus on learning from adversarial perturbations, leading to overfitting to the adversarial examples, or aim to eliminate such perturbations during inference, inevitably increasing computational burdens. Conversely, clean training, which strengthens the model's robustness by relying solely on clean examples, can address the aforementioned issues. In this paper, we align with this methodological stream and enhance its generalizability to unknown adversarial examples. This enhancement is achieved by scrutinizing the behavior of latent features within the network. Recognizing that a correct prediction relies on the correctness of the latent feature's pattern, we introduce a novel and effective Feature Pattern Consistency Constraint (FPCC) method to reinforce the latent feature's capacity to maintain the correct feature pattern. Specifically, we propose Spatial-wise Feature Modification and Channel-wise Feature Selection to enhance latent features. Subsequently, we employ the Pattern Consistency Loss to constrain the similarity between the feature pattern of the latent features and the correct feature pattern. Our experiments demonstrate that the FPCC method empowers latent features to uphold correct feature patterns even in the face of adversarial examples, resulting in inherent adversarial robustness surpassing state-of-the-art models.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Nonconvex Federated Learning on Compact Smooth Submanifolds With Heterogeneous Data
Authors:
Jiaojiao Zhang,
Jiang Hu,
Anthony Man-Cho So,
Mikael Johansson
Abstract:
Many machine learning tasks, such as principal component analysis and low-rank matrix completion, give rise to manifold optimization problems. Although there is a large body of work studying the design and analysis of algorithms for manifold optimization in the centralized setting, there are currently very few works addressing the federated setting. In this paper, we consider nonconvex federated l…
▽ More
Many machine learning tasks, such as principal component analysis and low-rank matrix completion, give rise to manifold optimization problems. Although there is a large body of work studying the design and analysis of algorithms for manifold optimization in the centralized setting, there are currently very few works addressing the federated setting. In this paper, we consider nonconvex federated learning over a compact smooth submanifold in the setting of heterogeneous client data. We propose an algorithm that leverages stochastic Riemannian gradients and a manifold projection operator to improve computational efficiency, uses local updates to improve communication efficiency, and avoids client drift. Theoretically, we show that our proposed algorithm converges sub-linearly to a neighborhood of a first-order optimal solution by using a novel analysis that jointly exploits the manifold structure and properties of the loss functions. Numerical experiments demonstrate that our algorithm has significantly smaller computational and communication overhead than existing methods.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (636 additional authors not shown)
Abstract:
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur…
▽ More
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measured in both destructive and constructive interference scenarios for the first time. The mass and width of the $η_{c}(1S)$ are measured to be $M=(2984.14 \pm 0.13 \pm 0.38)$ MeV/$c^{2}$ and $Γ=(28.82 \pm 0.11 \pm 0.82)$ MeV, respectively. Clear signals for the decays of the $χ_{cJ}(J=0,1,2)$ and the $η_{c}(2S)$ to $2(π^{+}π^{-})η$ are also observed for the first time, and the corresponding branching fractions are measured. The ratio of the branching fractions between the $η_{c}(2S)$ and $η_{c}(1S)$ decays is significantly lower than the theoretical prediction, which might suggest different dynamics in their decays.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
AI.vs.Clinician: Unveiling Intricate Interactions Between AI and Clinicians through an Open-Access Database
Authors:
Wanling Gao,
Yuan Liu,
Zhuoming Yu,
Dandan Cui,
Wen**g Liu,
Xiaoshuang Liang,
Jiahui Zhao,
Jiyue Xie,
Hao Li,
Li Ma,
Ning Ye,
Yumiao Kang,
Dingfeng Luo,
Peng Pan,
Wei Huang,
Zhongmou Liu,
Jizhong Hu,
Fan Huang,
Gangyuan Zhao,
Chongrong Jiang,
Tianyi Wei,
Zhifei Zhang,
Yunyou Huang,
Jianfeng Zhan
Abstract:
Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI f…
▽ More
Artificial Intelligence (AI) plays a crucial role in medical field and has the potential to revolutionize healthcare practices. However, the success of AI models and their impacts hinge on the synergy between AI and medical specialists, with clinicians assuming a dominant role. Unfortunately, the intricate dynamics and interactions between AI and clinicians remain undiscovered and thus hinder AI from being translated into medical practice. To address this gap, we have curated a groundbreaking database called AI.vs.Clinician. This database is the first of its kind for studying the interactions between AI and clinicians. It derives from 7,500 collaborative diagnosis records on a life-threatening medical emergency -- Sepsis -- from 14 medical centers across China. For the patient cohorts well-chosen from MIMIC databases, the AI-related information comprises the model property, feature input, diagnosis decision, and inferred probabilities of sepsis onset presently and within next three hours. The clinician-related information includes the viewed examination data and sequence, viewed time, preliminary and final diagnosis decisions with or without AI assistance, and recommended treatment.
△ Less
Submitted 15 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results
Authors:
Xin **,
Chunle Guo,
Xiaoming Li,
Zongsheng Yue,
Chongyi Li,
Shangchen Zhou,
Ruicheng Feng,
Yuekun Dai,
Peiqing Yang,
Chen Change Loy,
Ruoqi Li,
Chang Liu,
Ziyi Wang,
Yao Du,
**g**g Yang,
Long Bao,
Heng Sun,
Xiangyu Kong,
Xiaoxia Xing,
**long Wu,
Yuanyang Xue,
Hyunhee Park,
Sejun Song,
Changho Kim,
**gfan Tan
, et al. (17 additional authors not shown)
Abstract:
The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra…
▽ More
The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Few-shot RAW Image Denoising track on MIPI 2024. In total, 165 participants were successfully registered, and 7 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art erformance on Few-shot RAW Image Denoising. More details of this challenge and the link to the dataset can be found at https://mipichallenge.org/MIPI2024.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Ensemble Model With Bert,Roberta and Xlnet For Molecular property prediction
Authors:
Junling Hu
Abstract:
This paper presents a novel approach for predicting molecular properties with high accuracy without the need for extensive pre-training. Employing ensemble learning and supervised fine-tuning of BERT, RoBERTa, and XLNet, our method demonstrates significant effectiveness compared to existing advanced models. Crucially, it addresses the issue of limited computational resources faced by experimental…
▽ More
This paper presents a novel approach for predicting molecular properties with high accuracy without the need for extensive pre-training. Employing ensemble learning and supervised fine-tuning of BERT, RoBERTa, and XLNet, our method demonstrates significant effectiveness compared to existing advanced models. Crucially, it addresses the issue of limited computational resources faced by experimental groups, enabling them to accurately predict molecular properties. This innovation provides a cost-effective and resource-efficient solution, potentially advancing further research in the molecular domain.
△ Less
Submitted 30 May, 2024;
originally announced June 2024.
-
Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea…
▽ More
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The weak-$CP$ test is performed in the subsequent decays of their daughter particles $Λ$ and $\barΛ$. Also for the first time, the transverse polarizations of the $Σ^0$ hyperons in $J/ψ$ and $ψ(3686)$ decays are observed with opposite directions, and the ratios between the S-wave and D-wave contributions of the $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ decays are obtained. These results are crucial to understand the decay dynamics of the charmonium states and the production mechanism of the $Σ^0-\barΣ^0$ pairs.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,…
▽ More
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$, $8.157 \pm 0.031$~fb$^{-1}$, and $4.191 \pm 0.016$~fb$^{-1}$, respectively, by analyzing large angle Bhabha scattering events. The uncertainties are dominated by systematic effects and the statistical uncertainties are negligible. Our results provide essential input for future analyses and precision measurements.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Methodology and Real-World Applications of Dynamic Uncertain Causality Graph for Clinical Diagnosis with Explainability and Invariance
Authors:
Zhan Zhang,
Qin Zhang,
Yang Jiao,
Lin Lu,
Lin Ma,
Aihua Liu,
Xiao Liu,
Juan Zhao,
Yajun Xue,
Bing Wei,
Mingxia Zhang,
Ru Gao,
Hong Zhao,
Jie Lu,
Fan Li,
Yang Zhang,
Yiming Wang,
Lei Zhang,
Fengwei Tian,
Jie Hu,
Xin Gou
Abstract:
AI-aided clinical diagnosis is desired in medical care. Existing deep learning models lack explainability and mainly focus on image analysis. The recently developed Dynamic Uncertain Causality Graph (DUCG) approach is causality-driven, explainable, and invariant across different application scenarios, without problems of data collection, labeling, fitting, privacy, bias, generalization, high cost…
▽ More
AI-aided clinical diagnosis is desired in medical care. Existing deep learning models lack explainability and mainly focus on image analysis. The recently developed Dynamic Uncertain Causality Graph (DUCG) approach is causality-driven, explainable, and invariant across different application scenarios, without problems of data collection, labeling, fitting, privacy, bias, generalization, high cost and high energy consumption. Through close collaboration between clinical experts and DUCG technicians, 46 DUCG models covering 54 chief complaints were constructed. Over 1,000 diseases can be diagnosed without triage. Before being applied in real-world, the 46 DUCG models were retrospectively verified by third-party hospitals. The verified diagnostic precisions were no less than 95%, in which the diagnostic precision for every disease including uncommon ones was no less than 80%. After verifications, the 46 DUCG models were applied in the real-world in China. Over one million real diagnosis cases have been performed, with only 17 incorrect diagnoses identified. Due to DUCG's transparency, the mistakes causing the incorrect diagnoses were found and corrected. The diagnostic abilities of the clinicians who applied DUCG frequently were improved significantly. Following the introduction to the earlier presented DUCG methodology, the recommendation algorithm for potential medical checks is presented and the key idea of DUCG is extracted.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
QGEval: A Benchmark for Question Generation Evaluation
Authors:
Wei** Fu,
Bifan Wei,
Jianxiang Hu,
Zhongmin Cai,
Jun Liu
Abstract:
Automatically generated questions often suffer from problems such as unclear expression or factual inaccuracies, requiring a reliable and comprehensive evaluation of their quality. Human evaluation is frequently used in the field of question generation (QG) and is one of the most accurate evaluation methods. It also serves as the standard for automatic metrics. However, there is a lack of unified…
▽ More
Automatically generated questions often suffer from problems such as unclear expression or factual inaccuracies, requiring a reliable and comprehensive evaluation of their quality. Human evaluation is frequently used in the field of question generation (QG) and is one of the most accurate evaluation methods. It also serves as the standard for automatic metrics. However, there is a lack of unified evaluation criteria, which hampers the development of both QG technologies and automatic evaluation methods. To address this, we propose QGEval, a multi-dimensional Evaluation benchmark for Question Generation, which evaluates both generated questions and existing automatic metrics across 7 dimensions: fluency, clarity, conciseness, relevance, consistency, answerability, and answer consistency. We demonstrate the appropriateness of these dimensions by examining their correlations and distinctions. Analysis with QGEval reveals that 1) most QG models perform unsatisfactorily in terms of answerability and answer consistency, and 2) existing metrics fail to align well with human assessments when evaluating generated questions across the 7 dimensions. We expect this work to foster the development of both QG technologies and automatic metrics for QG.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Efficient Beamforming Feedback Information-Based Wi-Fi Sensing by Feature Selection
Authors:
Xin Li,
**gzhi Hu,
Jun Luo
Abstract:
Wi-Fi sensing leveraging plain-text beamforming feedback information (BFI) in multiple-input-multiple-output (MIMO) systems attracts increasing attention. However, due to the implicit relationship between BFI and the channel state information (CSI), quantifying the sensing capability of BFI poses a challenge in building efficient BFI-based sensing algorithms. In this letter, we first derive a math…
▽ More
Wi-Fi sensing leveraging plain-text beamforming feedback information (BFI) in multiple-input-multiple-output (MIMO) systems attracts increasing attention. However, due to the implicit relationship between BFI and the channel state information (CSI), quantifying the sensing capability of BFI poses a challenge in building efficient BFI-based sensing algorithms. In this letter, we first derive a mathematical model of BFI, characterizing its relationship with CSI explicitly, and then develop a closed-form expression of BFI for 2x2 MIMO systems. To enhance the efficiency of BFI-based sensing by selecting only the most informative features, we quantify the sensing capacity of BFI using the Cramer-Rao bound (CRB) and then propose an efficient CRB-based BFI feature selection algorithm. Simulation results verify that BFI and CSI exhibit comparable sensing capabilities and that the proposed algorithm halves the number of features, reducing 20% more parameters than baseline methods, at the cost of only slightly increasing positioning errors.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Authors:
Zanlin Ni,
Yulin Wang,
Ren** Zhou,
Jiayi Guo,
**yi Hu,
Zhiyuan Liu,
Shiji Song,
Yuan Yao,
Gao Huang
Abstract:
The field of image synthesis is currently flourishing due to the advancements in diffusion models. While diffusion models have been successful, their computational intensity has prompted the pursuit of more efficient alternatives. As a representative work, non-autoregressive Transformers (NATs) have been recognized for their rapid generation. However, a major drawback of these models is their infe…
▽ More
The field of image synthesis is currently flourishing due to the advancements in diffusion models. While diffusion models have been successful, their computational intensity has prompted the pursuit of more efficient alternatives. As a representative work, non-autoregressive Transformers (NATs) have been recognized for their rapid generation. However, a major drawback of these models is their inferior performance compared to diffusion models. In this paper, we aim to re-evaluate the full potential of NATs by revisiting the design of their training and inference strategies. Specifically, we identify the complexities in properly configuring these strategies and indicate the possible sub-optimality in existing heuristic-driven designs. Recognizing this, we propose to go beyond existing methods by directly solving the optimal strategies in an automatic framework. The resulting method, named AutoNAT, advances the performance boundaries of NATs notably, and is able to perform comparably with the latest diffusion models at a significantly reduced inference cost. The effectiveness of AutoNAT is validated on four benchmark datasets, i.e., ImageNet-256 & 512, MS-COCO, and CC3M. Our code is available at https://github.com/LeapLabTHU/ImprovedNAT.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
End-to-End Design of Polar Coded Integrated Data and Energy Networking
Authors:
Jie Hu,
**gwen Cui,
Kun Yang
Abstract:
In order to transmit data and transfer energy to the low-power Internet of Things (IoT) devices, integrated data and energy networking (IDEN) system may be harnessed. In this context, we propose a bitwise end-to-end design for polar coded IDEN systems, where the conventional encoding/decoding, modulation/demodulation, and energy harvesting (EH) modules are replaced by the neural networks (NNs). In…
▽ More
In order to transmit data and transfer energy to the low-power Internet of Things (IoT) devices, integrated data and energy networking (IDEN) system may be harnessed. In this context, we propose a bitwise end-to-end design for polar coded IDEN systems, where the conventional encoding/decoding, modulation/demodulation, and energy harvesting (EH) modules are replaced by the neural networks (NNs). In this way, the entire system can be treated as an AutoEncoder (AE) and trained in an end-to-end manner. Hence achieving global optimization. Additionally, we improve the common NN-based belief propagation (BP) decoder by adding an extra hypernetwork, which generates the corresponding NN weights for the main network under different number of iterations, thus the adaptability of the receiver architecture can be further enhanced. Our numerical results demonstrate that our BP-based end-to-end design is superior to conventional BP-based counterparts in terms of both the BER and power transfer, but it is inferior to the successive cancellation list (SCL)-based conventional IDEN system, which may be due to the inherent performance gap between the BP and SCL decoders.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
An Inexact Bregman Proximal Difference-of-Convex Algorithm with Two Types of Relative Stop** Criteria
Authors:
Lei Yang,
**g**g Hu,
Kim-Chuan Toh
Abstract:
In this paper, we consider a class of difference-of-convex (DC) optimization problems, where the global Lipschitz gradient continuity assumption on the smooth part of the objective function is not required. Such problems are prevalent in many contemporary applications such as compressed sensing, statistical regression, and machine learning, and can be solved by a general Bregman proximal DC algori…
▽ More
In this paper, we consider a class of difference-of-convex (DC) optimization problems, where the global Lipschitz gradient continuity assumption on the smooth part of the objective function is not required. Such problems are prevalent in many contemporary applications such as compressed sensing, statistical regression, and machine learning, and can be solved by a general Bregman proximal DC algorithm (BPDCA). However, the existing BPDCA is developed based on the stringent requirement that the involved subproblems must be solved exactly, which is often impractical and limits the applicability of the BPDCA. To facilitate the practical implementations and wider applications of the BPDCA, we develop an inexact Bregman proximal difference-of-convex algorithm (iBPDCA) by incorporating two types of relative-type stop** criteria for solving the subproblems. The proposed inexact framework has considerable flexibility to encompass many existing exact and inexact methods, and can accommodate different types of errors that may occur when solving the subproblem. This enables the potential application of our inexact framework across different DC decompositions to facilitate the design of a more efficient DCA scheme in practice. The global subsequential convergence and the global sequential convergence of our iBPDCA are established under suitable conditions including the Kurdyka-Łojasiewicz property. Some numerical experiments on the $\ell_{1-2}$ regularized least squares problem and the constrained $\ell_{1-2}$ sparse optimization problem are conducted to show the superior performance of our iBPDCA in comparison to existing algorithms. These results also empirically verify the necessity of develo** different types of stop** criteria to facilitate the efficient computation of the subproblem in each iteration of our iBPDCA.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Authors:
Yang Sui,
Yanyu Li,
Anil Kag,
Yerlan Idelbayev,
Junli Cao,
Ju Hu,
Dhritiman Sagar,
Bo Yuan,
Sergey Tulyakov,
Jian Ren
Abstract:
Diffusion-based image generation models have achieved great success in recent years by showing the capability of synthesizing high-quality content. However, these models contain a huge number of parameters, resulting in a significantly large model size. Saving and transferring them is a major bottleneck for various applications, especially those running on resource-constrained devices. In this wor…
▽ More
Diffusion-based image generation models have achieved great success in recent years by showing the capability of synthesizing high-quality content. However, these models contain a huge number of parameters, resulting in a significantly large model size. Saving and transferring them is a major bottleneck for various applications, especially those running on resource-constrained devices. In this work, we develop a novel weight quantization method that quantizes the UNet from Stable Diffusion v1.5 to 1.99 bits, achieving a model with 7.9X smaller size while exhibiting even better generation quality than the original one. Our approach includes several novel techniques, such as assigning optimal bits to each layer, initializing the quantized model for better performance, and improving the training strategy to dramatically reduce quantization error. Furthermore, we extensively evaluate our quantized model across various benchmark datasets and through human evaluation to demonstrate its superior generation quality.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
On Limitation of Transformer for Learning HMMs
Authors:
Jiachen Hu,
Qinghua Liu,
Chi **
Abstract:
Despite the remarkable success of Transformer-based architectures in various sequential modeling tasks, such as natural language processing, computer vision, and robotics, their ability to learn basic sequential models, like Hidden Markov Models (HMMs), is still unclear. This paper investigates the performance of Transformers in learning HMMs and their variants through extensive experimentation an…
▽ More
Despite the remarkable success of Transformer-based architectures in various sequential modeling tasks, such as natural language processing, computer vision, and robotics, their ability to learn basic sequential models, like Hidden Markov Models (HMMs), is still unclear. This paper investigates the performance of Transformers in learning HMMs and their variants through extensive experimentation and compares them to Recurrent Neural Networks (RNNs). We show that Transformers consistently underperform RNNs in both training speed and testing accuracy across all tested HMM models. There are even challenging HMM instances where Transformers struggle to learn, while RNNs can successfully do so. Our experiments further reveal the relation between the depth of Transformers and the longest sequence length it can effectively learn, based on the types and the complexity of HMMs. To address the limitation of transformers in modeling HMMs, we demonstrate that a variant of the Chain-of-Thought (CoT), called $\textit{block CoT}$ in the training phase, can help transformers to reduce the evaluation error and to learn longer sequences at a cost of increasing the training time. Finally, we complement our empirical findings by theoretical results proving the expressiveness of transformers in approximating HMMs with logarithmic depth.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
TwinS: Revisiting Non-Stationarity in Multivariate Time Series Forecasting
Authors:
Jiaxi Hu,
Qingsong Wen,
Sijie Ruan,
Li Liu,
Yuxuan Liang
Abstract:
Recently, multivariate time series forecasting tasks have garnered increasing attention due to their significant practical applications, leading to the emergence of various deep forecasting models. However, real-world time series exhibit pronounced non-stationary distribution characteristics. These characteristics are not solely limited to time-varying statistical properties highlighted by non-sta…
▽ More
Recently, multivariate time series forecasting tasks have garnered increasing attention due to their significant practical applications, leading to the emergence of various deep forecasting models. However, real-world time series exhibit pronounced non-stationary distribution characteristics. These characteristics are not solely limited to time-varying statistical properties highlighted by non-stationary Transformer but also encompass three key aspects: nested periodicity, absence of periodic distributions, and hysteresis among time variables. In this paper, we begin by validating this theory through wavelet analysis and propose the Transformer-based TwinS model, which consists of three modules to address the non-stationary periodic distributions: Wavelet Convolution, Period-Aware Attention, and Channel-Temporal Mixed MLP. Specifically, The Wavelet Convolution models nested periods by scaling the convolution kernel size like wavelet transform. The Period-Aware Attention guides attention computation by generating period relevance scores through a convolutional sub-network. The Channel-Temporal Mixed MLP captures the overall relationships between time series through channel-time mixing learning. TwinS achieves SOTA performance compared to mainstream TS models, with a maximum improvement in MSE of 25.8\% over PatchTST.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Measurement of the branching fraction ratios $R(D^{+})$ and $R(D^{*+})$ using muonic $τ$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1063 additional authors not shown)
Abstract:
The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining…
▽ More
The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining $D^+$ mesons with $τ^-\toμ^-\overlineν_μν_τ$ candidates, where the $D^+$ is reconstructed via the $D^+\to K^-π^+π^+$ decay. The results are
\begin{align*}
R(D^{+}) &= 0.249 \pm 0.043 \pm 0.047,
R(D^{*+}) &= 0.402 \pm 0.081\pm 0.085,
\end{align*}
where the first uncertainties are statistical and the second systematic. The two measurements have a correlation coefficient of $-0.39$ and are compatible with the Standard Model.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Observation of new charmonium(-like) states in $B^+ \to D^{*\pm} D^{\mp} K^+$ decays
Authors:
LHCb collaboration,
R. Aaij,
A. S. W. Abdelmotteleb,
C. Abellan Beteta,
F. Abudinén,
T. Ackernley,
A. A. Adefisoye,
B. Adeva,
M. Adinolfi,
P. Adlarson,
C. Agapopoulou,
C. A. Aidala,
Z. Ajaltouni,
S. Akar,
K. Akiba,
P. Albicocco,
J. Albrecht,
F. Alessio,
M. Alexander,
Z. Aliouche,
P. Alvarez Cartelle,
R. Amalric,
S. Amato,
J. L. Amey,
Y. Amhis
, et al. (1062 additional authors not shown)
Abstract:
A study of resonant structures in $B^{+}\rightarrow{D^{\ast+}D^{-}K^{+}}$ and $B^{+}\rightarrow{D^{\ast-}D^{+}K^{+}}$ decays is performed, using proton-proton collision data at centre-of-mass energies of $\sqrt{s}=7, 8$, and $13$ TeV recorded by the LHCb experiment, corresponding to an integrated luminosity of 9 fb$^{-1}$. A simultaneous amplitude fit is performed to the two channels with contribu…
▽ More
A study of resonant structures in $B^{+}\rightarrow{D^{\ast+}D^{-}K^{+}}$ and $B^{+}\rightarrow{D^{\ast-}D^{+}K^{+}}$ decays is performed, using proton-proton collision data at centre-of-mass energies of $\sqrt{s}=7, 8$, and $13$ TeV recorded by the LHCb experiment, corresponding to an integrated luminosity of 9 fb$^{-1}$. A simultaneous amplitude fit is performed to the two channels with contributions from resonances decaying to $D^{\ast-}D^{+}$ and $D^{\ast+}D^{-}$ states linked by $C$ parity. This procedure allows the $C$-parities of resonances in the $D^{\ast\pm}D^{\mp}$ mass spectra to be determined. Four charmonium(-like) states are observed decaying into $D^{\ast\pm}D^{\mp}$: $η_c(3945)$, $h_c(4000)$, $χ_{c1}(4010)$ and $h_c(4300)$, with quantum numbers $J^{PC}$ equal to $0^{-+}$, $1^{+-}$, $1^{++}$ and $1^{+-}$, respectively. At least three of these states have not been observed previously. In addition, the existence of the $T_{\bar{c}\bar{s}0}^{*}(2870)^{0}$ and $T_{\bar{c}\bar{s}1}^{*}(2900)^{0}$ resonances in the $D^-K^+$ mass spectrum, already observed in the $B^+ \to D^+ D^- K^+$ decay, is confirmed in a different production channel.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
A Combination Model for Time Series Prediction using LSTM via Extracting Dynamic Features Based on Spatial Smoothing and Sequential General Variational Mode Decomposition
Authors:
Jianyu Liu,
Wei Chen,
Yong Zhang,
Zhenfeng Chen,
Bin Wan,
**wei Hu
Abstract:
In order to solve the problems such as difficult to extract effective features and low accuracy of sales volume prediction caused by complex relationships such as market sales volume in time series prediction, we proposed a time series prediction method of market sales volume based on Sequential General VMD and spatial smoothing Long short-term memory neural network (SS-LSTM) combination model. Fi…
▽ More
In order to solve the problems such as difficult to extract effective features and low accuracy of sales volume prediction caused by complex relationships such as market sales volume in time series prediction, we proposed a time series prediction method of market sales volume based on Sequential General VMD and spatial smoothing Long short-term memory neural network (SS-LSTM) combination model. Firstly, the spatial smoothing algorithm is used to decompose and calculate the sample data of related industry sectors affected by the linkage effect of market sectors, extracting modal features containing information via Sequential General VMD on overall market and specific price trends; Then, according to the background of different Market data sets, LSTM network is used to model and predict the price of fundamental data and modal characteristics. The experimental results of data prediction with seasonal and periodic trends show that this method can achieve higher price prediction accuracy and more accurate accuracy in specific market contexts compared to traditional prediction methods Describe the changes in market sales volume.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models
Authors:
Jerry Yao-Chieh Hu,
Maojiang Su,
En-Jui Kuo,
Zhao Song,
Han Liu
Abstract:
We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of n…
▽ More
We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of nearly linear algorithms by controlling the LoRA update computation term by term, assuming the Strong Exponential Time Hypothesis (SETH). For the former, we identify a sharp transition in the efficiency of all possible rank-$r$ LoRA update algorithms for transformers, based on specific norms resulting from the multiplications of the input sequence $\mathbf{X}$, pretrained weights $\mathbf{W^\star}$, and adapter matrices $α\mathbf{B} \mathbf{A} / r$. Specifically, we derive a shared upper bound threshold for such norms and show that efficient (sub-quadratic) approximation algorithms of LoRA exist only below this threshold. For the latter, we prove the existence of nearly linear approximation algorithms for LoRA adaptation by utilizing the hierarchical low-rank structures of LoRA gradients and approximating the gradients with a series of chained low-rank approximations. To showcase our theory, we consider two practical scenarios: partial (e.g., only $\mathbf{W}_V$ and $\mathbf{W}_Q$) and full adaptations (e.g., $\mathbf{W}_Q$, $\mathbf{W}_V$, and $\mathbf{W}_K$) of weights in attention heads.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for…
▽ More
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for $h_c \to K^+ K^- π^0$ and $h_c \to K^+ K^- η$ are found with significances of $3.5σ$ and $3.3σ$, respectively, after considering the systematic uncertainties. The branching fractions of these decays are measured to be $\mathcal{B}(h_c \to π^+ π^- π^0)=(1.36\pm0.16\pm0.14)\times10^{-3}$, $\mathcal{B}(h_c \to K^+ K^- π^0)=(3.26\pm0.84\pm0.36)\times10^{-4}$, and $\mathcal{B}(h_c \to K^+ K^- η)=(3.13\pm1.08\pm0.38)\times10^{-4}$, where the first uncertainties are statistical and the second are systematic. No significant signal of $h_c\toπ^+π^-η$ is found, and the upper limit of its decay branching fraction is determined to be $\mathcal{B}(h_c\toπ^+π^-η) < 4.0 \times 10^{-4}$ at 90% confidence level.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Dynamically Expanding Capacity of Autonomous Driving with Near-Miss Focused Training Framework
Authors:
Ziyuan Yang,
Zhaoyang Li,
Jianming Hu,
Yi Zhang
Abstract:
The long-tail distribution of real driving data poses challenges for training and testing autonomous vehicles (AV), where rare yet crucial safety-critical scenarios are infrequent. And virtual simulation offers a low-cost and efficient solution. This paper proposes a near-miss focused training framework for AV. Utilizing the driving scenario information provided by sensors in the simulator, we des…
▽ More
The long-tail distribution of real driving data poses challenges for training and testing autonomous vehicles (AV), where rare yet crucial safety-critical scenarios are infrequent. And virtual simulation offers a low-cost and efficient solution. This paper proposes a near-miss focused training framework for AV. Utilizing the driving scenario information provided by sensors in the simulator, we design novel reward functions, which enable background vehicles (BV) to generate near-miss scenarios and ensure gradients exist not only in collision-free scenes but also in collision scenarios. And then leveraging the Robust Adversarial Reinforcement Learning (RARL) framework for simultaneous training of AV and BV to gradually enhance AV and BV capabilities, as well as generating near-miss scenarios tailored to different levels of AV capabilities. Results from three testing strategies indicate that the proposed method generates scenarios closer to near-miss, thus enhancing the capabilities of both AVs and BVs throughout training.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Safeguarding Large Language Models: A Survey
Authors:
Yi Dong,
Ronghui Mu,
Yanghao Zhang,
Siqi Sun,
Tianle Zhang,
Changshun Wu,
Gaojie **,
Yi Qi,
**wei Hu,
Jie Meng,
Saddek Bensalem,
Xiaowei Huang
Abstract:
In the burgeoning field of Large Language Models (LLMs), develo** a robust safety mechanism, colloquially known as "safeguards" or "guardrails", has become imperative to ensure the ethical use of LLMs within prescribed boundaries. This article provides a systematic literature review on the current status of this critical mechanism. It discusses its major challenges and how it can be enhanced int…
▽ More
In the burgeoning field of Large Language Models (LLMs), develo** a robust safety mechanism, colloquially known as "safeguards" or "guardrails", has become imperative to ensure the ethical use of LLMs within prescribed boundaries. This article provides a systematic literature review on the current status of this critical mechanism. It discusses its major challenges and how it can be enhanced into a comprehensive mechanism dealing with ethical issues in various contexts. First, the paper elucidates the current landscape of safeguarding mechanisms that major LLM service providers and the open-source community employ. This is followed by the techniques to evaluate, analyze, and enhance some (un)desirable properties that a guardrail might want to enforce, such as hallucinations, fairness, privacy, and so on. Based on them, we review techniques to circumvent these controls (i.e., attacks), to defend the attacks, and to reinforce the guardrails. While the techniques mentioned above represent the current status and the active research trends, we also discuss several challenges that cannot be easily dealt with by the methods and present our vision on how to implement a comprehensive guardrail through the full consideration of multi-disciplinary approach, neural-symbolic method, and systems development lifecycle.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Learning Image Priors through Patch-based Diffusion Models for Solving Inverse Problems
Authors:
Jason Hu,
Bowen Song,
Xiaojian Xu,
Liyue Shen,
Jeffrey A. Fessler
Abstract:
Diffusion models can learn strong image priors from underlying data distribution and use them to solve inverse problems, but the training process is computationally expensive and requires lots of data. Such bottlenecks prevent most existing works from being feasible for high-dimensional and high-resolution data such as 3D images. This paper proposes a method to learn an efficient data prior for th…
▽ More
Diffusion models can learn strong image priors from underlying data distribution and use them to solve inverse problems, but the training process is computationally expensive and requires lots of data. Such bottlenecks prevent most existing works from being feasible for high-dimensional and high-resolution data such as 3D images. This paper proposes a method to learn an efficient data prior for the entire image by training diffusion models only on patches of images. Specifically, we propose a patch-based position-aware diffusion inverse solver, called PaDIS, where we obtain the score function of the whole image through scores of patches and their positional encoding and utilize this as the prior for solving inverse problems. First of all, we show that this diffusion model achieves an improved memory efficiency and data efficiency while still maintaining the capability to generate entire images via positional encoding. Additionally, the proposed PaDIS model is highly flexible and can be plugged in with different diffusion inverse solvers (DIS). We demonstrate that the proposed PaDIS approach enables solving various inverse problems in both natural and medical image domains, including CT reconstruction, deblurring, and superresolution, given only patch-based priors. Notably, PaDIS outperforms previous DIS methods trained on entire image priors in the case of limited training data, demonstrating the data efficiency of our proposed approach by learning patch-based prior.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Authors:
Zefan Cai.,
Yichi Zhang,
Bofei Gao,
Yuliang Liu,
Tianyu Liu,
Keming Lu,
Wayne Xiong,
Yue Dong,
Baobao Chang,
Junjie Hu,
Wen Xiao
Abstract:
In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately foc…
▽ More
In this study, we investigate whether attention-based information flow inside large language models (LLMs) is aggregated through noticeable patterns for long context processing. Our observations reveal that LLMs aggregate information through Pyramidal Information Funneling where attention is scattering widely in lower layers, progressively consolidating within specific contexts, and ultimately focusin on critical tokens (a.k.a massive activation or attention sink) in higher layers. Motivated by these insights, we developed PyramidKV, a novel and effective KV cache compression method. This approach dynamically adjusts the KV cache size across different layers, allocating more cache in lower layers and less in higher ones, diverging from traditional methods that maintain a uniform KV cache size. Our experimental evaluations, utilizing the LongBench benchmark, show that PyramidKV matches the performance of models with a full KV cache while retaining only 12% of the KV cache, thus significantly reducing memory usage. In scenarios emphasizing memory efficiency, where only 0.7% of the KV cache is maintained, PyramidKV surpasses other KV cache compression techniques achieving up to a 20.5 absolute accuracy improvement on TREC.
△ Less
Submitted 16 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.