-
Triage of 3D pathology data via 2.5D multiple-instance learning to guide pathologist assessments
Authors:
Gan Gao,
Andrew H. Song,
Fiona Wang,
David Brenes,
Rui Wang,
Sarah S. L. Chow,
Kevin W. Bishop,
Lawrence D. True,
Faisal Mahmood,
Jonathan T. C. Liu
Abstract:
Accurate patient diagnoses based on human tissue biopsies are hindered by current clinical practice, where pathologists assess only a limited number of thin 2D tissue slices sectioned from 3D volumetric tissue. Recent advances in non-destructive 3D pathology, such as open-top light-sheet microscopy, enable comprehensive imaging of spatially heterogeneous tissue morphologies, offering the feasibili…
▽ More
Accurate patient diagnoses based on human tissue biopsies are hindered by current clinical practice, where pathologists assess only a limited number of thin 2D tissue slices sectioned from 3D volumetric tissue. Recent advances in non-destructive 3D pathology, such as open-top light-sheet microscopy, enable comprehensive imaging of spatially heterogeneous tissue morphologies, offering the feasibility to improve diagnostic determinations. A potential early route towards clinical adoption for 3D pathology is to rely on pathologists for final diagnosis based on viewing familiar 2D H&E-like image sections from the 3D datasets. However, manual examination of the massive 3D pathology datasets is infeasible. To address this, we present CARP3D, a deep learning triage approach that automatically identifies the highest-risk 2D slices within 3D volumetric biopsy, enabling time-efficient review by pathologists. For a given slice in the biopsy, we estimate its risk by performing attention-based aggregation of 2D patches within each slice, followed by pooling of the neighboring slices to compute a context-aware 2.5D risk score. For prostate cancer risk stratification, CARP3D achieves an area under the curve (AUC) of 90.4% for triaging slices, outperforming methods relying on independent analysis of 2D sections (AUC=81.3%). These results suggest that integrating additional depth context enhances the model's discriminative capabilities. In conclusion, CARP3D has the potential to improve pathologist diagnosis via accurate triage of high-risk slices within large-volume 3D pathology datasets.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Electrically Tunable Magnetoconductance of Close-Packed CVD Bilayer Graphene Layer Stacking Walls
Authors:
Qicheng Zhang,
Sheng Wang,
Zhaoli Gao,
Sebastian Hurtado-Parra,
Joel Berry,
Zachariah Addison,
Paul Masih Das,
William M. Parkin,
Marija Drndic,
James M. Kikkawa,
Feng Wang,
Eugene J. Mele,
A. T. Charlie Johnson,
Zhengtang Luo
Abstract:
Quantum valley Hall (QVH) domain wall states are a new class of one-dimensional (1D) one-way conductors that are topologically protected in the absence of valley mixing. Development beyond a single QVH channel raises important new questions as to how QVH channels in close spatial proximity interact with each other, and how that interaction may be controlled. Scalable epitaxial bilayer graphene syn…
▽ More
Quantum valley Hall (QVH) domain wall states are a new class of one-dimensional (1D) one-way conductors that are topologically protected in the absence of valley mixing. Development beyond a single QVH channel raises important new questions as to how QVH channels in close spatial proximity interact with each other, and how that interaction may be controlled. Scalable epitaxial bilayer graphene synthesis produces layer stacking wall (LSW) bundles, where QVH channels are bound, providing an excellent platform to study QVH channel interactions. Here we show that distinct strain sources lead to the formation of both well-separated LSWs and close packed LSW bundles. Comparative studies of electronic transport in these two regimes reveal that close-packed LSW bundles support electrically tunable magnetoconductance. The coexistence of different strain sources offers a potential pathway to realize scalable quantum transport platform based on LSWs where electrically tunability enables programmable functionality.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
A quasar-galaxy merger at $z\sim 6.2$: rapid host growth via accretion of two massive satellite galaxies
Authors:
Roberto Decarli,
Federica Loiacono,
Emanuele Paolo Farina,
Massimo Dotti,
Alessandro Lupi,
Romain A. Meyer,
Marco Mignoli,
Antonio Pensabene,
Michael A. Strauss,
Bram Venemans,
**yi Yang,
Fabian Walter,
Julien Wolf,
Eduardo Bañados,
Laura Blecha,
Sarah Bosman,
Chris L. Carilli,
Andrea Comastri,
Thomas Connor,
Tiago Costa,
Anna-Christina Eilers,
Xiaohui Fan,
Roberto Gilli,
Hyunsung D. Jun,
Weizhe Liu
, et al. (16 additional authors not shown)
Abstract:
We present JWST/NIRSpec Integral Field Spectroscopy in the rest-frame optical bands of the system PJ308-21, a quasar at $z=6.2342$ caught as its host galaxy interacts with companion galaxies. We detect spatially extended emission of several emission lines (H$α$, H$β$, [OIII], [NII], [SII], HeII), which we use to study the properties of the ionized phase of the interstellar medium: the source and h…
▽ More
We present JWST/NIRSpec Integral Field Spectroscopy in the rest-frame optical bands of the system PJ308-21, a quasar at $z=6.2342$ caught as its host galaxy interacts with companion galaxies. We detect spatially extended emission of several emission lines (H$α$, H$β$, [OIII], [NII], [SII], HeII), which we use to study the properties of the ionized phase of the interstellar medium: the source and hardness of the photoionizing radiation field, metallicity, dust reddening, electron density and temperature, and star formation. We also marginally detect continuum starlight emission associated with the companion sources. We find that at least two independent satellite galaxies are part of the system. While the quasar host appears highly enriched and obscured, with AGN-like photoionization conditions, the western companion shows minimal dust extinction, low metallicity ($Z\sim0.4$ Z$_\odot$), and star-formation driven photoionization. The eastern companion shows higher extinction and metallicity ($Z\sim0.8$ Z$_\odot$) compared to the western companion, and it is at least partially photoionized by the nearby quasar. We do not find any indication of AGN in the companion sources. Our study shows that while the quasar host galaxy is already very massive ($M_{\rm dyn}>10^{11}$ M$_\odot$), it is still rapidly building up by accreting two relatively massive ($M_{\rm star}\sim 10^{10}$ M$_\odot$) companion sources. This dataset showcases the power of JWST in exposing the build-up of massive galaxies in the first Gyr of the Universe.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
RAG-based Crowdsourcing Task Decomposition via Masked Contrastive Learning with Prompts
Authors:
**g Yang,
Xiao Wang,
Yu Zhao,
Yuhang Liu,
Fei-Yue Wang
Abstract:
Crowdsourcing is a critical technology in social manufacturing, which leverages an extensive and boundless reservoir of human resources to handle a wide array of complex tasks. The successful execution of these complex tasks relies on task decomposition (TD) and allocation, with the former being a prerequisite for the latter. Recently, pre-trained language models (PLMs)-based methods have garnered…
▽ More
Crowdsourcing is a critical technology in social manufacturing, which leverages an extensive and boundless reservoir of human resources to handle a wide array of complex tasks. The successful execution of these complex tasks relies on task decomposition (TD) and allocation, with the former being a prerequisite for the latter. Recently, pre-trained language models (PLMs)-based methods have garnered significant attention. However, they are constrained to handling straightforward common-sense tasks due to their inherent restrictions involving limited and difficult-to-update knowledge as well as the presence of hallucinations. To address these issues, we propose a retrieval-augmented generation-based crowdsourcing framework that reimagines TD as event detection from the perspective of natural language understanding. However, the existing detection methods fail to distinguish differences between event types and always depend on heuristic rules and external semantic analyzing tools. Therefore, we present a Prompt-Based Contrastive learning framework for TD (PBCT), which incorporates a prompt-based trigger detector to overcome dependence. Additionally, trigger-attentive sentinel and masked contrastive learning are introduced to provide varying attention to trigger and contextual features according to different event types. Experiment results demonstrate the competitiveness of our method in both supervised and zero-shot detection. A case study on printed circuit board manufacturing is showcased to validate its adaptability to unknown professional domains.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea…
▽ More
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The weak-$CP$ test is performed in the subsequent decays of their daughter particles $Λ$ and $\barΛ$. Also for the first time, the transverse polarizations of the $Σ^0$ hyperons in $J/ψ$ and $ψ(3686)$ decays are observed with opposite directions, and the ratios between the S-wave and D-wave contributions of the $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ decays are obtained. These results are crucial to understand the decay dynamics of the charmonium states and the production mechanism of the $Σ^0-\barΣ^0$ pairs.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,…
▽ More
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$, $8.157 \pm 0.031$~fb$^{-1}$, and $4.191 \pm 0.016$~fb$^{-1}$, respectively, by analyzing large angle Bhabha scattering events. The uncertainties are dominated by systematic effects and the statistical uncertainties are negligible. Our results provide essential input for future analyses and precision measurements.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
SPA-SVC: Self-supervised Pitch Augmentation for Singing Voice Conversion
Authors:
Bingsong Bai,
Feng** Wang,
Yingming Gao,
Ya Li
Abstract:
Diffusion-based singing voice conversion (SVC) models have shown better synthesis quality compared to traditional methods. However, in cross-domain SVC scenarios, where there is a significant disparity in pitch between the source and target voice domains, the models tend to generate audios with hoarseness, posing challenges in achieving high-quality vocal outputs. Therefore, in this paper, we prop…
▽ More
Diffusion-based singing voice conversion (SVC) models have shown better synthesis quality compared to traditional methods. However, in cross-domain SVC scenarios, where there is a significant disparity in pitch between the source and target voice domains, the models tend to generate audios with hoarseness, posing challenges in achieving high-quality vocal outputs. Therefore, in this paper, we propose a Self-supervised Pitch Augmentation method for Singing Voice Conversion (SPA-SVC), which can enhance the voice quality in SVC tasks without requiring additional data or increasing model parameters. We innovatively introduce a cycle pitch shifting training strategy and Structural Similarity Index (SSIM) loss into our SVC model, effectively enhancing its performance. Experimental results on the public singing datasets M4Singer indicate that our proposed method significantly improves model performance in both general SVC scenarios and particularly in cross-domain SVC scenarios.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Mmm whatcha say? Uncovering distal and proximal context effects in first and second-language word perception using psychophysical reverse correlation
Authors:
Paige Tuttösí,
H. Henny Yeung,
Yue Wang,
Fenqi Wang,
Guillaume Denis,
Jean-Julien Aucouturier,
Angelica Lim
Abstract:
Acoustic context effects, where surrounding changes in pitch, rate or timbre influence the perception of a sound, are well documented in speech perception, but how they interact with language background remains unclear. Using a reverse-correlation approach, we systematically varied the pitch and speech rate in phrases around different pairs of vowels for second language (L2) speakers of English (/…
▽ More
Acoustic context effects, where surrounding changes in pitch, rate or timbre influence the perception of a sound, are well documented in speech perception, but how they interact with language background remains unclear. Using a reverse-correlation approach, we systematically varied the pitch and speech rate in phrases around different pairs of vowels for second language (L2) speakers of English (/i/-/I/) and French (/u/-/y/), thus reconstructing, in a data-driven manner, the prosodic profiles that bias their perception. Testing English and French speakers (n=25), we showed that vowel perception is in fact influenced by conflicting effects from the surrounding pitch and speech rate: a congruent proximal effect 0.2s pre-target and a distal contrastive effect up to 1s before; and found that L1 and L2 speakers exhibited strikingly similar prosodic profiles in perception. We provide a novel method to investigate acoustic context effects across stimuli, timescales, and acoustic domain.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
AttnDreamBooth: Towards Text-Aligned Personalized Text-to-Image Generation
Authors:
Lianyu Pang,
Jian Yin,
Baoquan Zhao,
Feize Wu,
Fu Lee Wang,
Qing Li,
Xudong Mao
Abstract:
Recent advances in text-to-image models have enabled high-quality personalized image synthesis of user-provided concepts with flexible textual control. In this work, we analyze the limitations of two primary techniques in text-to-image personalization: Textual Inversion and DreamBooth. When integrating the learned concept into new prompts, Textual Inversion tends to overfit the concept, while Drea…
▽ More
Recent advances in text-to-image models have enabled high-quality personalized image synthesis of user-provided concepts with flexible textual control. In this work, we analyze the limitations of two primary techniques in text-to-image personalization: Textual Inversion and DreamBooth. When integrating the learned concept into new prompts, Textual Inversion tends to overfit the concept, while DreamBooth often overlooks it. We attribute these issues to the incorrect learning of the embedding alignment for the concept. We introduce AttnDreamBooth, a novel approach that addresses these issues by separately learning the embedding alignment, the attention map, and the subject identity in different training stages. We also introduce a cross-attention map regularization term to enhance the learning of the attention map. Our method demonstrates significant improvements in identity preservation and text alignment compared to the baseline methods.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
ADBA:Approximation Decision Boundary Approach for Black-Box Adversarial Attacks
Authors:
Feiyang Wang,
Xingquan Zuo,
Hai Huang,
Gang Chen
Abstract:
Many machine learning models are susceptible to adversarial attacks, with decision-based black-box attacks representing the most critical threat in real-world applications. These attacks are extremely stealthy, generating adversarial examples using hard labels obtained from the target machine learning model. This is typically realized by optimizing perturbation directions, guided by decision bound…
▽ More
Many machine learning models are susceptible to adversarial attacks, with decision-based black-box attacks representing the most critical threat in real-world applications. These attacks are extremely stealthy, generating adversarial examples using hard labels obtained from the target machine learning model. This is typically realized by optimizing perturbation directions, guided by decision boundaries identified through query-intensive exact search, significantly limiting the attack success rate. This paper introduces a novel approach using the Approximation Decision Boundary (ADB) to efficiently and accurately compare perturbation directions without precisely determining decision boundaries. The effectiveness of our ADB approach (ADBA) hinges on promptly identifying suitable ADB, ensuring reliable differentiation of all perturbation directions. For this purpose, we analyze the probability distribution of decision boundaries, confirming that using the distribution's median value as ADB can effectively distinguish different perturbation directions, giving rise to the development of the ADBA-md algorithm. ADBA-md only requires four queries on average to differentiate any pair of perturbation directions, which is highly query-efficient. Extensive experiments on six well-known image classifiers clearly demonstrate the superiority of ADBA and ADBA-md over multiple state-of-the-art black-box attacks. The source code is available at https://github.com/BUPTAIOC/ADBA.
△ Less
Submitted 12 June, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.
-
Predicting Polymer Properties Based on Multimodal Multitask Pretraining
Authors:
Fanmeng Wang,
Wentao Guo,
Minjie Cheng,
Shen Yuan,
Hongteng Xu,
Zhifeng Gao
Abstract:
In the past few decades, polymers, high-molecular-weight compounds formed by bonding numerous identical or similar monomers covalently, have played an essential role in various scientific fields. In this context, accurate prediction of their properties is becoming increasingly crucial. Typically, the properties of a polymer, such as plasticity, conductivity, bio-compatibility, and so on, are highl…
▽ More
In the past few decades, polymers, high-molecular-weight compounds formed by bonding numerous identical or similar monomers covalently, have played an essential role in various scientific fields. In this context, accurate prediction of their properties is becoming increasingly crucial. Typically, the properties of a polymer, such as plasticity, conductivity, bio-compatibility, and so on, are highly correlated with its 3D structure. However, current methods for predicting polymer properties heavily rely on information from polymer SMILES sequences (P-SMILES strings) while ignoring crucial 3D structural information, leading to sub-optimal performance. In this work, we propose MMPolymer, a novel multimodal multitask pretraining framework incorporating both polymer 1D sequential information and 3D structural information to enhance downstream polymer property prediction tasks. Besides, to overcome the limited availability of polymer 3D data, we further propose the "Star Substitution" strategy to extract 3D structural information effectively. During pretraining, MMPolymer not only predicts masked tokens and recovers 3D coordinates but also achieves the cross-modal alignment of latent representation. Subsequently, we further fine-tune the pretrained MMPolymer for downstream polymer property prediction tasks in the supervised learning paradigm. Experimental results demonstrate that MMPolymer achieves state-of-the-art performance in various polymer property prediction tasks. Moreover, leveraging the pretrained MMPolymer and using only one modality (either P-SMILES string or 3D conformation) during fine-tuning can also surpass existing polymer property prediction methods, highlighting the exceptional capability of MMPolymer in polymer feature extraction and utilization. Our online platform for polymer property prediction is available at https://app.bohrium.dp.tech/mmpolymer.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
GLACE: Global Local Accelerated Coordinate Encoding
Authors:
Fang**hua Wang,
Xudong Jiang,
Silvano Galliani,
Christoph Vogel,
Marc Pollefeys
Abstract:
Scene coordinate regression (SCR) methods are a family of visual localization methods that directly regress 2D-3D matches for camera pose estimation. They are effective in small-scale scenes but face significant challenges in large-scale scenes that are further amplified in the absence of ground truth 3D point clouds for supervision. Here, the model can only rely on reprojection constraints and ne…
▽ More
Scene coordinate regression (SCR) methods are a family of visual localization methods that directly regress 2D-3D matches for camera pose estimation. They are effective in small-scale scenes but face significant challenges in large-scale scenes that are further amplified in the absence of ground truth 3D point clouds for supervision. Here, the model can only rely on reprojection constraints and needs to implicitly triangulate the points. The challenges stem from a fundamental dilemma: The network has to be invariant to observations of the same landmark at different viewpoints and lighting conditions, etc., but at the same time discriminate unrelated but similar observations. The latter becomes more relevant and severe in larger scenes. In this work, we tackle this problem by introducing the concept of co-visibility to the network. We propose GLACE, which integrates pre-trained global and local encodings and enables SCR to scale to large scenes with only a single small-sized network. Specifically, we propose a novel feature diffusion technique that implicitly groups the reprojection constraints with co-visibility and avoids overfitting to trivial solutions. Additionally, our position decoder parameterizes the output positions for large-scale scenes more effectively. Without using 3D models or depth maps for supervision, our method achieves state-of-the-art results on large-scale scenes with a low-map-size model. On Cambridge landmarks, with a single model, we achieve 17% lower median position error than Poker, the ensemble variant of the state-of-the-art SCR method ACE. Code is available at: https://github.com/cvg/glace.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Enhanced Semantic Segmentation Pipeline for WeatherProof Dataset Challenge
Authors:
Nan Zhang,
Xidan Zhang,
Jianing Wei,
Fangjun Wang,
Zhiming Tan
Abstract:
This report describes the winning solution to the WeatherProof Dataset Challenge (CVPR 2024 UG2+ Track 3). Details regarding the challenge are available at https://cvpr2024ug2challenge.github.io/track3.html. We propose an enhanced semantic segmentation pipeline for this challenge. Firstly, we improve semantic segmentation models, using backbone pretrained with Depth Anything to improve UperNet mod…
▽ More
This report describes the winning solution to the WeatherProof Dataset Challenge (CVPR 2024 UG2+ Track 3). Details regarding the challenge are available at https://cvpr2024ug2challenge.github.io/track3.html. We propose an enhanced semantic segmentation pipeline for this challenge. Firstly, we improve semantic segmentation models, using backbone pretrained with Depth Anything to improve UperNet model and SETRMLA model, and adding language guidance based on both weather and category information to InternImage model. Secondly, we introduce a new dataset WeatherProofExtra with wider viewing angle and employ data augmentation methods, including adverse weather and super-resolution. Finally, effective training strategies and ensemble method are applied to improve final performance further. Our solution is ranked 1st on the final leaderboard. Code will be available at https://github.com/KaneiGi/WeatherProofChallenge.
△ Less
Submitted 6 June, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
The formation rate and luminosity function of fast radio bursts
Authors:
J. H. Chen,
X. D. Jia,
X. F. Dong,
F. Y. Wang
Abstract:
Fast radio bursts (FRBs) are millisecond-duration flashes with unknown origins. Its formation rate is crucial for unveiling physical origins. However, the luminosity and formation rate are degenerated when directly fitting the redshift distribution of FRBs. In contrast to previous forward-fitting methods, we use the Lynden-Bell's $c^{-}$ method to derive luminosity function and formation rate of F…
▽ More
Fast radio bursts (FRBs) are millisecond-duration flashes with unknown origins. Its formation rate is crucial for unveiling physical origins. However, the luminosity and formation rate are degenerated when directly fitting the redshift distribution of FRBs. In contrast to previous forward-fitting methods, we use the Lynden-Bell's $c^{-}$ method to derive luminosity function and formation rate of FRBs without any assumptions. Using the non-repeating FRBs from the first CHIME/FRB catalog, we find a relatively strong luminosity evolution, and luminosity function can be fitted by a broken power-law model with a break at $1.33\times10^{41}\ \mathrm{erg}\ \mathrm{s}^{-1}$. The formation rate declines rapidly as $(1+z)^{-4.9\pm0.3}$ with a local rate $1.13\times10^4\ \mathrm{Gpc}^{-3}\ \mathrm{yr}^{-1}$. This monotonic decrease is similar to the rate of short gamma-ray bursts. After comparing it with star formation rate and stellar mass density, we conclude that the old populations including neutron stars and black holes, are closely related to the origins of FRBs. Monte Carlo simulations are used to test our results. The distributions of mock sample are consistent with the observational data.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Study of hybrid stars with nonstrange quark matter cores
Authors:
Cheng-Ming Li,
He-Rui Zheng,
Shu-Yu Zuo,
Ya-Peng Zhao,
Fei Wang,
Yong-Feng Huang
Abstract:
In this work, under the hypothesis that quark matter may not be strange [Phys. Rev. Lett. 120, 222001 (2018)], we adopt a modification of the coupling constant of the four-quark scalar interaction $G\rightarrow G_1+G_2\langle\barψψ\rangle$ in the 2-flavor Nambu-Jona-Lasinio model to study nonstrange hybrid stars. According to lattice QCD simulation results of the critical temperature at zero chemi…
▽ More
In this work, under the hypothesis that quark matter may not be strange [Phys. Rev. Lett. 120, 222001 (2018)], we adopt a modification of the coupling constant of the four-quark scalar interaction $G\rightarrow G_1+G_2\langle\barψψ\rangle$ in the 2-flavor Nambu-Jona-Lasinio model to study nonstrange hybrid stars. According to lattice QCD simulation results of the critical temperature at zero chemical potential, $G_1$ and $G_2$ are constrained as $G_1\in(1.935, 1.972)$ GeV$^{-2}$, and $G_2\in(-1.582, -0.743)$ GeV$^{-5}$, respectively. To obtain hybrid equation of states, the Maxwell construction is used to describe the first-order confinement-deconfinement phase transition in hybrid stars. With recent measurements on neutron star mass, radius, and tidal deformability, the hybrid equation of states are constrained. The result suggests that pure nonstrange quark matter cores can exist in hybrid stars, possessing 0.014-0.026 solar mass with the bag constant $B^{1/4}$ in a range of 148-161 MeV. It is argued that the binary neutron stars in GW170817 should be hadron stars.
△ Less
Submitted 23 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for…
▽ More
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for $h_c \to K^+ K^- π^0$ and $h_c \to K^+ K^- η$ are found with significances of $3.5σ$ and $3.3σ$, respectively, after considering the systematic uncertainties. The branching fractions of these decays are measured to be $\mathcal{B}(h_c \to π^+ π^- π^0)=(1.36\pm0.16\pm0.14)\times10^{-3}$, $\mathcal{B}(h_c \to K^+ K^- π^0)=(3.26\pm0.84\pm0.36)\times10^{-4}$, and $\mathcal{B}(h_c \to K^+ K^- η)=(3.13\pm1.08\pm0.38)\times10^{-4}$, where the first uncertainties are statistical and the second are systematic. No significant signal of $h_c\toπ^+π^-η$ is found, and the upper limit of its decay branching fraction is determined to be $\mathcal{B}(h_c\toπ^+π^-η) < 4.0 \times 10^{-4}$ at 90% confidence level.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Uncorrelated estimations of $H_0$ redshift evolution from DESI baryon acoustic oscillation observations
Authors:
X. D. Jia,
J. P. Hu,
F. Y. Wang
Abstract:
The Dark Energy Spectroscopic Instrumnet (DESI) collaboration recently released the first year data of baryon acoustic oscillations (BAOs). Basing on the five different tracers, the cosmological constraint shows a hint of deviation from the standard $Λ$CDM model. In this letter, We combine the DESI BAOs with other cosmic probes to constrain the evolution of Hubble constant as a function of redshif…
▽ More
The Dark Energy Spectroscopic Instrumnet (DESI) collaboration recently released the first year data of baryon acoustic oscillations (BAOs). Basing on the five different tracers, the cosmological constraint shows a hint of deviation from the standard $Λ$CDM model. In this letter, We combine the DESI BAOs with other cosmic probes to constrain the evolution of Hubble constant as a function of redshift in flat $Λ$CDM model. The non-parametric method is used to estimate the value of Hubble constant at different redshift bins. The correlation among different bins are removed by diagonalizing the covariance matrix. The joint data sample demonstrate a decreasing trend of Hubble constant with a significance of $8.6 σ$, which can naturally resolve the Hubble tension. It may be due to dynamical dark energy or modified gravity.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Pionic transitions of the spin-2 partner of $X(3872)$ to $χ_{cJ}$
Authors:
Shi-Dong Liu,
Fan Wang,
Zhao-Sai Jia,
Gang Li,
Xiao-Hai Liu,
Ju-Jun Xie
Abstract:
We investigated the pionic transitions between the $X_2$ [spin-2 partner of the $X(3872)$] and $χ_{c1,2}$ using a nonrelativistic effective field theory. The $X_2$ is assumed to be a bound state of the $D^{*}$ and $\bar{D}^*$ mesons and to decay through several kinds of loops, including the bubble, triangle and box loops. Within the present model, the widths for the single-pion decays…
▽ More
We investigated the pionic transitions between the $X_2$ [spin-2 partner of the $X(3872)$] and $χ_{c1,2}$ using a nonrelativistic effective field theory. The $X_2$ is assumed to be a bound state of the $D^{*}$ and $\bar{D}^*$ mesons and to decay through several kinds of loops, including the bubble, triangle and box loops. Within the present model, the widths for the single-pion decays $X_2\toπ^0χ_{cJ}$ are predicted to be about $3$--$30$ keV. For the dipion decays, the widths are a few keVs. These widths yield a branching fraction of $10^{-3}$--$10^{-2}$. The ratio $R_{\mathrm{c}0}=Γ(X_2\toπ^+π^-χ_{cJ})/Γ(X_2\toπ^0π^0χ_{cJ}) \simeq 1.6$, which is a bit smaller than the expected value of $2$, and $R_{21}=Γ(X_2\toππχ_{c2})/Γ(X_2\toππχ_{c1}) \simeq 0.85$. These ratios are nearly independent of the $X_2$ mass and the coupling constants, which might be a good quantity for the experiments. Moreover, the invariant mass spectra of the $π^0χ_{cJ}$ final state for the dipion processes are presented, showing a cusp structure at the $D {\bar D}^*$ threshold enhanced and narrowed by the nearby triangle singularity.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are…
▽ More
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are ${\mathcal B}(D_s^+\to ηe^+ν_e)=(2.35\pm0.11_{\rm stat}\pm 0.10_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to η^\prime e^+ν_e)=(0.82\pm0.09_{\rm stat}\pm 0.04_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to φe^+ν_e)=(2.21\pm0.16_{\rm stat}\pm 0.11_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to f_0(980) e^+ν_e,f_0(980)\toπ^+π^-)=(0.15\pm0.02_{\rm stat}\pm 0.01_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to K^0 e^+ν_e)=(0.24\pm0.04_{\rm stat}\pm 0.01_{\rm syst})\%,$ and ${\mathcal B}(D_s^+\to K^{*0} e^+ν_e)=(0.19\pm0.03_{\rm stat}\pm 0.01_{\rm syst})\%.$ These results are consistent with those measured via the $e^+e^-\to D_s^{*\pm}D_s^{\mp}$ process by BESIII and CLEO. The hadronic transition form factors $D^+_s\to ηe^+ν_e$, $D^+_s\to η^\prime e^+ν_e$, and $D^+_s\to K^0 e^+ν_e$ at four-momentum transfer squared $q^2$ = 0 are determined to be $f^η_+(0) = 0.482 \pm 0.011_{\rm stat} \pm 0.009_{\rm syst}\pm0.004_{\rm input},$ $f^{η^{\prime}}_+(0) = 0.562 \pm 0.031_{\rm stat} \pm 0.014_{\rm
syst}\pm0.003_{\rm input},$ and $f^{K^0}_+(0) = 0.624 \pm 0.052_{\rm
stat} \pm 0.013_{\rm syst}\pm0.002_{\rm input}.$
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Measurement of Electron Antineutrino Oscillation Amplitude and Frequency via Neutron Capture on Hydrogen at Daya Bay
Authors:
Daya Bay collaboration,
F. P. An,
W. D. Bai,
A. B. Balantekin,
M. Bishai,
S. Blyth,
G. F. Cao,
J. Cao,
J. F. Chang,
Y. Chang,
H. S. Chen,
H. Y. Chen,
S. M. Chen,
Y. Chen,
Y. X. Chen,
Z. Y. Chen,
J. Cheng,
J. Cheng,
Y. -C. Cheng,
Z. K. Cheng,
J. J. Cherwinka,
M. C. Chu,
J. P. Cummings,
O. Dalager,
F. S. Deng
, et al. (177 additional authors not shown)
Abstract:
This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive…
▽ More
This Letter reports the first measurement of the oscillation amplitude and frequency of reactor antineutrinos at Daya Bay via neutron capture on hydrogen using 1958 days of data. With over 3.6 million signal candidates, an optimized candidate selection, improved treatment of backgrounds and efficiencies, refined energy calibration, and an energy response model for the capture-on-hydrogen sensitive region, the relative $\overlineν_{e}$ rates and energy spectra variation among the near and far detectors gives $\mathrm{sin}^22θ_{13} = 0.0759_{-0.0049}^{+0.0050}$ and $Δm^2_{32} = (2.72^{+0.14}_{-0.15})\times10^{-3}$ eV$^2$ assuming the normal neutrino mass ordering, and $Δm^2_{32} = (-2.83^{+0.15}_{-0.14})\times10^{-3}$ eV$^2$ for the inverted neutrino mass ordering. This estimate of $\sin^2 2θ_{13}$ is consistent with and essentially independent from the one obtained using the capture-on-gadolinium sample at Daya Bay. The combination of these two results yields $\mathrm{sin}^22θ_{13}= 0.0833\pm0.0022$, which represents an 8% relative improvement in precision regarding the Daya Bay full 3158-day capture-on-gadolinium result.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Profiled Transfer Learning for High Dimensional Linear Model
Authors:
Ziqian Lin,
Junlong Zhao,
Fang Wang,
Hansheng Wang
Abstract:
We develop here a novel transfer learning methodology called Profiled Transfer Learning (PTL). The method is based on the \textit{approximate-linear} assumption between the source and target parameters. Compared with the commonly assumed \textit{vanishing-difference} assumption and \textit{low-rank} assumption in the literature, the \textit{approximate-linear} assumption is more flexible and less…
▽ More
We develop here a novel transfer learning methodology called Profiled Transfer Learning (PTL). The method is based on the \textit{approximate-linear} assumption between the source and target parameters. Compared with the commonly assumed \textit{vanishing-difference} assumption and \textit{low-rank} assumption in the literature, the \textit{approximate-linear} assumption is more flexible and less stringent. Specifically, the PTL estimator is constructed by two major steps. Firstly, we regress the response on the transferred feature, leading to the profiled responses. Subsequently, we learn the regression relationship between profiled responses and the covariates on the target data. The final estimator is then assembled based on the \textit{approximate-linear} relationship. To theoretically support the PTL estimator, we derive the non-asymptotic upper bound and minimax lower bound. We find that the PTL estimator is minimax optimal under appropriate regularity conditions. Extensive simulation studies are presented to demonstrate the finite sample performance of the new method. A real data example about sentence prediction is also presented with very encouraging results.
△ Less
Submitted 5 June, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
High Performance Operation of a Direct-Current and Superconducting Radio-Frequency Combined Photocathode Gun
Authors:
H. Jia,
T. Li,
T. Wang,
Y. Zhao,
X. Zhang,
H. Xu,
Z. Liu,
J. Liu,
L. Lin,
H. Xie,
L. Feng,
F. Wang,
F. Zhu,
J. Hao,
S. Quan,
K. Liu,
S. Huang
Abstract:
Superconducting radio-frequency (SRF) guns are promising candidates to deliver high brightness continuous-wave (CW) electron beams for new generations of coherent linac light sources, ultrafast electron diffractions, MeV pulsed beam applications, etc. To solve the compatibility problem of semiconductor photocathodes, a hybrid gun combining a direct-current gap and an SRF cavity has been developed.…
▽ More
Superconducting radio-frequency (SRF) guns are promising candidates to deliver high brightness continuous-wave (CW) electron beams for new generations of coherent linac light sources, ultrafast electron diffractions, MeV pulsed beam applications, etc. To solve the compatibility problem of semiconductor photocathodes, a hybrid gun combining a direct-current gap and an SRF cavity has been developed. The gun, employing K2CsSb photocathodes driven by a green laser, has been brought into stable CW operation with a dark current below 100 pA, delivering electron beams at an energy gain of 2.4 MeV, an electron bunch charge of 100 pC, and a repetition rate of 1 MHz. A normalized beam emittance of 0.54 mm-mrad has been achieved at the bunch charge of 100 pC and peak current of about 6 A. CW operation at 81.25 MHz repetition rate has also been tested with the maximum average beam current reaching 3 mA.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
A Survey on Large Language Models for Code Generation
Authors:
Juyong Jiang,
Fan Wang,
Jiasi Shen,
Sungju Kim,
Sunghun Kim
Abstract:
Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e…
▽ More
Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e.g., GitHub Copilot. Despite the active exploration of LLMs for a variety of code tasks, either from the perspective of natural language processing (NLP) or software engineering (SE) or both, there is a noticeable absence of a comprehensive and up-to-date literature review dedicated to LLM for code generation. In this survey, we aim to bridge this gap by providing a systematic literature review that serves as a valuable reference for researchers investigating the cutting-edge progress in LLMs for code generation. We introduce a taxonomy to categorize and discuss the recent developments in LLMs for code generation, covering aspects such as data curation, latest advances, performance evaluation, and real-world applications. In addition, we present a historical overview of the evolution of LLMs for code generation and offer an empirical comparison using the widely recognized HumanEval and MBPP benchmarks to highlight the progressive enhancements in LLM capabilities for code generation. We identify critical challenges and promising opportunities regarding the gap between academia and practical development. Furthermore, we have established a dedicated resource website (https://codellm.github.io) to continuously document and disseminate the most recent advances in the field.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
SCALM: Towards Semantic Caching for Automated Chat Services with Large Language Models
Authors:
Jiaxing Li,
Chi Xu,
Feng Wang,
Isaac M von Riedemann,
Cong Zhang,
Jiangchuan Liu
Abstract:
Large Language Models (LLMs) have become increasingly popular, transforming a wide range of applications across various domains. However, the real-world effectiveness of their query cache systems has not been thoroughly investigated. In this work, we for the first time conducted an analysis on real-world human-to-LLM interaction data, identifying key challenges in existing caching solutions for LL…
▽ More
Large Language Models (LLMs) have become increasingly popular, transforming a wide range of applications across various domains. However, the real-world effectiveness of their query cache systems has not been thoroughly investigated. In this work, we for the first time conducted an analysis on real-world human-to-LLM interaction data, identifying key challenges in existing caching solutions for LLM-based chat services. Our findings reveal that current caching methods fail to leverage semantic connections, leading to inefficient cache performance and extra token costs. To address these issues, we propose SCALM, a new cache architecture that emphasizes semantic analysis and identifies significant cache entries and patterns. We also detail the implementations of the corresponding cache storage and eviction strategies. Our evaluations show that SCALM increases cache hit ratios and reduces operational costs for LLMChat services. Compared with other state-of-the-art solutions in GPTCache, SCALM shows, on average, a relative increase of 63% in cache hit ratio and a relative improvement of 77% in tokens savings.
△ Less
Submitted 24 May, 2024;
originally announced June 2024.
-
The First Billion Years, According to JWST
Authors:
Angela Adamo,
Hakim Atek,
Micaela B. Bagley,
Eduardo Bañados,
Kirk S. S. Barrow,
Danielle A. Berg,
Rachel Bezanson,
Maruša Bradač,
Gabriel Brammer,
Adam C. Carnall,
John Chisholm,
Dan Coe,
Pratika Dayal,
Daniel J. Eisenstein,
Jan J. Eldridge,
Andrea Ferrara,
Seiji Fujimoto,
Anna de Graaff,
Melanie Habouzit,
Taylor A. Hutchison,
Jeyhan S. Kartaltepe,
Susan A. Kassin,
Mariska Kriek,
Ivo Labbé,
Roberto Maiolino
, et al. (24 additional authors not shown)
Abstract:
With stunning clarity, JWST has revealed the Universe's first billion years. The scientific community is analyzing a wealth of JWST imaging and spectroscopic data from that era, and is in the process of rewriting the astronomy textbooks. Here, 1.5 years into the JWST science mission, we provide a snapshot of the great progress made towards understanding the initial chapters of our cosmic history.…
▽ More
With stunning clarity, JWST has revealed the Universe's first billion years. The scientific community is analyzing a wealth of JWST imaging and spectroscopic data from that era, and is in the process of rewriting the astronomy textbooks. Here, 1.5 years into the JWST science mission, we provide a snapshot of the great progress made towards understanding the initial chapters of our cosmic history. We highlight discoveries and breakthroughs, topics and issues that are not yet understood, and questions that will be addressed in the coming years, as JWST continues its revolutionary observations of the Early Universe. While this compendium is written by a small number of authors, invited to ISSI Bern in March 2024 as part of the 2024 ISSI Breakthrough Workshop, we acknowledge the work of a large community that is advancing our collective understanding of the evolution of the Early Universe.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Search for $e^{+}e^{-}\toη'ψ(2S)$ at center-of-mass energies from 4.66 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence lev…
▽ More
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence level are determined.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Study of the decays $χ_{cJ} \rightarrow Λ\barΛφ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured t…
▽ More
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured to be $( 2.99\pm1.24\pm0.19) \times 10^{-5}$, $(6.01\pm0.90\pm0.40 )\times 10^{-5}$, and $(7.13\pm0.81\pm0.36) \times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No obvious enhancement near the $Λ\barΛ$ production threshold or excited $Λ$ state is found in the $Λφ$ (or $\barΛφ$) system.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Faces of the Mind: Unveiling Mental Health States Through Facial Expressions in 11,427 Adolescents
Authors:
Xiao Xu,
Keyin Zhou,
Yan Zhang,
Yang Wang,
Fei Wang,
Xizhe Zhang
Abstract:
Mood disorders, including depression and anxiety, often manifest through facial expressions. While previous research has explored the connection between facial features and emotions, machine learning algorithms for estimating mood disorder severity have been hindered by small datasets and limited real-world application. To address this gap, we analyzed facial videos of 11,427 participants, a datas…
▽ More
Mood disorders, including depression and anxiety, often manifest through facial expressions. While previous research has explored the connection between facial features and emotions, machine learning algorithms for estimating mood disorder severity have been hindered by small datasets and limited real-world application. To address this gap, we analyzed facial videos of 11,427 participants, a dataset two orders of magnitude larger than previous studies. This comprehensive collection includes standardized facial expression videos from reading tasks, along with a detailed psychological scale that measures depression, anxiety, and stress. By examining the relationships among these emotional states and employing clustering analysis, we identified distinct subgroups embodying different emotional profiles. We then trained tree-based classifiers and deep learning models to estimate emotional states from facial features. Results indicate that models previously effective on small datasets experienced decreased performance when applied to our large dataset, highlighting the importance of data scale and mitigating overfitting in practical settings. Notably, our study identified subtle shifts in pupil dynamics and gaze orientation as potential markers of mood disorders, providing valuable information on the interaction between facial expressions and mental health. This research marks the first large-scale and comprehensive investigation of facial expressions in the context of mental health, laying the groundwork for future data-driven advancements in this field.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Uniform Inviscid Dam** and Inviscid Limit of the 2D Navier-Stokes equation with Navier Boundary Conditions
Authors:
Jacob Bedrossian,
Siming He,
Sameer Iyer,
Fei Wang
Abstract:
We consider the 2D, incompressible Navier-Stokes equations near the Couette flow, $ω^{(NS)} = 1 + εω$, set on the channel $\mathbb{T} \times [-1, 1]$, supplemented with Navier boundary conditions on the perturbation, $ω|_{y = \pm 1} = 0$. We are simultaneously interested in two asymptotic regimes that are classical in hydrodynamic stability: the long time, $t \rightarrow \infty$, stability of back…
▽ More
We consider the 2D, incompressible Navier-Stokes equations near the Couette flow, $ω^{(NS)} = 1 + εω$, set on the channel $\mathbb{T} \times [-1, 1]$, supplemented with Navier boundary conditions on the perturbation, $ω|_{y = \pm 1} = 0$. We are simultaneously interested in two asymptotic regimes that are classical in hydrodynamic stability: the long time, $t \rightarrow \infty$, stability of background shear flows, and the inviscid limit, $ν\rightarrow 0$ in the presence of boundaries. Given small ($ε\ll 1$, but independent of $ν$) Gevrey 2- datum, $ω_0^{(ν)}(x, y)$, that is supported away from the boundaries $y = \pm 1$, we prove the following results: \begin{align*} & \|ω^{(ν)}(t) - \frac{1}{2π}\int ω^{(ν)}(t) dx \|_{L^2} \lesssim εe^{-δν^{1/3} t}, & \text{(Enhanced Dissipation)} \\ & \langle t \rangle \|u_1^{(ν)}(t) - \frac{1}{2π} \int u_1^{(ν)}(t) dx\|_{L^2} + \langle t \rangle^2 \|u_2^{(ν)}(t)\|_{L^2} \lesssim εe^{-δν^{1/3} t}, & \text{(Inviscid Dam**)} \\ &\| ω^{(ν)} - ω^{(0)} \|_{L^\infty} \lesssim ενt^{3+η}, \quad\quad t \lesssim ν^{-1/(3+η)} & \text{(Long-time Inviscid Limit)} \end{align*} This is the first nonlinear asymptotic stability result of its type, which combines three important physical phenomena at the nonlinear level: inviscid dam**, enhanced dissipation, and long-time inviscid limit in the presence of boundaries. The techniques we develop represent a major departure from prior works on nonlinear inviscid dam** as physical space techniques necessarily play a central role. In this paper, we focus on the primary nonlinear result, while tools for handling the linearized parabolic and elliptic equations are developed in our separate, companion work.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Pseudo-Gevrey Smoothing for the Passive Scalar Equations near Couette
Authors:
Jacob Bedrossian,
Siming He,
Sameer Iyer,
Fei Wang
Abstract:
In this article, we study the regularity theory for two linear equations that are important in fluid dynamics: the passive scalar equation for (time-varying) shear flows close to Couette in $\mathbb T \times [-1,1]$ with vanishing diffusivity $ν\to 0$ and the Poisson equation with right-hand side behaving in similar function spaces to such a passive scalar. The primary motivation for this work is…
▽ More
In this article, we study the regularity theory for two linear equations that are important in fluid dynamics: the passive scalar equation for (time-varying) shear flows close to Couette in $\mathbb T \times [-1,1]$ with vanishing diffusivity $ν\to 0$ and the Poisson equation with right-hand side behaving in similar function spaces to such a passive scalar. The primary motivation for this work is to develop some of the main technical tools required for our treatment of the (nonlinear) 2D Navier-Stokes equations, carried out in our companion work. Both equations are studied with homogeneous Dirichlet conditions (the analogue of a Navier slip-type boundary condition) and the initial condition is taken to be compactly supported away from the walls. We develop smoothing estimates with the following three features:
[1] Uniform-in-$ν$ regularity is with respect to $\partial_x$ and a time-dependent adapted vector-field $Γ$ which approximately commutes with the passive scalar equation (as opposed to `flat' derivatives), and a scaled gradient $\sqrtν \nabla$;
[2] $(\partial_x, Γ)$-regularity estimates are performed in Gevrey spaces with regularity that depends on the spatial coordinate, $y$ (what we refer to as `pseudo-Gevrey');
[3] The regularity of these pseudo-Gevrey spaces degenerates to finite regularity near the center of the channel and hence standard Gevrey product rules and other amenable properties do not hold.
Nonlinear analysis in such a delicate functional setting is one of the key ingredients to our companion paper, \cite{BHIW24a}, which proves the full nonlinear asymptotic stability of the Couette flow with slip boundary conditions. The present article introduces new estimates for the associated linear problems in these degenerate pseudo-Gevrey spaces, which is of independent interest.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization
Authors:
Zhiwei Tang,
Jiangweizhi Peng,
Jiasheng Tang,
Mingyi Hong,
Fan Wang,
Tsung-Hui Chang
Abstract:
In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as improving human preference. The central goal of the alignment problem is to adjust the distribution learned by diffusion models such that the generated samples maximize the target reward function. We propose a novel alignment appr…
▽ More
In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as improving human preference. The central goal of the alignment problem is to adjust the distribution learned by diffusion models such that the generated samples maximize the target reward function. We propose a novel alignment approach, named Direct Noise Optimization (DNO), that optimizes the injected noise during the sampling process of diffusion models. By design, DNO is tuning-free and prompt-agnostic, as the alignment occurs in an online fashion during generation. We rigorously study the theoretical properties of DNO and also propose variants to deal with non-differentiable reward functions. Furthermore, we identify that naive implementation of DNO occasionally suffers from the out-of-distribution reward hacking problem, where optimized samples have high rewards but are no longer in the support of the pretrained distribution. To remedy this issue, we leverage classical high-dimensional statistics theory and propose to augment the DNO loss with certain probability regularization. We conduct extensive experiments on several popular reward functions trained on human feedback data and demonstrate that the proposed DNO approach achieves state-of-the-art reward scores as well as high image quality, all within a reasonable time budget for generation.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Phased Consistency Model
Authors:
Fu-Yun Wang,
Zhaoyang Huang,
Alexander William Bergman,
Dazhong Shen,
Peng Gao,
Michael Lingelbach,
Keqiang Sun,
Weikang Bian,
Guanglu Song,
Yu Liu,
Hongsheng Li,
Xiaogang Wang
Abstract:
The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in the latent space (a.k.a., LCM) remains unsatisfactory. In this paper, we identify three key flaws in the current design of LCM. We investigate the reasons behind these limitations and propose the Phas…
▽ More
The consistency model (CM) has recently made significant progress in accelerating the generation of diffusion models. However, its application to high-resolution, text-conditioned image generation in the latent space (a.k.a., LCM) remains unsatisfactory. In this paper, we identify three key flaws in the current design of LCM. We investigate the reasons behind these limitations and propose the Phased Consistency Model (PCM), which generalizes the design space and addresses all identified limitations. Our evaluations demonstrate that PCM significantly outperforms LCM across 1--16 step generation settings. While PCM is specifically designed for multi-step refinement, it achieves even superior or comparable 1-step generation results to previously state-of-the-art specifically designed 1-step methods. Furthermore, we show that PCM's methodology is versatile and applicable to video generation, enabling us to train the state-of-the-art few-step text-to-video generator. More details are available at https://g-u-n.github.io/projects/pcm/.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Yuan 2.0-M32: Mixture of Experts with Attention Router
Authors:
Shaohua Wu,
Jiangang Luo,
Xi Chen,
Lingjun Li,
Xudong Zhao,
Tong Yu,
Chao Wang,
Yue Wang,
Fei Wang,
Weixu Qiao,
Houbo He,
Zeru Zhang,
Zeyu Sun,
Junxiong Mao,
Chong Shen
Abstract:
Yuan 2.0-M32, with a similar base architecture as Yuan-2.0 2B, uses a mixture-of-experts architecture with 32 experts of which 2 experts are active. A new router network, Attention Router, is proposed and adopted for a more efficient selection of experts, which improves the accuracy compared to the model with classical router network. Yuan 2.0-M32 is trained with 2000B tokens from scratch, and the…
▽ More
Yuan 2.0-M32, with a similar base architecture as Yuan-2.0 2B, uses a mixture-of-experts architecture with 32 experts of which 2 experts are active. A new router network, Attention Router, is proposed and adopted for a more efficient selection of experts, which improves the accuracy compared to the model with classical router network. Yuan 2.0-M32 is trained with 2000B tokens from scratch, and the training computation consumption is only 9.25% of a dense model at the same parameter scale. Yuan 2.0-M32 demonstrates competitive capability on coding, math, and various domains of expertise, with only 3.7B active parameters of 40B in total, and 7.4 GFlops forward computation per token, both of which are only 1/19 of Llama3-70B. Yuan 2.0-M32 surpass Llama3-70B on MATH and ARC-Challenge benchmark, with accuracy of 55.89 and 95.8 respectively. The models and source codes of Yuan 2.0-M32 are released at Github1.
△ Less
Submitted 29 May, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
Towards Clinical AI Fairness: Filling Gaps in the Puzzle
Authors:
Mingxuan Liu,
Yilin Ning,
Salinelat Teixayavong,
Xiaoxuan Liu,
Mayli Mertens,
Yuqing Shang,
Xin Li,
Di Miao,
Jie Xu,
Daniel Shu Wei Ting,
Lionel Tim-Ee Cheng,
Jasmine Chiat Ling Ong,
Zhen Ling Teo,
Ting Fang Tan,
Narrendar RaviChandran,
Fei Wang,
Leo Anthony Celi,
Marcus Eng Hock Ong,
Nan Liu
Abstract:
The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical adva…
▽ More
The ethical integration of Artificial Intelligence (AI) in healthcare necessitates addressing fairness-a concept that is highly context-specific across medical fields. Extensive studies have been conducted to expand the technical components of AI fairness, while tremendous calls for AI fairness have been raised from healthcare. Despite this, a significant disconnect persists between technical advancements and their practical clinical applications, resulting in a lack of contextualized discussion of AI fairness in clinical settings. Through a detailed evidence gap analysis, our review systematically pinpoints several deficiencies concerning both healthcare data and the provided AI fairness solutions. We highlight the scarcity of research on AI fairness in many medical domains where AI technology is increasingly utilized. Additionally, our analysis highlights a substantial reliance on group fairness, aiming to ensure equality among demographic groups from a macro healthcare system perspective; in contrast, individual fairness, focusing on equity at a more granular level, is frequently overlooked. To bridge these gaps, our review advances actionable strategies for both the healthcare and AI research communities. Beyond applying existing AI fairness methods in healthcare, we further emphasize the importance of involving healthcare professionals to refine AI fairness concepts and methods to ensure contextually relevant and ethically sound AI applications in healthcare.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Enhancing Global Sensitivity and Uncertainty Quantification in Medical Image Reconstruction with Monte Carlo Arbitrary-Masked Mamba
Authors:
Jiahao Huang,
Liutao Yang,
Fanwen Wang,
Yang Nan,
Weiwen Wu,
Chengyan Wang,
Kuangyu Shi,
Angelica I. Aviles-Rivero,
Carola-Bibiane Schönlieb,
Daoqiang Zhang,
Guang Yang
Abstract:
Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has sh…
▽ More
Deep learning has been extensively applied in medical image reconstruction, where Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) represent the predominant paradigms, each possessing distinct advantages and inherent limitations: CNNs exhibit linear complexity with local sensitivity, whereas ViTs demonstrate quadratic complexity with global sensitivity. The emerging Mamba has shown superiority in learning visual representation, which combines the advantages of linear scalability and global sensitivity. In this study, we introduce MambaMIR, an Arbitrary-Masked Mamba-based model with wavelet decomposition for joint medical image reconstruction and uncertainty estimation. A novel Arbitrary Scan Masking (ASM) mechanism "masks out" redundant information to introduce randomness for further uncertainty estimation. Compared to the commonly used Monte Carlo (MC) dropout, our proposed MC-ASM provides an uncertainty map without the need for hyperparameter tuning and mitigates the performance drop typically observed when applying dropout to low-level tasks. For further texture preservation and better perceptual quality, we employ the wavelet transformation into MambaMIR and explore its variant based on the Generative Adversarial Network, namely MambaMIR-GAN. Comprehensive experiments have been conducted for multiple representative medical image reconstruction tasks, demonstrating that the proposed MambaMIR and MambaMIR-GAN outperform other baseline and state-of-the-art methods in different reconstruction tasks, where MambaMIR achieves the best reconstruction fidelity and MambaMIR-GAN has the best perceptual quality. In addition, our MC-ASM provides uncertainty maps as an additional tool for clinicians, while mitigating the typical performance drop caused by the commonly used dropout.
△ Less
Submitted 25 June, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
On The Implicit Large Eddy Simulation of Turbomachinery Flows Using The Flux Reconstruction Method
Authors:
Feng Wang
Abstract:
A high order flux reconstruction solver has been developed and validated to perform implicit large eddy simulations of industrially representative turbomachinery flows. The T106c low-pressure turbine and VKI LS89 high-pressure turbine cases are studied. The solver uses the Rusanov Riemann solver to compute the inviscid fluxes on the wall boundaries, and HLLC or Roe to evaluate inviscid fluxes for…
▽ More
A high order flux reconstruction solver has been developed and validated to perform implicit large eddy simulations of industrially representative turbomachinery flows. The T106c low-pressure turbine and VKI LS89 high-pressure turbine cases are studied. The solver uses the Rusanov Riemann solver to compute the inviscid fluxes on the wall boundaries, and HLLC or Roe to evaluate inviscid fluxes for internal faces. The impact of Riemann solvers is demonstrated in terms of accuracy and non-linear stability for turbomachinery flows. It is found that HLLC is more robust than Roe, but both Riemann solvers produce very similar results if stable solutions can be obtained. For non-linear stabilization, a local modal filter, which combines a smooth indicator and a modal filter, is used to stabilize the solution. This approach requires a tuning parameter for the smoothness criterion. Detailed analysis has been provided to guide the selection of a suitable value for different spatial orders of accuracy. This local-modal filter is also compared with the recent positivity-preserving entropy filter in terms of accuracy and stability for the LS89 turbine case. The entropy filter could stabilize the computation but is more dissipative than the local modal filter. Regarding the spanwise spacing of the grid, the case of the LS89 turbine shows that a $z^+$ of approximately $45 - 60$ is suitable for obtaining a satisfactory prediction of the heat transfer coefficient of the mean flow. This would allow for a coarse grid spacing in the spanwise direction and a cost-effective ILES aerothermal simulation for turbomachinery flows.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Benchmarking General-Purpose In-Context Learning
Authors:
Fan Wang,
Chuan Lin,
Yang Cao,
Yu Kang
Abstract:
In-context learning (ICL) empowers generative models to address new tasks effectively and efficiently on the fly, without relying on any artificially crafted optimization techniques. In this paper, we study extending ICL to address a broader range of tasks with an extended learning horizon and higher improvement potential, namely General-Purpose In-Context Learning (GPICL). To this end, we introdu…
▽ More
In-context learning (ICL) empowers generative models to address new tasks effectively and efficiently on the fly, without relying on any artificially crafted optimization techniques. In this paper, we study extending ICL to address a broader range of tasks with an extended learning horizon and higher improvement potential, namely General-Purpose In-Context Learning (GPICL). To this end, we introduce two lightweight benchmarks specifically crafted to train and evaluate GPICL functionalities. Each benchmark encompasses a vast number of tasks characterized by significant task variance, facilitating meta-training that minimizes inductive bias. These tasks are also crafted to promote long-horizon in-context learning through continuous generation and interaction. These characteristics necessitate the models to leverage contexts and history interactions to enhance their capabilities, across domains such as language modeling, decision-making, and world modeling. Our experiments on the baseline models demonstrate that meta-training with minimal inductive bias and ICL from the ground up is feasible across all the domains we've discussed. Additionally, our findings indicate that the scale of parameters alone may not be crucial for ICL or GPICL, suggesting alternative approaches such as increasing the scale of contexts and memory states.
△ Less
Submitted 26 June, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
PatchScaler: An Efficient Patch-Independent Diffusion Model for Super-Resolution
Authors:
Yong Liu,
Hang Dong,
**shan Pan,
Qingji Dong,
Kai Chen,
Rongxiang Zhang,
Lean Fu,
Fei Wang
Abstract:
Diffusion models significantly improve the quality of super-resolved images with their impressive content generation capabilities. However, the huge computational costs limit the applications of these methods.Recent efforts have explored reasonable inference acceleration to reduce the number of sampling steps, but the computational cost remains high as each step is performed on the entire image.Th…
▽ More
Diffusion models significantly improve the quality of super-resolved images with their impressive content generation capabilities. However, the huge computational costs limit the applications of these methods.Recent efforts have explored reasonable inference acceleration to reduce the number of sampling steps, but the computational cost remains high as each step is performed on the entire image.This paper introduces PatchScaler, a patch-independent diffusion-based single image super-resolution (SR) method, designed to enhance the efficiency of the inference process.The proposed method is motivated by the observation that not all the image patches within an image need the same sampling steps for reconstructing high-resolution images.Based on this observation, we thus develop a Patch-adaptive Group Sampling (PGS) to divide feature patches into different groups according to the patch-level reconstruction difficulty and dynamically assign an appropriate sampling configuration for each group so that the inference speed can be better accelerated.In addition, to improve the denoising ability at each step of the sampling, we develop a texture prompt to guide the estimations of the diffusion model by retrieving high-quality texture priors from a patch-independent reference texture memory.Experiments show that our PatchScaler achieves favorable performance in both quantitative and qualitative evaluations with fast inference speed.Our code and model are available at \url{https://github.com/yongliuy/PatchScaler}.
△ Less
Submitted 11 June, 2024; v1 submitted 27 May, 2024;
originally announced May 2024.
-
Standardizing the Gamma-ray burst as a standard candle and applying to the cosmological probes: constraints on the two-component dark energy model
Authors:
Jia-Lun Li,
Yu-Peng Yang,
Shuang-Xi Yi,
Jian-** Hu,
Yan-Kun Qu,
Fa-Yin Wang
Abstract:
As one of the most energetic and brightest events, gamma-ray bursts (GRBs) have been used as a standard candle for cosmological probe. Based on the relevant features of GRBs light curves, a plateau phase followed a decay phase, we obtain X-ray samples of 31 GRBs and optical samples of 50 GRBs, which are thought to be caused by the same physical mechanism. We standardize GRBs using the two-dimensio…
▽ More
As one of the most energetic and brightest events, gamma-ray bursts (GRBs) have been used as a standard candle for cosmological probe. Based on the relevant features of GRBs light curves, a plateau phase followed a decay phase, we obtain X-ray samples of 31 GRBs and optical samples of 50 GRBs, which are thought to be caused by the same physical mechanism. We standardize GRBs using the two-dimension fundamental plane relation of the rest-frame luminosity of the plateau emission ($L_{b,z}$) and the end time of plateau ($T_{b,z}$) $L_{b,z}-T_{b,z}$, as well as the three-dimension fundamental plane correlation including the peak energy ($E_{p,i}$) $L_{b,z}-T_{b,z}-E_{p,i}$. For the cosmological probes, we consider the $ω$CDM model in which the dark energy consists of one component, and mainly focus on the $X_1X_2$CDM model in which the dark energy is made up of two independent components. We obtain the constraints on the related parameters of the cosmological models using the type Ia supernovae (SNe Ia) data and selected X-ray and optical samples. For the $X_1X_2$CDM model, we find that the values of the equations of state parameters of two dark energies, $ω_1$ and $ω_2$, are very close. We also conduct the comparison between the models using the Bayesian information criterion, and find that the $ω$CDM model is favoured.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
A Study on Magnetic-sensitivity Wavelength Position of the Working Line Used by the Full-Disk Magnetograph onboard the Advanced Space based Solar Observatory (ASO-S/FMG)
Authors:
S. Liu,
J. T. Su,
X. Y. Bai,
Y. Y. Deng,
J. Chen,
Y. L. Song,
X. F. Wang,
H. Q. Xu,
X. Yang,
Shahid Idrees
Abstract:
Utilizing data from the $Solar$ $Magnetism$ and $Activity$ $Telescope$ (SMAT), analytical solutions of polarized radiative transfer equations, and in-orbit test data from the Full-disk Magnetograph (FMG) onboard the Advanced Space based Solar Observatory (ASO-S), this study reveals the magnetic-sensitivity spectral positions for the Fe {\sc i} $λ$5234.19 A, working line used by FMG. From the exper…
▽ More
Utilizing data from the $Solar$ $Magnetism$ and $Activity$ $Telescope$ (SMAT), analytical solutions of polarized radiative transfer equations, and in-orbit test data from the Full-disk Magnetograph (FMG) onboard the Advanced Space based Solar Observatory (ASO-S), this study reveals the magnetic-sensitivity spectral positions for the Fe {\sc i} $λ$5234.19 A, working line used by FMG. From the experimental data of SMAT, it is found that the most sensitivity position is located at the line center for linear polarization (Stokes-Q/U), while it is about -0.07 A away from the line center for circular polarization (Stokes-V). Moreover, both the theoretical analysis and the in-orbit test data analysis of FMG prove again the above results. Additionally, the theoretical analysis suggests the presence of distinct spectral pockets (centered at 0.08-0.15 A) from the line, harboring intense magnetic sensitivity across all three Stokes parameters. Striking a balance between high sensitivity for both linear and circular polarization while capturing additional valuable information, a spectral position of -0.08 A emerges as the champion for routine FMG magnetic-field observations.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models
Authors:
Jiankun Wang,
Sumyeong Ahn,
Taykhoom Dalal,
Xiaodan Zhang,
Weishen Pan,
Qiannan Zhang,
Bin Chen,
Hiroko H. Dodge,
Fei Wang,
Jiayu Zhou
Abstract:
Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for develo** ADRD screening tools such as machine learning bas…
▽ More
Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for develo** ADRD screening tools such as machine learning based predictive models. Recent advancements in large language models (LLMs) demonstrate their unprecedented capability of encoding knowledge and performing reasoning, which offers them strong potential for enhancing risk prediction. This paper proposes a novel pipeline that augments risk prediction by leveraging the few-shot inference power of LLMs to make predictions on cases where traditional supervised learning methods (SLs) may not excel. Specifically, we develop a collaborative pipeline that combines SLs and LLMs via a confidence-driven decision-making mechanism, leveraging the strengths of SLs in clear-cut cases and LLMs in more complex scenarios. We evaluate this pipeline using a real-world EHR data warehouse from Oregon Health \& Science University (OHSU) Hospital, encompassing EHRs from over 2.5 million patients and more than 20 million patient encounters. Our results show that our proposed approach effectively combines the power of SLs and LLMs, offering significant improvements in predictive performance. This advancement holds promise for revolutionizing ADRD screening and early detection practices, with potential implications for better strategies of patient management and thus improving healthcare.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Diffusion-Reward Adversarial Imitation Learning
Authors:
Chun-Mao Lai,
Hsiang-Chun Wang,
**-Chun Hsieh,
Yu-Chiang Frank Wang,
Min-Hung Chen,
Shao-Hua Sun
Abstract:
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a generator policy learning to imitate expert behaviors and discriminator learning to distinguish the expert demonstrations from agent trajectories. Despit…
▽ More
Imitation learning aims to learn a policy from observing expert demonstrations without access to reward signals from environments. Generative adversarial imitation learning (GAIL) formulates imitation learning as adversarial learning, employing a generator policy learning to imitate expert behaviors and discriminator learning to distinguish the expert demonstrations from agent trajectories. Despite its encouraging results, GAIL training is often brittle and unstable. Inspired by the recent dominance of diffusion models in generative modeling, this work proposes Diffusion-Reward Adversarial Imitation Learning (DRAIL), which integrates a diffusion model into GAIL, aiming to yield more precise and smoother rewards for policy learning. Specifically, we propose a diffusion discriminative classifier to construct an enhanced discriminator; then, we design diffusion rewards based on the classifier's output for policy learning. We conduct extensive experiments in navigation, manipulation, and locomotion, verifying DRAIL's effectiveness compared to prior imitation learning methods. Moreover, additional experimental results demonstrate the generalizability and data efficiency of DRAIL. Visualized learned reward functions of GAIL and DRAIL suggest that DRAIL can produce more precise and smoother rewards.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Rethinking Early-Fusion Strategies for Improved Multispectral Object Detection
Authors:
Xue Zhang,
Si-Yuan Cao,
Fang Wang,
Runmin Zhang,
Zhe Wu,
Xiaohan Zhang,
Xiaokai Bai,
Hui-Liang Shen
Abstract:
Most recent multispectral object detectors employ a two-branch structure to extract features from RGB and thermal images. While the two-branch structure achieves better performance than a single-branch structure, it overlooks inference efficiency. This conflict is increasingly aggressive, as recent works solely pursue higher performance rather than both performance and efficiency. In this paper, w…
▽ More
Most recent multispectral object detectors employ a two-branch structure to extract features from RGB and thermal images. While the two-branch structure achieves better performance than a single-branch structure, it overlooks inference efficiency. This conflict is increasingly aggressive, as recent works solely pursue higher performance rather than both performance and efficiency. In this paper, we address this issue by improving the performance of efficient single-branch structures. We revisit the reasons causing the performance gap between these structures. For the first time, we reveal the information interference problem in the naive early-fusion strategy adopted by previous single-branch structures. Besides, we find that the domain gap between multispectral images, and weak feature representation of the single-branch structure are also key obstacles for performance. Focusing on these three problems, we propose corresponding solutions, including a novel shape-priority early-fusion strategy, a weakly supervised learning method, and a core knowledge distillation technique. Experiments demonstrate that single-branch networks equipped with these three contributions achieve significant performance enhancements while retaining high efficiency. Our code will be available at \url{https://github.com/XueZ-phd/Efficient-RGB-T-Early-Fusion-Detection}.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Sequence Length Scaling in Vision Transformers for Scientific Images on Frontier
Authors:
Aristeidis Tsaris,
Chengming Zhang,
Xiao Wang,
Junqi Yin,
Siyan Liu,
Moetasim Ashfaq,
Ming Fan,
Jong Youl Choi,
Mohamed Wahib,
Dan Lu,
Prasanna Balaprakash,
Feiyi Wang
Abstract:
Vision Transformers (ViTs) are pivotal for foundational models in scientific imagery, including Earth science applications, due to their capability to process large sequence lengths. While transformers for text has inspired scaling sequence lengths in ViTs, yet adapting these for ViTs introduces unique challenges. We develop distributed sequence parallelism for ViTs, enabling them to handle up to…
▽ More
Vision Transformers (ViTs) are pivotal for foundational models in scientific imagery, including Earth science applications, due to their capability to process large sequence lengths. While transformers for text has inspired scaling sequence lengths in ViTs, yet adapting these for ViTs introduces unique challenges. We develop distributed sequence parallelism for ViTs, enabling them to handle up to 1M tokens. Our approach, leveraging DeepSpeed-Ulysses and Long-Sequence-Segmentation with model sharding, is the first to apply sequence parallelism in ViT training, achieving a 94% batch scaling efficiency on 2,048 AMD-MI250X GPUs. Evaluating sequence parallelism in ViTs, particularly in models up to 10B parameters, highlighted substantial bottlenecks. We countered these with hybrid sequence, pipeline, tensor parallelism, and flash attention strategies, to scale beyond single GPU memory limits. Our method significantly enhances climate modeling accuracy by 20% in temperature predictions, marking the first training of a transformer model on a full-attention matrix over 188K sequence length.
△ Less
Submitted 17 April, 2024;
originally announced May 2024.
-
Self-distilled Dynamic Fusion Network for Language-based Fashion Retrieval
Authors:
Yiming Wu,
Hangfei Li,
Fangfang Wang,
Yilong Zhang,
Ronghua Liang
Abstract:
In the domain of language-based fashion image retrieval, pinpointing the desired fashion item using both a reference image and its accompanying textual description is an intriguing challenge. Existing approaches lean heavily on static fusion techniques, intertwining image and text. Despite their commendable advancements, these approaches are still limited by a deficiency in flexibility. In respons…
▽ More
In the domain of language-based fashion image retrieval, pinpointing the desired fashion item using both a reference image and its accompanying textual description is an intriguing challenge. Existing approaches lean heavily on static fusion techniques, intertwining image and text. Despite their commendable advancements, these approaches are still limited by a deficiency in flexibility. In response, we propose a Self-distilled Dynamic Fusion Network to compose the multi-granularity features dynamically by considering the consistency of routing path and modality-specific information simultaneously. Two new modules are included in our proposed method: (1) Dynamic Fusion Network with Modality Specific Routers. The dynamic network enables a flexible determination of the routing for each reference image and modification text, taking into account their distinct semantics and distributions. (2) Self Path Distillation Loss. A stable path decision for queries benefits the optimization of feature extraction as well as routing, and we approach this by progressively refine the path decision with previous path information. Extensive experiments demonstrate the effectiveness of our proposed model compared to existing methods.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
High-field magnetoelectric coupling and successive magnetic transitions in Mn-doped polar antiferromagnet Ni3TeO6
Authors:
J. H. Zhang,
L. Lin,
C. Dong,
Y. T. Chang,
J. F. Wang,
C. L. Lu,
P. Z. Chen,
W. J. Zhai,
G. Z. Zhou,
L. Huang,
Y. S. Tang,
S. H. Zheng,
M. F. Liu,
X. H. Zhou,
Z. B. Yan,
J. -M. Liu
Abstract:
Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 sing…
▽ More
Among the 3d transition metal ions doped polar Ni3TeO6, Mn-doped Ni3TeO6 has stimulated great interest due to its high magnetic ordering temperature and complex magnetic phases, but the mechanism of magnetoelectric (ME) coupling is far from understood. Herein we report our systematic investigation of the chemical control of magnetism, metamagnetic transition, and ME properties of Ni3-xMnxTeO6 single crystals in high magnetic field (H) up to 52 T. We present a previously unreported weak ferromagnetic behavior appeared in the ab plane below 9.5 K in addition to the incommensurate helical and commensurate collinear antiferromagnetic states. In the low-field region, a spin-flop type metamagnetic transition without any hysteresis occurs at Hc1 for H // c, while another metamagnetic transition accompanied with a change in electric polarization is observed at Hc2 in the high-field region both for H // c and H // ab above 30 K, which can be attributed to the sudden rotation of magnetic moments at Ni2 sites. The ME measurements reveal that a first-order ME effect is observed in the low-T and low-H regions, while a second-order ME coupling term appears above 30 K in the magnetic field range of Hc1 < H < Hc2 for H // c and H < Hc2 for H // ab, both becoming significant with increasing temperature. Eventually, they are dominated by the second-order ME effect near the antiferromagnetic transition temperature. The present work demonstrates that Ni3-xMnxTeO6 is an exotic magnetoelectric material compared with Ni3TeO6 and its derivatives, thereby providing insights to better understand the magnetism and ME coupling in Ni3TeO6 and its derivatives.
△ Less
Submitted 29 May, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
3D Unsupervised Learning by Distilling 2D Open-Vocabulary Segmentation Models for Autonomous Driving
Authors:
Boyi Sun,
Yuhang Liu,
Xingxia Wang,
Bin Tian,
Long Chen,
Fei-Yue Wang
Abstract:
Point cloud data labeling is considered a time-consuming and expensive task in autonomous driving, whereas unsupervised learning can avoid it by learning point cloud representations from unannotated data. In this paper, we propose UOV, a novel 3D Unsupervised framework assisted by 2D Open-Vocabulary segmentation models. It consists of two stages: In the first stage, we innovatively integrate high-…
▽ More
Point cloud data labeling is considered a time-consuming and expensive task in autonomous driving, whereas unsupervised learning can avoid it by learning point cloud representations from unannotated data. In this paper, we propose UOV, a novel 3D Unsupervised framework assisted by 2D Open-Vocabulary segmentation models. It consists of two stages: In the first stage, we innovatively integrate high-quality textual and image features of 2D open-vocabulary models and propose the Tri-Modal contrastive Pre-training (TMP). In the second stage, spatial map** between point clouds and images is utilized to generate pseudo-labels, enabling cross-modal knowledge distillation. Besides, we introduce the Approximate Flat Interaction (AFI) to address the noise during alignment and label confusion. To validate the superiority of UOV, extensive experiments are conducted on multiple related datasets. We achieved a record-breaking 47.73% mIoU on the annotation-free point cloud segmentation task in nuScenes, surpassing the previous best model by 10.70% mIoU. Meanwhile, the performance of fine-tuning with 1% data on nuScenes and SemanticKITTI reached a remarkable 51.75% mIoU and 48.14% mIoU, outperforming all previous pre-trained models.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Talk to Parallel LiDARs: A Human-LiDAR Interaction Method Based on 3D Visual Grounding
Authors:
Yuhang Liu,
Boyi Sun,
Guixu Zheng,
Yishuo Wang,
**g Wang,
Fei-Yue Wang
Abstract:
LiDAR sensors play a crucial role in various applications, especially in autonomous driving. Current research primarily focuses on optimizing perceptual models with point cloud data as input, while the exploration of deeper cognitive intelligence remains relatively limited. To address this challenge, parallel LiDARs have emerged as a novel theoretical framework for the next-generation intelligent…
▽ More
LiDAR sensors play a crucial role in various applications, especially in autonomous driving. Current research primarily focuses on optimizing perceptual models with point cloud data as input, while the exploration of deeper cognitive intelligence remains relatively limited. To address this challenge, parallel LiDARs have emerged as a novel theoretical framework for the next-generation intelligent LiDAR systems, which tightly integrate physical, digital, and social systems. To endow LiDAR systems with cognitive capabilities, we introduce the 3D visual grounding task into parallel LiDARs and present a novel human-computer interaction paradigm for LiDAR systems. We propose Talk2LiDAR, a large-scale benchmark dataset tailored for 3D visual grounding in autonomous driving. Additionally, we present a two-stage baseline approach and an efficient one-stage method named BEVGrounding, which significantly improves grounding accuracy by fusing coarse-grained sentence and fine-grained word embeddings with visual features. Our experiments on Talk2Car-3D and Talk2LiDAR datasets demonstrate the superior performance of BEVGrounding, laying a foundation for further research in this domain.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Mamba-R: Vision Mamba ALSO Needs Registers
Authors:
Feng Wang,
Jiahao Wang,
Sucheng Ren,
Guoyizhe Wei,
Jieru Mei,
Wei Shao,
Yuyin Zhou,
Alan Yuille,
Cihang Xie
Abstract:
Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we…
▽ More
Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we follow the prior solution of introducing register tokens into Vision Mamba. To better cope with Mamba blocks' uni-directional inference paradigm, two key modifications are introduced: 1) evenly inserting registers throughout the input token sequence, and 2) recycling registers for final decision predictions. We term this new architecture Mamba-R. Qualitative observations suggest, compared to vanilla Vision Mamba, Mamba-R's feature maps appear cleaner and more focused on semantically meaningful regions. Quantitatively, Mamba-R attains stronger performance and scales better. For example, on the ImageNet benchmark, our base-size Mamba-R attains 82.9% accuracy, significantly outperforming Vim-B's 81.8%; furthermore, we provide the first successful scaling to the large model size (i.e., with 341M parameters), attaining a competitive accuracy of 83.2% (84.5% if finetuned with 384x384 inputs). Additional validation on the downstream semantic segmentation task also supports Mamba-R's efficacy.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Local Causal Discovery for Structural Evidence of Direct Discrimination
Authors:
Jacqueline Maasch,
Kyra Gan,
Violet Chen,
Agni Orfanoudaki,
Nil-Jana Akpinar,
Fei Wang
Abstract:
Fairness is a critical objective in policy design and algorithmic decision-making. Identifying the causal pathways of unfairness requires knowledge of the underlying structural causal model, which may be incomplete or unavailable. This limits the practicality of causal fairness analysis in complex or low-knowledge domains. To mitigate this practicality gap, we advocate for develo** efficient cau…
▽ More
Fairness is a critical objective in policy design and algorithmic decision-making. Identifying the causal pathways of unfairness requires knowledge of the underlying structural causal model, which may be incomplete or unavailable. This limits the practicality of causal fairness analysis in complex or low-knowledge domains. To mitigate this practicality gap, we advocate for develo** efficient causal discovery methods for fairness applications. To this end, we introduce local discovery for direct discrimination (LD3): a polynomial-time algorithm that recovers structural evidence of direct discrimination. LD3 performs a linear number of conditional independence tests with respect to variable set size. Moreover, we propose a graphical criterion for identifying the weighted controlled direct effect (CDE), a qualitative measure of direct discrimination. We prove that this criterion is satisfied by the knowledge returned by LD3, increasing the accessibility of the weighted CDE as a causal fairness measure. Taking liver transplant allocation as a case study, we highlight the potential impact of LD3 for modeling fairness in complex decision systems. Results on real-world data demonstrate more plausible causal relations than baselines, which took 197x to 5870x longer to execute.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.