Search | arXiv e-print repository

arXiv:2406.19240 [pdf, other]

Data Preparation for Deep Learning based Code Smell Detection: A Systematic Literature Review

Authors: Fengji Zhang, Zexian Zhang, Jacky Wai Keung, Xiangru Tang, Zhen Yang, Xiao Yu, Wenhua Hu

Abstract: Code Smell Detection (CSD) plays a crucial role in improving software quality and maintainability. And Deep Learning (DL) techniques have emerged as a promising approach for CSD due to their superior performance. However, the effectiveness of DL-based CSD methods heavily relies on the quality of the training data. Despite its importance, little attention has been paid to analyzing the data prepara… ▽ More Code Smell Detection (CSD) plays a crucial role in improving software quality and maintainability. And Deep Learning (DL) techniques have emerged as a promising approach for CSD due to their superior performance. However, the effectiveness of DL-based CSD methods heavily relies on the quality of the training data. Despite its importance, little attention has been paid to analyzing the data preparation process. This systematic literature review analyzes the data preparation techniques used in DL-based CSD methods. We identify 36 relevant papers published by December 2023 and provide a thorough analysis of the critical considerations in constructing CSD datasets, including data requirements, collection, labeling, and cleaning. We also summarize seven primary challenges and corresponding solutions in the literature. Finally, we offer actionable recommendations for preparing and accessing high-quality CSD data, emphasizing the importance of data diversity, standardization, and accessibility. This survey provides valuable insights for researchers and practitioners to harness the full potential of DL techniques in CSD. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.18442 [pdf, other]

Correlation of the L-mode density limit with edge collisionality

Authors: Andrew Maris, Cristina Rea, Alessandro Pau, Wenhui Hu, Bingjia Xiao, Robert Granetz, Earl Marmar, the EUROfusion Tokamak Exploitation team, the Alcator C-Mod team, the ASDEX Upgrade team, the DIII-D team, the EAST team, the TCV team

Abstract: The "density limit" is one of the fundamental bounds on tokamak operating space, and is commonly estimated via the empirical Greenwald scaling. This limit has garnered renewed interest in recent years as it has become clear that ITER and many tokamak pilot plant concepts must operate near or above the widely-used Greenwald limit to achieve their objectives. Evidence has also grown that the Greenwa… ▽ More The "density limit" is one of the fundamental bounds on tokamak operating space, and is commonly estimated via the empirical Greenwald scaling. This limit has garnered renewed interest in recent years as it has become clear that ITER and many tokamak pilot plant concepts must operate near or above the widely-used Greenwald limit to achieve their objectives. Evidence has also grown that the Greenwald scaling - in its remarkable simplicity - may not capture the full complexity of the disruptive density limit. In this study, we assemble a multi-machine database to quantify the effectiveness of the Greenwald limit as a predictor of the L-mode density limit and identify alternative stability metrics. We find that a two-parameter dimensionless boundary in the plasma edge, $ν_{*\rm, edge}^{\rm limit} = 3.0 β_{T,{\rm edge}}^{-0.4}$, achieves significantly higher accuracy (true negative rate of 97.7% at a true positive rate of 95%) than the Greenwald limit (true negative rate 86.1% at a true positive rate of 95%) across a multi-machine dataset including metal- and carbon-wall tokamaks (AUG, C-Mod, DIII-D, and TCV). The collisionality boundary presented here can be applied for density limit avoidance in current devices and in ITER, where it can be measured and responded to in real time. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: 27 pages, 9 figures

arXiv:2406.18078 [pdf, other]

Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction

Authors: Yice Zhang, Jie Zeng, Weiming Hu, Ziyi Wang, Shiwei Chen, Ruifeng Xu

Abstract: Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review, which is the most representative and challenging task in aspect-based sentiment analysis. A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods. To tackle this issue, we propose a self-tra… ▽ More Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review, which is the most representative and challenging task in aspect-based sentiment analysis. A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods. To tackle this issue, we propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels, aiming to filter out mismatches and thereby enhance the effectiveness of self-training. We highlight two critical aspects to ensure the scorer's effectiveness and reliability: the quality of the training dataset and its model architecture. To this end, we create a human-annotated comparison dataset and train a generative model on it using ranking-based objectives. Extensive experiments on public ASQP datasets reveal that using our scorer can greatly and consistently improve the effectiveness of self-training. Moreover, we explore the possibility of replacing humans with large language models for comparison dataset annotation, and experiments demonstrate its feasibility. We release our code and data at https://github.com/HITSZ-HLT/ST-w-Scorer-ABSA . △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Accepted to ACL 2024 Main Conference

arXiv:2406.17006 [pdf, other]

Probing the nature of the $χ_{c1}(3872)$ state using radiative decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1094 additional authors not shown)

Abstract: The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and… ▽ More The radiative decays $χ_{c1}(3872)\rightarrowψ(2S)γ$ and $χ_{c1}(3872)\rightarrow J/ψγ$ are used to probe the~nature of the~$χ_{c1}(3872)$ state using proton-proton collision data collected with the LHCb detector, corresponding to an~integrated luminosity of~9fb$^{-1}$. Using the~$B^+\rightarrow χ_{c1}(3872)K^+$decay, the $χ_{c1}(3872)\rightarrow ψ(2S)γ$ process is observed for the first time and the ratio of its partial width to that of the $χ_{c1}(3872)\rightarrow J/ψγ$ decay is measured to be $$ \frac{Γ_{χ_{c1}(3872)\rightarrow ψ(2S)γ}} {Γ_{χ_{c1}(3872)\rightarrow J/ψγ}} = 1.67 \pm 0.21 \pm 0.12 \pm0.04 , $$ where the first uncertainty is statistical, the second systematic and the third is due to the uncertainties on the branching fractions of the $ψ(2S)$ and $J/ψ$ mesons. The measured ratio makes the interpretation of the $χ_{c1}(3872)$ state as a~pure $D^0\bar{D}^{*0}+\bar{D}^0D^{*0}$ molecule questionable and strongly indicates a sizeable compact charmonium or tetraquark component within the $χ_{c1}(3872)$ state. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 31 pages, 2 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-015.html (LHCb public pages)

Report number: LHCb-PAPER-2024-015, CERN-EP-2025-157

arXiv:2406.16334 [pdf, other]

CONCERTO: Instrument model of Fourier transform spectroscopy, white-noise components

Authors: Alessandro Fasano, Peter Ade, Manuel Aravena, Emilio Barria, Alexandre Beelen, Alain Benoit, Matthieu Béthermin, Julien Bounmy, Olivier Bourrion, Guillaume Bres, Martino Calvo, Andrea Catalano, Carlos De Breuck, François-Xavier Désert, Cédric Dubois, Carlos Durán, Thomas Fenouillet, Jose Garcia, Gregory Garde, Johannes Goupy, Christophe Hoarau, Wenkai Hu, Guilaine Lagache, Jean-Charles Lambert, Florence Levy-Bertrand , et al. (12 additional authors not shown)

Abstract: Modern astrophysics relies on intricate instrument setups to meet the demands of sensitivity, sky coverage, and multi-channel observations. An example is the CONCERTO project, employing advanced technology like kinetic inductance detectors and a Martin-Puplett interferometer. This instrument, installed at the APEX telescope atop the Chajnantor plateau, began commissioning observations in April 202… ▽ More Modern astrophysics relies on intricate instrument setups to meet the demands of sensitivity, sky coverage, and multi-channel observations. An example is the CONCERTO project, employing advanced technology like kinetic inductance detectors and a Martin-Puplett interferometer. This instrument, installed at the APEX telescope atop the Chajnantor plateau, began commissioning observations in April 2021. Following a successful commissioning phase that concluded in June 2021, CONCERTO was offered to the scientific community for observations, with a final observing run in December 2022. CONCERTO boasts an 18.5 arcmin field of view and a spectral resolution down to 1.45 GHz in the 130-310 GHz electromagnetic band. We developed a comprehensive instrument model of CONCERTO inspired by Fourier transform spectrometry principles to optimize performance and address systematic errors. This model integrates instrument noises, subsystem characteristics, and celestial signals, leveraging both physical data and simulations. Our methodology involves delineating simulation components, executing on-sky simulations, and comparing results with real observations. The resulting instrument model is pivotal, enabling a precise error correction and enhancing the reliability of astrophysical insights obtained from observational data. In this work, we focus on the description of three white-noise noise components included in the instrument model that characterize the white-noise level: the photon, the generation-recombination, and the amplifier noises. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 8 pages, 1 figure, Proceeding of the SPIE conference Millimeter, Submillimeter, and Far-Infrared Detectors and Instrumentation for Astronomy XII, SPIE Astronomical Telescopes + Instrumentation 2024

arXiv:2406.16291 [pdf, other]

Integrated Study of X-ray Spectrum and Time Lags for HBL Mrk 421 within the Framework of the Multiple-Zone Leptonic Model

Authors: Wen Hu, Jia-Lai Kang, Zhen-Yi Cai, Jun-Xian Wang, Zhen-Bo Su, Guang-Cheng Xiao

Abstract: We present the timing analysis of 10 archived \XMM observations with an exposure of $>40$ ks of Markarian 421. Mrk 421 is the brightest high-frequency-peaked BL Lac object (HBL) emitting in X-rays produced by electrons accelerated in the innermost regions of a relativistic jet pointing toward us. For each observation, we construct averaged X-ray spectra in 0.5--10 keV band, as well as 100 s binned… ▽ More We present the timing analysis of 10 archived \XMM observations with an exposure of $>40$ ks of Markarian 421. Mrk 421 is the brightest high-frequency-peaked BL Lac object (HBL) emitting in X-rays produced by electrons accelerated in the innermost regions of a relativistic jet pointing toward us. For each observation, we construct averaged X-ray spectra in 0.5--10 keV band, as well as 100 s binned light curves (LCs) in various subbands. During these observations, the source exhibited various intensity states differing by close to an order of magnitude in flux, with the fractional variability amplitude increasing with energy through the X-ray band. Bayesian power spectral density analysis reveals that the X-ray variability can be characterized by a colored noise, with an index ranging from $\sim-1.9$ to $-3.0$. Moreover, both the standard cross-correlation function and cross-spectral methods indicate that the amount of time lags increases with the energy difference between two compared LCs. A time-dependent two-zone jet model is developed to extract physical information from the X-ray emission of Mrk 421. In the model, we assume that the jet emission mostly comprises a quasi-stationary component and a highly variable one. Our results show that the two-zone model can simultaneously provide a satisfactory description for both the X-ray spectra and time lags observed in different epochs, with the model parameters constrained in a fully acceptable interval. We suggest that shocks within the jets may be the primary energy dissipation process responsible for triggering the rapid variability, although magnetic reconnection cannot be excluded. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 33 pages, 12 figures, 6 tables; Accepted for publication in ApJ supplement series

arXiv:2406.15572 [pdf, other]

CONCERTO at APEX -- On-sky performance in continuum

Authors: W. Hu, A. Beelen, G. Lagache, A. Fasano, A. Lundgren, P. Ade, M. Aravena, E. Barria, A. Benoit, M. Bethermin, J. Bounmy, O. Bourrion, G. Bres, C. De Breuck, M. Calvo, A. Catalano, F. -X. Desert, C. Dubois, C. A Duran, T. Fenouillet, J. Garcia, G. Garde, J. Goupy, C. Hoarau, J. -C. Lambert , et al. (14 additional authors not shown)

Abstract: We present the data-processing algorithms and the performance of CONCERTO (CarbON CII line in post-rEionisation and ReionisaTiOn epoch) in continuum by analysing the data from the commissioning and scientific observations. The beam pattern is characterized by an effective FWHM of 31.9 $\pm$ 0.6" and 34.4 $\pm$ 1.0" for high-frequency (HF) and low-frequency (LF) bands. The main beam is slightly elo… ▽ More We present the data-processing algorithms and the performance of CONCERTO (CarbON CII line in post-rEionisation and ReionisaTiOn epoch) in continuum by analysing the data from the commissioning and scientific observations. The beam pattern is characterized by an effective FWHM of 31.9 $\pm$ 0.6" and 34.4 $\pm$ 1.0" for high-frequency (HF) and low-frequency (LF) bands. The main beam is slightly elongated with a mean eccentricity of 0.46. Two error beams of $\sim$65" and $\sim$130" are characterized, enabling the estimate of a main beam efficiency of $\sim$0.52. The field of view is accurately reconstructed and presents coherent distortions between the HF and LF arrays. LEKID parameters were robustly determined for 80% of the read tones. Cross-talks between LEKIDs are the first cause of flagging, followed by an excess of eccentricity for $\sim$10% of the LEKIDs, all located in a given region of the field of view. On the 44 scans of Uranus selected for the absolute photometric calibration, 72.5% and 78.2% of the LEKIDs are selected as valid detectors with a probability >70%. By comparing Uranus measurements with a model, we obtain calibration factors of 19.5$\pm$0.6 [Hz/Jy] and 25.6$\pm$0.9 [Hz/Jy] for HF and LF. The point-source continuum measurement uncertainties are 3.0% and 3.4% for HF and LF bands. The RMS of CONCERTO maps is verified to evolve as proportional to the inverse square root of integration time. The measured NEFDs for HF and LF are 115$\pm$2 mJy/beam$\cdot$s$^{1/2}$ and 95$\pm$1 mJy/beam$\cdot$s$^{1/2}$, obtained using CONCERTO data on the COSMOS field for a mean precipitable water vapour and elevation of 0.81 mm and 55.7 deg. CONCERTO demonstrates unique capabilities in fast dual-band spectral map** with a $\sim$18.5' instantaneous field-of-view. CONCERTO's performance in continuum is perfectly in line with expectations. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 23pages, 22 figures

arXiv:2406.14473 [pdf, other]

Data-Centric AI in the Age of Large Language Models

Authors: Xinyi Xu, Zhaoxuan Wu, Rui Qiao, Arun Verma, Yao Shu, **gtan Wang, Xinyuan Niu, Zhenfeng He, Jiangwei Chen, Zijian Zhou, Gregory Kang Ruey Lau, Hieu Dao, Lucas Agussurja, Rachael Hwee Ling Sim, Xiaoqiang Lin, Wenyang Hu, Zhongxiang Dai, Pang Wei Koh, Bryan Kian Hsiang Low

Abstract: This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs, and yet it receives disproportionally low attention from the research community. We identify four specific… ▽ More This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs). We start by making the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs, and yet it receives disproportionally low attention from the research community. We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization. In each scenario, we underscore the importance of data, highlight promising research directions, and articulate the potential impacts on the research community and, where applicable, the society as a whole. For instance, we advocate for a suite of data-centric benchmarks tailored to the scale and complexity of data for LLMs. These benchmarks can be used to develop new data curation methods and document research efforts and results, which can help promote openness and transparency in AI and LLM research. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Preprint

arXiv:2406.13511 [pdf, other]

Slice-Level Scheduling for High Throughput and Load Balanced LLM Serving

Authors: Ke Cheng, Wen Hu, Zhi Wang, Hongen Peng, Jianguo Li, Sheng Zhang

Abstract: Large language models (LLMs) iteratively generate text token by token, with memory usage increasing with the length of generated token sequences. The unpredictability of generation lengths makes it difficult to estimate the time and memory needed to process requests, posing a challenge for effective request scheduling. Conventional sequence-level scheduling (SLS) serves requests in a first-come fi… ▽ More Large language models (LLMs) iteratively generate text token by token, with memory usage increasing with the length of generated token sequences. The unpredictability of generation lengths makes it difficult to estimate the time and memory needed to process requests, posing a challenge for effective request scheduling. Conventional sequence-level scheduling (SLS) serves requests in a first-come first-served (FCFS) manner with static batching where requests with short generation lengths are delayed until those with long ones have finished generation, which hurts computational efficiency. Besides, to avoid out-of-memory (OOM) errors, SLS batches requests with a small batch size, which limits throughput. Recently proposed iteration-level scheduling (ILS) enhances computational efficiency with continuous batching to return completed requests timely and dynamically add new requests for processing. However, many ILS schedulers limit the number of parallel-processing requests to avoid OOM errors while achieving a fast inference speed, which compromises throughput. Moreover, existing SLS and ILS schedulers fail to balance the workload across multiple deployed LLM instances. To tackle these challenges, we propose slice-level scheduling (SCLS). By splitting the predefined maximal generation length limit into slices and serving batches slice by slice, it provides a precise range of serving time and memory usage for batched requests, laying the foundation for effective scheduling. Experiments confirm that compared with SLS and ILS schedulers, SCLS can improve throughput by up to 315.8% and greatly mitigate load imbalance with proposed batching and offloading algorithms. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 13 pages, 22 figures

arXiv:2406.13409 [pdf, other]

doi 10.1145/3581783.3612007

PetalView: Fine-grained Location and Orientation Extraction of Street-view Images via Cross-view Local Search with Supplementary Materials

Authors: Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Xian**g Han, Yifang Yin, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann

Abstract: Satellite-based street-view information extraction by cross-view matching refers to a task that extracts the location and orientation information of a given street-view image query by using one or multiple geo-referenced satellite images. Recent work has initiated a new research direction to find accurate information within a local area covered by one satellite image centered at a location prior (… ▽ More Satellite-based street-view information extraction by cross-view matching refers to a task that extracts the location and orientation information of a given street-view image query by using one or multiple geo-referenced satellite images. Recent work has initiated a new research direction to find accurate information within a local area covered by one satellite image centered at a location prior (e.g., from GPS). It can be used as a standalone solution or complementary step following a large-scale search with multiple satellite candidates. However, these existing works require an accurate initial orientation (angle) prior (e.g., from IMU) and/or do not efficiently search through all possible poses. To allow efficient search and to give accurate prediction regardless of the existence or the accuracy of the angle prior, we present PetalView extractors with multi-scale search. The PetalView extractors give semantically meaningful features that are equivalent across two drastically different views, and the multi-scale search strategy efficiently inspects the satellite image from coarse to fine granularity to provide sub-meter and sub-degree precision extraction. Moreover, when an angle prior is given, we propose a learnable prior angle mixer to utilize this information. Our method obtains the best performance on the VIGOR dataset and successfully improves the performance on KITTI dataset test 1 set with the recall within 1 meter (r@1m) for location estimation to 68.88% and recall within 1 degree (r@1d) 21.10% when no angle prior is available, and with angle prior achieves stable estimations at r@1m and r@1d above 70% and 21%, up to a 40-degree noise level. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: This paper has been accepted by ACM Multimedia 2023. This version contains additional supplementary materials

Journal ref: Proceedings of the 31st ACM International Conference on Multimedia (2023) 56-66

arXiv:2406.12970 [pdf, other]

Warm and Fuzzy Dark Matter: Free Streaming of Wave Dark Matter

Authors: Rayne Liu, Wayne Hu, Huangyu Xiao

Abstract: Wave or fuzzy dark matter that is produced with relativistic wavenumbers exhibits free streaming effects analogous to warm or hot particle dark matter with relativistic momenta. Axions produced after inflation provide such a warm or mildly relativistic candidate, where the enhanced suppression and observational bounds are only moderately stronger than that from wave propagation of initially cold a… ▽ More Wave or fuzzy dark matter that is produced with relativistic wavenumbers exhibits free streaming effects analogous to warm or hot particle dark matter with relativistic momenta. Axions produced after inflation provide such a warm or mildly relativistic candidate, where the enhanced suppression and observational bounds are only moderately stronger than that from wave propagation of initially cold axions. More generally, the free streaming dam** also impacts isocurvature fluctuations from generation in causally disconnected patches. As coherent spatial fluctuations free stream away they leave incoherent and transient superpositions in their wakes. These multiple wave momentum streams are the wave analogue of particle phase space fluctuations or directional collisionless dam** of massive neutrinos or hot dark matter. The observable impact on both adiabatic and isocurvature fluctuations of fuzzy dark matter can differ from their cold dark matter counterparts due to free streaming depending on how warm or hot is their momentum distribution. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 16 pages, 11 figures

Report number: FERMILAB-PUB-24-0296-T

arXiv:2406.12881 [pdf, other]

Towards Unlocking Insights from Logbooks Using AI

Authors: Antonin Sulc, Alex Bien, Annika Eichler, Daniel Ratner, Florian Rehm, Frank Mayet, Gregor Hartmann, Hayden Hoschouer, Henrik Tuennermann, Jan Kaiser, Jason St. John, Jennefer Maldonado, Kyle Hazelwood, Raimund Kammering, Thorsten Hellert, Tim Wilksen, Verena Kain, Wan-Lin Hu

Abstract: Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly t… ▽ More Electronic logbooks contain valuable information about activities and events concerning their associated particle accelerator facilities. However, the highly technical nature of logbook entries can hinder their usability and automation. As natural language processing (NLP) continues advancing, it offers opportunities to address various challenges that logbooks present. This work explores jointly testing a tailored Retrieval Augmented Generation (RAG) model for enhancing the usability of particle accelerator logbooks at institutes like DESY, BESSY, Fermilab, BNL, SLAC, LBNL, and CERN. The RAG model uses a corpus built on logbook contributions and aims to unlock insights from these logbooks by leveraging retrieval over facility datasets, including discussion about potential multimodal sources. Our goals are to increase the FAIR-ness (findability, accessibility, interoperability, and reusability) of logbooks by exploiting their information content to streamline everyday use, enable macro-analysis for root cause analysis, and facilitate problem-solving automation. △ Less

Submitted 25 May, 2024; originally announced June 2024.

Comments: 5 pages, 1 figure, 15th International Particle Accelerator Conference

arXiv:2406.12111 [pdf, other]

Precision measurement of the $Ξ^-_b$ baryon lifetime

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1064 additional authors not shown)

Abstract: A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second sys… ▽ More A sample of $pp$ collision data, corresponding to an integrated luminosity of 5.5 fb$^{-1}$ and collected by the LHCb experiment during Run 2, is used to measure the ratio of the lifetime of the $Ξ^-_b$ baryon to that of the $Λ^0_b$ baryon, $r_τ\equivτ_{Ξ^-_b}/τ_{Λ^0_b}$. The value ${r_τ^{\rm Run\,2}=1.076\pm0.013\pm0.006}$ is obtained, where the first uncertainty is statistical and the second systematic. This value is averaged with the corresponding value from Run 1 to obtain ${r_τ^{\rm Run\,1,2} = 1.078\pm0.012\pm0.007}$. Multiplying by the world-average value of the $Λ^0_b$ lifetime yields $τ_{Ξ^-_b}^{\rm Run~1,2} = 1.578\pm0.018\pm0.010\pm0.011$ ps, where the uncertainties are statistical, systematic, and due to the limited knowledge of the $Λ^0_b$ lifetime. This measurement improves the precision of the current world average of the $Ξ^-_b$ lifetime by about a factor of two, and is in good agreement with the most recent theoretical predictions. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 12 pages, 5 figures. All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2014-010.html (LHCb public pages)

Report number: LHCb-PAPER-2024-010, CERN-EP-2024-139

arXiv:2406.10765 [pdf, other]

PWDFT-SW: Extending the Limit of Plane-Wave DFT Calculations to 16K Atoms on the New Sunway Supercomputer

Authors: Qingcai Jiang, Zhenwei Cao, Junshi Chen, Xinming Qin, Wei Hu, Hong An, **long Yang

Abstract: First-principles density functional theory (DFT) with plane wave (PW) basis set is the most widely used method in quantum mechanical material simulations due to its advantages in accuracy and universality. However, a perceived drawback of PW-based DFT calculations is their substantial computational cost and memory usage, which currently limits their ability to simulate large-scale complex systems… ▽ More First-principles density functional theory (DFT) with plane wave (PW) basis set is the most widely used method in quantum mechanical material simulations due to its advantages in accuracy and universality. However, a perceived drawback of PW-based DFT calculations is their substantial computational cost and memory usage, which currently limits their ability to simulate large-scale complex systems containing thousands of atoms. This situation is exacerbated in the new Sunway supercomputer, where each process is limited to a mere 16 GB of memory. Herein, we present a novel parallel implementation of plane wave density functional theory on the new Sunway supercomputer (PWDFT-SW). PWDFT-SW fully extracts the benefits of Sunway supercomputer by extensively refactoring and calibrating our algorithms to align with the system characteristics of the Sunway system. Through extensive numerical experiments, we demonstrate that our methods can substantially decrease both computational costs and memory usage. Our optimizations translate to a speedup of 64.8x for a physical system containing 4,096 silicon atoms, enabling us to push the limit of PW-based DFT calculations to large-scale systems containing 16,384 carbon atoms. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.08835 [pdf, other]

A Single-Step Non-Autoregressive Automatic Speech Recognition Architecture with High Accuracy and Inference Speed

Authors: Ziyang Zhuang, Chenfeng Miao, Kun Zou, Shuai Gong, Ming Fang, Tao Wei, Zijian Li, Wei Hu, Shaojun Wang, **g Xiao

Abstract: Non-autoregressive (NAR) automatic speech recognition (ASR) models predict tokens independently and simultaneously, bringing high inference speed. However, there is still a gap in the accuracy of the NAR models compared to the autoregressive (AR) models. To further narrow the gap between the NAR and AR models, we propose a single-step NAR ASR architecture with high accuracy and inference speed, ca… ▽ More Non-autoregressive (NAR) automatic speech recognition (ASR) models predict tokens independently and simultaneously, bringing high inference speed. However, there is still a gap in the accuracy of the NAR models compared to the autoregressive (AR) models. To further narrow the gap between the NAR and AR models, we propose a single-step NAR ASR architecture with high accuracy and inference speed, called EfficientASR. It uses an Index Map** Vector (IMV) based alignment generator to generate alignments during training, and an alignment predictor to learn the alignments for inference. It can be trained end-to-end (E2E) with cross-entropy loss combined with alignment loss. The proposed EfficientASR achieves competitive results on the AISHELL-1 and AISHELL-2 benchmarks compared to the state-of-the-art (SOTA) models. Specifically, it achieves character error rates (CER) of 4.26%/4.62% on the AISHELL-1 dev/test dataset, which outperforms the SOTA AR Conformer with about 30x inference speedup. △ Less

Submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.05785 [pdf, other]

A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions

Authors: Daizong Liu, Yang Liu, Wencan Huang, Wei Hu

Abstract: Text-guided 3D visual grounding (T-3DVG), which aims to locate a specific object that semantically corresponds to a language query from a complicated 3D scene, has drawn increasing attention in the 3D research community over the past few years. Compared to 2D visual grounding, this task presents great potential and challenges due to its closer proximity to the real world and the complexity of data… ▽ More Text-guided 3D visual grounding (T-3DVG), which aims to locate a specific object that semantically corresponds to a language query from a complicated 3D scene, has drawn increasing attention in the 3D research community over the past few years. Compared to 2D visual grounding, this task presents great potential and challenges due to its closer proximity to the real world and the complexity of data collection and 3D point cloud source processing. In this survey, we attempt to provide a comprehensive overview of the T-3DVG progress, including its fundamental elements, recent research advances, and future research directions. To the best of our knowledge, this is the first systematic survey on the T-3DVG task. Specifically, we first provide a general structure of the T-3DVG pipeline with detailed components in a tutorial style, presenting a complete background overview. Then, we summarize the existing T-3DVG approaches into different categories and analyze their strengths and weaknesses. We also present the benchmark datasets and evaluation metrics to assess their performances. Finally, we discuss the potential limitations of existing T-3DVG and share some insights on several promising research directions. The latest papers are continually collected at https://github.com/liudaizong/Awesome-3D-Visual-Grounding. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05608 [pdf]

Janus graphene nanoribbons with a single ferromagnetic zigzag edge

Authors: Shaotang Song, Yu Teng, Weichen Tang, Zhen Xu, Yuanyuan He, Jiawei Ruan, Takahiro Kojima, Wen** Hu, Franz J Giessibl, Hiroshi Sakaguchi, Steven G Louie, Jiong Lu

Abstract: Topological design of pi-electrons in zigzag-edged graphene nanoribbons (ZGNRs) leads to a wealth of magnetic quantum phenomena and exotic quantum phases. Symmetric ZGNRs typically exhibit antiferromagnetically coupled spin-ordered edge states. Eliminating cross-edge magnetic coupling in ZGNRs not only enables the realization of a new class of ferromagnetic quantum spin chains, enabling the explor… ▽ More Topological design of pi-electrons in zigzag-edged graphene nanoribbons (ZGNRs) leads to a wealth of magnetic quantum phenomena and exotic quantum phases. Symmetric ZGNRs typically exhibit antiferromagnetically coupled spin-ordered edge states. Eliminating cross-edge magnetic coupling in ZGNRs not only enables the realization of a new class of ferromagnetic quantum spin chains, enabling the exploration of quantum spin physics and entanglement of multiple qubits in the 1D limit, but also establishes a long-sought carbon-based ferromagnetic transport channel, pivotal for ultimate scaling of GNR-based quantum electronics. However, designing such GNRs entails overcoming daunting challenges, including simultaneous breaking of structural and spin symmetries, and designing elegant precursors for asymmetric fabrication of reactive zigzag edges. Here, we report a general approach for designing and fabricating such ferromagnetic GNRs in the form of Janus GNRs with two distinct edge configurations. Guided by Lieb's theorem and topological classification theory, we devised two JGNRs by asymmetrically introduced a topological defect array of benzene motifs to one zigzag edge, while kee** the opposing zigzag edge unchanged. This breaks structural symmetry and creates a sublattice imbalance within each unit cell, initiating a spin symmetry breaking. Three Z-shape precursors are designed to fabricate one parent ZGNR and two JGNRs with an optimal lattice spacing of the defect array for a complete quench of the magnetic edge states at the defective edge. Characterization via scanning probe microscopy/spectroscopy and first-principles density functional theory confirms the successful fabrication of Janus GNRs with ferromagnetic ground state delocalised along the pristine zigzag edge. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: 19 pages, 4 figures

arXiv:2406.04983 [pdf, other]

CityCraft: A Real Crafter for 3D City Generation

Authors: Jie Deng, Wenhao Chai, Junsheng Huang, Zhonghan Zhao, Qixuan Huang, Mingyan Gao, Jianshu Guo, Shengyu Hao, Wenhao Hu, Jenq-Neng Hwang, Xi Li, Gaoang Wang

Abstract: City scene generation has gained significant attention in autonomous driving, smart city development, and traffic simulation. It helps enhance infrastructure planning and monitoring solutions. Existing methods have employed a two-stage process involving city layout generation, typically using Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), or Transformers, followed by neur… ▽ More City scene generation has gained significant attention in autonomous driving, smart city development, and traffic simulation. It helps enhance infrastructure planning and monitoring solutions. Existing methods have employed a two-stage process involving city layout generation, typically using Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), or Transformers, followed by neural rendering. These techniques often exhibit limited diversity and noticeable artifacts in the rendered city scenes. The rendered scenes lack variety, resembling the training images, resulting in monotonous styles. Additionally, these methods lack planning capabilities, leading to less realistic generated scenes. In this paper, we introduce CityCraft, an innovative framework designed to enhance both the diversity and quality of urban scene generation. Our approach integrates three key stages: initially, a diffusion transformer (DiT) model is deployed to generate diverse and controllable 2D city layouts. Subsequently, a Large Language Model(LLM) is utilized to strategically make land-use plans within these layouts based on user prompts and language guidelines. Based on the generated layout and city plan, we utilize the asset retrieval module and Blender for precise asset placement and scene construction. Furthermore, we contribute two new datasets to the field: 1)CityCraft-OSM dataset including 2D semantic layouts of urban areas, corresponding satellite images, and detailed annotations. 2) CityCraft-Buildings dataset, featuring thousands of diverse, high-quality 3D building assets. CityCraft achieves state-of-the-art performance in generating realistic 3D cities. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 20 pages, 9 figures

arXiv:2406.04785 [pdf, other]

Enabling Efficient Batch Serving for LMaaS via Generation Length Prediction

Authors: Ke Cheng, Wen Hu, Zhi Wang, Peng Du, Jianguo Li, Sheng Zhang

Abstract: Nowadays, large language models (LLMs) are published as a service and can be accessed by various applications via APIs, also known as language-model-as-a-service (LMaaS). Without knowing the generation length of requests, existing serving systems serve requests in a first-come, first-served (FCFS) manner with a fixed batch size, which leads to two problems that affect batch serving efficiency. Fir… ▽ More Nowadays, large language models (LLMs) are published as a service and can be accessed by various applications via APIs, also known as language-model-as-a-service (LMaaS). Without knowing the generation length of requests, existing serving systems serve requests in a first-come, first-served (FCFS) manner with a fixed batch size, which leads to two problems that affect batch serving efficiency. First, the generation lengths of requests in a batch vary, and requests with short generation lengths must wait for requests with long generation lengths to finish during the batch serving procedure. Second, requests with longer generation lengths consume more memory during serving. Without knowing the generation lengths of batched requests, the batch size is always set small to avoid the out-of-memory (OOM) error, thus preventing the GPU from being fully utilized. In this paper, we find that a significant number of popular applications in the LMaaS scenario have a positive correlation between the generation length and the length of raw user input. Based on this observation, we propose Magnus, which can accurately predict the request generation length with the user input length, application-level, and user-level semantic features. Accordingly, Magnus can achieve high request throughput by batching requests of similar generation lengths together with adaptive batch sizes. Besides, Magnus can also schedule batches with the highest response ratio next (HRRN) policy to reduce request response time. Experiments conducted on our testbed show that Magnus improves request throughput by up to 234\% and reduces response time by up to 89.7\% compared to baselines. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 12 pages, 14 figures

arXiv:2406.03646 [pdf, other]

CLASSY X: Highlighting Differences Between Partial Covering and Semi-Analytic Modeling in the Estimate of Galactic Outflow Properties

Authors: M. Huberty, C. Carr, C. Scarlata, T. Heckman, A. Henry, X. Xu, K. Ariano-Cordoba, D. Berg, S. Charlot, J. Chisholm, S. Gazagnes, M. Hayes, W. Hu, B. James, R. M. Jennings, C. Leitherer, C. L. Martin, M. Mingozzi, E. Skillman, Y. Sugahara

Abstract: Feedback driven massive outflows play a crucial role in galaxy evolution by regulating star formation and influencing the dynamics of surrounding media. Extracting outflow properties from spectral lines is a notoriously difficult process for a number of reasons, including the possibility that a substantial fraction of the outflow is carried by dense gas in a very narrow range in velocity. This gas… ▽ More Feedback driven massive outflows play a crucial role in galaxy evolution by regulating star formation and influencing the dynamics of surrounding media. Extracting outflow properties from spectral lines is a notoriously difficult process for a number of reasons, including the possibility that a substantial fraction of the outflow is carried by dense gas in a very narrow range in velocity. This gas can hide in spectra with insufficient resolution. Empirically motivated analysis based on the Apparent Optical Depth method, commonly used in the literature, neglects the contribution of this gas, and may therefore underestimate the true gas column density. More complex semi-analytical line transfer (e.g., SALT) models, on the other hand, allow for the presence of this gas by modeling the radial density and velocity of the outflows as power laws. Here we compare the two approaches to quantify the uncertainties in the inferences of outflow properties based on 1-D "down-the-barrel" using the UV spectra of the CLASSY galaxy sample. We find that empirical modeling may significantly underestimate the column densities relative to SALT analysis, particularly in the optically thick regime. We use simulations to show that the main reason for this discrepancy is the presence of large amount of dense material at low velocities, which can be hidden by the finite spectral resolution of the data. The SALT models in turn could over-estimate the column densities if the assumed power laws of the density profiles strong are not a property of actual outflows. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: Submitted to ApJ, Comments welcome

arXiv:2406.03387 [pdf, other]

Measurement of the branching fraction ratios $R(D^{+})$ and $R(D^{*+})$ using muonic $τ$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1063 additional authors not shown)

Abstract: The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining… ▽ More The branching fraction ratios of $\overline{B}^0\to D^+τ^-\overlineν_τ$ and $\overline{B}^0\to D^{*+}τ^-\overlineν_τ$ decays are measured with respect to their muonic counterparts, using a data sample corresponding to an integrated luminosity of 2.0 fb$^{-1}$ collected by the LHCb experiment in proton-proton collisions at $\sqrt{s} = 13$ TeV. The reconstructed final states are formed by combining $D^+$ mesons with $τ^-\toμ^-\overlineν_μν_τ$ candidates, where the $D^+$ is reconstructed via the $D^+\to K^-π^+π^+$ decay. The results are \begin{align*} R(D^{+}) &= 0.249 \pm 0.043 \pm 0.047, R(D^{*+}) &= 0.402 \pm 0.081\pm 0.085, \end{align*} where the first uncertainties are statistical and the second systematic. The two measurements have a correlation coefficient of $-0.39$ and are compatible with the Standard Model. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: All figures and tables, along with machine-readable versions and any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/LHCbProjectPublic/LHCb-PAPER-2024-007.html (LHCb public pages)

Report number: LHCb-PAPER-2024-007, CERN-EP-2024-125

arXiv:2406.03156 [pdf, other]

Observation of new charmonium(-like) states in $B^+ \to D^{*\pm} D^{\mp} K^+$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A study of resonant structures in $B^{+}\rightarrow{D^{\ast+}D^{-}K^{+}}$ and $B^{+}\rightarrow{D^{\ast-}D^{+}K^{+}}$ decays is performed, using proton-proton collision data at centre-of-mass energies of $\sqrt{s}=7, 8$, and $13$ TeV recorded by the LHCb experiment, corresponding to an integrated luminosity of 9 fb$^{-1}$. A simultaneous amplitude fit is performed to the two channels with contribu… ▽ More A study of resonant structures in $B^{+}\rightarrow{D^{\ast+}D^{-}K^{+}}$ and $B^{+}\rightarrow{D^{\ast-}D^{+}K^{+}}$ decays is performed, using proton-proton collision data at centre-of-mass energies of $\sqrt{s}=7, 8$, and $13$ TeV recorded by the LHCb experiment, corresponding to an integrated luminosity of 9 fb$^{-1}$. A simultaneous amplitude fit is performed to the two channels with contributions from resonances decaying to $D^{\ast-}D^{+}$ and $D^{\ast+}D^{-}$ states linked by $C$ parity. This procedure allows the $C$-parities of resonances in the $D^{\ast\pm}D^{\mp}$ mass spectra to be determined. Four charmonium(-like) states are observed decaying into $D^{\ast\pm}D^{\mp}$: $η_c(3945)$, $h_c(4000)$, $χ_{c1}(4010)$ and $h_c(4300)$, with quantum numbers $J^{PC}$ equal to $0^{-+}$, $1^{+-}$, $1^{++}$ and $1^{+-}$, respectively. At least three of these states have not been observed previously. In addition, the existence of the $T_{\bar{c}\bar{s}0}^{*}(2870)^{0}$ and $T_{\bar{c}\bar{s}1}^{*}(2900)^{0}$ resonances in the $D^-K^+$ mass spectrum, already observed in the $B^+ \to D^+ D^- K^+$ decay, is confirmed in a different production channel. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2023-047.html (LHCb public pages)

Report number: LHCb-PAPER-2023-047, CERN-EP-2024-096

arXiv:2406.02087 [pdf, ps, other]

Boundedness of variation, oscillation and maximal differential transform on BMO space

Authors: Wenting Hu, Kai Wu, Dongyong Yang, Chao Zhang

Abstract: In this paper, we prove that the oscillation operator, variation operator and maximal differential transform associated with the approximate identities are bounded from ${\rm BMO}({\mathbb R}^n)$ to its subspace ${\rm BLO}({\mathbb R}^n)$. In this paper, we prove that the oscillation operator, variation operator and maximal differential transform associated with the approximate identities are bounded from ${\rm BMO}({\mathbb R}^n)$ to its subspace ${\rm BLO}({\mathbb R}^n)$. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.00403 [pdf, other]

Dual-perspective Cross Contrastive Learning in Graph Transformers

Authors: Zelin Yao, Chuang Liu, Xueqi Ma, Mukun Chen, Jia Wu, Xiantao Cai, Bo Du, Wenbin Hu

Abstract: Graph contrastive learning (GCL) is a popular method for leaning graph representations by maximizing the consistency of features across augmented views. Traditional GCL methods utilize single-perspective i.e. data or model-perspective) augmentation to generate positive samples, restraining the diversity of positive samples. In addition, these positive samples may be unreliable due to uncontrollabl… ▽ More Graph contrastive learning (GCL) is a popular method for leaning graph representations by maximizing the consistency of features across augmented views. Traditional GCL methods utilize single-perspective i.e. data or model-perspective) augmentation to generate positive samples, restraining the diversity of positive samples. In addition, these positive samples may be unreliable due to uncontrollable augmentation strategies that potentially alter the semantic information. To address these challenges, this paper proposed a innovative framework termed dual-perspective cross graph contrastive learning (DC-GCL), which incorporates three modifications designed to enhance positive sample diversity and reliability: 1) We propose dual-perspective augmentation strategy that provide the model with more diverse training data, enabling the model effective learning of feature consistency across different views. 2) From the data perspective, we slightly perturb the original graphs using controllable data augmentation, effectively preserving their semantic information. 3) From the model perspective, we enhance the encoder by utilizing more powerful graph transformers instead of graph neural networks. Based on the model's architecture, we propose three pruning-based strategies to slightly perturb the encoder, providing more reliable positive samples. These modifications collectively form the DC-GCL's foundation and provide more diverse and reliable training inputs, offering significant improvements over traditional GCL methods. Extensive experiments on various benchmarks demonstrate that DC-GCL consistently outperforms different baselines on various datasets and tasks. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 12 pages, 5 figures, submitted to IEEE TKDE

arXiv:2406.00235 [pdf, other]

Amplitude analysis of the radiative decay $B^0_s\to K^+K^-γ$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1061 additional authors not shown)

Abstract: A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for alm… ▽ More A search for radiative decay of $B^0_s$ mesons to orbitally excited $K^+K^-$ states is performed using proton proton collisions recorded by the \mbox{LHCb}\xspace experiment, corresponding to an integrated luminosity of 9~fb$^{-1}$. The dikaon spectrum in the mass range $m_{KK}<2400$~{\ensuremath{\,\text{Me\kern -0.1em V\!/}c^2}\xspace} is dominated by the $φ(1020)$ resonance that accounts for almost 70$\%$ of the decay rate. Considering the possible contributions of $f_2{(1270)}$, $f'_2{(1525)}$ and $f_2{(2010)}$ meson states, the overall tensor contribution to the amplitude is measured to be \begin{equation} {\cal F}_{\{f_2\}}=16.8\pm 0.5\mathrm{~(stat.)}\pm0.7\mathrm{~(syst.)}\%,\nonumber \end{equation} mostly dominated by the $f'_2(1525)$ state. Several statistically equivalent solutions are obtained for the detailed resonant structure depending on whether the smaller amplitudes interfere destructively or constructively with the dominant amplitude. The preferred solution that corresponds to the lowest values of the fit fractions along with constructive interference leads to the relative branching ratio measurement \begin{equation} \frac{{\cal B}(B^0_s\to f'_2γ)}{{\cal B}(B^0_s\toφγ)}= 19.4^{+0.9}_{-0.8}\mathrm{~(stat.)}{}^{+1.4}_{-0.5}\mathrm{~(syst.)}\pm0.5\mathrm{~(\cal{B})}\%\nonumber, \end{equation} where the last uncertainty is due to the ratio of measured branching fractions to the $K^+K^-$ final state. This result represents the first observation of the radiative $B^0_s\to f'_2(1525)γ$ decay, which is the second radiative transition observed in the $B^0_s$ sector. △ Less

Submitted 31 May, 2024; originally announced June 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-002.html (LHCb public pages)

Report number: LHCb-PAPER-2024-002, CERN-EP-2024-115

arXiv:2405.20279 [pdf, other]

CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Authors: Sijie Zhao, Yong Zhang, Xiaodong Cun, Shaoshu Yang, Muyao Niu, Xiaoyu Li, Wenbo Hu, Ying Shan

Abstract: Spatio-temporal compression of videos, utilizing networks such as Variational Autoencoders (VAE), plays a crucial role in OpenAI's SORA and numerous other video generative models. For instance, many LLM-like video models learn the distribution of discrete tokens derived from 3D VAEs within the VQVAE framework, while most diffusion-based video models capture the distribution of continuous latent ex… ▽ More Spatio-temporal compression of videos, utilizing networks such as Variational Autoencoders (VAE), plays a crucial role in OpenAI's SORA and numerous other video generative models. For instance, many LLM-like video models learn the distribution of discrete tokens derived from 3D VAEs within the VQVAE framework, while most diffusion-based video models capture the distribution of continuous latent extracted by 2D VAEs without quantization. The temporal compression is simply realized by uniform frame sampling which results in unsmooth motion between consecutive frames. Currently, there lacks of a commonly used continuous video (3D) VAE for latent diffusion-based video models in the research community. Moreover, since current diffusion-based approaches are often implemented using pre-trained text-to-image (T2I) models, directly training a video VAE without considering the compatibility with existing T2I models will result in a latent space gap between them, which will take huge computational resources for training to bridge the gap even with the T2I models as initialization. To address this issue, we propose a method for training a video VAE of latent video models, namely CV-VAE, whose latent space is compatible with that of a given image VAE, e.g., image VAE of Stable Diffusion (SD). The compatibility is achieved by the proposed novel latent space regularization, which involves formulating a regularization loss using the image VAE. Benefiting from the latent space compatibility, video models can be trained seamlessly from pre-trained T2I or video models in a truly spatio-temporally compressed latent space, rather than simply sampling video frames at equal intervals. With our CV-VAE, existing video models can generate four times more frames with minimal finetuning. Extensive experiments are conducted to demonstrate the effectiveness of the proposed video VAE. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Project Page: https://ailab-cvc.github.io/cvvae/index.html

arXiv:2405.19958 [pdf, other]

Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation

Authors: Yi Liu, Xiangyu Liu, Xiangrong Zhu, Wei Hu

Abstract: Multi-aspect controllable text generation aims to control the generated texts in attributes from multiple aspects (e.g., "positive" from sentiment and "sport" from topic). For ease of obtaining training samples, existing works neglect attribute correlations formed by the intertwining of different attributes. Particularly, the stereotype formed by imbalanced attribute correlations significantly aff… ▽ More Multi-aspect controllable text generation aims to control the generated texts in attributes from multiple aspects (e.g., "positive" from sentiment and "sport" from topic). For ease of obtaining training samples, existing works neglect attribute correlations formed by the intertwining of different attributes. Particularly, the stereotype formed by imbalanced attribute correlations significantly affects multi-aspect control. In this paper, we propose MAGIC, a new multi-aspect controllable text generation method with disentangled counterfactual augmentation. We alleviate the issue of imbalanced attribute correlations during training using counterfactual feature vectors in the attribute latent space by disentanglement. During inference, we enhance attribute correlations by target-guided counterfactual augmentation to further improve multi-aspect control. Experiments show that MAGIC outperforms state-of-the-art baselines in both imbalanced and balanced attribute correlation scenarios. Our source code and data are available at https://github.com/nju-websoft/MAGIC. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Accepted in the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

arXiv:2405.19782 [pdf, other]

Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion

Authors: Wei Cheng, Yuhan Wu, Wei Hu

Abstract: Recent years have witnessed the deployment of code language models (LMs) in various code intelligence tasks such as code completion. Yet, it is challenging for pre-trained LMs to generate correct completions in private repositories. Previous studies retrieve cross-file context based on import relations or text similarity, which is insufficiently relevant to completion targets. In this paper, we pr… ▽ More Recent years have witnessed the deployment of code language models (LMs) in various code intelligence tasks such as code completion. Yet, it is challenging for pre-trained LMs to generate correct completions in private repositories. Previous studies retrieve cross-file context based on import relations or text similarity, which is insufficiently relevant to completion targets. In this paper, we propose a dataflow-guided retrieval augmentation approach, called DraCo, for repository-level code completion. DraCo parses a private repository into code entities and establishes their relations through an extended dataflow analysis, forming a repo-specific context graph. Whenever triggering code completion, DraCo precisely retrieves relevant background knowledge from the repo-specific context graph and generates well-formed prompts to query code LMs. Furthermore, we construct a large Python dataset, ReccEval, with more diverse completion targets. Our experiments demonstrate the superior accuracy and applicable efficiency of DraCo, improving code exact match by 3.43% and identifier F1-score by 3.27% on average compared to the state-of-the-art approach. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Accepted in the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

arXiv:2405.19315 [pdf, other]

Matryoshka Query Transformer for Large Vision-Language Models

Authors: Wenbo Hu, Zi-Yi Dou, Liunian Harold Li, Amita Kamath, Nanyun Peng, Kai-Wei Chang

Abstract: Large Vision-Language Models (LVLMs) typically encode an image into a fixed number of visual tokens (e.g., 576) and process these tokens with a language model. Despite their strong performance, LVLMs face challenges in adapting to varying computational constraints. This raises the question: can we achieve flexibility in the number of visual tokens to suit different tasks and computational resource… ▽ More Large Vision-Language Models (LVLMs) typically encode an image into a fixed number of visual tokens (e.g., 576) and process these tokens with a language model. Despite their strong performance, LVLMs face challenges in adapting to varying computational constraints. This raises the question: can we achieve flexibility in the number of visual tokens to suit different tasks and computational resources? We answer this with an emphatic yes. Inspired by Matryoshka Representation Learning, we introduce the Matryoshka Query Transformer (MQT), capable of encoding an image into m visual tokens during inference, where m can be any number up to a predefined maximum. This is achieved by employing a query transformer with M latent query tokens to compress the visual embeddings. During each training step, we randomly select m <= M latent query tokens and train the model using only these first m tokens, discarding the rest. Combining MQT with LLaVA, we train a single model once, and flexibly and drastically reduce the number of inference-time visual tokens while maintaining similar or better performance compared to training independent models for each number of tokens. Our model, MQT-LLAVA, matches LLaVA-1.5 performance across 11 benchmarks using a maximum of 256 tokens instead of LLaVA's fixed 576. Reducing to 16 tokens (8x less TFLOPs) only sacrifices the performance by 2.4 points on MMBench. On certain tasks such as ScienceQA and MMMU, we can even go down to only 2 visual tokens with performance drops of just 3% and 6% each. Our exploration of the trade-off between the accuracy and computational cost brought about by the number of visual tokens facilitates future research to achieve the best of both worlds. △ Less

Submitted 6 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: Preprint. Our code and model are publicly available at https://github.com/gordonhu608/MQT-LLaVA

arXiv:2405.18435 [pdf, other]

QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks. △ Less

Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

Comments: initial technical report

arXiv:2405.17875 [pdf, other]

BO4IO: A Bayesian optimization approach to inverse optimization with uncertainty quantification

Authors: Yen-An Lu, Wei-Shou Hu, Joel A. Paulson, Qi Zhang

Abstract: This work addresses data-driven inverse optimization (IO), where the goal is to estimate unknown parameters in an optimization model from observed decisions that can be assumed to be optimal or near-optimal solutions to the optimization problem. The IO problem is commonly formulated as a large-scale bilevel program that is notoriously difficult to solve. Deviating from traditional exact solution m… ▽ More This work addresses data-driven inverse optimization (IO), where the goal is to estimate unknown parameters in an optimization model from observed decisions that can be assumed to be optimal or near-optimal solutions to the optimization problem. The IO problem is commonly formulated as a large-scale bilevel program that is notoriously difficult to solve. Deviating from traditional exact solution methods, we propose a derivative-free optimization approach based on Bayesian optimization, which we call BO4IO, to solve general IO problems. We treat the IO loss function as a black box and approximate it with a Gaussian process model. Using the predicted posterior function, an acquisition function is minimized at each iteration to query new candidate solutions and sequentially converge to the optimal parameter estimates. The main advantages of using Bayesian optimization for IO are two-fold: (i) it circumvents the need of complex reformulations of the bilevel program or specialized algorithms and can hence enable computational tractability even when the underlying optimization problem is nonconvex or involves discrete variables, and (ii) it allows approximations of the profile likelihood, which provide uncertainty quantification on the IO parameter estimates. We apply the proposed method to three computational case studies, covering different classes of forward optimization problems ranging from convex nonlinear to nonconvex mixed-integer nonlinear programs. Our extensive computational results demonstrate the efficacy and robustness of BO4IO to accurately estimate unknown model parameters from small and noisy datasets. In addition, the proposed profile likelihood analysis has proven to be effective in providing good approximations of the confidence intervals on the parameter estimates and assessing the identifiability of the unknown parameters. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.17811 [pdf, other]

Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh

Authors: Xiangjun Gao, Xiaoyu Li, Yiyu Zhuang, Qi Zhang, Wenbo Hu, Chaopeng Zhang, Yao Yao, Ying Shan, Long Quan

Abstract: Neural 3D representations such as Neural Radiance Fields (NeRF), excel at producing photo-realistic rendering results but lack the flexibility for manipulation and editing which is crucial for content creation. Previous works have attempted to address this issue by deforming a NeRF in canonical space or manipulating the radiance field based on an explicit mesh. However, manipulating NeRF is not hi… ▽ More Neural 3D representations such as Neural Radiance Fields (NeRF), excel at producing photo-realistic rendering results but lack the flexibility for manipulation and editing which is crucial for content creation. Previous works have attempted to address this issue by deforming a NeRF in canonical space or manipulating the radiance field based on an explicit mesh. However, manipulating NeRF is not highly controllable and requires a long training and inference time. With the emergence of 3D Gaussian Splatting (3DGS), extremely high-fidelity novel view synthesis can be achieved using an explicit point-based 3D representation with much faster training and rendering speed. However, there is still a lack of effective means to manipulate 3DGS freely while maintaining rendering quality. In this work, we aim to tackle the challenge of achieving manipulable photo-realistic rendering. We propose to utilize a triangular mesh to manipulate 3DGS directly with self-adaptation. This approach reduces the need to design various algorithms for different types of Gaussian manipulation. By utilizing a triangle shape-aware Gaussian binding and adapting method, we can achieve 3DGS manipulation and preserve high-fidelity rendering after manipulation. Our approach is capable of handling large deformations, local manipulations, and soft body simulations while kee** high-quality rendering. Furthermore, we demonstrate that our method is also effective with inaccurate meshes extracted from 3DGS. Experiments conducted demonstrate the effectiveness of our method and its superiority over baseline approaches. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Project page here: https://gaoxiangjun.github.io/mani_gs/

arXiv:2405.17347 [pdf, other]

Comprehensive analysis of local and nonlocal amplitudes in the $B^0\rightarrow K^{*0}μ^+μ^-$ decay

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1070 additional authors not shown)

Abstract: A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a m… ▽ More A comprehensive study of the local and nonlocal amplitudes contributing to the decay $B^0\rightarrow K^{*0}(\to K^+π^-) μ^+μ^-$ is performed by analysing the phase-space distribution of the decay products. The analysis is based on \proton\proton collision data corresponding to an integrated luminosity of 8.4fb$^{-1}$ collected by the LHCb experiment. This measurement employs for the first time a model of both one-particle and two-particle nonlocal amplitudes, and utilises the complete dimuon mass spectrum without any veto regions around the narrow charmonium resonances. In this way it is possible to explicitly isolate the local and nonlocal contributions and capture the interference between them. The results show that interference with nonlocal contributions, although larger than predicted, only has a minor impact on the Wilson Coefficients determined from the fit to the data. For the local contributions, the Wilson Coefficient $C_9$, responsible for vector dimuon currents, exhibits a $2.1σ$ deviation from the Standard Model expectation. The Wilson Coefficients $C_{10}$, $C_{9}'$ and $C_{10}'$ are all in better agreement than $C_{9}$ with the Standard Model and the global significance is at the level of $1.5σ$. The model used also accounts for nonlocal contributions from $B^{0}\to K^{*0}\left[τ^+τ^-\to μ^+μ^-\right]$ rescattering, resulting in the first direct measurement of the $b sττ$ vector effective-coupling $C_{9τ}$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-011.html (LHCb public pages)

Report number: LHCb-PAPER-2024-011, CERN-EP-2024-122

arXiv:2405.16209 [pdf]

Analytical photoresponses of Schottky contact MoS2 phototransistors

Authors: Jianyong Wei, Yumeng Liu, Yizhuo Wang, Kai Li, Zhentao Lian, Maosong Xie, Xinhan Yang, Seyed Saleh Mousavi Khaleghi, Fuxing Dai, Weida Hu, Xuejiao Gao, Rui Yang, Ya** Dan

Abstract: High-gain photodetectors based on two-dimensional (2D) semiconductors, in particular those in photoconductive mode, have been extensively investigated in the past decade. However, the classical photoconductive theory was derived on two misplaced assumptions. In this work, we established an explicit analytical device model for Schottky contact MoS2 phototransistors that fits well with experimental… ▽ More High-gain photodetectors based on two-dimensional (2D) semiconductors, in particular those in photoconductive mode, have been extensively investigated in the past decade. However, the classical photoconductive theory was derived on two misplaced assumptions. In this work, we established an explicit analytical device model for Schottky contact MoS2 phototransistors that fits well with experimental data. From the fitting results, we found that the Richardson constant of the MoS2 Schottky contact is temperature dependent, indicating that the Schottky contacts for the 2D material is best described by the mixed thermionic emission and diffusion model. Based on this device model, we further established an analytical photoresponse for the few-layer MoS2 phototransistors, from which we found the voltage distribution on the two Schottky contacts and the channel, and extracted the minority carrier recombination lifetimes. The lifetimes are comparable with the values found from transient photoluminescence measurements, which therefore validates our analytical photoresponses for Schottky contact 2D semiconducting phototransistors. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 15 pages, 6 figures

arXiv:2405.16122 [pdf, other]

Prompt Optimization with EASE? Efficient Ordering-aware Automated Selection of Exemplars

Authors: Zhaoxuan Wu, Xiaoqiang Lin, Zhongxiang Dai, Wenyang Hu, Yao Shu, See-Kiong Ng, Patrick Jaillet, Bryan Kian Hsiang Low

Abstract: Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM to downstream tasks by including input-label exemplars in the prompt without model fine-tuning. However, the quality of these exemplars in the prompt greatly impacts performance, highlighting the need for an effective automated exemplar s… ▽ More Large language models (LLMs) have shown impressive capabilities in real-world applications. The capability of in-context learning (ICL) allows us to adapt an LLM to downstream tasks by including input-label exemplars in the prompt without model fine-tuning. However, the quality of these exemplars in the prompt greatly impacts performance, highlighting the need for an effective automated exemplar selection method. Recent studies have explored retrieval-based approaches to select exemplars tailored to individual test queries, which can be undesirable due to extra test-time computation and an increased risk of data exposure. Moreover, existing methods fail to adequately account for the impact of exemplar ordering on the performance. On the other hand, the impact of the instruction, another essential component in the prompt given to the LLM, is often overlooked in existing exemplar selection methods. To address these challenges, we propose a novel method named EASE, which leverages the hidden embedding from a pre-trained language model to represent ordered sets of exemplars and uses a neural bandit algorithm to optimize the sets of exemplars while accounting for exemplar ordering. Our EASE can efficiently find an ordered set of exemplars that performs well for all test queries from a given task, thereby eliminating test-time computation. Importantly, EASE can be readily extended to jointly optimize both the exemplars and the instruction. Through extensive empirical evaluations (including novel tasks), we demonstrate the superiority of EASE over existing methods, and reveal practical insights about the impact of exemplar selection on ICL, which may be of independent interest. Our code is available at https://github.com/ZhaoxuanWu/EASE-Prompt-Optimization. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: 23 pages, 1 figure, 23 tables

arXiv:2405.14628 [pdf, other]

Online robust estimation and bootstrap inference for function-on-scalar regression

Authors: Guanghui Cheng, Wenjuan Hu, Ruitao Lin, Chen Wang

Abstract: We propose a novel and robust online function-on-scalar regression technique via geometric median to learn associations between functional responses and scalar covariates based on massive or streaming datasets. The online estimation procedure, developed using the average stochastic gradient descent algorithm, offers an efficient and cost-effective method for analyzing sequentially augmented datase… ▽ More We propose a novel and robust online function-on-scalar regression technique via geometric median to learn associations between functional responses and scalar covariates based on massive or streaming datasets. The online estimation procedure, developed using the average stochastic gradient descent algorithm, offers an efficient and cost-effective method for analyzing sequentially augmented datasets, eliminating the need to store large volumes of data in memory. We establish the almost sure consistency, $L_p$ convergence, and asymptotic normality of the online estimator. To enable efficient and fast inference of the parameters of interest, including the derivation of confidence intervals, we also develop an innovative two-step online bootstrap procedure to approximate the limiting error distribution of the robust online estimator. Numerical studies under a variety of scenarios demonstrate the effectiveness and efficiency of the proposed online learning method. A real application analyzing PM$_{2.5}$ air-quality data is also included to exemplify the proposed online approach. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14545 [pdf, other]

A Cross-Field Fusion Strategy for Drug-Target Interaction Prediction

Authors: Hongzhi Zhang, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

Abstract: Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to… ▽ More Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to an inability to effectively predict interaction the interactions between novel drugs and their targets. As a result, the cross-field information fusion strategy is employed to acquire local and global protein information. Thus, we propose the siamese drug-target interaction SiamDTI prediction method, which utilizes a double channel network structure for cross-field supervised learning.Experimental results on three benchmark datasets demonstrate that SiamDTI achieves higher accuracy levels than other state-of-the-art (SOTA) methods on novel drugs and targets.Additionally, SiamDTI's performance with known drugs and targets is comparable to that of SOTA approachs. The code is available at https://anonymous.4open.science/r/DDDTI-434D. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.14536 [pdf, other]

Regressor-free Molecule Generation to Support Drug Response Prediction

Authors: Kun Li, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

Abstract: Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the samplin… ▽ More Drug response prediction (DRP) is a crucial phase in drug discovery, and the most important metric for its evaluation is the IC50 score. DRP results are heavily dependent on the quality of the generated molecules. Existing molecule generation methods typically employ classifier-based guidance, enabling sampling within the IC50 classification range. However, these methods fail to ensure the sampling space range's effectiveness, generating numerous ineffective molecules. Through experimental and theoretical study, we hypothesize that conditional generation based on the target IC50 score can obtain a more effective sampling space. As a result, we introduce regressor-free guidance molecule generation to ensure sampling within a more effective space and support DRP. Regressor-free guidance combines a diffusion model's score estimation with a regression controller model's gradient based on number labels. To effectively map regression labels between drugs and cell lines, we design a common-sense numerical knowledge graph that constrains the order of text representations. Experimental results on the real-world dataset for the DRP task demonstrate our method's effectiveness in drug discovery. The code is available at:https://anonymous.4open.science/r/RMCD-DBD1. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 22 pages, 7 figures, 9 tables,

arXiv:2405.13568 [pdf, other]

CPE-Identifier: Automated CPE identification and CVE summaries annotation with Deep Learning and NLP

Authors: Wanyu Hu, Vrizlynn L. L. Thing

Abstract: With the drastic increase in the number of new vulnerabilities in the National Vulnerability Database (NVD) every year, the workload for NVD analysts to associate the Common Platform Enumeration (CPE) with the Common Vulnerabilities and Exposures (CVE) summaries becomes increasingly laborious and slow. The delay causes organisations, which depend on NVD for vulnerability management and security me… ▽ More With the drastic increase in the number of new vulnerabilities in the National Vulnerability Database (NVD) every year, the workload for NVD analysts to associate the Common Platform Enumeration (CPE) with the Common Vulnerabilities and Exposures (CVE) summaries becomes increasingly laborious and slow. The delay causes organisations, which depend on NVD for vulnerability management and security measurement, to be more vulnerable to zero-day attacks. Thus, it is essential to come out with a technique and tool to extract the CPEs in the CVE summaries accurately and quickly. In this work, we propose the CPE-Identifier system, an automated CPE annotating and extracting system, from the CVE summaries. The system can be used as a tool to identify CPE entities from new CVE text inputs. Moreover, we also automate the data generating and labeling processes using deep learning models. Due to the complexity of the CVE texts, new technical terminologies appear frequently. To identify novel words in future CVE texts, we apply Natural Language Processing (NLP) Named Entity Recognition (NER), to identify new technical jargons in the text. Our proposed model achieves an F1 score of 95.48%, an accuracy score of 99.13%, a precision of 94.83%, and a recall of 96.14%. We show that it outperforms prior works on automated CVE-CPE labeling by more than 9% on all metrics. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: International Conference on Information Systems Security and Privacy 2024

arXiv:2405.13103 [pdf, other]

Search for the lepton-flavor violating decay $B^0_s\toφμ^\pmτ^\mp$

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper l… ▽ More A search for the lepton-flavor violating decays $B^0_s\toφμ^\pmτ^\mp$ is presented, using a sample of proton-proton collisions at center-of-mass energies of 7, 8, and 13 TeV, collected with the LHCb detector and corresponding to a total integrated luminosity of $9\,\text{fb}^{-1}$. The $τ$ leptons are selected using decays with three charged pions. No significant excess is observed, and an upper limit on the branching fraction is determined to be ${\cal B}( B^0_s\toφμ^\pmτ^\mp) < 1.0\times 10^{-5}$ at 90% confidence level. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-006.html (LHCb public pages)

Report number: LHCb-PAPER-2024-006, CERN-EP-2024-114

arXiv:2405.12688 [pdf, other]

Study of $b$-hadron decays to $Λ_c^+ h^- h^{\prime -}$ final states

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1072 additional authors not shown)

Abstract: Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and… ▽ More Decays of $Ξ_b^-$ and $Ω_b^-$ baryons to $Λ_c^+ h^- h^{\prime -}$ final states, with $h^- h^{\prime -}$ being $π^-π^-$, $K^-π^-$ and $K^-K^-$ meson pairs, are searched for using data collected with the LHCb detector. The data sample studied corresponds to an integrated luminosity of $8.7\,\mathrm{fb}^{-1}$ of $pp$ collisions collected at centre-of-mass energies $\sqrt{s} = 7$, $8$ and $13\,\mathrm{Te\kern -0.1em V}$. The products of the relative branching fractions and fragmentation fractions for each signal mode, relative to the $B^- \to Λ_c^+ \overline{p} π^-$ mode, are measured, with $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$, $Ξ_{b}^- \toΛ_{c}^+ K^- K^-$ and $Ω_{b}^- \toΛ_{c}^+ K^- K^-$ decays being observed at over $5\,σ$ significance. The $Ξ_{b}^- \toΛ_{c}^+ K^- π^-$ mode is also used to measure the $Ξ_{b}^-$ production asymmetry, which is found to be consistent with zero. In addition, the $B^- \to Λ_{c}^+ \overline{p} K^-$ decay is observed for the first time, and its branching fraction is measured relative to that of the $B^- \to Λ_{c}^+ \overline{p} π^-$ mode. △ Less

Submitted 22 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://cern.ch/lhcbproject/Publications/p/LHCb-PAPER-2024-013.html

Report number: CERN-EP-2024-116, LHCb-PAPER-2024-013

arXiv:2405.11456 [pdf, other]

Biometrics-Based Authenticated Key Exchange with Multi-Factor Fuzzy Extractor

Authors: Hong Yen Tran, Jiankun Hu, Wen Hu

Abstract: Existing fuzzy extractors and similar methods provide an effective way for extracting a secret key from a user's biometric data, but are susceptible to impersonation attack: once a valid biometric sample is captured, the scheme is no longer secure. We propose a novel multi-factor fuzzy extractor that integrates both a user's secret (e.g., a password) and a user's biometrics in the generation and r… ▽ More Existing fuzzy extractors and similar methods provide an effective way for extracting a secret key from a user's biometric data, but are susceptible to impersonation attack: once a valid biometric sample is captured, the scheme is no longer secure. We propose a novel multi-factor fuzzy extractor that integrates both a user's secret (e.g., a password) and a user's biometrics in the generation and reconstruction process of a cryptographic key. We then employ this multi-factor fuzzy extractor to construct personal identity credentials which can be used in a new multi-factor authenticated key exchange protocol that possesses multiple important features. First, the protocol provides mutual authentication. Second, the user and service provider can authenticate each other without the involvement of the identity authority. Third, the protocol can prevent user impersonation from a compromised identity authority. Finally, even when both a biometric sample and the secret are captured, the user can re-register to create a new credential using a new secret (reusable/reissued identity credentials). Most existing works on multi-factor authenticated key exchange only have a subset of these features. We formally prove that the proposed protocol is semantically secure. Our experiments carried out on the finger vein dataset SDUMLA achieved a low equal error rate (EER) of 0.04%, a reasonable averaged computation time of 0.93 seconds for the user and service provider to authenticate and establish a shared session key, and a small communication overhead of only 448 bytes. △ Less

Submitted 19 May, 2024; originally announced May 2024.

Comments: 17 pages

arXiv:2405.11324 [pdf, other]

Transverse polarization measurement of $Λ$ hyperons in $p$Ne collisions at $\sqrt{s_{NN}}$ = 68.4 GeV with the $\mbox{LHCb}$ detector

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1065 additional authors not shown)

Abstract: A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are… ▽ More A measurement of the transverse polarization of the $Λ$ and $\barΛ$ hyperons in $p$Ne fixed-target collisions at $\sqrt{s_{NN}}$ = 68.4 GeV is presented using data collected by the LHCb detector. The polarization is studied using the decay $Λ\rightarrow p π^-$ together with its charge conjugated process, the integrated values measured are $$ P_Λ = 0.029 \pm 0.019 \, (\rm{stat}) \pm 0.012 \, (\rm{syst}) \, , $$ $$ P_{\barΛ} = 0.003 \pm 0.023 \, (\rm{stat}) \pm 0.014 \,(\rm{syst}) \,. $$ Furthermore, the results are shown as a function of the Feynman~$x$~variable, transverse momentum, pseudorapidity and rapidity of the hyperons, and are compared with previous measurements. △ Less

Submitted 24 May, 2024; v1 submitted 18 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lbfence.cern.ch/alcm/public/analysis/full-details/3120 (LHCb public pages)

Report number: CERN-EP-2024-121, LHCb-PAPER-2024-009

arXiv:2405.11001 [pdf, other]

Using physics-based simulation towards eliminating empiricism in extraterrestrial terramechanics applications

Authors: Wei Hu, Pei Li, Arno Rogg, Alexander Schepelmann, Colin Creager, Samuel Chandler, Ken Kamrin, Dan Negrut

Abstract: Recently, there has been a surge of international interest in extraterrestrial exploration targeting the Moon, Mars, the moons of Mars, and various asteroids. This contribution discusses how current state-of-the-art Earth-based testing for designing rovers and landers for these missions currently leads to overly optimistic conclusions about the behavior of these devices upon deployment on the targ… ▽ More Recently, there has been a surge of international interest in extraterrestrial exploration targeting the Moon, Mars, the moons of Mars, and various asteroids. This contribution discusses how current state-of-the-art Earth-based testing for designing rovers and landers for these missions currently leads to overly optimistic conclusions about the behavior of these devices upon deployment on the targeted celestial bodies. The key misconception is that gravitational offset is necessary during the \textit{terramechanics} testing of rover and lander prototypes on Earth. The body of evidence supporting our argument is tied to a small number of studies conducted during parabolic flights and insights derived from newly revised scaling laws. We argue that what has prevented the community from fully diagnosing the problem at hand is the absence of effective physics-based models capable of simulating terramechanics under low gravity conditions. We developed such a physics-based simulator and utilized it to gauge the mobility of early prototypes of the Volatiles Investigating Polar Exploration Rover (VIPER), which is slated to depart for the Moon in November 2024. This contribution discusses the results generated by this simulator, how they correlate with physical test results from the NASA-Glenn SLOPE lab, and the fallacy of the gravitational offset in rover and lander testing. The simulator developed is open sourced and made publicly available for unfettered use; it can support principled studies that extend beyond trafficability analysis to provide insights into in-situ resource utilization activities, e.g., digging, bulldozing, and berming in low gravity. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.10642 [pdf, other]

Hi-GMAE: Hierarchical Graph Masked Autoencoders

Authors: Chuang Liu, Zelin Yao, Yibing Zhan, Xueqi Ma, Dapeng Tao, Jia Wu, Wenbin Hu, Shirui Pan, Bo Du

Abstract: Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance,… ▽ More Graph Masked Autoencoders (GMAEs) have emerged as a notable self-supervised learning approach for graph-structured data. Existing GMAE models primarily focus on reconstructing node-level information, categorizing them as single-scale GMAEs. This methodology, while effective in certain contexts, tends to overlook the complex hierarchical structures inherent in many real-world graphs. For instance, molecular graphs exhibit a clear hierarchical organization in the form of the atoms-functional groups-molecules structure. Hence, the inability of single-scale GMAE models to incorporate these hierarchical relationships often leads to their inadequate capture of crucial high-level graph information, resulting in a noticeable decline in performance. To address this limitation, we propose Hierarchical Graph Masked AutoEncoders (Hi-GMAE), a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs. First, Hi-GMAE constructs a multi-scale graph hierarchy through graph pooling, enabling the exploration of graph structures across different granularity levels. To ensure masking uniformity of subgraphs across these scales, we propose a novel coarse-to-fine strategy that initiates masking at the coarsest scale and progressively back-projects the mask to the finer scales. Furthermore, we integrate a gradual recovery strategy with the masking process to mitigate the learning challenges posed by completely masked subgraphs. Diverging from the standard graph neural network (GNN) used in GMAE models, Hi-GMAE modifies its encoder and decoder into hierarchical structures. This entails using GNN at the finer scales for detailed local graph analysis and employing a graph transformer at coarser scales to capture global information. Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 10 pages, 6 figures, 3 tables

arXiv:2405.10288 [pdf, other]

Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction

Authors: Jianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu

Abstract: Facts extraction is pivotal for constructing knowledge graphs. Recently, the increasing demand for temporal facts in downstream tasks has led to the emergence of the task of temporal fact extraction. In this paper, we specifically address the extraction of temporal facts from natural language text. Previous studies fail to handle the challenge of establishing time-to-fact correspondences in comple… ▽ More Facts extraction is pivotal for constructing knowledge graphs. Recently, the increasing demand for temporal facts in downstream tasks has led to the emergence of the task of temporal fact extraction. In this paper, we specifically address the extraction of temporal facts from natural language text. Previous studies fail to handle the challenge of establishing time-to-fact correspondences in complex sentences. To overcome this hurdle, we propose a timeline-based sentence decomposition strategy using large language models (LLMs) with in-context learning, ensuring a fine-grained understanding of the timeline associated with various facts. In addition, we evaluate the performance of LLMs for direct temporal fact extraction and get unsatisfactory results. To this end, we introduce TSDRE, a method that incorporates the decomposition capabilities of LLMs into the traditional fine-tuning of smaller pre-trained language models (PLMs). To support the evaluation, we construct ComplexTRED, a complex temporal fact extraction dataset. Our experiments show that TSDRE achieves state-of-the-art results on both HyperRED-Temporal and ComplexTRED datasets. △ Less

Submitted 18 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

Comments: Accepted to ACL2024 main conference

arXiv:2405.07303 [pdf, other]

Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment

Authors: L. T. Yang, S. K. Liu, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

Abstract: We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China **** Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio… ▽ More We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China **** Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axions with mass up to 100 eV/$c^2$. Within the hadronic model of KSVZ, our results exclude axion mass $>5.3~\rm{eV}/c^2$ at 95\% C.L. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 7 pages, 5 figures

arXiv:2405.06556 [pdf, other]

Search for time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays

Authors: LHCb collaboration, R. Aaij, A. S. W. Abdelmotteleb, C. Abellan Beteta, F. Abudinén, T. Ackernley, A. A. Adefisoye, B. Adeva, M. Adinolfi, P. Adlarson, C. Agapopoulou, C. A. Aidala, Z. Ajaltouni, S. Akar, K. Akiba, P. Albicocco, J. Albrecht, F. Alessio, M. Alexander, Z. Aliouche, P. Alvarez Cartelle, R. Amalric, S. Amato, J. L. Amey, Y. Amhis , et al. (1062 additional authors not shown)

Abstract: A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the… ▽ More A measurement of time-dependent $CP$ violation in $D^0 \rightarrow π^+ π^- π^0$ decays using a $pp$ collision data sample collected by the LHCb experiment in 2012 and from 2015 to 2018, corresponding to an integrated luminosity of 7.7$\,\mathrm{fb}^{-1}$, is presented. The initial flavour of each $D^0$ candidate is determined from the charge of the pion produced in the $D^*(2010)^+ \rightarrow D^0 π^+$ decay. The decay $D^0 \rightarrow K^- π^+ π^0$ is used as a control channel to validate the measurement procedure. The gradient of the time-dependent $CP$ asymmetry, $ΔY$, in $D^0 \rightarrow π^+ π^- π^0$ decays is measured to be \begin{equation*} ΔY = (-1.3 \pm 6.3 \pm 2.4) \times 10^{-4}, \end{equation*} where the first uncertainty is statistical and the second is systematic, which is compatible with $CP$ conservation. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: All figures and tables, along with any supplementary material and additional information, are available at https://lhcbproject.web.cern.ch/Publications/p/LHCb-PAPER-2024-003.html (LHCb public pages)

Report number: LHCb-PAPER-2024-003, CERN-EP-2024-111

arXiv:2405.06424 [pdf, other]

Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation

Authors: JoonHo Lee, Jae Oh Woo, Juree Seok, Parisa Hassanzadeh, Wooseok Jang, JuYoun Son, Sima Didari, Baruch Gutow, Heng Hao, Hankyu Moon, Wenjun Hu, Yeong-Dae Kwon, Taehee Lee, Seungjai Min

Abstract: Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for t… ▽ More Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for the quality of paired responses based on Bayesian approximation. Trained with preference datasets, our uncertainty-enabled proxy not only scores rewards for responses but also evaluates their inherent uncertainty. Empirical results demonstrate significant benefits of incorporating the proposed proxy into language model training. Our method boosts the instruction following capability of language models by refining data curation for training and improving policy optimization objectives, thereby surpassing existing methods by a large margin on benchmarks such as Vicuna and MT-bench. These findings highlight that our proposed approach substantially advances language model training and paves a new way of harnessing uncertainty within language models. △ Less

Submitted 19 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: Accepted to ICML 2024

arXiv:2405.05708 [pdf]

Characteristic-Mode Based Conformal Design of Ultra-Wideband Antenna Array

Authors: Zhan Chen, Wei Hu, Yuchen Gao, Qi Luo, Xiangbo Wang, Steven Gao

Abstract: An innovative design method of conformal array antennas is presented by utilizing characteristic mode analysis (CMA) in this work. A single-layer continuous perfect electric conductor under bending conditions is conducted by CMA to evaluate the variations in operating performance. By using this method, the design process of a conformal array is simplified. The results indicate that the operating p… ▽ More An innovative design method of conformal array antennas is presented by utilizing characteristic mode analysis (CMA) in this work. A single-layer continuous perfect electric conductor under bending conditions is conducted by CMA to evaluate the variations in operating performance. By using this method, the design process of a conformal array is simplified. The results indicate that the operating performance of the antenna with single-layer metal radiation structure remains stable within a certain range of curvature. Subsequently, an infinite array element using single-layer metal radiation structure is designed, operating in ultra-wideband and dual polarization. Following, an 8 * 8 ultra-wideband dual-polarized cylindrical-conformal array (UDCA) is developed by wrap** the planar arrays to a cylindric surface, which has a stable operating performance even at a curvature radius as small as 100 mm. Finally, a physical prototype is cost-effectively fabricated by novel manufacturing solutions that stack three-layer conformal substrate. The experimental result demonstrates that the proposed UDCA with a 1.2λ curvature radius operates at 3.6~9.6 GHz (90.9%) and achieves 60° wide-angle scanning in two principal planes, which provides a practical and promising solution for conformal array applications. The insights derived from the CMA offer a direction for further advancement in conformal antenna research. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Showing 1–50 of 1,674 results for author: Hu, W