-
Additive manufacturing in ceramics: targeting lightweight mirror applications in the visible, ultraviolet and X-ray
Authors:
Carolyn Atkins,
Younes Chahid,
Gregory Lister,
Rhys Tuck,
David Isherwood,
Nan Yu,
Rongyan Sun,
Itsuki Noto,
Kazuya Yamamura,
Marta Civitani,
Gabriele Vecchi,
Giovanni Pareschi,
Simon G. Alcock,
Ioana-Theodora Nistea,
Murilo Bazan Da Silva
Abstract:
Additive manufacturing (AM; 3D printing) has clear benefits in the production of lightweight mirrors for astronomy: it can create optimised lightweight structures and combine multiple components into one. New capabilities in AM ceramics, silicon carbide infiltrated with silicon and fused silica, offer the possibility to combine the design benefits of AM with a material suitable for visible, ultrav…
▽ More
Additive manufacturing (AM; 3D printing) has clear benefits in the production of lightweight mirrors for astronomy: it can create optimised lightweight structures and combine multiple components into one. New capabilities in AM ceramics, silicon carbide infiltrated with silicon and fused silica, offer the possibility to combine the design benefits of AM with a material suitable for visible, ultraviolet and X-ray applications. This paper will introduce the printing methods and post-processing steps to convert AM ceramic samples into reflective mirrors. Surface roughness measurements after abrasive polishing of the AM ceramics will be presented.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
The S-PLUS Ultra-Short Survey: first data release
Authors:
Hélio D. Perottoni,
Vinicius M. Placco,
Felipe Almeida-Fernandes,
Fábio R. Herpich,
Silvia Rossi,
Timothy C. Beers,
Rodolfo Smiljanic,
João A. S. Amarante,
Guilherme Limberg,
Ariel Werle,
Helio J. Rocha-Pinto,
Leandro Beraldo e Silva,
Simone Daflon,
Alvaro Alvarez-Candal,
Gustavo B Oliveira Schwarz,
William Schoenell,
Tiago Ribeiro,
Antonio Kanaan
Abstract:
This paper presents the first public data release of the S-PLUS Ultra-Short Survey (USS), a photometric survey with short exposure times, covering approximately 9300 deg$^{2}$ of the Southern sky. The USS utilizes the Javalambre 12-band magnitude system, including narrow and medium-band and broad-band filters targeting prominent stellar spectral features. The primary objective of the USS is to ide…
▽ More
This paper presents the first public data release of the S-PLUS Ultra-Short Survey (USS), a photometric survey with short exposure times, covering approximately 9300 deg$^{2}$ of the Southern sky. The USS utilizes the Javalambre 12-band magnitude system, including narrow and medium-band and broad-band filters targeting prominent stellar spectral features. The primary objective of the USS is to identify bright, extremely metal-poor (EMP; [Fe/H] $\leq -3$) and ultra metal-poor (UMP; [Fe/H] $\leq -4$) stars for further analysis using medium- and high-resolution spectroscopy.}{This paper provides an overview of the survey observations, calibration method, data quality, and data products. Additionally, it presents the selection of EMP and UMP candidates.}{The data from the USS were reduced and calibrated using the same methods as presented in the S-PLUS DR2. An additional step was introduced, accounting for the offset between the observed magnitudes off the USS and the predicted magnitudes from the very low-resolution Gaia XP spectra.}{This first release contains data for 163 observed fields totaling $\sim$324 deg$^{2}$ along the Celestial Equator. The magnitudes obtained from the USS are well-calibrated, showing a difference of $\sim 15$ mmag compared to the predicted magnitudes by the GaiaXPy toolkit. By combining colors and magnitudes, 140 candidates for EMP or UMP have been identified for follow-up studies.}{The S-PLUS USS DR1 is an important milestone in the search for bright metal-poor stars, with magnitudes in the range 10 $ < r \leq 14$. The USS is an ongoing survey; in the near future, it will provide many more bright metal-poor candidate stars for spectroscopic follow-up.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
xTower: A Multilingual LLM for Explaining and Correcting Translation Errors
Authors:
Marcos Treviso,
Nuno M. Guerreiro,
Sweta Agrawal,
Ricardo Rei,
José Pombal,
Tania Vaz,
Helena Wu,
Beatriz Silva,
Daan van Stigt,
André F. T. Martins
Abstract:
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for tr…
▽ More
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for translation errors in order to guide the generation of a corrected translation. The quality of the generated explanations by xTower are assessed via both intrinsic and extrinsic evaluation. We ask expert translators to evaluate the quality of the explanations across two dimensions: relatedness towards the error span being explained and helpfulness in error understanding and improving translation quality. Extrinsically, we test xTower across various experimental setups in generating translation corrections, demonstrating significant improvements in translation quality. Our findings highlight xTower's potential towards not only producing plausible and helpful explanations of automatic translations, but also leveraging them to suggest corrected translations.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Semi-supervised classification of dental conditions in panoramic radiographs using large language model and instance segmentation: A real-world dataset evaluation
Authors:
Bernardo Silva,
Jefferson Fontinele,
Carolina Letícia Zilli Vieira,
João Manuel R. S. Tavares,
Patricia Ramos Cury,
Luciano Oliveira
Abstract:
Dental panoramic radiographs offer vast diagnostic opportunities, but training supervised deep learning networks for automatic analysis of those radiology images is hampered by a shortage of labeled data. Here, a different perspective on this problem is introduced. A semi-supervised learning framework is proposed to classify thirteen dental conditions on panoramic radiographs, with a particular em…
▽ More
Dental panoramic radiographs offer vast diagnostic opportunities, but training supervised deep learning networks for automatic analysis of those radiology images is hampered by a shortage of labeled data. Here, a different perspective on this problem is introduced. A semi-supervised learning framework is proposed to classify thirteen dental conditions on panoramic radiographs, with a particular emphasis on teeth. Large language models were explored to annotate the most common dental conditions based on dental reports. Additionally, a masked autoencoder was employed to pre-train the classification neural network, and a Vision Transformer was used to leverage the unlabeled data. The analyses were validated using two of the most extensive datasets in the literature, comprising 8,795 panoramic radiographs and 8,029 paired reports and images. Encouragingly, the results consistently met or surpassed the baseline metrics for the Matthews correlation coefficient. A comparison of the proposed solution with human practitioners, supported by statistical analysis, highlighted its effectiveness and performance limitations; based on the degree of agreement among specialists, the solution demonstrated an accuracy level comparable to that of a junior specialist.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Position: Benchmarking is Limited in Reinforcement Learning Research
Authors:
Scott M. Jordan,
Adam White,
Bruno Castro da Silva,
Martha White,
Philip S. Thomas
Abstract:
Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is…
▽ More
Novel reinforcement learning algorithms, or improvements on existing ones, are commonly justified by evaluating their performance on benchmark environments and are compared to an ever-changing set of standard algorithms. However, despite numerous calls for improvements, experimental practices continue to produce misleading or unsupported claims. One reason for the ongoing substandard practices is that conducting rigorous benchmarking experiments requires substantial computational time. This work investigates the sources of increased computation costs in rigorous experiment designs. We show that conducting rigorous performance benchmarks will likely have computational costs that are often prohibitive. As a result, we argue for using an additional experimentation paradigm to overcome the limitations of benchmarking.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Quantum Mechanics of Particles Constrained to Spiral Curves with Application to Polyene Chains
Authors:
Eduardo V. S. Anjos,
Antonio C. Pavão,
Luiz C. B. da Silva,
Cristiano C. Bastos
Abstract:
Context: Due to advances in synthesizing lower dimensional materials there is the challenge of finding the wave equation that effectively describes quantum particles moving on 1D and 2D domains. Jensen and Koppe and Da Costa independently introduced a confining potential formalism showing that the effective constrained dynamics is subjected to a scalar geometry-induced potential; for the confineme…
▽ More
Context: Due to advances in synthesizing lower dimensional materials there is the challenge of finding the wave equation that effectively describes quantum particles moving on 1D and 2D domains. Jensen and Koppe and Da Costa independently introduced a confining potential formalism showing that the effective constrained dynamics is subjected to a scalar geometry-induced potential; for the confinement to a curve, the potential depends on the curve's curvature function.
Method: To characterize the $π$ electrons in polyenes, we follow two approaches. First, we utilize a weakened Coulomb potential associated with a spiral curve. The solution to the Schrödinger equation with Dirichlet boundary conditions yields Bessel functions, and the spectrum is obtained analytically. We employ the particle-in-a-box model in the second approach, incorporating effective mass corrections. The $π$-$π^*$ transitions of polyenes were calculated in good experimental agreement with both approaches, although with different wave functions.
△ Less
Submitted 6 June, 2024; v1 submitted 5 June, 2024;
originally announced June 2024.
-
Segmentation of dense and multi-species bacterial colonies achieved using models trained on synthetic microscopy images
Authors:
Vincent Hickl,
Abid Khan,
René M. Rossi,
Bruno F. B. Silva,
Katharina Maniura-Weber
Abstract:
The spread of microbial infections is governed by the self-organization of bacteria on surfaces. Limitations of live imaging techniques make collective behaviors in clinically relevant systems challenging to quantify. Here, novel experimental and image analysis techniques for high-fidelity single-cell segmentation of bacterial colonies are developed. Machine learning-based segmentation models are…
▽ More
The spread of microbial infections is governed by the self-organization of bacteria on surfaces. Limitations of live imaging techniques make collective behaviors in clinically relevant systems challenging to quantify. Here, novel experimental and image analysis techniques for high-fidelity single-cell segmentation of bacterial colonies are developed. Machine learning-based segmentation models are trained solely using synthetic microscopy images that are processed to look realistic using state-of-the-art image-to-image translation methods, requiring no biophysical modeling. Accurate single-cell segmentation is achieved for densely packed single-species colonies and multi-species colonies of common pathogenic bacteria, even under suboptimal imaging conditions and for both brightfield and confocal laser scanning microscopy. The resulting data provide quantitative insights into the self-organization of bacteria on soft surfaces. Thanks to their high adaptability and relatively simple implementation, these methods promise to greatly facilitate quantitative descriptions of bacterial infections in varied environments.
△ Less
Submitted 14 June, 2024; v1 submitted 20 May, 2024;
originally announced May 2024.
-
The Analysis of Criminal Recidivism: A Hierarchical Model-Based Approach for the Analysis of Zero-Inflated, Spatially Correlated recurrent events Data
Authors:
Alisson C. C. Silva,
Fábio N. Demarqui,
Bráulio F. Silva,
Marcos O. Prates
Abstract:
The life course perspective in criminology has become prominent last years, offering valuable insights into various patterns of criminal offending and pathways. The study of criminal trajectories aims to understand the beginning, persistence and desistence in crime, providing intriguing explanations about these moments in life. Central to this analysis is the identification of patterns in the freq…
▽ More
The life course perspective in criminology has become prominent last years, offering valuable insights into various patterns of criminal offending and pathways. The study of criminal trajectories aims to understand the beginning, persistence and desistence in crime, providing intriguing explanations about these moments in life. Central to this analysis is the identification of patterns in the frequency of criminal victimization and recidivism, along with the factors that contribute to them. Specifically, this work introduces a new class of models that overcome limitations in traditional methods used to analyze criminal recidivism. These models are designed for recurrent events data characterized by excess of zeros and spatial correlation. They extend the Non-Homogeneous Poisson Process, incorporating spatial dependence in the model through random effects, enabling the analysis of associations among individuals within the same spatial stratum. To deal with the excess of zeros in the data, a zero-inflated Poisson mixed model was incorporated. In addition to parametric models following the Power Law process for baseline intensity functions, we propose flexible semi-parametric versions approximating the intensity function using Bernstein Polynomials. The Bayesian approach offers advantages such as incorporating external evidence and modeling specific correlations between random effects and observed data. The performance of these models was evaluated in a simulation study with various scenarios, and we applied them to analyze criminal recidivism data in the Metropolitan Region of Belo Horizonte, Brazil. The results provide a detailed analysis of high-risk areas for recurrent crimes and the behavior of recidivism rates over time. This research significantly enhances our understanding of criminal trajectories, paving the way for more effective strategies in combating criminal recidivism.
△ Less
Submitted 4 May, 2024;
originally announced May 2024.
-
Early flash-ionization lines in SN 2024ggi revealed by high-resolution spectroscopy
Authors:
Thallis Pessi,
Régis Cartier,
Emilio Hueichapan,
Danielle de Brito Silva,
Jose L. Prieto,
Ricardo R. Muñoz,
Gustavo E. Medina,
Paula Diaz
Abstract:
We present an analysis of very early high-resolution spectroscopic observations of the nearby core-collapse (CC) supernova (SN) 2024ggi, a Type II SN that ocurred in the galaxy NGC 3621, at a distance of 7.11 Mpc ($z\approx0.002435$). These observations represent the earliest high-resolution spectroscopy of a CCSN ever made. We analyze the very early-phase spectroscopic evolution of SN 2024ggi obt…
▽ More
We present an analysis of very early high-resolution spectroscopic observations of the nearby core-collapse (CC) supernova (SN) 2024ggi, a Type II SN that ocurred in the galaxy NGC 3621, at a distance of 7.11 Mpc ($z\approx0.002435$). These observations represent the earliest high-resolution spectroscopy of a CCSN ever made. We analyze the very early-phase spectroscopic evolution of SN 2024ggi obtained in a short interval at 26.6 and 33.8h after the SN first light. Observations were obtained with the high-resolution spectrograph MIKE ($R\approx22600-28000$) at the 6.5m Magellan Clay Telescope, located at the Las Campanas Observatory, during the night of 2024-04-12UT. We constrain emission line features in the early-phase spectroscopic evolution of SN 2024ggi. We analyze the evolution of main spectroscopic features and the occurrence of high-ionization emission lines, by estimating their full width at half maximum (FWHM), equivalent width (EW), and blueshift velocities. We then compare our results to other early-time observations of CCSNe. The spectra show strong and narrow features of Balmer emission lines and of high-ionization species of HeI, HeII, NIII, CIII, together with relatively broader emission features of NIV and CIV. Some of these features become broader or disappear in the interval of 8h, indicating the rapid changes in the early evolution of CCSNe flash-ionization features. The HeII, CIV, NIV and Balmer emission lines have asymmetric Lorentzian profiles, with the HeII $\lambda4686$ broad component showing blue wings that extends up to $\sim-1000$ km s$^{-1}$. We also measure a CSM expansion velocity of $\sim 79 \ \textrm{km} \ \textrm{s}^{-1}$ from the blueshift in the H$α$ emission profile, and a total extinction in the line of sight of $E(B-V)=0.16$ mag. Finally, we note many similarities of SN 2024ggi to the early evolution of SN 2023ixf.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
The Pristine Inner Galaxy Survey (PIGS) IX. The largest detailed chemical analysis of very metal-poor stars in the Sagittarius dwarf galaxy
Authors:
Federico Sestito,
Sara Vitali,
Paula Jofre,
Kim A. Venn,
David S. Aguado,
Claudia Aguilera-Gómez,
Anke Ardern-Arentsen,
Danielle de Brito Silva,
Raymond Carlberg,
Camilla J. L. Eldridge,
Felipe Gran,
Vanessa Hill,
Pascale Jablonka,
Georges Kordopatis,
Nicolas F. Martin,
Tadafumi Matsuno,
Samuel Rusterucci,
Else Starkenburg,
Akshara Viswanathan
Abstract:
The most metal-poor stars provide valuable insights into the early chemical enrichment history of a system, carrying the chemical imprints of the first generations of supernovae. The most metal-poor region of the Sagittarius dwarf galaxy remains inadequately observed and characterised. To date, only $\sim4$ stars with [Fe/H]~$<-2.0$ have been chemically analysed with high-resolution spectroscopy.…
▽ More
The most metal-poor stars provide valuable insights into the early chemical enrichment history of a system, carrying the chemical imprints of the first generations of supernovae. The most metal-poor region of the Sagittarius dwarf galaxy remains inadequately observed and characterised. To date, only $\sim4$ stars with [Fe/H]~$<-2.0$ have been chemically analysed with high-resolution spectroscopy. In this study, we present the most extensive chemical abundance analysis of 12 low-metallicity stars with metallicities down to [Fe/H]~$=-3.26$ and located in the main body of Sagittarius. These targets, selected from the Pristine Inner Galaxy Survey, were observed using the MIKE high-resolution spectrograph at the {\it Magellan-Clay} telescope, which allowed us to measure up to 17 chemical species. The chemical composition of these stars reflects the imprint of a variety of type~II supernovae (SNe~II). A combination of low- to intermediate-mass high-energy SNe and hypernovae ($\sim10-70\msun$) is required to account for the abundance patterns of the lighter elements up to the Fe-peak. The trend of the heavy elements suggests the involvement of compact binary merger events and fast-rotating (up to $\sim300\kms$) intermediate-mass to massive metal-poor stars ($\sim25-120\msun$) that are the sources of rapid and slow processes, respectively. Additionally, asymptotic giant branch stars contribute to a wide dispersion of [Ba/Mg] and [Ba/Eu]. The absence of an $α-$knee in our data indicates that type Ia supernovae did not contribute in the very metal-poor region ([Fe/H]~$\leq-2.0$). However, they might have started to pollute the interstellar medium at [Fe/H]~$>-2.0$, given the relatively low [Co/Fe] in this metallicity region.
△ Less
Submitted 2 July, 2024; v1 submitted 30 April, 2024;
originally announced May 2024.
-
An adaptive hierarchical ensemble Kalman filter with reduced basis models
Authors:
Francesco A. B. Silva,
Cecilia Pagliantini,
Karen Veroy
Abstract:
The use of model order reduction techniques in combination with ensemble-based methods for estimating the state of systems described by nonlinear partial differential equations has been of great interest in recent years in the data assimilation community. Methods such as the multi-fidelity ensemble Kalman filter (MF-EnKF) and the multi-level ensemble Kalman filter (ML-EnKF) are recognized as state…
▽ More
The use of model order reduction techniques in combination with ensemble-based methods for estimating the state of systems described by nonlinear partial differential equations has been of great interest in recent years in the data assimilation community. Methods such as the multi-fidelity ensemble Kalman filter (MF-EnKF) and the multi-level ensemble Kalman filter (ML-EnKF) are recognized as state-of-the-art techniques. However, in many cases, the construction of low-fidelity models in an offline stage, before solving the data assimilation problem, prevents them from being both accurate and computationally efficient. In our work, we investigate the use of {\it{adaptive}} reduced basis techniques in which the approximation space is modified online based on the information that is extracted from a limited number of full order solutions and that is carried by the past models. This allows to simultaneously ensure good accuracy and low cost for the employed models and thus improve the performance of the multi-fidelity and multi-level methods.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
Authors:
Shreyas Chaudhari,
Pranjal Aggarwal,
Vishvak Murahari,
Tanmay Rajpurohit,
Ashwin Kalyan,
Karthik Narasimhan,
Ameet Deshpande,
Bruno Castro da Silva
Abstract:
State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hal…
▽ More
State-of-the-art large language models (LLMs) have become indispensable tools for various tasks. However, training LLMs to serve as effective assistants for humans requires careful consideration. A promising approach is reinforcement learning from human feedback (RLHF), which leverages human feedback to update the model in accordance with human preferences and mitigate issues like toxicity and hallucinations. Yet, an understanding of RLHF for LLMs is largely entangled with initial design choices that popularized the method and current research focuses on augmenting those choices rather than fundamentally improving the framework. In this paper, we analyze RLHF through the lens of reinforcement learning principles to develop an understanding of its fundamentals, dedicating substantial focus to the core component of RLHF -- the reward model. Our study investigates modeling choices, caveats of function approximation, and their implications on RLHF training algorithms, highlighting the underlying assumptions made about the expressivity of reward. Our analysis improves the understanding of the role of reward models and methods for their training, concurrently revealing limitations of the current methodology. We characterize these limitations, including incorrect generalization, model misspecification, and the sparsity of feedback, along with their impact on the performance of a language model. The discussion and analysis are substantiated by a categorical review of current literature, serving as a reference for researchers and practitioners to understand the challenges of RLHF and build upon existing efforts.
△ Less
Submitted 15 April, 2024; v1 submitted 12 April, 2024;
originally announced April 2024.
-
Control of the Schrödinger equation in $\mathbb{R}^3$: The critical case
Authors:
Pablo Braz e Silva,
Roberto de A. Capistrano-Filho,
Jackellyny Dassy do Nascimento Carvalho,
David dos Santos Ferreira
Abstract:
This article deals with the $\dot{H}^{1}$--level exact controllability for the defocusing critical nonlinear Schrödinger equation in $\mathbb{R}^3$. Firstly, we show the problem under consideration to be well-posed using Strichartz estimates. Moreover, through the Hilbert uniqueness method, we prove the linear Schrödinger equation to be controllable. Finally, we use a perturbation argument and sho…
▽ More
This article deals with the $\dot{H}^{1}$--level exact controllability for the defocusing critical nonlinear Schrödinger equation in $\mathbb{R}^3$. Firstly, we show the problem under consideration to be well-posed using Strichartz estimates. Moreover, through the Hilbert uniqueness method, we prove the linear Schrödinger equation to be controllable. Finally, we use a perturbation argument and show local exact controllability for the critical nonlinear Schrödinger equation.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Quantum tomography of structured light patterns from simple intensity measurements
Authors:
M. Gil de Oliveira,
A. L. S. Santos Junior,
P. M. R. Lima,
A. C. Barbosa,
B. Pinheiro da Silva,
S. Padua,
A. Z. Khoury
Abstract:
We study the tomography of spatial qudits encoded on structured light photons. While direct position measurements with cameras do not provide an informationally complete Positive Operator Valued Measure (POVM) in the space of fixed order modes, we complement this POVM with an astigmatic transformation. The enlarged POVM is informationally complete, allowing full characterization of the spatial qua…
▽ More
We study the tomography of spatial qudits encoded on structured light photons. While direct position measurements with cameras do not provide an informationally complete Positive Operator Valued Measure (POVM) in the space of fixed order modes, we complement this POVM with an astigmatic transformation. The enlarged POVM is informationally complete, allowing full characterization of the spatial quantum state from simple intensity measurements in both the intense and in the low photocount regimes. For intense light, the standard technique of linear inversion is used. For the low photocount regime, we employ Bayesian mean inference, and study how the quality of the tomographic reconstruction behaves as we increase the photocounts. In both cases, we also perform the tomography using a convolutional neural network, which displays an increased flexibility in exchange for a slightly lower quality reconstruction in some of the cases. These methods will be useful for classical and quantum communication with structured light.
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning
Authors:
Nick Mecklenburg,
Yiyou Lin,
Xiaoxiao Li,
Daniel Holstein,
Leonardo Nunes,
Sara Malvar,
Bruno Silva,
Ranveer Chandra,
Vijay Aski,
Pavan Kumar Reddy Yannam,
Tolga Aktas,
Todd Hendry
Abstract:
In recent years, Large Language Models (LLMs) have shown remarkable performance in generating human-like text, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain knowledge remains a challenge, particularly for facts and events that occur after the model's knowledge cutoff date. This paper investigates the effectiveness of Su…
▽ More
In recent years, Large Language Models (LLMs) have shown remarkable performance in generating human-like text, proving to be a valuable asset across various applications. However, adapting these models to incorporate new, out-of-domain knowledge remains a challenge, particularly for facts and events that occur after the model's knowledge cutoff date. This paper investigates the effectiveness of Supervised Fine-Tuning (SFT) as a method for knowledge injection in LLMs, specifically focusing on the domain of recent sporting events. We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information. Our experiments on GPT-4 demonstrate that while token-based scaling can lead to improvements in Q&A accuracy, it may not provide uniform coverage of new knowledge. Fact-based scaling, on the other hand, offers a more systematic approach to ensure even coverage across all facts. We present a novel dataset generation process that leads to more effective knowledge ingestion through SFT, and our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge. This study contributes to the understanding of domain adaptation for LLMs and highlights the potential of SFT in enhancing the factuality of LLM responses in specific knowledge domains.
△ Less
Submitted 2 April, 2024; v1 submitted 29 March, 2024;
originally announced April 2024.
-
Electrically Switchable Circular Photogalvanic Effect in Methylammonium Lead Iodide Microcrystals
Authors:
Yuqing Zhu,
Ziyi Song,
Rodrigo Becerra Silva,
Bob Minyu Wang,
Henry Clark Travaglini,
Andrew C Grieder,
Yuan **,
Liang Z. Tan,
Dong Yu
Abstract:
We investigate the circular photogalvanic effect (CPGE) in single-crystalline methylammonium lead iodide microcrystals under a static electric field. The external electric field can enhance the magnitude of the helicity dependent photocurrent (HDPC) by two orders of magnitude and flip its sign, which we attribute to magnetic shift currents induced by the Rashba-Edelstein effect. This HDPC induced…
▽ More
We investigate the circular photogalvanic effect (CPGE) in single-crystalline methylammonium lead iodide microcrystals under a static electric field. The external electric field can enhance the magnitude of the helicity dependent photocurrent (HDPC) by two orders of magnitude and flip its sign, which we attribute to magnetic shift currents induced by the Rashba-Edelstein effect. This HDPC induced by the static electric field may be viewed as an unusually strong third-order photoresponse, which produces a current two orders of magnitude larger than second-order injection current. Furthermore, the HDPC is highly nonlocal and can be created by photoexcitation out of the device channel, indicating a spin diffusion length up to 50 $μ$m at 78 K.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Curves and surfaces making a constant angle with a parallel transported direction in Riemannian spaces
Authors:
Luiz C. B. da Silva,
Gilson S. Ferreira Jr,
José D. da Silva
Abstract:
In the last two decades, much effort has been dedicated to studying curves and surfaces according to their angle with a given direction. However, most findings were obtained using a case-by-case approach, and it is often unclear what are consequences of the specificities of the ambient manifold and what could be generic. In this work, we propose a theoretical framework to unify parts of these find…
▽ More
In the last two decades, much effort has been dedicated to studying curves and surfaces according to their angle with a given direction. However, most findings were obtained using a case-by-case approach, and it is often unclear what are consequences of the specificities of the ambient manifold and what could be generic. In this work, we propose a theoretical framework to unify parts of these findings. We study curves and surfaces by prescribing the angle they make with a parallel transported vector field. We show that the characterization of Euclidean helices in terms of their curvature and torsion is also valid in any Riemannian manifold. Among other properties, we prove that surfaces making a constant angle with a parallel transported direction are extrinsically flat ruled surfaces. We also investigate the relation between their geodesics and the so-called slant helices; we prove that surfaces of constant angle are the rectifying surface of a slant helix, i.e., the ruled surface with rulings given by the Darboux vector field of the directrix. We characterize rectifying surfaces of constant angle; in other words, when their geodesics are slant helices. As a corollary, we show that if every geodesic of a surface of constant angle is a slant helix, then the ambient manifold is flat. Finally, we characterize surfaces in the product of a Riemannian surface with the real line making a constant angle with the vertical real direction.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation
Authors:
Marcos Fernández-Rodríguez,
Bruno Silva,
Sandro Queirós,
Helena R. Torres,
Bruno Oliveira,
Pedro Morais,
Lukas R. Buschle,
Jorge Correia-Pinto,
Estevão Lima,
João L. Vilaça
Abstract:
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including it…
▽ More
Surgical instrument segmentation in laparoscopy is essential for computer-assisted surgical systems. Despite the Deep Learning progress in recent years, the dynamic setting of laparoscopic surgery still presents challenges for precise segmentation. The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information. The framework's ease of use, including its ability to be automatically configured, and its low expertise requirements, have made it a popular base framework for comparisons. Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information. This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task, taking advantage of the fact that instruments are the main moving objects in the surgical field. With this new input, the temporal component would be indirectly added without modifying the architecture. Using CholecSeg8k dataset, three different representations of movement were estimated and used as new inputs, comparing them with a baseline model. Results showed that the use of OF maps improves the detection of classes with high movement, even when these are scarce in the dataset. To further improve performance, future work may focus on implementing other OF-preserving augmentations.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
A multi-cohort study on prediction of acute brain dysfunction states using selective state space models
Authors:
Brandon Silva,
Miguel Contreras,
Sabyasachi Bandyopadhyay,
Yuanfang Ren,
Ziyuan Guan,
Jeremy Balch,
Kia Khezeli,
Tezcan Ozrazgat Baslanti,
Ben Shickel,
Azra Bihorac,
Parisa Rashidi
Abstract:
Assessing acute brain dysfunction (ABD), including delirium and coma in the intensive care unit (ICU), is a critical challenge due to its prevalence and severe implications for patient outcomes. Current diagnostic methods rely on infrequent clinical observations, which can only determine a patient's ABD status after onset. Our research attempts to solve these problems by harnessing Electronic Heal…
▽ More
Assessing acute brain dysfunction (ABD), including delirium and coma in the intensive care unit (ICU), is a critical challenge due to its prevalence and severe implications for patient outcomes. Current diagnostic methods rely on infrequent clinical observations, which can only determine a patient's ABD status after onset. Our research attempts to solve these problems by harnessing Electronic Health Records (EHR) data to develop automated methods for ABD prediction for patients in the ICU. Existing models solely predict a single state (e.g., either delirium or coma), require at least 24 hours of observation data to make predictions, do not dynamically predict fluctuating ABD conditions during ICU stay (typically a one-time prediction), and use small sample size, proprietary single-hospital datasets. Our research fills these gaps in the existing literature by dynamically predicting delirium, coma, and mortality for 12-hour intervals throughout an ICU stay and validating on two public datasets. Our research also introduces the concept of dynamically predicting critical transitions from non-ABD to ABD and between different ABD states in real time, which could be clinically more informative for the hospital staff. We compared the predictive performance of two state-of-the-art neural network models, the MAMBA selective state space model and the Longformer Transformer model. Using the MAMBA model, we achieved a mean area under the receiving operator characteristic curve (AUROC) of 0.95 on outcome prediction of ABD for 12-hour intervals. The model achieves a mean AUROC of 0.79 when predicting transitions between ABD states. Our study uses a curated dataset from the University of Florida Health Shands Hospital for internal validation and two publicly available datasets, MIMIC-IV and eICU, for external validation, demonstrating robustness across ICU stays from 203 hospitals and 140,945 patients.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Leveraging Computer Vision in the Intensive Care Unit (ICU) for Examining Visitation and Mobility
Authors:
Scott Siegel,
Jiaqing Zhang,
Sabyasachi Bandyopadhyay,
Subhash Nerella,
Brandon Silva,
Tezcan Baslanti,
Azra Bihorac,
Parisa Rashidi
Abstract:
Despite the importance of closely monitoring patients in the Intensive Care Unit (ICU), many aspects are still assessed in a limited manner due to the time constraints imposed on healthcare providers. For example, although excessive visitations during rest hours can potentially exacerbate the risk of circadian rhythm disruption and delirium, it is not captured in the ICU. Likewise, while mobility…
▽ More
Despite the importance of closely monitoring patients in the Intensive Care Unit (ICU), many aspects are still assessed in a limited manner due to the time constraints imposed on healthcare providers. For example, although excessive visitations during rest hours can potentially exacerbate the risk of circadian rhythm disruption and delirium, it is not captured in the ICU. Likewise, while mobility can be an important indicator of recovery or deterioration in ICU patients, it is only captured sporadically or not captured at all. In the past few years, the computer vision field has found application in many domains by reducing the human burden. Using computer vision systems in the ICU can also potentially enable non-existing assessments or enhance the frequency and accuracy of existing assessments while reducing the staff workload. In this study, we leverage a state-of-the-art noninvasive computer vision system based on depth imaging to characterize ICU visitations and patients' mobility. We then examine the relationship between visitation and several patient outcomes, such as pain, acuity, and delirium. We found an association between deteriorating patient acuity and the incidence of delirium with increased visitations. In contrast, self-reported pain, reported using the Defense and Veteran Pain Rating Scale (DVPRS), was correlated with decreased visitations. Our findings highlight the feasibility and potential of using noninvasive autonomous systems to monitor ICU patients.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
Anomalous Second Harmonic Generation of Twisted Gaussian Schell Model Beams
Authors:
M. Gil de Oliveira,
A. L. S. Santos Junior,
A. C. Barbosa,
B. Pinheiro da Silva,
G. H. dos Santos,
G. Cañas,
P. H. Souto Ribeiro,
S. P. Walborn,
A. Z. Khoury
Abstract:
We investigate theoretically and experimentally the optical second harmonic generation (SHG) with a twisted Gaussian Schell model (TGSM) beam as the fundamental field. We use Type-II phase matching and analyze the cross spectral density (CSD) of the SHG output beam when the input fundamental is prepared with a TGSM structure. We analyze two synthetization methods for preparing the TGSM fundamental…
▽ More
We investigate theoretically and experimentally the optical second harmonic generation (SHG) with a twisted Gaussian Schell model (TGSM) beam as the fundamental field. We use Type-II phase matching and analyze the cross spectral density (CSD) of the SHG output beam when the input fundamental is prepared with a TGSM structure. We analyze two synthetization methods for preparing the TGSM fundamental beam and we find that for one method the SHG is also a TGSM beam. For the other method, we find that the SHG is not a TGSM beam and presents an anomalous CSD possessing a dip instead of a peak in the transverse spatial structure. Moreover, we show that the dip depth is directly related to the twisted phase parameter, being absent for a non twisted GSM beam. Our results show that the SHG from a fundamental TGSM beam can result in a doubled frequency TGSM or in a non-TGSM beam depending on the synthetization method.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Iterative Occlusion-Aware Light Field Depth Estimation using 4D Geometrical Cues
Authors:
Rui Lourenço,
Lucas Thomaz,
Eduardo A. B. Silva,
Sergio M. M. Faria
Abstract:
Light field cameras and multi-camera arrays have emerged as promising solutions for accurately estimating depth by passively capturing light information. This is possible because the 3D information of a scene is embedded in the 4D light field geometry. Commonly, depth estimation methods extract this information relying on gradient information, heuristic-based optimisation models, or learning-based…
▽ More
Light field cameras and multi-camera arrays have emerged as promising solutions for accurately estimating depth by passively capturing light information. This is possible because the 3D information of a scene is embedded in the 4D light field geometry. Commonly, depth estimation methods extract this information relying on gradient information, heuristic-based optimisation models, or learning-based approaches. This paper focuses mainly on explicitly understanding and exploiting 4D geometrical cues for light field depth estimation. Thus, a novel method is proposed, based on a non-learning-based optimisation approach for depth estimation that explicitly considers surface normal accuracy and occlusion regions by utilising a fully explainable 4D geometric model of the light field. The 4D model performs depth/disparity estimation by determining the orientations and analysing the intersections of key 2D planes in 4D space, which are the images of 3D-space points in the 4D light field. Experimental results show that the proposed method outperforms both learning-based and non-learning-based state-of-the-art methods in terms of surface normal angle accuracy, achieving a Median Angle Error on planar surfaces, on average, 26.3\% lower than the state-of-the-art, and still being competitive with state-of-the-art methods in terms of Mean Squared Error $\vc{\times}$ 100 and Badpix 0.07.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
New topological subsystem codes from semi-regular tessellations
Authors:
Eduardo Brandani da Silva,
Evandro Mazetto Brizola
Abstract:
In this work, we present new constructions for topological subsystem codes using semi-regular Euclidean and hyperbolic tessellations. They give us new families of codes, and we also provide a new family of codes obtained through an already existing construction, due to Sarvepalli and Brown. We also prove new results that allow us to obtain the parameters of these new codes.
In this work, we present new constructions for topological subsystem codes using semi-regular Euclidean and hyperbolic tessellations. They give us new families of codes, and we also provide a new family of codes obtained through an already existing construction, due to Sarvepalli and Brown. We also prove new results that allow us to obtain the parameters of these new codes.
△ Less
Submitted 28 February, 2024;
originally announced February 2024.
-
Azimuthal metallicity variations, spiral structure, and the failure of radial actions based on assuming axisymmetry
Authors:
Victor P. Debattista,
Tigran Khachaturyants,
Joao A. S. Amarante,
Christopher Carr,
Leandro Beraldo e Silva,
Chervin F. P. Laporte
Abstract:
We study azimuthal variations in the mean stellar metallicity, <[Fe/H]>, in a self-consistent, isolated simulation in which all stars form out of gas. We find <[Fe/H]> variations comparable to those observed in the Milky Way and which are coincident with the spiral density waves. The azimuthal variations are present in young and old stars and therefore are not a result of recently formed stars. Si…
▽ More
We study azimuthal variations in the mean stellar metallicity, <[Fe/H]>, in a self-consistent, isolated simulation in which all stars form out of gas. We find <[Fe/H]> variations comparable to those observed in the Milky Way and which are coincident with the spiral density waves. The azimuthal variations are present in young and old stars and therefore are not a result of recently formed stars. Similar variations are present in the mean age and alpha-abundance. We measure the pattern speeds of the <[Fe/H]>-variations and find that they match those of the spirals, indicating that they are at the origin of the metallicity patterns. Because younger stellar populations are not just more [Fe/H]-rich and alpha-poor but also dynamically cooler, we expect them to more strongly support spirals, which is indeed the case in the simulation. However, if we measure the radial action, J_R, using the Stackel axisymmetric approximation, we find that the spiral ridges are traced by regions of high J_R, contrary to expectations. Assuming that the passage of stars through the spirals leads to unphysical variations in the measured J_R, we obtain an improved estimate of J_R by averaging over a 1 Gyr time interval. This time-averaged J_R is a much better tracer of the spiral structure, with minima at the spiral ridges. We conclude that the errors incurred by the axisymmetric approximation introduce correlated deviations large enough to render the instantaneous radial action inadequate for tracing spirals.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
FeCuNbSiB thin films with sub-Oersted coercivity
Authors:
J. M. Alves,
D. E. Gonzalez-Chavez,
N. R. Checca,
B. G. Silva,
R. L. Sommer
Abstract:
Nanocrystalline FeCuNbSiB thin films were fabricated through magnetron sputtering followed by heat treatment, resulting in samples characterized by low coercivity and high effective magnetization. Comprehensive microstructural analysis, employing X-ray diffraction and transmission electron microscopy techniques such as selected area electron diffraction, high-resolution imaging, and Fourier transf…
▽ More
Nanocrystalline FeCuNbSiB thin films were fabricated through magnetron sputtering followed by heat treatment, resulting in samples characterized by low coercivity and high effective magnetization. Comprehensive microstructural analysis, employing X-ray diffraction and transmission electron microscopy techniques such as selected area electron diffraction, high-resolution imaging, and Fourier transform, was conducted. Magnetic properties were investigated using an alternating gradient field magnetometer and broadband ferromagnetic resonance. The structural analysis revealed a well-defined microstructure of nanograins within an amorphous matrix in all of our films. However, the coercivity of the 80 nm films did not exhibit as low values as observed for the 160 nm films
△ Less
Submitted 26 January, 2024;
originally announced January 2024.
-
Attention Modules Improve Modern Image-Level Anomaly Detection: A DifferNet Case Study
Authors:
André Luiz B. Vieira e Silva,
Francisco Simões,
Danny Kowerko,
Tobias Schlosser,
Felipe Battisti,
Veronica Teichrieb
Abstract:
Within (semi-)automated visual inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To not only alleviate this issue but to furthermore adva…
▽ More
Within (semi-)automated visual inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To not only alleviate this issue but to furthermore advance the current state of the art in unsupervised visual inspection, this contribution proposes a DifferNet-based solution enhanced with attention modules utilizing SENet and CBAM as backbone - AttentDifferNet - to improve the detection and classification capabilities on three different visual inspection and anomaly detection datasets: MVTec AD, InsPLAD-fault, and Semiconductor Wafer. In comparison to the current state of the art, it is shown that AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quantitative as well as qualitative evaluation, indicated by a general improvement in AUC of 94.34 vs. 92.46, 96.67 vs. 94.69, and 90.20 vs. 88.74%. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for anomaly detection.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Authors:
Angels Balaguer,
Vinamra Benara,
Renato Luiz de Freitas Cunha,
Roberto de M. Estevão Filho,
Todd Hendry,
Daniel Holstein,
Jennifer Marsman,
Nick Mecklenburg,
Sara Malvar,
Leonardo O. Nunes,
Rafael Padilha,
Morris Sharp,
Bruno Silva,
Swati Sharma,
Vijay Aski,
Ranveer Chandra
Abstract:
There are two common ways in which developers are incorporating proprietary and domain-specific data when building applications of Large Language Models (LLMs): Retrieval-Augmented Generation (RAG) and Fine-Tuning. RAG augments the prompt with the external data, while fine-Tuning incorporates the additional knowledge into the model itself. However, the pros and cons of both approaches are not well…
▽ More
There are two common ways in which developers are incorporating proprietary and domain-specific data when building applications of Large Language Models (LLMs): Retrieval-Augmented Generation (RAG) and Fine-Tuning. RAG augments the prompt with the external data, while fine-Tuning incorporates the additional knowledge into the model itself. However, the pros and cons of both approaches are not well understood. In this paper, we propose a pipeline for fine-tuning and RAG, and present the tradeoffs of both for multiple popular LLMs, including Llama2-13B, GPT-3.5, and GPT-4. Our pipeline consists of multiple stages, including extracting information from PDFs, generating questions and answers, using them for fine-tuning, and leveraging GPT-4 for evaluating the results. We propose metrics to assess the performance of different stages of the RAG and fine-Tuning pipeline. We conduct an in-depth study on an agricultural dataset. Agriculture as an industry has not seen much penetration of AI, and we study a potentially disruptive application - what if we could provide location-specific insights to a farmer? Our results show the effectiveness of our dataset generation pipeline in capturing geographic-specific knowledge, and the quantitative and qualitative benefits of RAG and fine-tuning. We see an accuracy increase of over 6 p.p. when fine-tuning the model and this is cumulative with RAG, which increases accuracy by 5 p.p. further. In one particular experiment, we also demonstrate that the fine-tuned model leverages information from across geographies to answer specific questions, increasing answer similarity from 47% to 72%. Overall, the results point to how systems built using LLMs can be adapted to respond and incorporate knowledge across a dimension that is critical for a specific industry, paving the way for further applications of LLMs in other industrial domains.
△ Less
Submitted 30 January, 2024; v1 submitted 16 January, 2024;
originally announced January 2024.
-
Exploration of the Muon $g-2$ and Light Dark Matter explanations in NA64 with the CERN SPS high energy muon beam
Authors:
Yu. M. Andreev,
D. Banerjee,
B. Banto Oberhauser,
J. Bernhard,
P. Bisio,
N. Charitonidis,
P. Crivelli,
E. Depero,
A. V. Dermenev,
S. V. Donskov,
R. R. Dusaev,
T. Enik,
V. N. Frolov,
R. B. Galleguillos Silva,
A. Gardikiotis,
S. V. Gertsenberger,
S. Girod,
S. N. Gninenko,
M. Hoesgen,
V. A. Kachanov,
Y. Kambar,
A. E. Karneyeu,
E. A. Kasianova,
G. Kekelidze,
B. Ketzer
, et al. (32 additional authors not shown)
Abstract:
We report on a search for a new $Z'$ ($L_μ-L_τ$) vector boson performed at the NA64 experiment employing a high energy muon beam and a missing energy-momentum technique. Muons from the M2 beamline at the CERN Super Proton Synchrotron with a momentum of 160 GeV/c are directed to an active target. A signal event is a single scattered muon with momentum $<$ 80 GeV/c in the final state, accompanied by…
▽ More
We report on a search for a new $Z'$ ($L_μ-L_τ$) vector boson performed at the NA64 experiment employing a high energy muon beam and a missing energy-momentum technique. Muons from the M2 beamline at the CERN Super Proton Synchrotron with a momentum of 160 GeV/c are directed to an active target. A signal event is a single scattered muon with momentum $<$ 80 GeV/c in the final state, accompanied by missing energy, i.e. no detectable activity in the downstream calorimeters. For a total statistic of $(1.98\pm0.02)\times10^{10}$ muons on target, no event is observed in the expected signal region. This allows us to set new limits on part of the remaining $(m_{Z'},\ g_{Z'})$ parameter space which could provide an explanation for the muon $(g-2)_μ$ anomaly. Additionally, our study excludes part of the parameter space suggested by the thermal Dark Matter relic abundance. Our results pave the way to explore Dark Sectors and light Dark Matter with muon beams in a unique and complementary way to other experiments.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
From Past to Future: Rethinking Eligibility Traces
Authors:
Dhawal Gupta,
Scott M. Jordan,
Shreyas Chaudhari,
Bo Liu,
Philip S. Thomas,
Bruno Castro da Silva
Abstract:
In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment to preceding states. From this investigation emerges the concept of a novel value function, which we refer to as the \emph{bidirectional value functio…
▽ More
In this paper, we introduce a fresh perspective on the challenges of credit assignment and policy evaluation. First, we delve into the nuances of eligibility traces and explore instances where their updates may result in unexpected credit assignment to preceding states. From this investigation emerges the concept of a novel value function, which we refer to as the \emph{bidirectional value function}. Unlike traditional state value functions, bidirectional value functions account for both future expected returns (rewards anticipated from the current state onward) and past expected returns (cumulative rewards from the episode's start to the present). We derive principled update equations to learn this value function and, through experimentation, demonstrate its efficacy in enhancing the process of policy evaluation. In particular, our results indicate that the proposed learning approach can, in certain challenging contexts, perform policy evaluation more rapidly than TD($λ$) -- a method that learns forward value functions, $v^π$, \emph{directly}. Overall, our findings present a new perspective on eligibility traces and potential advantages associated with the novel value function it inspires, especially for policy evaluation.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Disentangling stress and strain effects in ferroelectric HfO2
Authors:
Tingfeng Song,
Veniero Lenzi,
José P. B. Silva,
Luís Marques,
Ignasi Fina,
Florencio Sánchez
Abstract:
Ferroelectric HfO2 films are usually polycrystalline and contain a mixture of polar and nonpolar phases. This challenges the understanding and control of polar phase stabilization and ferroelectric properties. Several factors such as dopants, oxygen vacancies, or stress, among others, have been investigated and shown to have a crucial role on optimizing the ferroelectric response. Stress generated…
▽ More
Ferroelectric HfO2 films are usually polycrystalline and contain a mixture of polar and nonpolar phases. This challenges the understanding and control of polar phase stabilization and ferroelectric properties. Several factors such as dopants, oxygen vacancies, or stress, among others, have been investigated and shown to have a crucial role on optimizing the ferroelectric response. Stress generated during deposition or annealing of thin films is a main factor determining the formed crystal phases and influences the lattice strain of the polar orthorhombic phase. It is difficult to discriminate between stress and strain effects on polycrystalline ferroelectric HfO2 films, and the direct impact of orthorhombic lattice strain on ferroelectric polarization has yet to be determined experimentally. Here, we analyze the crystalline phases and lattice strain of several series of doped HfO2 epitaxial films. We conclude that stress has a critical influence on metastable orthorhombic phase stabilization and ferroelectric polarization. On the contrary, the lattice deformation effects are much smaller than those caused by variations in the orthorhombic phase content. The experimental results are confirmed by density functional theory calculations on HfO2 and Hf0.5Zr0.5O2 ferroelectric phases.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Stochastic modelling of football matches
Authors:
Luiz Fernando G. N. Maia,
Teemu Pennanen,
Moacyr A. H. B. da Silva,
Rodrigo S. Targino
Abstract:
This paper develops a general framework for stochastic modeling of goals and other events in football (soccer) matches. The events are modelled as Cox processes (doubly stochastic Poisson processes) where the event intensities may depend on all the modeled events as well as external factors. The model has a strictly concave log-likelihood function which facilitates its fitting to observed data. Be…
▽ More
This paper develops a general framework for stochastic modeling of goals and other events in football (soccer) matches. The events are modelled as Cox processes (doubly stochastic Poisson processes) where the event intensities may depend on all the modeled events as well as external factors. The model has a strictly concave log-likelihood function which facilitates its fitting to observed data. Besides event times, the model describes the random lengths of stoppage times which can have a strong influence on the final score of a match. The model is illustrated on eight years of data from Campeonato Brasileiro de Futebol Série A. We find that dynamic regressors significantly improve the in-game predictive power of the model. In particular, a) when a team receives a red card, its goal intensity decreases more than 30%; b) the goal rate of a team increases by 10% if it is losing by one goal and by 20% if its losing by two goals; and c) when the goal difference at the end of the second half is less than or equal to one, the stoppage time is on average more than one minute longer than in matches with a difference of two goals.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Characterisation of high velocity stars in the S-PLUS internal fourth data release
Authors:
F. Quispe-Huaynasi,
F. Roig,
V. M. Placco,
L. Beraldo e Silva,
S. Daflon,
C. B. Pereira,
A. Kanaan,
C. Mendes de Oliveira,
T. Ribeiro,
W. Schoenell
Abstract:
In general, the atypical high velocity of some stars in the Galaxy can only be explained by invoking acceleration mechanisms related to extreme astrophysical events in the Milky Way. Using astrometric data from Gaia and the photometric information in 12 filters of the S-PLUS, we performed a kinematic, dynamical, and chemical analysis of 64 stars with galactocentric velocities higher than 400…
▽ More
In general, the atypical high velocity of some stars in the Galaxy can only be explained by invoking acceleration mechanisms related to extreme astrophysical events in the Milky Way. Using astrometric data from Gaia and the photometric information in 12 filters of the S-PLUS, we performed a kinematic, dynamical, and chemical analysis of 64 stars with galactocentric velocities higher than 400 $\mathrm{km\,s}^{-1}$. All the stars are gravitationally bound to the Galaxy and exhibit halo kinematics. Some of the stars could be remnants of structures such as the Sequoia and the Gaia-Sausage/Enceladus. Supported by orbital and chemical analysis, we identified Gaia DR3 5401875170994688896 as a star likely to be originated at the centre of the Galaxy. Application of a machine learning technique to the S-PLUS photometric data allows us to obtain very good estimates of magnesium abundances for this sample of high velocity stars.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
Attention Modules Improve Image-Level Anomaly Detection for Industrial Inspection: A DifferNet Case Study
Authors:
André Luiz Buarque Vieira e Silva,
Francisco Simões,
Danny Kowerko,
Tobias Schlosser,
Felipe Battisti,
Veronica Teichrieb
Abstract:
Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To alleviate this issue and advance the curre…
▽ More
Within (semi-)automated visual industrial inspection, learning-based approaches for assessing visual defects, including deep neural networks, enable the processing of otherwise small defect patterns in pixel size on high-resolution imagery. The emergence of these often rarely occurring defect patterns explains the general need for labeled data corpora. To alleviate this issue and advance the current state of the art in unsupervised visual inspection, this work proposes a DifferNet-based solution enhanced with attention modules: AttentDifferNet. It improves image-level detection and classification capabilities on three visual anomaly detection datasets for industrial inspection: InsPLAD-fault, MVTec AD, and Semiconductor Wafer. In comparison to the state of the art, AttentDifferNet achieves improved results, which are, in turn, highlighted throughout our quali-quantitative study. Our quantitative evaluation shows an average improvement - compared to DifferNet - of 1.77 +/- 0.25 percentage points in overall AUROC considering all three datasets, reaching SOTA results in InsPLAD-fault, an industrial inspection in-the-wild dataset. As our variants to AttentDifferNet show great prospects in the context of currently investigated approaches, a baseline is formulated, emphasizing the importance of attention for industrial anomaly detection both in the wild and in controlled environments.
△ Less
Submitted 7 November, 2023; v1 submitted 5 November, 2023;
originally announced November 2023.
-
APRICOT-Mamba: Acuity Prediction in Intensive Care Unit (ICU): Development and Validation of a Stability, Transitions, and Life-Sustaining Therapies Prediction Model
Authors:
Miguel Contreras,
Brandon Silva,
Benjamin Shickel,
Tezcan Ozrazgat-Baslanti,
Yuanfang Ren,
Ziyuan Guan,
Jeremy Balch,
Jiaqing Zhang,
Sabyasachi Bandyopadhyay,
Kia Khezeli,
Azra Bihorac,
Parisa Rashidi
Abstract:
The acuity state of patients in the intensive care unit (ICU) can quickly change from stable to unstable. Early detection of deteriorating conditions can result in providing timely interventions and improved survival rates. In this study, we propose APRICOT-M (Acuity Prediction in Intensive Care Unit-Mamba), a 150k-parameter state space-based neural network to predict acuity state, transitions, an…
▽ More
The acuity state of patients in the intensive care unit (ICU) can quickly change from stable to unstable. Early detection of deteriorating conditions can result in providing timely interventions and improved survival rates. In this study, we propose APRICOT-M (Acuity Prediction in Intensive Care Unit-Mamba), a 150k-parameter state space-based neural network to predict acuity state, transitions, and the need for life-sustaining therapies in real-time in ICU patients. The model uses data obtained in the prior four hours in the ICU and patient information obtained at admission to predict the acuity outcomes in the next four hours. We validated APRICOT-M externally on data from hospitals not used in development (75,668 patients from 147 hospitals), temporally on data from a period not used in development (12,927 patients from one hospital from 2018-2019), and prospectively on data collected in real-time (215 patients from one hospital from 2021-2023) using three large datasets: the University of Florida Health (UFH) dataset, the electronic ICU Collaborative Research Database (eICU), and the Medical Information Mart for Intensive Care (MIMIC)-IV. The area under the receiver operating characteristic curve (AUROC) of APRICOT-M for mortality (external 0.94-0.95, temporal 0.97-0.98, prospective 0.96-1.00) and acuity (external 0.95-0.95, temporal 0.97-0.97, prospective 0.96-0.96) shows comparable results to state-of-the-art models. Furthermore, APRICOT-M can predict transitions to instability (external 0.81-0.82, temporal 0.77-0.78, prospective 0.68-0.75) and need for life-sustaining therapies, including mechanical ventilation (external 0.82-0.83, temporal 0.87-0.88, prospective 0.67-0.76), and vasopressors (external 0.81-0.82, temporal 0.73-0.75, prospective 0.66-0.74). This tool allows for real-time acuity monitoring in critically ill patients and can help clinicians make timely interventions.
△ Less
Submitted 8 March, 2024; v1 submitted 3 November, 2023;
originally announced November 2023.
-
InsPLAD: A Dataset and Benchmark for Power Line Asset Inspection in UAV Images
Authors:
André Luiz Buarque Vieira e Silva,
Heitor de Castro Felix,
Franscisco Paulo Magalhães Simões,
Veronica Teichrieb,
Michel Mozinho dos Santos,
Hemir Santiago,
Virginia Sgotti,
Henrique Lott Neto
Abstract:
Power line maintenance and inspection are essential to avoid power supply interruptions, reducing its high social and financial impacts yearly. Automating power line visual inspections remains a relevant open problem for the industry due to the lack of public real-world datasets of power line components and their various defects to foster new research. This paper introduces InsPLAD, a Power Line A…
▽ More
Power line maintenance and inspection are essential to avoid power supply interruptions, reducing its high social and financial impacts yearly. Automating power line visual inspections remains a relevant open problem for the industry due to the lack of public real-world datasets of power line components and their various defects to foster new research. This paper introduces InsPLAD, a Power Line Asset Inspection Dataset and Benchmark containing 10,607 high-resolution Unmanned Aerial Vehicles colour images. The dataset contains seventeen unique power line assets captured from real-world operating power lines. Additionally, five of those assets present six defects: four of which are corrosion, one is a broken component, and one is a bird's nest presence. All assets were labelled according to their condition, whether normal or the defect name found on an image level. We thoroughly evaluate state-of-the-art and popular methods for three image-level computer vision tasks covered by InsPLAD: object detection, through the AP metric; defect classification, through Balanced Accuracy; and anomaly detection, through the AUROC metric. InsPLAD offers various vision challenges from uncontrolled environments, such as multi-scale objects, multi-size class instances, multiple objects per image, intra-class variation, cluttered background, distinct point-of-views, perspective distortion, occlusion, and varied lighting conditions. To the best of our knowledge, InsPLAD is the first large real-world dataset and benchmark for power line asset inspection with multiple components and defects for various computer vision tasks, with a potential impact to improve state-of-the-art methods in the field. It will be publicly available in its integrity on a repository with a thorough description. It can be found at https://github.com/andreluizbvs/InsPLAD.
△ Less
Submitted 3 December, 2023; v1 submitted 2 November, 2023;
originally announced November 2023.
-
Abnormal Singular Foliations and the Sard Conjecture for generic co-rank one distributions
Authors:
A Belotto da Silva,
A Parusiński,
L Rifford
Abstract:
Given a smooth totally nonholonomic distribution on a smooth manifold, we construct a singular distribution capturing essential abnormal lifts which is locally generated by vector fields with controlled divergence. Then, as an application, we prove the Sard Conjecture for rank 3 distribution in dimension 4 and generic distributions of corank 1.
Given a smooth totally nonholonomic distribution on a smooth manifold, we construct a singular distribution capturing essential abnormal lifts which is locally generated by vector fields with controlled divergence. Then, as an application, we prove the Sard Conjecture for rank 3 distribution in dimension 4 and generic distributions of corank 1.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
Behavior Alignment via Reward Function Optimization
Authors:
Dhawal Gupta,
Yash Chandak,
Scott M. Jordan,
Philip S. Thomas,
Bruno Castro da Silva
Abstract:
Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadvertently inducing undesirable behaviors. Naively modifying the reward structure to offer denser and more frequent feedback can lead to unintended outco…
▽ More
Designing reward functions for efficiently guiding reinforcement learning (RL) agents toward specific behaviors is a complex task. This is challenging since it requires the identification of reward structures that are not sparse and that avoid inadvertently inducing undesirable behaviors. Naively modifying the reward structure to offer denser and more frequent feedback can lead to unintended outcomes and promote behaviors that are not aligned with the designer's intended goal. Although potential-based reward sha** is often suggested as a remedy, we systematically investigate settings where deploying it often significantly impairs performance. To address these issues, we introduce a new framework that uses a bi-level objective to learn \emph{behavior alignment reward functions}. These functions integrate auxiliary rewards reflecting a designer's heuristics and domain knowledge with the environment's primary rewards. Our approach automatically determines the most effective way to blend these types of feedback, thereby enhancing robustness against heuristic reward misspecification. Remarkably, it can also adapt an agent's policy optimization process to mitigate suboptimalities resulting from limitations and biases inherent in the underlying RL algorithms. We evaluate our method's efficacy on a diverse set of tasks, from small-scale experiments to high-dimensional control challenges. We investigate heuristic auxiliary rewards of varying quality -- some of which are beneficial and others detrimental to the learning process. Our results show that our framework offers a robust and principled way to integrate designer-specified heuristics. It not only addresses key shortcomings of existing approaches but also consistently leads to high-performing solutions, even when given misaligned or poorly-specified auxiliary reward functions.
△ Less
Submitted 31 October, 2023; v1 submitted 29 October, 2023;
originally announced October 2023.
-
On the evolutionary history of a simulated disc galaxy as seen by phylogenetic trees
Authors:
Danielle de Brito Silva,
Paula Jofré,
Patricia B. Tissera,
Keaghan J. Yaxley,
Jenny Gonzalez Jara,
Camilla J. L. Eldridge,
Emanuel Sillero,
Robert M. Yates,
Xia Hua,
Payel Das,
Claudia Aguilera-Gómez,
Evelyn J. Johnston,
Alvaro Rojas-Arriagada,
Robert Foley,
Gerard Gilmore
Abstract:
Phylogenetic methods have long been used in biology, and more recently have been extended to other fields - for example, linguistics and technology - to study evolutionary histories. Galaxies also have an evolutionary history, and fall within this broad phylogenetic framework. Under the hypothesis that chemical abundances can be used as a proxy for interstellar medium's DNA, phylogenetic methods a…
▽ More
Phylogenetic methods have long been used in biology, and more recently have been extended to other fields - for example, linguistics and technology - to study evolutionary histories. Galaxies also have an evolutionary history, and fall within this broad phylogenetic framework. Under the hypothesis that chemical abundances can be used as a proxy for interstellar medium's DNA, phylogenetic methods allow us to reconstruct hierarchical similarities and differences among stars - essentially a tree of evolutionary relationships and thus history. In this work, we apply phylogenetic methods to a simulated disc galaxy obtained with a chemo-dynamical code to test the approach. We found that at least 100 stellar particles are required to reliably portray the evolutionary history of a selected stellar population in this simulation, and that the overall evolutionary history is reliably preserved when the typical uncertainties in the chemical abundances are smaller than 0.08 dex. The results show that the shape of the trees are strongly affected by the age-metallicity relation, as well as the star formation history of the galaxy. We found that regions with low star formation rates produce shorter trees than regions with high star formation rates. Our analysis demonstrates that phylogenetic methods can shed light on the process of galaxy evolution.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Quantum work: Reconciling quantum mechanics and thermodynamics
Authors:
Thales Augusto Barbosa Pinto Silva,
David Gelbwaser-Klimovsky
Abstract:
It has been recently claimed that no protocol for measuring quantum work can satisfy standard required physical principles, casting doubts on the compatibility between quantum mechanics, thermodynamics, and the classical limit. In this Letter, we present a solution for this incompatibility. We demonstrate that the standard formulation of these principles fails to address the classical limit proper…
▽ More
It has been recently claimed that no protocol for measuring quantum work can satisfy standard required physical principles, casting doubts on the compatibility between quantum mechanics, thermodynamics, and the classical limit. In this Letter, we present a solution for this incompatibility. We demonstrate that the standard formulation of these principles fails to address the classical limit properly. By proposing changes in this direction, we prove that all the essential principles can be satisfied when work is defined as a quantum observable, reconciling quantum work statistics and thermodynamics.
△ Less
Submitted 16 May, 2024; v1 submitted 17 October, 2023;
originally announced October 2023.
-
Gaia FGK Benchmark Stars: fundamental Teff and log g of the third version
Authors:
Caroline Soubiran,
Orlagh Creevey,
Nadege Lagarde,
Nathalie Brouillet,
Paula Jofre,
Laia Casamiquela,
Ulrike Heiter,
Claudia Aguilera Gomez,
Sara Vitali,
Clare Worley,
Danielle de Brito Silva
Abstract:
Context. Large spectroscopic surveys devoted to the study of the Milky Way, including Gaia, use automated pipelines to massively determine the atmospheric parameters of millions of stars. The Gaia FGK Benchmark Stars are reference stars with Teff and log g derived through fundamental relations, independently of spectroscopy, to be used as anchors for the parameter scale. The first and second versi…
▽ More
Context. Large spectroscopic surveys devoted to the study of the Milky Way, including Gaia, use automated pipelines to massively determine the atmospheric parameters of millions of stars. The Gaia FGK Benchmark Stars are reference stars with Teff and log g derived through fundamental relations, independently of spectroscopy, to be used as anchors for the parameter scale. The first and second versions of the sample have been extensively used for that purpose, and more generally to help constrain stellar models. Aims. We provide the third version of the Gaia FGK Benchmark Stars, an extended set intended to improve the calibration of spectroscopic surveys, and their interconnection. Methods. We have compiled about 200 candidates which have precise measurements of angular diameters and parallaxes. We determined their bolometric fluxes by fitting their spectral energy distribution. Masses were determined using two sets of stellar evolution models. In a companion paper we describe the determination of metallicities and detailed abundances. Results. We provide a new set of 192 Gaia FGK Benchmark Stars with their fundamental Teff and logg, and with uncertainties lower than 2% for most stars. Compared to the previous versions, the homogeneity and accuracy of the fundamental parameters are significantly improved thanks to the high quality of the Gaia data reflecting on distances and bolometric fluxes.
△ Less
Submitted 18 October, 2023; v1 submitted 17 October, 2023;
originally announced October 2023.
-
GPT-4 as an Agronomist Assistant? Answering Agriculture Exams Using Large Language Models
Authors:
Bruno Silva,
Leonardo Nunes,
Roberto Estevão,
Vijay Aski,
Ranveer Chandra
Abstract:
Large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding across various domains, including healthcare and finance. For some tasks, LLMs achieve similar or better performance than trained human beings, therefore it is reasonable to employ human exams (e.g., certification tests) to assess the performance of LLMs. We present a comprehensive evaluation o…
▽ More
Large language models (LLMs) have demonstrated remarkable capabilities in natural language understanding across various domains, including healthcare and finance. For some tasks, LLMs achieve similar or better performance than trained human beings, therefore it is reasonable to employ human exams (e.g., certification tests) to assess the performance of LLMs. We present a comprehensive evaluation of popular LLMs, such as Llama 2 and GPT, on their ability to answer agriculture-related questions. In our evaluation, we also employ RAG (Retrieval-Augmented Generation) and ER (Ensemble Refinement) techniques, which combine information retrieval, generation capabilities, and prompting strategies to improve the LLMs' performance. To demonstrate the capabilities of LLMs, we selected agriculture exams and benchmark datasets from three of the largest agriculture producer countries: Brazil, India, and the USA. Our analysis highlights GPT-4's ability to achieve a passing score on exams to earn credits for renewing agronomist certifications, answering 93% of the questions correctly and outperforming earlier general-purpose models, which achieved 88% accuracy. On one of our experiments, GPT-4 obtained the highest performance when compared to human subjects. This performance suggests that GPT-4 could potentially pass on major graduate education admission tests or even earn credits for renewing agronomy certificates. We also explore the models' capacity to address general agriculture-related questions and generate crop management guidelines for Brazilian and Indian farmers, utilizing robust datasets from the Brazilian Agency of Agriculture (Embrapa) and graduate program exams from India. The results suggest that GPT-4, ER, and RAG can contribute meaningfully to agricultural education, assessment, and crop management practice, offering valuable insights to farmers and agricultural professionals.
△ Less
Submitted 12 October, 2023; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Reducing the False Positive Rate Using Bayesian Inference in Autonomous Driving Perception
Authors:
Gledson Melotti,
Johann J. S. Bastos,
Bruno L. S. da Silva,
Tiago Zanotelli,
Cristiano Premebida
Abstract:
Object recognition is a crucial step in perception systems for autonomous and intelligent vehicles, as evidenced by the numerous research works in the topic. In this paper, object recognition is explored by using multisensory and multimodality approaches, with the intention of reducing the false positive rate (FPR). The reduction of the FPR becomes increasingly important in perception systems sinc…
▽ More
Object recognition is a crucial step in perception systems for autonomous and intelligent vehicles, as evidenced by the numerous research works in the topic. In this paper, object recognition is explored by using multisensory and multimodality approaches, with the intention of reducing the false positive rate (FPR). The reduction of the FPR becomes increasingly important in perception systems since the misclassification of an object can potentially cause accidents. In particular, this work presents a strategy through Bayesian inference to reduce the FPR considering the likelihood function as a cumulative distribution function from Gaussian kernel density estimations, and the prior probabilities as cumulative functions of normalized histograms. The validation of the proposed methodology is performed on the KITTI dataset using deep networks (DenseNet, NasNet, and EfficientNet), and recent 3D point cloud networks (PointNet, and PintNet++), by considering three object-categories (cars, cyclists, pedestrians) and the RGB and LiDAR sensor modalities.
△ Less
Submitted 22 October, 2023; v1 submitted 9 September, 2023;
originally announced October 2023.
-
Promoting Astronomy Education: The Helix Nebula and Interdisciplinary Image Reading
Authors:
Vinicius Sanches,
Fabiene Barbosa da Silva
Abstract:
The observation of space seems to have always caused wonder into people's collective consciousness, generating a series of historical myths. More recently specially with the development of better tools alongside the constant refinement of the scientific method Astronomy has consolidated into increasing field of Physics. Yet, representing such field in an accurate manner for beginner students poses…
▽ More
The observation of space seems to have always caused wonder into people's collective consciousness, generating a series of historical myths. More recently specially with the development of better tools alongside the constant refinement of the scientific method Astronomy has consolidated into increasing field of Physics. Yet, representing such field in an accurate manner for beginner students poses a challenge. Appropriate images and descriptions should be chosen, which proves itself a large part of such challenge. Here we perform a technique named Interdisciplinary Image Reading aimed at trying to minimize the problem by improving and therefore promoting better Astronomy Education.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Strength in Diversity: Multi-Branch Representation Learning for Vehicle Re-Identification
Authors:
Eurico Almeida,
Bruno Silva,
Jorge Batista
Abstract:
This paper presents an efficient and lightweight multi-branch deep architecture to improve vehicle re-identification (V-ReID). While most V-ReID work uses a combination of complex multi-branch architectures to extract robust and diversified embeddings towards re-identification, we advocate that simple and lightweight architectures can be designed to fulfill the Re-ID task without compromising perf…
▽ More
This paper presents an efficient and lightweight multi-branch deep architecture to improve vehicle re-identification (V-ReID). While most V-ReID work uses a combination of complex multi-branch architectures to extract robust and diversified embeddings towards re-identification, we advocate that simple and lightweight architectures can be designed to fulfill the Re-ID task without compromising performance.
We propose a combination of Grouped-convolution and Loss-Branch-Split strategies to design a multi-branch architecture that improve feature diversity and feature discriminability. We combine a ResNet50 global branch architecture with a BotNet self-attention branch architecture, both designed within a Loss-Branch-Split (LBS) strategy. We argue that specialized loss-branch-splitting helps to improve re-identification tasks by generating specialized re-identification features. A lightweight solution using grouped convolution is also proposed to mimic the learning of loss-splitting into multiple embeddings while significantly reducing the model size. In addition, we designed an improved solution to leverage additional metadata, such as camera ID and pose information, that uses 97% less parameters, further improving re-identification performance.
In comparison to state-of-the-art (SoTA) methods, our approach outperforms competing solutions in Veri-776 by achieving 85.6% mAP and 97.7% CMC1 and obtains competitive results in Veri-Wild with 88.1% mAP and 96.3% CMC1. Overall, our work provides important insights into improving vehicle re-identification and presents a strong basis for other retrieval tasks. Our code is available at the https://github.com/videturfortuna/vehicle_reid_itsc2023.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
AI driven B-cell Immunotherapy Design
Authors:
Bruna Moreira da Silva,
David B. Ascher,
Nicholas Geard,
Douglas E. V. Pires
Abstract:
Antibodies, a prominent class of approved biologics, play a crucial role in detecting foreign antigens. The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction, which demands resource-intensive experimental techniques for characterisation. In recent years, artificial intelligence and machine learning met…
▽ More
Antibodies, a prominent class of approved biologics, play a crucial role in detecting foreign antigens. The effectiveness of antigen neutralisation and elimination hinges upon the strength, sensitivity, and specificity of the paratope-epitope interaction, which demands resource-intensive experimental techniques for characterisation. In recent years, artificial intelligence and machine learning methods have made significant strides, revolutionising the prediction of protein structures and their complexes. The past decade has also witnessed the evolution of computational approaches aiming to support immunotherapy design. This review focuses on the progress of machine learning-based tools and their frameworks in the domain of B-cell immunotherapy design, encompassing linear and conformational epitope prediction, paratope prediction, and antibody design. We mapped the most commonly used data sources, evaluation metrics, and method availability and thoroughly assessed their significance and limitations, discussing the main challenges ahead.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Production of bound states of magnetic monopoles in high energy collisions at LHC
Authors:
João Vitor Bulhões da Silva,
Werner Krambeck Sauter
Abstract:
In this work, we present the studies carried out for the production of the monopolium at the LHC in ultraperipheral collisions for the processes $pp$ and $PbPb$. The monopolium is described by the bound state of a monopole-antimonopole pair, and we assume the study of the monopole in this characteristic state because the coupling constant is very large, which allows us to suggest that this exotic…
▽ More
In this work, we present the studies carried out for the production of the monopolium at the LHC in ultraperipheral collisions for the processes $pp$ and $PbPb$. The monopolium is described by the bound state of a monopole-antimonopole pair, and we assume the study of the monopole in this characteristic state because the coupling constant is very large, which allows us to suggest that this exotic particle can be produced in the bound state. The monopolium is defined by a wave function arising from the numerical solution of the Schrödinger equation for the modified Cornell potential. We used the photon fusion production mechanism, with the Weizsäcker-Williams and Drees-Zeppenfeld expressions to describe the lead and proton equivalent photon distributions. We estimate a high production rate of monopolium production for $pp$ collisions with $\sqrt{s}=14$ TeV and $PbPb$ collisions with $\sqrt{s}=5.5$ TeV in LHC.
△ Less
Submitted 19 January, 2024; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Disfavoring the Schroedinger-Newton equation
Authors:
Joao V. B. da Silva,
Gabriel H. S. Aguiar,
George E. A. Matsas
Abstract:
The main goal of this brief report is to provide some new insight into how promising the Schroedinger-Newton equation would be to explain the emergence of classicality. Based on the similarity of the Newton and Coulomb potentials, we add an electric self-interacting term to the Schroedinger-Newton equation for the hydrogen atom. Our results rule out the possibility that single electrons self-inter…
▽ More
The main goal of this brief report is to provide some new insight into how promising the Schroedinger-Newton equation would be to explain the emergence of classicality. Based on the similarity of the Newton and Coulomb potentials, we add an electric self-interacting term to the Schroedinger-Newton equation for the hydrogen atom. Our results rule out the possibility that single electrons self-interact through their electromagnetic field. Next, we use the hydrogen atom to get insight into the intrinsic difficulty of testing the Schroedinger-Newton equation itself and conclude that the Planck scale must be approached before sound constraints are established. Although our results cannot be used to rule out the Schroedinger-Newton equation at all, they might be seen as disfavoring it if we underpin on the resemblance between the gravitational and electromagnetic interactions at low energies.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
Search for Light Dark Matter with NA64 at CERN
Authors:
Yu. M. Andreev,
D. Banerjee,
B. Banto Oberhauser,
J. Bernhard,
P. Bisio,
A. Celentano,
N. Charitonidis,
A. G. Chumakov,
D. Cooke,
P. Crivelli,
E. Depero,
A. V. Dermenev,
S. V. Donskov,
R. R. Dusaev,
T. Enik,
V. N. Frolov,
R. B. Galleguillos Silva,
A. Gardikiotis,
S. V. Gertsenberger,
S. Girod,
S. N. Gninenko,
M. H"osgen,
V. A. Kachanov,
Y. Kambar,
A. E. Karneyeu
, et al. (38 additional authors not shown)
Abstract:
Thermal dark matter models with particle $χ$ masses below the electroweak scale can provide an explanation for the observed relic dark matter density. This would imply the existence of a new feeble interaction between the dark and ordinary matter. We report on a new search for the sub-GeV $χ$ production through the interaction mediated by a new vector boson, called the dark photon $A'$, in collisi…
▽ More
Thermal dark matter models with particle $χ$ masses below the electroweak scale can provide an explanation for the observed relic dark matter density. This would imply the existence of a new feeble interaction between the dark and ordinary matter. We report on a new search for the sub-GeV $χ$ production through the interaction mediated by a new vector boson, called the dark photon $A'$, in collisions of 100 GeV electrons with the active target of the NA64 experiment at the CERN SPS. With $9.37\times10^{11}$ electrons on target collected during 2016-2022 runs NA64 probes for the first time the well-motivated region of parameter space of benchmark thermal scalar and fermionic dark matter models. No evidence for dark matter production has been found. This allows us to set the most sensitive limits on the $A'$ couplings to photons for masses $m_{A'} \lesssim 0.35$ GeV, and to exclude scalar and Majorana dark matter with the $χ-A'$ coupling $α_D \leq 0.1$ for masses $0.001 \lesssim m_χ\lesssim 0.1$ GeV and $3m_χ\leq m_{A'}$.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
Transformers in Healthcare: A Survey
Authors:
Subhash Nerella,
Sabyasachi Bandyopadhyay,
Jiaqing Zhang,
Miguel Contreras,
Scott Siegel,
Aysegul Bumin,
Brandon Silva,
Jessica Sena,
Benjamin Shickel,
Azra Bihorac,
Kia Khezeli,
Parisa Rashidi
Abstract:
With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption of the Transformers neural network architecture is rapidly changing many applications. Transformer is a type of deep learning architecture initially developed to solve general-purpose Natural Language Processing (NLP) tasks and has subsequently been adapted in many fields, inclu…
▽ More
With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption of the Transformers neural network architecture is rapidly changing many applications. Transformer is a type of deep learning architecture initially developed to solve general-purpose Natural Language Processing (NLP) tasks and has subsequently been adapted in many fields, including healthcare. In this survey paper, we provide an overview of how this architecture has been adopted to analyze various forms of data, including medical imaging, structured and unstructured Electronic Health Records (EHR), social media, physiological signals, and biomolecular sequences. Those models could help in clinical diagnosis, report generation, data reconstruction, and drug/protein synthesis. We identified relevant studies using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We also discuss the benefits and limitations of using transformers in healthcare and examine issues such as computational cost, model interpretability, fairness, alignment with human values, ethical implications, and environmental impact.
△ Less
Submitted 30 June, 2023;
originally announced July 2023.
-
PyKoopman: A Python Package for Data-Driven Approximation of the Koopman Operator
Authors:
Shaowu Pan,
Eurika Kaiser,
Brian M. de Silva,
J. Nathan Kutz,
Steven L. Brunton
Abstract:
PyKoopman is a Python package for the data-driven approximation of the Koopman operator associated with a dynamical system. The Koopman operator is a principled linear embedding of nonlinear dynamics and facilitates the prediction, estimation, and control of strongly nonlinear dynamics using linear systems theory. In particular, PyKoopman provides tools for data-driven system identification for un…
▽ More
PyKoopman is a Python package for the data-driven approximation of the Koopman operator associated with a dynamical system. The Koopman operator is a principled linear embedding of nonlinear dynamics and facilitates the prediction, estimation, and control of strongly nonlinear dynamics using linear systems theory. In particular, PyKoopman provides tools for data-driven system identification for unforced and actuated systems that build on the equation-free dynamic mode decomposition (DMD) and its variants. In this work, we provide a brief description of the mathematical underpinnings of the Koopman operator, an overview and demonstration of the features implemented in PyKoopman (with code examples), practical advice for users, and a list of potential extensions to PyKoopman. Software is available at http://github.com/dynamicslab/pykoopman
△ Less
Submitted 22 June, 2023;
originally announced June 2023.