-
Decentralized multi-agent reinforcement learning algorithm using a cluster-synchronized laser network
Authors:
Shun Kotoku,
Takatomo Mihana,
André Röhm,
Ryoichi Horisaki
Abstract:
Multi-agent reinforcement learning (MARL) studies crucial principles that are applicable to a variety of fields, including wireless networking and autonomous driving. We propose a photonic-based decision-making algorithm to address one of the most fundamental problems in MARL, called the competitive multi-armed bandit (CMAB) problem. Our numerical simulations demonstrate that chaotic oscillations…
▽ More
Multi-agent reinforcement learning (MARL) studies crucial principles that are applicable to a variety of fields, including wireless networking and autonomous driving. We propose a photonic-based decision-making algorithm to address one of the most fundamental problems in MARL, called the competitive multi-armed bandit (CMAB) problem. Our numerical simulations demonstrate that chaotic oscillations and cluster synchronization of optically coupled lasers, along with our proposed decentralized coupling adjustment, efficiently balance exploration and exploitation while facilitating cooperative decision-making without explicitly sharing information among agents. Our study demonstrates how decentralized reinforcement learning can be achieved by exploiting complex physical processes controlled by simple algorithms.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Multi-Modal Dataset Creation for Federated~Learning with DICOM Structured Reports
Authors:
Malte Tölle,
Lukas Burger,
Halvar Kelm,
Florian André,
Peter Bannas,
Gerhard Diller,
Norbert Frey,
Philipp Garthe,
Stefan Groß,
Anja Hennemuth,
Lars Kaderali,
Nina Krüger,
Andreas Leha,
Simon Martin,
Alexander Meyer,
Eike Nagel,
Stefan Orwat,
Clemens Scherer,
Moritz Seiffert,
Jan Moritz Seliger,
Stefan Simm,
Tim Friede,
Tim Seidler,
Sandy Engelhardt
Abstract:
Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance.…
▽ More
Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance.
Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets.
Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses).
Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
High-Resolution Hyperspectral Video Imaging Using A Hexagonal Camera Array
Authors:
Frank Sippel,
Jürgen Seiler,
André Kaup
Abstract:
Retrieving the reflectance spectrum from objects is an essential task for many classification and detection problems, since many materials and processes have a unique spectral behaviour. In many cases, it is highly desirable to capture hyperspectral images due to the high spectral flexibility. Often, it is even necessary to capture hyperspectral videos or at least to be able to record a hyperspect…
▽ More
Retrieving the reflectance spectrum from objects is an essential task for many classification and detection problems, since many materials and processes have a unique spectral behaviour. In many cases, it is highly desirable to capture hyperspectral images due to the high spectral flexibility. Often, it is even necessary to capture hyperspectral videos or at least to be able to record a hyperspectral image at once, also called snapshot hyperspectral imaging, to avoid spectral smearing. For this task, a high-resolution snapshot hyperspectral camera array using a hexagonal shape is introduced.The hexagonal array for hyperspectral imaging uses off-the-shelf hardware, which enables high flexibility regarding employed cameras, lenses and filters. Hence, the spectral range can be easily varied by mounting a different set of filters. Moreover, the concept of using off-the-shelf hardware enables low prices in comparison to other approaches with highly specialized hardware. Since classical industrial cameras are used in this hyperspectral camera array, the spatial and temporal resolution is very high, while recording 37 hyperspectral channels in the range from 400 nm to 760 nm in 10 nm steps. A registration process is required for near-field imaging, which maps the peripheral camera views to the center view. It is shown that this combination using a hyperspectral camera array and the corresponding image registration pipeline is superior in comparison to other popular snapshot approaches. For this evaluation, a synthetic hyperspectral database is rendered. On the synthetic data, the novel approach outperforms its best competitor by more than 3 dB in reconstruction quality. This synthetic data is also used to show the superiority of the hexagonal shape in comparison to an orthogonal-spaced one. Moreover, a real-world high resolution hyperspectral video database is provided.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
In-plane staging in lithium-ion intercalation of bilayer graphene
Authors:
Thomas Astles,
James G. McHugh,
Rui Zhang,
Qian Guo,
Madeleine Howe,
Zefei Wu,
Kornelia Indykiewicz,
Alex Summerfield,
Zachary A. H. Goodwin,
Sergey Slizovskiy,
Daniil Domaretskiy,
Andre K. Geim,
Vladimir Falko,
Irina V. Grigorieva
Abstract:
The ongoing efforts to optimize Li-ion batteries led to the interest in intercalation of nanoscale layered compounds, including bilayer graphene. Its lithium intercalation has been demonstrated recently but the mechanisms underpinning the storage capacity remain poorly understood. Here, using magnetotransport measurements, we report in-operando intercalation dynamics of bilayer graphene. Unexpecte…
▽ More
The ongoing efforts to optimize Li-ion batteries led to the interest in intercalation of nanoscale layered compounds, including bilayer graphene. Its lithium intercalation has been demonstrated recently but the mechanisms underpinning the storage capacity remain poorly understood. Here, using magnetotransport measurements, we report in-operando intercalation dynamics of bilayer graphene. Unexpectedly, we find four distinct intercalation stages that correspond to well-defined Li-ion densities. We refer to these stages as 'in-plane', with no in-plane analogues in bulk graphite. The fully intercalated bilayers represent a stoichiometric compound C14LiC14 with a Li density of 2.7x10^{14} cm^{-2}, notably lower than fully intercalated graphite. Combining the experimental findings and DFT calculations, we show that the critical step in bilayer intercalation is a transition from AB to AA stacking which occurs at a density of 0.9x10^{14} cm^{-2}. Our findings reveal the mechanism and limits for electrochemical intercalation of bilayer graphene and suggest possible avenues for increasing the Li storage capacity.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
PaliGemma: A versatile 3B VLM for transfer
Authors:
Lucas Beyer,
Andreas Steiner,
André Susano Pinto,
Alexander Kolesnikov,
Xiao Wang,
Daniel Salz,
Maxim Neumann,
Ibrahim Alabdulmohsin,
Michael Tschannen,
Emanuele Bugliarello,
Thomas Unterthiner,
Daniel Keysers,
Skanda Koppula,
Fangyu Liu,
Adam Grycner,
Alexey Gritsenko,
Neil Houlsby,
Manoj Kumar,
Keran Rong,
Julian Eisenschlos,
Rishabh Kabra,
Matthias Bauer,
Matko Bošnjak,
Xi Chen,
Matthias Minderer
, et al. (10 additional authors not shown)
Abstract:
PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more…
▽ More
PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more specialized tasks such as remote-sensing and segmentation.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Federated Foundation Model for Cardiac CT Imaging
Authors:
Malte Tölle,
Philipp Garthe,
Clemens Scherer,
Jan Moritz Seliger,
Andreas Leha,
Nina Krüger,
Stefan Simm,
Simon Martin,
Sebastian Eble,
Halvar Kelm,
Moritz Bednorz,
Florian André,
Peter Bannas,
Gerhard Diller,
Norbert Frey,
Stefan Groß,
Anja Hennemuth,
Lars Kaderali,
Alexander Meyer,
Eike Nagel,
Stefan Orwat,
Moritz Seiffert,
Tim Friede,
Tim Seidler,
Sandy Engelhardt
Abstract:
Federated learning (FL) is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often involve inherent challenges such as partially labeled datasets, where not all clients possess expert annotations of all labels of interest, leaving large portions of unlabeled data unused. In this study, we conduct the largest federated cardiac CT imagin…
▽ More
Federated learning (FL) is a renowned technique for utilizing decentralized data while preserving privacy. However, real-world applications often involve inherent challenges such as partially labeled datasets, where not all clients possess expert annotations of all labels of interest, leaving large portions of unlabeled data unused. In this study, we conduct the largest federated cardiac CT imaging analysis to date, focusing on partially labeled datasets ($n=8,124$) of Transcatheter Aortic Valve Implantation (TAVI) patients over eight hospital clients. Transformer architectures, which are the major building blocks of current foundation models, have shown superior performance when trained on larger cohorts than traditional CNNs. However, when trained on small task-specific labeled sample sizes, it is currently not feasible to exploit their underlying attention mechanism for improved performance. Therefore, we developed a two-stage semi-supervised learning strategy that distills knowledge from several task-specific CNNs (landmark detection and segmentation of calcification) into a single transformer model by utilizing large amounts of unlabeled data typically residing unused in hospitals to mitigate these issues. This method not only improves the predictive accuracy and generalizability of transformer-based architectures but also facilitates the simultaneous learning of all partial labels within a single transformer model across the federation. Additionally, we show that our transformer-based model extracts more meaningful features for further downstream tasks than the UNet-based one by only training the last layer to also solve segmentation of coronary arteries. We make the code and weights of the final model openly available, which can serve as a foundation model for further research in cardiac CT imaging.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
SVT-AV1 Encoding Bitrate Estimation Using Motion Search Information
Authors:
Lena Eichermüller,
Gaurang Chaudhari,
Ioannis Katsavounidis,
Zhijun Lei,
Hassene Tmar,
Christian Herglotz,
André Kaup
Abstract:
Enabling high compression efficiency while kee** encoding energy consumption at a low level, requires prioritization of which videos need more sophisticated encoding techniques. However, the effects vary highly based on the content, and information on how good a video can be compressed is required. This can be measured by estimating the encoded bitstream size prior to encoding. We identified the…
▽ More
Enabling high compression efficiency while kee** encoding energy consumption at a low level, requires prioritization of which videos need more sophisticated encoding techniques. However, the effects vary highly based on the content, and information on how good a video can be compressed is required. This can be measured by estimating the encoded bitstream size prior to encoding. We identified the errors between estimated motion vectors from Motion Search, an algorithm that predicts temporal changes in videos, correlates well to the encoded bitstream size. Combining Motion Search with Random Forests, the encoding bitrate can be estimated with a Pearson correlation of above 0.96.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
CANDID DAC: Leveraging Coupled Action Dimensions with Importance Differences in DAC
Authors:
Philipp Bordne,
M. Asif Hasan,
Eddie Bergman,
Noor Awad,
André Biedenkapp
Abstract:
High-dimensional action spaces remain a challenge for dynamic algorithm configuration (DAC). Interdependencies and varying importance between action dimensions are further known key characteristics of DAC problems. We argue that these Coupled Action Dimensions with Importance Differences (CANDID) represent aspects of the DAC problem that are not yet fully explored. To address this gap, we introduc…
▽ More
High-dimensional action spaces remain a challenge for dynamic algorithm configuration (DAC). Interdependencies and varying importance between action dimensions are further known key characteristics of DAC problems. We argue that these Coupled Action Dimensions with Importance Differences (CANDID) represent aspects of the DAC problem that are not yet fully explored. To address this gap, we introduce a new white-box benchmark within the DACBench suite that simulates the properties of CANDID. Further, we propose sequential policies as an effective strategy for managing these properties. Such policies factorize the action space and mitigate exponential growth by learning a policy per action dimension. At the same time, these policies accommodate the interdependence of action dimensions by fostering implicit coordination. We show this in an experimental study of value-based policies on our new benchmark. This study demonstrates that sequential policies significantly outperform independent learning of factorized policies in CANDID action spaces. In addition, they overcome the scalability limitations associated with learning a single policy across all action dimensions. The code used for our experiments is available under https://github.com/PhilippBordne/candidDAC.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
How Effective are State Space Models for Machine Translation?
Authors:
Hugo Pitorro,
Pavlo Vasylenko,
Marcos Treviso,
André F. T. Martins
Abstract:
Transformers are the current architecture of choice for NLP, but their attention layers do not scale well to long contexts. Recent works propose to replace attention with linear recurrent layers -- this is the case for state space models, which enjoy efficient training and inference. However, it remains unclear whether these models are competitive with transformers in machine translation (MT). In…
▽ More
Transformers are the current architecture of choice for NLP, but their attention layers do not scale well to long contexts. Recent works propose to replace attention with linear recurrent layers -- this is the case for state space models, which enjoy efficient training and inference. However, it remains unclear whether these models are competitive with transformers in machine translation (MT). In this paper, we provide a rigorous and comprehensive experimental comparison between transformers and linear recurrent models for MT. Concretely, we experiment with RetNet, Mamba, and hybrid versions of Mamba which incorporate attention mechanisms. Our findings demonstrate that Mamba is highly competitive with transformers on sentence and paragraph-level datasets, where in the latter both models benefit from shifting the training distribution towards longer sequences. Further analysis show that integrating attention into Mamba improves translation quality, robustness to sequence length extrapolation, and the ability to recall named entities.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Interpreting the Residual Stream of ResNet18
Authors:
André Longon
Abstract:
A mechanistic understanding of the computations learned by deep neural networks (DNNs) is far from complete. In the domain of visual object recognition, prior research has illuminated inner workings of InceptionV1, but DNNs with different architectures have remained largely unexplored. This work investigates ResNet18 with a particular focus on its residual stream, an architectural mechanism which…
▽ More
A mechanistic understanding of the computations learned by deep neural networks (DNNs) is far from complete. In the domain of visual object recognition, prior research has illuminated inner workings of InceptionV1, but DNNs with different architectures have remained largely unexplored. This work investigates ResNet18 with a particular focus on its residual stream, an architectural mechanism which InceptionV1 lacks. We observe that for a given block, channel features of the stream are updated along a spectrum: either the input feature skips to the output, the block feature overwrites the output, or the output is some mixture between the input and block features. Furthermore, we show that many residual stream channels compute scale invariant representations through a mixture of the input's smaller-scale feature with the block's larger-scale feature. This not only mounts evidence for the universality of scale equivariance, but also presents how the residual stream further implements scale invariance. Collectively, our results begin an interpretation of the residual stream in visual object recognition, finding it to be a flexible feature manager and a medium to build scale invariant representations.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Electronic Correlations in Multielectron Silicon Quantum Dots
Authors:
Dylan H. Liang,
MengKe Feng,
Philip Y. Mai,
Jesus D. Cifuentes,
Andrew S. Dzurak,
Andre Saraiva
Abstract:
Silicon quantum computing has the potential to revolutionize technology with capabilities to solve real-life problems that are computationally complex or even intractable for modern computers [1] by offering sufficient high quality qubits to perform complex error-corrected calculations. Silicon metal-oxide-semiconductor based quantum dots present a promising pathway for realizing practical quantum…
▽ More
Silicon quantum computing has the potential to revolutionize technology with capabilities to solve real-life problems that are computationally complex or even intractable for modern computers [1] by offering sufficient high quality qubits to perform complex error-corrected calculations. Silicon metal-oxide-semiconductor based quantum dots present a promising pathway for realizing practical quantum computers. To improve certain qubit properties, it is a common strategy to incorporate multiple electrons in the same dot in order to form qubits in higher confined orbital states. Theoretical modelling is an essential part of understanding the quantum behaviour of these electrons, providing a basis for validating the physical working of device models as well as providing insights into experimental data.
Hartree-Fock theory is an imperative tool for the electronic structure modelling of multi-electron quantum dots due to its ability to simulate a large number of electrons with manageable computation load. However, an efficient calculation of the self-consistent field becomes hard because dot formations in silicon are characterized by strong electron-electron interactions and conduction band valleys, besides the relatively high comparative effective mass, which add to create a behaviour dominated by repulsion between electrons rather than a well established shell structure. In this paper, we present a Hartree-Fock-based method that accounts for these complexities for the modelling of silicon quantum dots. With this method, we first establish the significance of including electron-electron interactions and valley degree of freedom and their implications. We then explore a simple case of anisotropic dots and observe the impact of anisotropy on dot formations.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Over the Edge of Chaos? Excess Complexity as a Roadblock to Artificial General Intelligence
Authors:
Teo Susnjak,
Timothy R. McIntosh,
Andre L. C. Barczak,
Napoleon H. Reyes,
Tong Liu,
Paul Watters,
Malka N. Halgamuge
Abstract:
In this study, we explored the progression trajectories of artificial intelligence (AI) systems through the lens of complexity theory. We challenged the conventional linear and exponential projections of AI advancement toward Artificial General Intelligence (AGI) underpinned by transformer-based architectures, and posited the existence of critical points, akin to phase transitions in complex syste…
▽ More
In this study, we explored the progression trajectories of artificial intelligence (AI) systems through the lens of complexity theory. We challenged the conventional linear and exponential projections of AI advancement toward Artificial General Intelligence (AGI) underpinned by transformer-based architectures, and posited the existence of critical points, akin to phase transitions in complex systems, where AI performance might plateau or regress into instability upon exceeding a critical complexity threshold. We employed agent-based modelling (ABM) to simulate hypothetical scenarios of AI systems' evolution under specific assumptions, using benchmark performance as a proxy for capability and complexity. Our simulations demonstrated how increasing the complexity of the AI system could exceed an upper criticality threshold, leading to unpredictable performance behaviours. Additionally, we developed a practical methodology for detecting these critical thresholds using simulation data and stochastic gradient descent to fine-tune detection thresholds. This research offers a novel perspective on AI advancement that has a particular relevance to Large Language Models (LLMs), emphasising the need for a tempered approach to extrapolating AI's growth potential and underscoring the importance of develo** more robust and comprehensive AI performance benchmarks.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Fast Radio Bursts and Artificial Neural Networks: a cosmological-model-independent estimation of the Hubble Constant
Authors:
Jéferson A. S. Fortunato,
David J. Bacon,
Wiliam S. Hipólito-Ricaldi,
David Wands
Abstract:
Fast Radio Bursts (FRBs) have emerged as powerful cosmological probes in recent years offering valuable insights into cosmic expansion. These predominantly extragalactic transients encode information on the expansion of the Universe through their dispersion measure, reflecting interactions with the intervening medium along the line of sight. In this study, we introduce a novel method for reconstru…
▽ More
Fast Radio Bursts (FRBs) have emerged as powerful cosmological probes in recent years offering valuable insights into cosmic expansion. These predominantly extragalactic transients encode information on the expansion of the Universe through their dispersion measure, reflecting interactions with the intervening medium along the line of sight. In this study, we introduce a novel method for reconstructing the late-time cosmic expansion rate and estimating the Hubble constant, solely derived from FRBs measurements coupled with their redshift information while employing Artificial Neural Networks. Our approach yields a Hubble constant estimate of $H_0 = 67.3\pm6.6\rm \ km \ s^{-1} \ Mpc^{-1}$. With a dataset comprising 23 localised data points, we demonstrate a precision of $\sim10\%$. However, our forecasts using simulated datasets indicate that in the future it could be possible to achieve precision comparable to the SH0ES collaboration or the Planck satellite. Our findings underscore the potential of FRBs as alternative, independent tools for probing cosmic dynamics.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Spanner for the $0/1/\infty$ weighted region problem
Authors:
Joachim Gudmundsson,
Zi** Huang,
André van Renssen,
Sampson Wong
Abstract:
We consider the problem of computing an approximate weighted shortest path in a weighted subdivision, with weights assigned from the set $\{0, 1, \infty\}$. We present a data structure $B$, which stores a set of convex, non-overlap** regions. These include zero-cost regions (0-regions) with a weight of $0$ and obstacles with a weight of $\infty$, all embedded in a plane with a weight of $1$. The…
▽ More
We consider the problem of computing an approximate weighted shortest path in a weighted subdivision, with weights assigned from the set $\{0, 1, \infty\}$. We present a data structure $B$, which stores a set of convex, non-overlap** regions. These include zero-cost regions (0-regions) with a weight of $0$ and obstacles with a weight of $\infty$, all embedded in a plane with a weight of $1$. The data structure $B$ can be constructed in expected time $O(N + (n/\varepsilon^3)(\log(n/\varepsilon) + \log N))$, where $n$ is the total number of regions, $N$ represents the total complexity of the regions, and $1 + \varepsilon$ is the approximation factor, for any $0 < \varepsilon < 1$. Using $B$, one can compute an approximate weighted shortest path from any point $s$ to any point $t$ in $O(N + n/\varepsilon^3 + (n/\varepsilon^2) \log(n/\varepsilon) + (\log N)/\varepsilon)$ time. In the special case where the 0-regions and obstacles are polygons (not necessarily convex), $B$ contains a $(1 + \varepsilon)$-spanner of the input vertices.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Object Proxy Patterns for Accelerating Distributed Applications
Authors:
J. Gregory Pauloski,
Valerie Hayot-Sasson,
Logan Ward,
Alexander Brace,
André Bauer,
Kyle Chard,
Ian Foster
Abstract:
Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves optimization to the application programmer -- optimization that becomes more difficult as data become larger. The transparent object proxy, which provides wide-area r…
▽ More
Workflow and serverless frameworks have empowered new approaches to distributed application design by abstracting compute resources. However, their typically limited or one-size-fits-all support for advanced data flow patterns leaves optimization to the application programmer -- optimization that becomes more difficult as data become larger. The transparent object proxy, which provides wide-area references that can resolve to data regardless of location, has been demonstrated as an effective low-level building block in such situations. Here we propose three high-level proxy-based programming patterns -- distributed futures, streaming, and ownership -- that make the power of the proxy pattern usable for more complex and dynamic distributed program structures. We motivate these patterns via careful review of application requirements and describe implementations of each pattern. We evaluate our implementations through a suite of benchmarks and by applying them in three substantial scientific applications, in which we demonstrate substantial improvements in runtime, throughput, and memory usage.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
A chiral quark model analysis of the $\bar KN$ interaction
Authors:
M. Conde-Correa,
T. Aguilar,
A. Capelo-Astudillo,
A. Duenas-Vidal,
J. Segovia,
P. G. Ortega
Abstract:
In this work we analyze the $\bar KN$ interaction in the framework of a constituent quark model. The near-threshold elastic and charge exchange cross sections are evaluated, finding a good agreement with the experimental data. Furthermore, the possible existence of $\bar KN$ bound states are explored, finding two poles in the isoscalar $J^P=\frac{1}{2}^-$ sector that can be interpreted as the expe…
▽ More
In this work we analyze the $\bar KN$ interaction in the framework of a constituent quark model. The near-threshold elastic and charge exchange cross sections are evaluated, finding a good agreement with the experimental data. Furthermore, the possible existence of $\bar KN$ bound states are explored, finding two poles in the isoscalar $J^P=\frac{1}{2}^-$ sector that can be interpreted as the experimental $Λ(1405)$ state.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Fault-tolerant noise guessing decoding of quantum random codes
Authors:
Diogo Cruz,
Francisco A. Monteiro,
André Roque,
Bruno C. Coutinho
Abstract:
This work addresses the open question of implementing fault-tolerant QRLCs with feasible computational overhead. We present a new decoder for quantum random linear codes (QRLCs) capable of dealing with imperfect decoding operations. A first approach, introduced by Cruz et al., only considered channel errors, and perfect gates at the decoder. Here, we analyze the fault-tolerant characteristics of Q…
▽ More
This work addresses the open question of implementing fault-tolerant QRLCs with feasible computational overhead. We present a new decoder for quantum random linear codes (QRLCs) capable of dealing with imperfect decoding operations. A first approach, introduced by Cruz et al., only considered channel errors, and perfect gates at the decoder. Here, we analyze the fault-tolerant characteristics of QRLCs with a new noise-guessing decoding technique, when considering preparation, measurement, and gate errors in the syndrome extraction procedure, while also accounting for error degeneracy. Our findings indicate a threshold error rate ($\pth$) of approximately $\pnum$ in the asymptotic limit, while considering realistic noise levels in the mentioned physical procedures.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
TransferAttn: Transferable-guided Attention Is All You Need for Video Domain Adaptation
Authors:
André Sacilotti,
Samuel Felipe dos Santos,
Nicu Sebe,
Jurandy Almeida
Abstract:
Unsupervised domain adaptation (UDA) in videos is a challenging task that remains not well explored compared to image-based UDA techniques. Although vision transformers (ViT) achieve state-of-the-art performance in many computer vision tasks, their use in video domain adaptation has still been little explored. Our key idea is to use the transformer layers as a feature encoder and incorporate spati…
▽ More
Unsupervised domain adaptation (UDA) in videos is a challenging task that remains not well explored compared to image-based UDA techniques. Although vision transformers (ViT) achieve state-of-the-art performance in many computer vision tasks, their use in video domain adaptation has still been little explored. Our key idea is to use the transformer layers as a feature encoder and incorporate spatial and temporal transferability relationships into the attention mechanism. A Transferable-guided Attention (TransferAttn) framework is then developed to exploit the capacity of the transformer to adapt cross-domain knowledge from different backbones. To improve the transferability of ViT, we introduce a novel and effective module named Domain Transferable-guided Attention Block~(DTAB). DTAB compels ViT to focus on the spatio-temporal transferability relationship among video frames by changing the self-attention mechanism to a transferability attention mechanism. Extensive experiments on UCF-HMDB, Kinetics-Gameplay, and Kinetics-NEC Drone datasets with different backbones, like ResNet101, I3D, and STAM, verify the effectiveness of TransferAttn compared with state-of-the-art approaches. Also, we demonstrate that DTAB yields performance gains when applied to other state-of-the-art transformer-based UDA methods from both video and image domains. The code will be made freely available.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Deep Dive into MRI: Exploring Deep Learning Applications in 0.55T and 7T MRI
Authors:
Ana Carolina Alves,
André Ferreira,
Behrus Puladi,
Jan Egger,
Victor Alves
Abstract:
The development of magnetic resonance imaging (MRI) for medical imaging has provided a leap forward in diagnosis, providing a safe, non-invasive alternative to techniques involving ionising radiation exposure for diagnostic purposes. It was described by Block and Purcel in 1946, and it was not until 1980 that the first clinical application of MRI became available. Since that time the MRI has gone…
▽ More
The development of magnetic resonance imaging (MRI) for medical imaging has provided a leap forward in diagnosis, providing a safe, non-invasive alternative to techniques involving ionising radiation exposure for diagnostic purposes. It was described by Block and Purcel in 1946, and it was not until 1980 that the first clinical application of MRI became available. Since that time the MRI has gone through many advances and has altered the way diagnosing procedures are performed. Due to its ability to improve constantly, MRI has become a commonly used practice among several specialisations in medicine. Particularly starting 0.55T and 7T MRI technologies have pointed out enhanced preservation of image detail and advanced tissue characterisation. This review examines the integration of deep learning (DL) techniques into these MRI modalities, disseminating and exploring the study applications. It highlights how DL contributes to 0.55T and 7T MRI data, showcasing the potential of DL in improving and refining these technologies. The review ends with a brief overview of how MRI technology will evolve in the coming years.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Herschel Gould Belt Survey in Taurus. II: A census of dense cores and filaments in the TMC1 region
Authors:
Jason Kirk,
Derek Ward-Thompson,
James Di Francesco,
Philippe André,
David Bresnahan,
Vera Könyves,
Kenneth Marsh,
Matt Griffin,
Nicola Schneider,
A. Men'shchikov,
Pedro Palmeirim,
Sylvain Bontemps,
Doris Arzoumanian,
Milena Benedettini,
Stefania Pezzuto
Abstract:
We present a catalogue of dense cores and filaments in a $3.8^{\circ}\times2.4^{\circ}$ field around the TMC1 region of the Taurus Molecular Cloud. The catalogue was created using photometric data from the Herschel SPIRE and PACS instruments in the 70 $μ$m, 160 $μ$m, 250 $μ$m, 350 $μ$m, and 500 $μ$m continuum bands. Extended structure in the region was reconstructed from a Herschel column density…
▽ More
We present a catalogue of dense cores and filaments in a $3.8^{\circ}\times2.4^{\circ}$ field around the TMC1 region of the Taurus Molecular Cloud. The catalogue was created using photometric data from the Herschel SPIRE and PACS instruments in the 70 $μ$m, 160 $μ$m, 250 $μ$m, 350 $μ$m, and 500 $μ$m continuum bands. Extended structure in the region was reconstructed from a Herschel column density map. Power spectra and PDFs of this structure are presented. The PDF splits into log-normal and power-law forms, with the high-density power-law component associated primarily with the central part of TMC1. The total mass in the mapped region is 2000 M$_\odot$, of which 34% is above an extinction of AV $\sim$ 3 mag -- a level that appears as a break in the PDF and as the minimum column density at which dense cores are found. A total of 35 dense filaments were extracted from the column density map. These have a characteristic FWHM width of 0.07 pc, but the TMC1 filament itself has a mean FWHM of $\sim$ 0.13 pc. The thermally supercritical filaments in the region are aligned orthogonal to the prevailing magnetic field direction. Derived properties for the supercritical TMC1 filament support the scenario of it being relatively young. A catalogue of 44 robust and candidate prestellar cores is created and is assessed to be complete down to 0.1 M$_\odot$. The combined prestellar CMF for the TMC1 and L1495 regions is well fit by a single log-normal distribution and is comparable to the standard IMF.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Around first-order rigidity of Coxeter groups
Authors:
Simon André,
Gianluca Paolini
Abstract:
By the work of Sela, for any free group $F$, the Coxeter group $W_ 3 = \mathbb{Z}/2\mathbb{Z} \ast \mathbb{Z}/2\mathbb{Z} \ast \mathbb{Z}/2\mathbb{Z}$ is elementarily equivalent to $W_3 \ast F$, and so Coxeter groups are not closed under elementary equivalence among finitely generated groups. In this paper we show that if we restrict to models which are generated by finitely many torsion elements…
▽ More
By the work of Sela, for any free group $F$, the Coxeter group $W_ 3 = \mathbb{Z}/2\mathbb{Z} \ast \mathbb{Z}/2\mathbb{Z} \ast \mathbb{Z}/2\mathbb{Z}$ is elementarily equivalent to $W_3 \ast F$, and so Coxeter groups are not closed under elementary equivalence among finitely generated groups. In this paper we show that if we restrict to models which are generated by finitely many torsion elements (finitely torsion-generated), then we can recover striking rigidity results. Our main result is that if $(W, S)$ is a Coxeter system whose irreducible components are either spherical, or affine or (Gromov) hyperbolic, and $G$ is finitely torsion-generated and elementarily equivalent to $W$, then $G$ is itself a Coxeter group. This combines results of the second author et al. from [MPS22, PS23] with the following main hyperbolic result: if $W$ is a Coxeter hyperbolic group and $G$ is $\mathrm{AE}$-equivalent to $W$ and finitely torsion-generated, then $G$ belongs to a finite collection of Coxeter groups (modulo isomorphism). Furthermore, we show that there are two hyperbolic Coxeter groups $W$ and $W'$ which are non-isomorphic but $\mathrm{AE}$-equivalent. We also show that, on other hand, if we restrict to certain specific classes of Coxeter groups then we can recover the strongest possible form of first-order rigidity, which we call first-order torsion-rigidity, namely the Coxeter group $W$ is the only finitely torsion-generated model of its theory. Crucially, we show that this form of rigidity holds for the following classes of Coxeter groups: even hyperbolic Coxeter groups and free products of one-ended or finite hyperbolic Coxeter groups. We conjecture that the same kind of phenomena occur for the whole class of Coxeter groups. In this direction, we prove that if $W$ and $W'$ are even Coxeter groups which are elementarily equivalent, then they are isomorphic.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Steady-state properties of multi-orbital systems using quantum Monte Carlo
Authors:
Andre Erpenbeck,
Thomas Blommel,
Lei Zhang,
Wei-Ting Lin,
Guy Cohen,
Emanuel Gull
Abstract:
A precise dynamical characterization of quantum impurity models with multiple interacting orbitals is challenging. In quantum Monte Carlo methods, this is embodied by sign problems. A dynamical sign problem makes it exponentially difficult to simulate long times. A multi-orbital sign problem generally results in a prohibitive computational cost for systems with multiple impurity degrees of freedom…
▽ More
A precise dynamical characterization of quantum impurity models with multiple interacting orbitals is challenging. In quantum Monte Carlo methods, this is embodied by sign problems. A dynamical sign problem makes it exponentially difficult to simulate long times. A multi-orbital sign problem generally results in a prohibitive computational cost for systems with multiple impurity degrees of freedom even in static equilibrium calculations. Here, we present a numerically exact inchworm method that simultaneously alleviates both sign problems, enabling simulation of multi-orbital systems directly in the equilibrium or nonequilibrium steady-state. The method combines ideas from the recently developed steady-state inchworm Monte Carlo framework [Phys. Rev. Lett. 130, 186301 (2023)] with other ideas from the equilibrium multi-orbital inchworm algorithm [Phys. Rev. Lett. 124, 206405 (2020)]. We verify our method by comparison with analytical limits and numerical results from previous methods.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
A Medical Low-Back Pain Physical Rehabilitation Dataset for Human Body Movement Analysis
Authors:
Sao Mai Nguyen,
Maxime Devanne,
Olivier Remy-Neris,
Mathieu Lempereur,
André Thepaut
Abstract:
While automatic monitoring and coaching of exercises are showing encouraging results in non-medical applications, they still have limitations such as errors and limited use contexts. To allow the development and assessment of physical rehabilitation by an intelligent tutoring system, we identify in this article four challenges to address and propose a medical dataset of clinical patients carrying…
▽ More
While automatic monitoring and coaching of exercises are showing encouraging results in non-medical applications, they still have limitations such as errors and limited use contexts. To allow the development and assessment of physical rehabilitation by an intelligent tutoring system, we identify in this article four challenges to address and propose a medical dataset of clinical patients carrying out low back-pain rehabilitation exercises. The dataset includes 3D Kinect skeleton positions and orientations, RGB videos, 2D skeleton data, and medical annotations to assess the correctness, and error classification and localisation of body part and timespan. Along this dataset, we perform a complete research path, from data collection to processing, and finally a small benchmark. We evaluated on the dataset two baseline movement recognition algorithms, pertaining to two different approaches: the probabilistic approach with a Gaussian Mixture Model (GMM), and the deep learning approach with a Long-Short Term Memory (LSTM).
This dataset is valuable because it includes rehabilitation relevant motions in a clinical setting with patients in their rehabilitation program, using a cost-effective, portable, and convenient sensor, and because it shows the potential for improvement on these challenges.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
Authors:
Peiqin Lin,
André F. T. Martins,
Hinrich Schütze
Abstract:
Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e.g., machine translation, and general-purpose tasks, e.g., text classification. Building upon these findings, our comprehensive study aims to identify the most effective strategies for leveraging parallel corpora. We investigate…
▽ More
Recent studies have highlighted the potential of exploiting parallel corpora to enhance multilingual large language models, improving performance in both bilingual tasks, e.g., machine translation, and general-purpose tasks, e.g., text classification. Building upon these findings, our comprehensive study aims to identify the most effective strategies for leveraging parallel corpora. We investigate the impact of parallel corpora quality and quantity, training objectives, and model size on the performance of multilingual large language models enhanced with parallel corpora across diverse languages and tasks. Our analysis reveals several key insights: (i) filtering noisy translations is essential for effectively exploiting parallel corpora, while language identification and short sentence filtering have little effect; (ii) even a corpus containing just 10K parallel sentences can yield results comparable to those obtained from much larger datasets; (iii) employing only the machine translation objective yields the best results among various training objectives and their combinations; (iv) larger multilingual language models benefit more from parallel corpora than smaller models due to their stronger capacity for cross-task transfer. Our study offers valuable insights into the optimal utilization of parallel corpora to enhance multilingual large language models, extending the generalizability of previous findings from limited languages and tasks to a broader range of scenarios.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Multicriteria Optimization and Decision Making: Principles, Algorithms and Case Studies
Authors:
Michael Emmerich,
André Deutz
Abstract:
Real-world decision and optimization problems, often involve constraints and conflicting criteria. For example, choosing a travel method must balance speed, cost, environmental footprint, and convenience. Similarly, designing an industrial process must consider safety, environmental impact, and cost efficiency. Ideal solutions where all objectives are optimally met are rare; instead, we seek good…
▽ More
Real-world decision and optimization problems, often involve constraints and conflicting criteria. For example, choosing a travel method must balance speed, cost, environmental footprint, and convenience. Similarly, designing an industrial process must consider safety, environmental impact, and cost efficiency. Ideal solutions where all objectives are optimally met are rare; instead, we seek good compromises and aim to avoid lose-lose scenarios. Multicriteria optimization offers computational techniques to compute Pareto optimal solutions, aiding decision analysis and decision making. This reader offers an introduction to this topic and has been developed on the basis of the revised edition of the reader for the MSc computer science course "Multicriteria Optimization and Decision Analysis" at the Leiden Institute of Advanced Computer Science, Leiden University, The Netherlands. This course was taught annually by the first author from 2007 to 2023 as a single semester course with lectures and practicals. Our aim was to make the material accessible to MSc students who do not study mathematics as their core discipline by introducing basic numerical analysis concepts when necessary and providing numerical examples for interesting cases. The introduction is organized in a unique didactic manner developed by the authors, starting from more simple concepts such as linear programming and single-point methods, and advancing from these to more difficult concepts such as optimality conditions for nonlinear optimization and set-oriented solution algorithms. Besides, we focus on the mathematical modeling and foundations rather than on specific algorithms, though not excluding the discussion of some representative examples of solution algorithms.
△ Less
Submitted 2 July, 2024; v1 submitted 29 June, 2024;
originally announced July 2024.
-
Constraints on the energy spectrum of the diffuse cosmic neutrino flux from the ANTARES neutrino telescope
Authors:
ANTARES Collaboration,
A. Albert,
S. Alves,
M. André,
M. Ardid,
S. Ardid,
J. -J. Aubert,
J. Aublin,
B. Baret,
S. Basa,
Y. Becherini,
B. Belhorma,
M. Bendahman,
F. Benfenati,
V. Bertin,
S. Biagi,
J. Boumaaza,
M. Bouta,
M. C. Bouwhuis,
H. Brânzaş,
R. Bruijn,
J. Brunner,
J. Busto,
B. Caiffi,
D. Calvo
, et al. (117 additional authors not shown)
Abstract:
High-significance evidences of the existence of a high-energy diffuse flux of cosmic neutrinos have emerged in the last decade from several observations by the IceCube Collaboration. The ANTARES neutrino telescope took data for 15 years in the Mediterranean Sea, from 2007 to 2022, and collected a high-purity all-flavour neutrino sample. The search for a diffuse cosmic neutrino signal using this da…
▽ More
High-significance evidences of the existence of a high-energy diffuse flux of cosmic neutrinos have emerged in the last decade from several observations by the IceCube Collaboration. The ANTARES neutrino telescope took data for 15 years in the Mediterranean Sea, from 2007 to 2022, and collected a high-purity all-flavour neutrino sample. The search for a diffuse cosmic neutrino signal using this dataset is presented in this article. This final analysis did not provide a statistically significant observation of the cosmic diffuse flux: this is converted into limits on the properties of the cosmic neutrino spectrum. In particular, given the sensitivity of the ANTARES neutrino telescope between 1 and 50 TeV, constraints on single-power-law hypotheses are derived for the cosmic diffuse flux below 20 TeV.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Fully-Adaptive Dynamic Connectivity of Square Intersection Graphs
Authors:
Ivor van der Hoog,
André Nusser,
Eva Rotenberg,
Frank Staals
Abstract:
A classical problem in computational geometry and graph algorithms is: given a dynamic set S of geometric shapes in the plane, efficiently maintain the connectivity of the intersection graph of S. Previous papers studied the setting where, before the updates, the data structure receives some parameter P. Then, updates could insert and delete disks as long as at all times the disks have a diameter…
▽ More
A classical problem in computational geometry and graph algorithms is: given a dynamic set S of geometric shapes in the plane, efficiently maintain the connectivity of the intersection graph of S. Previous papers studied the setting where, before the updates, the data structure receives some parameter P. Then, updates could insert and delete disks as long as at all times the disks have a diameter that lies in a fixed range [1/P, 1]. The state-of-the-art for storing disks in a dynamic connectivity data structure is a data structure that uses O(Pn) space and that has amortized O(P log^4 n) expected amortized update time. Connectivity queries between disks are supported in O( log n / loglog n) time. The state-of-the-art for Euclidean disks immediately implies a data structure for connectivity between axis-aligned squares that have their diameter in the fixed range [1/P, 1], with an improved update time of O(P log^4 n) amortized time.
We restrict our attention to axis-aligned squares, and study fully-dynamic square intersection graph connectivity. Our result is fully-adaptive to the aspect ratio, spending time proportional to the current aspect ratio ψ, as opposed to some previously given maximum P. Our focus on squares allows us to simplify and streamline the connectivity pipeline from previous work. When $n$ is the number of squares and ψ is the aspect ratio after insertion (or before deletion), our data structure answers connectivity queries in O(log n / loglog n) time. We can update connectivity information in O(ψ log^4 n + log^6 n) amortized time. We also improve space usage from O(P n log n) to O(n log^3 n log ψ) -- while generalizing to a fully-adaptive aspect ratio -- which yields a space usage that is near-linear in n for any polynomially bounded ψ.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
xTower: A Multilingual LLM for Explaining and Correcting Translation Errors
Authors:
Marcos Treviso,
Nuno M. Guerreiro,
Sweta Agrawal,
Ricardo Rei,
José Pombal,
Tania Vaz,
Helena Wu,
Beatriz Silva,
Daan van Stigt,
André F. T. Martins
Abstract:
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for tr…
▽ More
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies. Understanding these errors can potentially help improve the translation quality and user experience. This paper introduces xTower, an open large language model (LLM) built on top of TowerBase designed to provide free-text explanations for translation errors in order to guide the generation of a corrected translation. The quality of the generated explanations by xTower are assessed via both intrinsic and extrinsic evaluation. We ask expert translators to evaluate the quality of the explanations across two dimensions: relatedness towards the error span being explained and helpfulness in error understanding and improving translation quality. Extrinsically, we test xTower across various experimental setups in generating translation corrections, demonstrating significant improvements in translation quality. Our findings highlight xTower's potential towards not only producing plausible and helpful explanations of automatic translations, but also leveraging them to suggest corrected translations.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
On the Energy Consumption of Rotary Wing and Fixed Wing UAVs in Flying Networks
Authors:
Pedro Ribeiro,
André Coelho,
Rui Campos
Abstract:
Unmanned Aerial Vehicles (UAVs) are increasingly used to enable wireless communications. Due to their characteristics, such as the ability to hover and carry cargo, UAVs can serve as communications nodes, including Wi-Fi Access Points and Cellular Base Stations. In previous work, we proposed the Sustainable multi-UAV Performance-aware Placement (SUPPLY) algorithm, which focuses on the energy-effic…
▽ More
Unmanned Aerial Vehicles (UAVs) are increasingly used to enable wireless communications. Due to their characteristics, such as the ability to hover and carry cargo, UAVs can serve as communications nodes, including Wi-Fi Access Points and Cellular Base Stations. In previous work, we proposed the Sustainable multi-UAV Performance-aware Placement (SUPPLY) algorithm, which focuses on the energy-efficient placement of multiple UAVs acting as Flying Access Points (FAPs). Additionally, we developed the Multi-UAV Energy Consumption (MUAVE) simulator to evaluate the UAV energy consumption, specifically when using the SUPPLY algorithm. However, MUAVE was initially designed to compute the energy consumption for rotary-wing UAVs only.
In this paper, we propose eMUAVE, an enhanced version of the MUAVE simulator that allows the evaluation of the energy consumption for both rotary-wing and fixed-wing UAVs. Our energy consumption evaluation using eMUAVE considers reference and random networking scenarios. The results show that fixed-wing UAVs can be employed in the majority of networking scenarios. However, rotary-wing UAVs are typically more energy-efficient than fixed-wing UAVs when following the trajectories defined by SUPPLY.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Resonant sub-Neptunes are puffier
Authors:
Adrien Leleu,
Jean-Baptiste Delisle,
Remo Burn,
André Izidoro,
Stéphane Udry,
Xavier Dumusque,
Christophe Lovis,
Sarah Millholland,
Léna Parc,
François Bouchy,
Vincent Bourrier,
Yann Alibert,
João Faria,
Christoph Mordasini,
Damien Ségransan
Abstract:
A systematic, population-level discrepancy exists between the densities of exoplanets whose masses have been measured with transit timing variations (TTVs) versus those measured with radial velocities (RVs). Since the TTV planets are predominantly nearly resonant, it is still unclear whether the discrepancy is attributed to detection biases or to astrophysical differences between the nearly resona…
▽ More
A systematic, population-level discrepancy exists between the densities of exoplanets whose masses have been measured with transit timing variations (TTVs) versus those measured with radial velocities (RVs). Since the TTV planets are predominantly nearly resonant, it is still unclear whether the discrepancy is attributed to detection biases or to astrophysical differences between the nearly resonant and non resonant planet populations. We defined a controlled, unbiased sample of 36 sub-Neptunes characterised by Kepler, TESS, HARPS, and ESPRESSO. We found that their density depends mostly on the resonant state of the system, with a low probability (of $0.002_{-0.001}^{+0.010}$) that the mass of (nearly) resonant planets is drawn from the same underlying population as the bulk of sub-Neptunes. Increasing the sample to 133 sub-Neptunes reveals finer details: the densities of resonant planets are similar and lower than non-resonant planets, and both the mean and spread in density increase for planets that are away from resonance. This trend is also present in RV-characterised planets alone. In addition, TTVs and RVs have consistent density distributions for a given distance to resonance. We also show that systems closer to resonances tend to be more co-planar than their spread-out counterparts. These observational trends are also found in synthetic populations, where planets that survived in their original resonant configuration retain a lower density; whereas less compact systems have undergone post-disc giant collisions that increased the planet's density, while expanding their orbits. Our findings reinforce the claim that resonant systems are archetypes of planetary systems at their birth.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
An LLM-based Knowledge Synthesis and Scientific Reasoning Framework for Biomedical Discovery
Authors:
Oskar Wysocki,
Magdalena Wysocka,
Danilo Carvalho,
Alex Teodor Bogatu,
Danilo Miranda Gusicuma,
Maxime Delmas,
Harriet Unsworth,
Andre Freitas
Abstract:
We present BioLunar, developed using the Lunar framework, as a tool for supporting biological analyses, with a particular emphasis on molecular-level evidence enrichment for biomarker discovery in oncology. The platform integrates Large Language Models (LLMs) to facilitate complex scientific reasoning across distributed evidence spaces, enhancing the capability for harmonizing and reasoning over h…
▽ More
We present BioLunar, developed using the Lunar framework, as a tool for supporting biological analyses, with a particular emphasis on molecular-level evidence enrichment for biomarker discovery in oncology. The platform integrates Large Language Models (LLMs) to facilitate complex scientific reasoning across distributed evidence spaces, enhancing the capability for harmonizing and reasoning over heterogeneous data sources. Demonstrating its utility in cancer research, BioLunar leverages modular design, reusable data access and data analysis components, and a low-code user interface, enabling researchers of all programming levels to construct LLM-enabled scientific workflows. By facilitating automatic scientific discovery and inference from heterogeneous evidence, BioLunar exemplifies the potential of the integration between LLMs, specialised databases and biomedical tools to support expert-level knowledge synthesis and discovery.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Authors:
Anna Bavaresco,
Raffaella Bernardi,
Leonardo Bertolazzi,
Desmond Elliott,
Raquel Fernández,
Albert Gatt,
Esam Ghaleb,
Mario Giulianelli,
Michael Hanna,
Alexander Koller,
André F. T. Martins,
Philipp Mondorf,
Vera Neplenbroek,
Sandro Pezzelle,
Barbara Plank,
David Schlangen,
Alessandro Suglia,
Aditya K Surikuchi,
Ece Takmaz,
Alberto Testoni
Abstract:
There is an increasing trend towards evaluating NLP models with LLM-generated judgments instead of human judgments. In the absence of a comparison against human data, this raises concerns about the validity of these evaluations; in case they are conducted with proprietary models, this also raises concerns over reproducibility. We provide JUDGE-BENCH, a collection of 20 NLP datasets with human anno…
▽ More
There is an increasing trend towards evaluating NLP models with LLM-generated judgments instead of human judgments. In the absence of a comparison against human data, this raises concerns about the validity of these evaluations; in case they are conducted with proprietary models, this also raises concerns over reproducibility. We provide JUDGE-BENCH, a collection of 20 NLP datasets with human annotations, and comprehensively evaluate 11 current LLMs, covering both open-weight and proprietary models, for their ability to replicate the annotations. Our evaluations show that each LLM exhibits a large variance across datasets in its correlation to human judgments. We conclude that LLMs are not yet ready to systematically replace human judges in NLP.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Multi-Step Reconstruction of Radio-Interferometric Images
Authors:
S. Wang,
S. Prunet,
S. Mignot,
A. Ferrari
Abstract:
The advent of large aperture arrays, such as the currently under construction Square Kilometer Array (SKA), allows for observing the universe in the radio-spectrum at unprecedented resolution and sensitivity. However, these telescopes produce data on the scale of exabytes, introducing a slew of hardware and software design challenges. This paper proposes a multi-step image reconstruction method th…
▽ More
The advent of large aperture arrays, such as the currently under construction Square Kilometer Array (SKA), allows for observing the universe in the radio-spectrum at unprecedented resolution and sensitivity. However, these telescopes produce data on the scale of exabytes, introducing a slew of hardware and software design challenges. This paper proposes a multi-step image reconstruction method that allows for partitioning visibility data by baseline length. This enables more flexible data distribution and parallelization, aiding in processing radio-astronomical observations within given constraints. The multi-step reconstruction is separated into two-steps, first reconstructing a low-resolution image with only short-baseline visibilities, and then using this image together with the long-baseline visibilities to reconstruct the full-resolution image. The proposed method only operates in the minor-cycle, and can be easily integrated into existing imaging pipelines. We show that our proposed method allows for partitioning visibilities by baseline without introducing significant additional drawbacks, having roughly the same computational cost and producing images of comparable quality to a method in the same framework that processes all baselines simultaneously.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Transformer Normalisation Layers and the Independence of Semantic Subspaces
Authors:
Stephen Menary,
Samuel Kaski,
Andre Freitas
Abstract:
Recent works have shown that transformers can solve contextual reasoning tasks by internally executing computational graphs called circuits. Circuits often use attention to logically match information from subspaces of the representation, e.g. using position-in-sequence to identify the previous token. In this work, we consider a semantic subspace to be any independent subspace of the latent repres…
▽ More
Recent works have shown that transformers can solve contextual reasoning tasks by internally executing computational graphs called circuits. Circuits often use attention to logically match information from subspaces of the representation, e.g. using position-in-sequence to identify the previous token. In this work, we consider a semantic subspace to be any independent subspace of the latent representation that can fully determine an attention distribution. We show that Pre-Norm, the placement of normalisation layer used by state-of-the-art transformers, violates this ability unless the model learns a strict representation structure of orthogonal spheres. This is because it causes linear subspaces to interfere through their common normalisation factor. Theoretically, we analyse circuit stability by modelling this interference as random noise on the $L_2$-norms of the query/key/value vectors, predicting a phenomenon of circuit collapse when sparse-attention shifts to a different token. Empirically, we investigate the sensitivity of real-world models trained for mathematical addition, observing a 1% rate of circuit collapse when the norms are artificially perturbed by $\lesssim$10%. We contrast Pre-Norm with QKV-Norm, which places normalisation after the attention head's linear operators. Theoretically this relaxes the representational constraints. Empirically we observe comparable in-distribution but worse out-of-distribution performance.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Quantum-Inspired Fluid Simulation of 2D Turbulence with GPU Acceleration
Authors:
Leonhard Hölscher,
Pooja Rao,
Lukas Müller,
Johannes Klepsch,
Andre Luckow,
Tobias Stollenwerk,
Frank K. Wilhelm
Abstract:
Tensor network algorithms can efficiently simulate complex quantum many-body systems by utilizing knowledge of their structure and entanglement. These methodologies have been adapted recently for solving the Navier-Stokes equations, which describe a spectrum of fluid phenomena, from the aerodynamics of vehicles to weather patterns. Within this quantum-inspired paradigm, velocity is encoded as matr…
▽ More
Tensor network algorithms can efficiently simulate complex quantum many-body systems by utilizing knowledge of their structure and entanglement. These methodologies have been adapted recently for solving the Navier-Stokes equations, which describe a spectrum of fluid phenomena, from the aerodynamics of vehicles to weather patterns. Within this quantum-inspired paradigm, velocity is encoded as matrix product states (MPS), effectively harnessing the analogy between interscale correlations of fluid dynamics and entanglement in quantum many-body physics. This particular tensor structure is also called quantics tensor train (QTT). By utilizing NVIDIA's cuQuantum library to perform parallel tensor computations on GPUs, our adaptation speeds up simulations by up to 12.1 times. This allows us to study the algorithm in terms of its applicability, scalability, and performance. By simulating two qualitatively different but commonly encountered 2D flow problems at high Reynolds numbers up to $1\times10^7$ using a fourth-order time step** scheme, we find that the algorithm has a potential advantage over direct numerical simulations in the turbulent regime as the requirements for grid resolution increase drastically. In addition, we derive the scaling $χ=\mathcal{O}(\text{poly}(1/ε))$ for the maximum bond dimension $χ$ of MPS representing turbulent flow fields, with an error $ε$, based on the spectral distribution of turbulent kinetic energy. Our findings motivate further exploration of related quantum algorithms and other tensor network methods.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity
Authors:
Chih-Hsuan Yang,
Benjamin Feuer,
Zaki Jubery,
Zi K. Deng,
Andre Nakkab,
Md Zahid Hasan,
Shivani Chiranjeevi,
Kelly Marshall,
Nirmal Baishnab,
Asheesh K Singh,
Arti Singh,
Soumik Sarkar,
Nirav Merchant,
Chinmay Hegde,
Baskar Ganapathysubramanian
Abstract:
We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set…
▽ More
We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set of species from birds (Aves), spiders/ticks/mites (Arachnida), insects (Insecta), plants (Plantae), fungus/mushrooms (Fungi), snails (Mollusca), and snakes/lizards (Reptilia), making it a valuable resource for multimodal vision-language AI models for biodiversity assessment and agriculture research. Each image is annotated with scientific names, taxonomic details, and common names, enhancing the robustness of AI model training.
We showcase the value of Arboretum by releasing a suite of CLIP models trained using a subset of 40 million captioned images. We introduce several new benchmarks for rigorous assessment, report accuracy for zero-shot learning, and evaluations across life stages, rare species, confounding species, and various levels of the taxonomic hierarchy.
We anticipate that Arboretum will spur the development of AI models that can enable a variety of digital tools ranging from pest control strategies, crop monitoring, and worldwide biodiversity assessment and environmental conservation. These advancements are critical for ensuring food security, preserving ecosystems, and mitigating the impacts of climate change. Arboretum is publicly available, easily accessible, and ready for immediate use.
Please see the \href{https://baskargroup.github.io/Arboretum/}{project website} for links to our data, models, and code.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
GreenFaaS: Maximizing Energy Efficiency of HPC Workloads with FaaS
Authors:
Alok Kamatar,
Valerie Hayot-Sasson,
Yadu Babuji,
Andre Bauer,
Gourav Rattihalli,
Ninad Hogade,
Dejan Milojicic,
Kyle Chard,
Ian Foster
Abstract:
Application energy efficiency can be improved by executing each application component on the compute element that consumes the least energy while also satisfying time constraints. In principle, the function as a service (FaaS) paradigm should simplify such optimizations by abstracting away compute location, but existing FaaS systems do not provide for user transparency over application energy cons…
▽ More
Application energy efficiency can be improved by executing each application component on the compute element that consumes the least energy while also satisfying time constraints. In principle, the function as a service (FaaS) paradigm should simplify such optimizations by abstracting away compute location, but existing FaaS systems do not provide for user transparency over application energy consumption or task placement. Here we present GreenFaaS, a novel open source framework that bridges this gap between energy-efficient applications and FaaS platforms. GreenFaaS can be deployed by end users or providers across systems to monitor energy use, provide task-specific feedback, and schedule tasks in an energy-aware manner. We demonstrate that intelligent placement of tasks can both reduce energy consumption and improve performance. For a synthetic workload, GreenFaaS reduces the energy-delay product by 45% compared to alternatives. Furthermore, running a molecular design application through GreenFaaS can reduce energy consumption by 21% and runtime by 63% by better matching tasks with machines.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
LumberChunker: Long-Form Narrative Document Segmentation
Authors:
André V. Duarte,
João Marques,
Miguel Graça,
Miguel Freire,
Lei Li,
Arlindo L. Oliveira
Abstract:
Modern NLP tasks increasingly rely on dense retrieval methods to access up-to-date and relevant contextual information. We are motivated by the premise that retrieval benefits from segments that can vary in size such that a content's semantic independence is better captured. We propose LumberChunker, a method leveraging an LLM to dynamically segment documents, which iteratively prompts the LLM to…
▽ More
Modern NLP tasks increasingly rely on dense retrieval methods to access up-to-date and relevant contextual information. We are motivated by the premise that retrieval benefits from segments that can vary in size such that a content's semantic independence is better captured. We propose LumberChunker, a method leveraging an LLM to dynamically segment documents, which iteratively prompts the LLM to identify the point within a group of sequential passages where the content begins to shift. To evaluate our method, we introduce GutenQA, a benchmark with 3000 "needle in a haystack" type of question-answer pairs derived from 100 public domain narrative books available on Project Gutenberg. Our experiments show that LumberChunker not only outperforms the most competitive baseline by 7.37% in retrieval performance (DCG@20) but also that, when integrated into a RAG pipeline, LumberChunker proves to be more effective than other chunking methods and competitive baselines, such as the Gemini 1.5M Pro. Our Code and Data are available at https://github.com/joaodsmarques/LumberChunker
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Optical ionization effects in kHz laser wakefield acceleration with few-cycle pulses
Authors:
Joséphine Monzac,
Slava Smartsev,
Julius Huijts,
Lucas Rovige,
Igor A. Andriyash,
Aline Vernier,
Vidmantas Tomkus,
Valdas Girdauskas,
Gediminas Raciukaitis,
Miglė Mackevičiūtė,
Valdemar Stankevic,
Antoine Cavagna,
Jaismeen Kaur,
André Kalouguine,
Rodrigo Lopez-Martens,
Jérôme Faure
Abstract:
We present significant advances in Laser Wakefield Acceleration (LWFA) operating at a 1 kHz repetition rate, employing a sub-TW, few-femtosecond laser and a continuously flowing hydrogen gas target. We conducted the first comprehensive study assessing how the nature of the gas within the target influences accelerator performance. This work confirms and elucidates the superior performance of hydrog…
▽ More
We present significant advances in Laser Wakefield Acceleration (LWFA) operating at a 1 kHz repetition rate, employing a sub-TW, few-femtosecond laser and a continuously flowing hydrogen gas target. We conducted the first comprehensive study assessing how the nature of the gas within the target influences accelerator performance. This work confirms and elucidates the superior performance of hydrogen in kHz LWFA. Our system generates quasi-monoenergetic electron bunches with energies up to 10 MeV, bunch charges of 2 pC, and angular divergences of 15 mrad. Notably, our novel scheme relying on differential pum** enables continuous operation at kHz repetition rates, contrasting with previous systems that operated in burst mode to achieve similar beam properties. Particle-in-cell simulations explain hydrogen's superior performances: the ionization effects in nitrogen and helium distort the laser pulse, negatively impacting accelerator performance. These effects are strongly mitigated in hydrogen plasma, thereby enhancing beam quality. This analysis represents a significant step forward in optimizing and understanding kHz LWFA. It underscores the critical role of hydrogen and the imperative need to develop hydrogen-compatible target systems capable of managing high repetition rates, as exemplified by our differential pum** system. These advances lay the groundwork for further developments in high-repetition-rate LWFA technology.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Towards a copilot in BIM authoring tool using a large language model-based agent for intelligent human-machine interaction
Authors:
Changyu Du,
Stavros Nousias,
André Borrmann
Abstract:
Facing increasingly complex BIM authoring software and the accompanying expensive learning costs, designers often seek to interact with the software in a more intelligent and lightweight manner. They aim to automate modeling workflows, avoiding obstacles and difficulties caused by software usage, thereby focusing on the design process itself. To address this issue, we proposed an LLM-based autonom…
▽ More
Facing increasingly complex BIM authoring software and the accompanying expensive learning costs, designers often seek to interact with the software in a more intelligent and lightweight manner. They aim to automate modeling workflows, avoiding obstacles and difficulties caused by software usage, thereby focusing on the design process itself. To address this issue, we proposed an LLM-based autonomous agent framework that can function as a copilot in the BIM authoring tool, answering software usage questions, understanding the user's design intentions from natural language, and autonomously executing modeling tasks by invoking the appropriate tools. In a case study based on the BIM authoring software Vectorworks, we implemented a software prototype to integrate the proposed framework seamlessly into the BIM authoring scenario. We evaluated the planning and reasoning capabilities of different LLMs within this framework when faced with complex instructions. Our work demonstrates the significant potential of LLM-based agents in design automation and intelligent interaction.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Analysis of surface tension in terms of force gradient per unit area
Authors:
Andre Schiltz
Abstract:
Conventionally, surface tension is expressed as a force per unit length or as an energy per unit area.In this paper, we propose a thought experiment which consists of replacing the surface tension by an equivalent force gradient following fluid mechanics principles. Such a system of equivalent forces allows to analyze the effects of tensioactivity in terms of force per unit area or in terms of ene…
▽ More
Conventionally, surface tension is expressed as a force per unit length or as an energy per unit area.In this paper, we propose a thought experiment which consists of replacing the surface tension by an equivalent force gradient following fluid mechanics principles. Such a system of equivalent forces allows to analyze the effects of tensioactivity in terms of force per unit area or in terms of energy per unit volume.This theoretical tool proposes an alternative vision of tensioactivity forces at the interfaces and allows to rewrite the wellknown capillarity equations by calculating forces equilibrium at steady state. The new equations will be applied the case of known phenomena known phenomena such as meniscus, capillary tube, wilhelmy blade and equilibrium of drops and semi-drops.The paper will be concluded by extending our thought experience with the formulation of some hypotheses that would make it possible to interpret such equivalent stress gradients.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Hecke operators on rational functions II
Authors:
André Rosenbaum Coelho,
Caio Simon de Oliveira,
Sinai Robins
Abstract:
We study the action of the Hecke operators $U_n$ on the space $\mathcal R$ of rational functions of one variable, over $\mathbb C$. The main goal is to give a complete classification of the eigenfunctions of $U_n$, answering many questions that were raised in \cite{GilRobins}. We accomplish this by introducing certain number-theoretic directed graphs, called Zolotarev Graphs, which extend the well…
▽ More
We study the action of the Hecke operators $U_n$ on the space $\mathcal R$ of rational functions of one variable, over $\mathbb C$. The main goal is to give a complete classification of the eigenfunctions of $U_n$, answering many questions that were raised in \cite{GilRobins}. We accomplish this by introducing certain number-theoretic directed graphs, called Zolotarev Graphs, which extend the well-known permutations due to Zolotarev.
We develop the theory of the Zolotarev graphs, and discover certain strong relations between these graphs and the kernel of $U_n$ acting on a subspace of $\mathcal R$. We decompose the eigenfunctions of $U_n$ into certain natural finite-dimensional vector spaces of rational functions, which we call the eigenspaces. In this context, we prove that the dimension of each eigenspace is equal to the number of nodes of its corresponding Zolotarev graph, belonging to a cycle. We prove that the number of leaves of this Zolotarev graph equals the dimension of the kernel of $U_n$. We then give a novel number-theoretic formula for the number of cycles of fixed length, in each Zolotarev graph. We also study the simultaneous eigenfunctions for all of the $U_n$, and give explicit bases for all of them. Finally, we prove that the classical Artin Conjecture on primitive roots is equivalent to the conjecture that infinitely many of these eigenspaces have dimension $1$.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Impact of surface treatments on the electron affinity of nitrogen-doped ultrananocrystalline diamond
Authors:
Andre Chambers,
Daniel J. McCloskey,
Nikolai Dontschuk,
Hassan N. Al Hashem,
Billy J. Murdoch,
Alastair Stacey,
Steven Prawer,
Arman Ahnood
Abstract:
In recent years, various forms of nanocrystalline diamond (NCD) have emerged as an attractive group of diamond/graphite mixed-phase materials for a range of applications from electron emission sources to electrodes for neural interfacing. To tailor their properties for different uses, NCD surfaces can be terminated with various chemical functionalities, in particular hydrogen and oxygen, which shi…
▽ More
In recent years, various forms of nanocrystalline diamond (NCD) have emerged as an attractive group of diamond/graphite mixed-phase materials for a range of applications from electron emission sources to electrodes for neural interfacing. To tailor their properties for different uses, NCD surfaces can be terminated with various chemical functionalities, in particular hydrogen and oxygen, which shift the band edge positions and electron affinity values. While the band edge positions of chemically terminated single crystal diamond are well understood, the same is not true for nanocrystalline diamond, which has uncontrolled crystallographic surfaces with a variety of chemical states as well as graphitic grain boundary regions. In this work, the relative band edge positions of as-grown, hydrogen terminated, and oxygen terminated nitrogen-doped ultrananocrystalline diamond (N-UNCD) are determined using ultraviolet photoelectron spectroscopy (UPS), while the band bending is investigated using photoelectrochemical measurements. In contrast to the widely reported negative electrode affinity of hydrogen terminated single crystal diamond, our work demonstrates that hydrogen terminated N-UNCD exhibits a positive electron affinity owing to the increased surface and bulk defect densities. These findings elucidate the marked differences in electrochemical properties of hydrogen and oxygen terminated N-UNCD, such as the dramatic changes in electrochemical capacitance.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Towards optical neuromodulation using nitrogen-doped ultrananocrystalline diamond photoelectrodes
Authors:
Samira Falahatdoost,
Andre Chambers,
Alastair Stacey,
Steven Prawer,
Arman Ahnood
Abstract:
Nitrogen-doped ultrananocrystalline diamond (N-UNCD) is a form of diamond electrode with near-infrared photoresponsivity, making it well suited for physiological applications. N-UNCD's photoresponsivity is strongly influenced by its surface. While it is known that oxygen treatment provides a higher photoresponsivity, a better understanding of its surface processes is needed to tailor the material…
▽ More
Nitrogen-doped ultrananocrystalline diamond (N-UNCD) is a form of diamond electrode with near-infrared photoresponsivity, making it well suited for physiological applications. N-UNCD's photoresponsivity is strongly influenced by its surface. While it is known that oxygen treatment provides a higher photoresponsivity, a better understanding of its surface processes is needed to tailor the material for optical neuromodulation. This work examines the impact of various oxygen treatment methods, with aim of creating oxygen rich surfaces with different chemical and structural properties. Surface characterisation methods along with electrochemical and photoelectrochemical measurements and modelling were used to investigate the films. It was found that oxygen furnace annealing resulted in orders of magnitude improvement in the near-infrared photoresponsivity, to 3.75 +/- 0.05 uA/W. This translates to an approximate 200 times increase in the photocurrent compared to the untreated surface. This enhancement in photocurrent is largely due to the changes in the chemical species present at the surface. The photocurrent is estimated to be sufficient for extra-cellular stimulation of brain neurons within the safe optical exposure limit, positioning N-UNCD as an excellent candidate to be used in next-generation photoelectrodes for photobiomodulation.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
RadEx: A Framework for Structured Information Extraction from Radiology Reports based on Large Language Models
Authors:
Daniel Reichenpfader,
Jonas Knupp,
André Sander,
Kerstin Denecke
Abstract:
Annually and globally, over three billion radiography examinations and computer tomography scans result in mostly unstructured radiology reports containing free text. Despite the potential benefits of structured reporting, its adoption is limited by factors such as established processes, resource constraints and potential loss of information. However, structured information would be necessary for…
▽ More
Annually and globally, over three billion radiography examinations and computer tomography scans result in mostly unstructured radiology reports containing free text. Despite the potential benefits of structured reporting, its adoption is limited by factors such as established processes, resource constraints and potential loss of information. However, structured information would be necessary for various use cases, including automatic analysis, clinical trial matching, and prediction of health outcomes. This study introduces RadEx, an end-to-end framework comprising 15 software components and ten artifacts to develop systems that perform automated information extraction from radiology reports. It covers the complete process from annotating training data to extracting information by offering a consistent generic information model and setting boundaries for model development. Specifically, RadEx allows clinicians to define relevant information for clinical domains (e.g., mammography) and to create report templates. The framework supports both generative and encoder-only models and the decoupling of information extraction from template filling enables independent model improvements. Develo** information extraction systems according to the RadEx framework facilitates implementation and maintenance as components are easily exchangeable, while standardized artifacts ensure interoperability between components.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Universality of scaled particle spectra in ultrarelativistic heavy-ion collisions
Authors:
Cicero D. Muncinelli,
Fernando G. Gardim,
David D. Chinellato,
Gabriel S. Denicol,
Andre V. Giannini,
Matthew Luzum,
Jorge Noronha,
Tiago Nunes da Silva,
Jun Takahashi,
Giorgio Torrieri
Abstract:
We propose a new observable derived from a centrality-dependent scaling of transverse particle spectra. By removing the global scales of total particle number and mean transverse momentum, we isolate the shape of the spectrum. In hydrodynamic simulations, while the multiplicity and mean transverse momentum fluctuate significantly, the scaled spectrum is found to be almost constant even at an event…
▽ More
We propose a new observable derived from a centrality-dependent scaling of transverse particle spectra. By removing the global scales of total particle number and mean transverse momentum, we isolate the shape of the spectrum. In hydrodynamic simulations, while the multiplicity and mean transverse momentum fluctuate significantly, the scaled spectrum is found to be almost constant even at an event-by-event level and after resonance decays. This universality survives when averaging over events in each centrality bin before scaling. We then investigate the presence of this scaling in experimental data from the ALICE collaboration in Pb-Pb, Xe-Xe, and p-Pb collisions. We find a remarkable universality in the experimentally observed scaled spectra at low transverse momentum, compatible with hydrodynamic predictions. The data show a minor breaking of universality at large transverse momentum and hints of evolution with the system size that are not seen in simulations. Our results motivate further theoretical and experimental investigations of this new observable to bring to light the collective and non-collective behavior encoded in the transverse particle spectrum of different collision systems.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Lieb-Thirring inequalities on the spheres and $SO(3)$
Authors:
André Pedroso Kowacs,
Michael Ruzhansky
Abstract:
In this paper, we obtain new upper bounds for the Lieb-Thirring inequality on the spheres of any dimension greater than $2$. As far as we have checked, our results improve previous results found in the literature for all dimensions greater than $2$. We also prove and exhibit an explicit new upper bound for the Lieb-Thirring inequality on $SO(3)$. We also discuss these estimates in the case of gene…
▽ More
In this paper, we obtain new upper bounds for the Lieb-Thirring inequality on the spheres of any dimension greater than $2$. As far as we have checked, our results improve previous results found in the literature for all dimensions greater than $2$. We also prove and exhibit an explicit new upper bound for the Lieb-Thirring inequality on $SO(3)$. We also discuss these estimates in the case of general compact Lie groups. Originally developed for estimating the sums of moments of negative eigenvalues of the Schrödinger operator in $L^2(\mathbb{R}^n)$, these inequalities have applications in quantum mechanics and other fields.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
RouteFinder: Towards Foundation Models for Vehicle Routing Problems
Authors:
Federico Berto,
Chuanbo Hua,
Nayeli Gast Zepeda,
André Hottung,
Niels Wouda,
Leon Lan,
Kevin Tierney,
**kyoo Park
Abstract:
Vehicle Routing Problems (VRPs) are optimization problems with significant real-world implications in logistics, transportation, and supply chain management. Despite the recent progress made in learning to solve individual VRP variants, there is a lack of a unified approach that can effectively tackle a wide range of tasks, which is crucial for real-world impact. This paper introduces RouteFinder,…
▽ More
Vehicle Routing Problems (VRPs) are optimization problems with significant real-world implications in logistics, transportation, and supply chain management. Despite the recent progress made in learning to solve individual VRP variants, there is a lack of a unified approach that can effectively tackle a wide range of tasks, which is crucial for real-world impact. This paper introduces RouteFinder, a framework for develo** foundation models for VRPs. Our key idea is that a foundation model for VRPs should be able to model variants by treating each variant as a subset of a larger VRP problem, equipped with different attributes. We introduce a parallelized environment that can handle any combination of attributes at the same time in a batched manner, and an efficient sampling procedure to train on a mix of problems at each optimization step that can greatly improve convergence robustness. We also introduce novel Global Feature Embeddings that project instance-wise attributes efficiently onto the latent space and help the model understand different VRP variants. Finally, we introduce Efficient Adapter Layers, a simple yet effective technique to finetune pre-trained RouteFinder models to solve novel variants with previously unseen attributes outside of the original feature space. We validate our approach through extensive experiments on 24 VRP variants, demonstrating competitive results over recent multi-task learning models. We make our code openly available at https://github.com/ai4co/routefinder.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Modelling the evolution of the Galactic disc scale height traced by open clusters
Authors:
Sandro Moreira,
André Moitinho,
André Silva,
Duarte Almeida
Abstract:
Context. The scale height of the spatial distribution of open clusters (OCs) in the Milky Way exhibits a well known increase with age which is usually interpreted as evidence for dynamical heating of the disc or of the disc having been thicker in the past.
Aims. We address the increase of the scale height with age of the OC population from a different angle. We propose that the apparent thickeni…
▽ More
Context. The scale height of the spatial distribution of open clusters (OCs) in the Milky Way exhibits a well known increase with age which is usually interpreted as evidence for dynamical heating of the disc or of the disc having been thicker in the past.
Aims. We address the increase of the scale height with age of the OC population from a different angle. We propose that the apparent thickening of the disc can be largely explained as a consequence of a stronger disruption of OCs near the Galactic plane by disc phenomena, namely encounters with giant molecular clouds (GMCs).
Methods. We present a computational model that forms OCs with different initial masses and follows their orbits while subjecting them to different disruption mechanisms. To setup the model and infer its parameters, we use and analyse a Gaia-based OC catalogue (Dias et al. 2021). We investigate both the spatial and age distributions of the OC population and discuss the completeness of the sample. The simulation results are then compared to the observations.
Results. Consistent with previous studies, the observations reveal that the SH of the spatial distribution of OCs increases with age. We find that it is very likely that the OC sample is incomplete even for the solar neighbourhood. The model simulations successfully reproduce the SH increase with age and the total number of OCs that survive with age up to 1 Gyr. For older OCs, the predicted SH from the model starts deviating from the observations, although remaining within the uncertainties of the observations. This can be related with effects of incompleteness and/or simplifications in the model.
Conclusions. A selective disruption of OCs near the galactic plane through GMC encounters is able to explain the SH evolution of the OC population.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
A two-dimensional optomechanical crystal for quantum transduction
Authors:
Felix M. Mayor,
Sultan Malik,
André G. Primo,
Samuel Gyger,
Wentao Jiang,
Thiago P. M. Alegre,
Amir H. Safavi-Naeini
Abstract:
Integrated optomechanical systems are one of the leading platforms for manipulating, sensing, and distributing quantum information. The temperature increase due to residual optical absorption sets the ultimate limit on performance for these applications. In this work, we demonstrate a two-dimensional optomechanical crystal geometry, named \textbf{b-dagger}, that alleviates this problem through inc…
▽ More
Integrated optomechanical systems are one of the leading platforms for manipulating, sensing, and distributing quantum information. The temperature increase due to residual optical absorption sets the ultimate limit on performance for these applications. In this work, we demonstrate a two-dimensional optomechanical crystal geometry, named \textbf{b-dagger}, that alleviates this problem through increased thermal anchoring to the surrounding material. Our mechanical mode operates at 7.4 GHz, well within the operation range of standard cryogenic microwave hardware and piezoelectric transducers. The enhanced thermalization combined with the large optomechanical coupling rates, $g_0/2π\approx 880~\mathrm{kHz}$, and high optical quality factors, $Q_\text{opt} = 2.4 \times 10^5$, enables the ground-state cooling of the acoustic mode to phononic occupancies as low as $n_\text{m} = 0.35$ from an initial temperature of 3 kelvin, as well as entering the optomechanical strong-coupling regime. Finally, we perform pulsed sideband asymmetry of our devices at a temperature below 10 millikelvin and demonstrate ground-state operation ($n_\text{m} < 0.45$) for repetition rates as high as 3 MHz. Our results extend the boundaries of optomechanical system capabilities and establish a robust foundation for the next generation of microwave-to-optical transducers with entanglement rates overcoming the decoherence rates of state-of-the-art superconducting qubits.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.