-
ACR: A Benchmark for Automatic Cohort Retrieval
Authors:
Dung Ngoc Thai,
Victor Ardulov,
Jose Ulises Mena,
Simran Tiwari,
Gleb Erofeev,
Ramy Eskander,
Karim Tarabishy,
Ravi B Parikh,
Wael Salloum
Abstract:
Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured data combined with manual curation, which are time-consuming, labor-intensive, and often yield low-quality results. Recent advancements in large language models (…
▽ More
Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured data combined with manual curation, which are time-consuming, labor-intensive, and often yield low-quality results. Recent advancements in large language models (LLMs) and information retrieval (IR) offer promising avenues to revolutionize these systems. Major challenges include managing extensive eligibility criteria and handling the longitudinal nature of unstructured Electronic Medical Records (EMRs) while ensuring that the solution remains cost-effective for real-world application. This paper introduces a new task, Automatic Cohort Retrieval (ACR), and evaluates the performance of LLMs and commercial, domain-specific neuro-symbolic approaches. We provide a benchmark task, a query dataset, an EMR dataset, and an evaluation framework. Our findings underscore the necessity for efficient, high-quality ACR systems capable of longitudinal reasoning across extensive patient databases.
△ Less
Submitted 1 July, 2024; v1 submitted 20 June, 2024;
originally announced June 2024.
-
On the Convergence of the Sinkhorn-Knopp Algorithm with Sparse Cost Matrices
Authors:
Jose Rafael Espinosa Mena
Abstract:
Matrix scaling problems with sparse cost matrices arise frequently in various domains, such as optimal transport, image processing, and machine learning. The Sinkhorn-Knopp algorithm is a popular iterative method for solving these problems, but its convergence properties in the presence of sparsity have not been thoroughly analyzed. This paper presents a theoretical analysis of the convergence rat…
▽ More
Matrix scaling problems with sparse cost matrices arise frequently in various domains, such as optimal transport, image processing, and machine learning. The Sinkhorn-Knopp algorithm is a popular iterative method for solving these problems, but its convergence properties in the presence of sparsity have not been thoroughly analyzed. This paper presents a theoretical analysis of the convergence rate of the Sinkhorn-Knopp algorithm specifically for sparse cost matrices. We derive novel bounds on the convergence rate that explicitly depend on the sparsity pattern and the degree of nonsparsity of the cost matrix. These bounds provide new insights into the behavior of the algorithm and highlight the potential for exploiting sparsity to develop more efficient solvers. We also explore connections between our sparse convergence results and existing convergence results for dense matrices, showing that our bounds generalize the dense case. Our analysis reveals that the convergence rate improves as the matrix becomes less sparse and as the minimum entry of the cost matrix increases relative to its maximum entry. These findings have important practical implications, suggesting that the Sinkhorn-Knopp algorithm may be particularly well-suited for large-scale matrix scaling problems with sparse cost matrices arising in real-world applications. Future research directions include investigating tighter bounds based on more sophisticated sparsity patterns, develo** algorithm variants that actively exploit sparsity, and empirically validating the benefits of our theoretical results on real-world datasets. This work advances our understanding of the Sinkhorn-Knopp algorithm for an important class of matrix scaling problems and lays the foundation for designing more efficient and scalable solutions in practice.
△ Less
Submitted 24 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Multifractal Analysis of the Sinkhorn Algorithm: Unveiling the Intricate Structure of Optimal Transport Maps
Authors:
Jose Rafael Espinosa Mena
Abstract:
The Sinkhorn algorithm has emerged as a powerful tool for solving optimal transport problems, finding applications in various domains such as machine learning, image processing, and computational biology. Despite its widespread use, the intricate structure and scaling properties of the coupling matrices generated by the Sinkhorn algorithm remain largely unexplored. In this paper, we delve into the…
▽ More
The Sinkhorn algorithm has emerged as a powerful tool for solving optimal transport problems, finding applications in various domains such as machine learning, image processing, and computational biology. Despite its widespread use, the intricate structure and scaling properties of the coupling matrices generated by the Sinkhorn algorithm remain largely unexplored. In this paper, we delve into the multifractal properties of these coupling matrices, aiming to unravel their complex behavior and shed light on the underlying dynamics of the Sinkhorn algorithm. We prove the existence of the multifractal spectrum and the singularity spectrum for the Sinkhorn coupling matrices. Furthermore, we derive bounds on the generalized dimensions, providing a comprehensive characterization of their scaling properties. Our findings not only deepen our understanding of the Sinkhorn algorithm but also pave the way for novel applications and algorithmic improvements in the realm of optimal transport.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Robust Joint Estimation of Galaxy Redshift and Spectral Templates using Online Dictionary Learning
Authors:
Sean Bryan,
Ayan Barekzai,
Delondrae Carter,
Philip Mauskopf,
Julian Mena,
Danielle Rivera,
Abel S. Uriarte,
Pao-Yu Wang
Abstract:
We present a novel approach to analyzing astronomical spectral survey data using our non-linear extension of an online dictionary learning algorithm. Current and upcoming surveys such as SPHEREx will use spectral data to build a 3D map of the universe by estimating the redshifts of millions of galaxies. Existing algorithms rely on hand-curated external templates and have limited performance due to…
▽ More
We present a novel approach to analyzing astronomical spectral survey data using our non-linear extension of an online dictionary learning algorithm. Current and upcoming surveys such as SPHEREx will use spectral data to build a 3D map of the universe by estimating the redshifts of millions of galaxies. Existing algorithms rely on hand-curated external templates and have limited performance due to model mismatch error. Our algorithm addresses this limitation by jointly estimating both the underlying spectral features in common across the entire dataset, as well as the redshift of each galaxy. Our online approach scales well to large datasets since we only process a single spectrum in memory at a time. Our algorithm performs better than a state-of-the-art existing algorithm when analyzing a mock SPHEREx dataset, achieving a NMAD standard deviation of 0.18% and a catastrophic error rate of 0.40% when analyzing noiseless data. Our algorithm also performs well over a wide range of signal to noise ratios (SNR), delivering sub-percent NMAD and catastrophic error above median SNR of 20. We released our algorithm publicly at github.com/HyperspectralDictionaryLearning/BryanEtAl2023 .
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
Joule-Thomson expansion of AdS black holes in quasi-topological electromagnetism
Authors:
José Barrientos,
José Mena
Abstract:
We study the Joule-Thomson expansion in Einstein-Maxwell theory supplemented with the so-called quasitopological electromagnetism, this in the extended phase space thermodynamic approach. We compute the Joule-Thomson coefficient and depict all relevant inversion and isenthalpic curves in the temperature-pressure plane, determining in this manner the corresponding cooling and heating regions. In co…
▽ More
We study the Joule-Thomson expansion in Einstein-Maxwell theory supplemented with the so-called quasitopological electromagnetism, this in the extended phase space thermodynamic approach. We compute the Joule-Thomson coefficient and depict all relevant inversion and isenthalpic curves in the temperature-pressure plane, determining in this manner the corresponding cooling and heating regions. In contrast with previous related works we show the existence of three branches for the inversion curves which depends on suitable selections of the parameter space of the theory, thus departing from the usual van der Waals behavior which exhibits up to two branches.
△ Less
Submitted 29 August, 2022; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Exact scalar (quasi-)normal modes of black holes and solitons in gauged SUGRA
Authors:
Monserrat Aguayo,
Ankai Hernández,
José Mena,
Julio Oliva,
Marcelo Oyarzo
Abstract:
In this paper we identify a new family of black holes and solitons that lead to the exact integration of scalar probes, even in the presence of a non-minimal coupling with the Ricci scalar which has a non-trivial profile. The backgrounds are planar and spherical black holes as well as solitons of $SU\left( 2\right) \times SU\left( 2\right) $ $\mathcal{N}=4$ gauged supergravity in four dimensions.…
▽ More
In this paper we identify a new family of black holes and solitons that lead to the exact integration of scalar probes, even in the presence of a non-minimal coupling with the Ricci scalar which has a non-trivial profile. The backgrounds are planar and spherical black holes as well as solitons of $SU\left( 2\right) \times SU\left( 2\right) $ $\mathcal{N}=4$ gauged supergravity in four dimensions. On these geometries, we compute the spectrum of (quasi-)normal modes for the non-minimally coupled scalar field. We find that the equation for the radial dependence can be integrated in terms of hypergeometric functions leading to an exact expression for the frequencies. The solutions do not asymptote to a constant curvature spacetime, nevertheless the asymptotic region acquires an extra conformal Killing vector. For the black hole, the scalar probe is purely ingoing at the horizon, and requiring that the solutions lead to an extremum of the action principle we impose a Dirichlet boundary condition at infinity. Surprisingly, the quasinormal modes do not depend on the radius of the black hole, therefore this family of geometries can be interpreted as isospectral in what regards to the wave operator non-minimally coupled to the Ricci scalar. We find both purely damped modes, as well as exponentially growing unstable modes depending on the values of the non-minimal coupling parameter. For the solitons we show that the same integrability property is achieved separately in a non-supersymmetric solutions as well as for the supersymmetric one. Imposing regularity at the origin and a well defined extremum for the action principle we obtain the spectra that can also lead to purely oscillatory modes as well as to unstable scalar probes, depending on the values of the non-minimal coupling.
△ Less
Submitted 2 January, 2022;
originally announced January 2022.
-
Verification and Optimization of Cyber-Physical Systems: Preprint for FedCSIS
Authors:
Reza Soltani,
Eun-Young Kang,
Juan Esteban Heredia Mena
Abstract:
Optimizing CPS behavior in terms of energy consumption can have a significant impact on system reliability. The environment influences the system's behavior, and neglecting the environmental behavior has an indirect negative impact on optimizing the system's behavior. In this work, to increase the system's flexibility, the behavior of the environment is modeled dynamically to apply the disorderlin…
▽ More
Optimizing CPS behavior in terms of energy consumption can have a significant impact on system reliability. The environment influences the system's behavior, and neglecting the environmental behavior has an indirect negative impact on optimizing the system's behavior. In this work, to increase the system's flexibility, the behavior of the environment is modeled dynamically to apply the disorderliness of its behavior. The resulting models are formally verified. By examining the past environmental behavior and predicting its future behavior, energy optimization is done more dynamically. The verification results acquired using a UPPAAL-SMC show that the optimization of system behavior by predicting the environmental behavior has been successful. Our approach is demonstrated using a case study within an I4 setting.
△ Less
Submitted 3 September, 2021;
originally announced September 2021.
-
Combined numerical and experimental estimation of the fracture toughness and failure analysis of single lap shear test for dissimilar welds
Authors:
Norberto Jimenez Mena,
Thaneshan Sapanathan,
Pascal J. Jacques,
Aude Simar
Abstract:
The single lap shear test is widely used to measure the strength of dissimilar welds even though such a test brings limited understanding of the intrinsic weld toughness. The present study proposes a numerical finite element (FE) analysis and experimental characterization of dissimilar joints presenting various microstructures (thickness of the intermetallic layer (IML) and hardness profile). For…
▽ More
The single lap shear test is widely used to measure the strength of dissimilar welds even though such a test brings limited understanding of the intrinsic weld toughness. The present study proposes a numerical finite element (FE) analysis and experimental characterization of dissimilar joints presenting various microstructures (thickness of the intermetallic layer (IML) and hardness profile). For this purpose, Friction Melt Bonding (FMB) and Friction Stir Welding (FSW) were used to join aluminum AA6061 and Dual Phase steel (DP980). The FE simulations allowed calculating the evolution of the J-integral near this notch tip. It shows that crack initiation depends significantly on the plastic properties of the welded metallic alloys around the notch tip and the width of the welded zone, which both are significantly different for FSW and FMB processes. Nevertheless, a similar weld fracture toughness J_C of approximately 1 kJ.m-2 is estimated from the analysis for both FMB and FSW. This is three orders of magnitude higher than the fracture toughness of the intermetallic layer, revealing that the plastic dissipation in the Al and steel plates around the crack tip has a major effect on the weld toughness.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
Dark Energy Survey Year 1 Results: Cosmological Constraints from Cluster Abundances and Weak Lensing
Authors:
DES Collaboration,
Tim Abbott,
Michel Aguena,
Alex Alarcon,
Sahar Allam,
Steve Allen,
James Annis,
Santiago Avila,
David Bacon,
Alberto Bermeo,
Gary Bernstein,
Emmanuel Bertin,
Sunayana Bhargava,
Sebastian Bocquet,
David Brooks,
Dillon Brout,
Elizabeth Buckley-Geer,
David Burke,
Aurelio Carnero Rosell,
Matias Carrasco Kind,
Jorge Carretero,
Francisco Javier Castander,
Ross Cawthon,
Chihway Chang,
Xinyi Chen
, et al. (107 additional authors not shown)
Abstract:
We perform a joint analysis of the counts and weak lensing signal of redMaPPer clusters selected from the Dark Energy Survey (DES) Year 1 dataset. Our analysis uses the same shear and source photometric redshifts estimates as were used in the DES combined probes analysis. Our analysis results in surprisingly low values for $S_8 =σ_8(Ω_{\rm m}/0.3)^{0.5}= 0.65\pm 0.04$, driven by a low matter densi…
▽ More
We perform a joint analysis of the counts and weak lensing signal of redMaPPer clusters selected from the Dark Energy Survey (DES) Year 1 dataset. Our analysis uses the same shear and source photometric redshifts estimates as were used in the DES combined probes analysis. Our analysis results in surprisingly low values for $S_8 =σ_8(Ω_{\rm m}/0.3)^{0.5}= 0.65\pm 0.04$, driven by a low matter density parameter, $Ω_{\rm m}=0.179^{+0.031}_{-0.038}$, with $σ_8-Ω_{\rm m}$ posteriors in $2.4σ$ tension with the DES Y1 3x2pt results, and in $5.6σ$ with the Planck CMB analysis. These results include the impact of post-unblinding changes to the analysis, which did not improve the level of consistency with other data sets compared to the results obtained at the unblinding. The fact that multiple cosmological probes (supernovae, baryon acoustic oscillations, cosmic shear, galaxy clustering and CMB anisotropies), and other galaxy cluster analyses all favor significantly higher matter densities suggests the presence of systematic errors in the data or an incomplete modeling of the relevant physics. Cross checks with X-ray and microwave data, as well as independent constraints on the observable--mass relation from SZ selected clusters, suggest that the discrepancy resides in our modeling of the weak lensing signal rather than the cluster abundance. Repeating our analysis using a higher richness threshold ($λ\ge 30$) significantly reduces the tension with other probes, and points to one or more richness-dependent effects not captured by our model.
△ Less
Submitted 25 February, 2020;
originally announced February 2020.
-
Dirichlet uncertainty wrappers for actionable algorithm accuracy accountability and auditability
Authors:
José Mena,
Oriol Pujol,
Jordi Vitrià
Abstract:
Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of t…
▽ More
Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of the internals of each component used cause unavoidable effects, such as lack of transparency, difficulty in auditability, and emergence of potential uncontrolled risks. They are effectively black-boxes. Accountability of such solutions is a challenge for the auditors and the machine learning community. In this work, we propose a wrapper that given a black-box model enriches its output prediction with a measure of uncertainty. By using this wrapper, we make the black-box auditable for the accuracy risk (risk derived from low quality or uncertain decisions) and at the same time we provide an actionable mechanism to mitigate that risk in the form of decision rejection; we can choose not to issue a prediction when the risk or uncertainty in that decision is significant. Based on the resulting uncertainty measure, we advocate for a rejection system that selects the more confident predictions, discarding those more uncertain, leading to an improvement in the trustability of the resulting system. We showcase the proposed technique and methodology in a practical scenario where a simulated sentiment analysis API based on natural language processing is applied to different domains. Results demonstrate the effectiveness of the uncertainty computed by the wrapper and its high correlation to bad quality predictions and misclassifications.
△ Less
Submitted 29 December, 2019;
originally announced December 2019.
-
Canadian Hydrogen Intensity Map** Experiment (CHIME) Pathfinder
Authors:
Kevin Bandura,
Graeme E. Addison,
Mandana Amiri,
J. Richard Bond,
Duncan Campbell-Wilson,
Liam Connor,
Jean-Francois Cliche,
Greg Davis,
Meiling Deng,
Nolan Denman,
Matt Dobbs,
Mateus Fandino,
Kenneth Gibbs,
Adam Gilbert,
Mark Halpern,
David Hanna,
Adam D. Hincks,
Gary Hinshaw,
Carolin Hofer,
Peter Klages,
Tom L. Landecker,
Kiyoshi Masui,
Juan Mena,
Laura B. Newburgh,
Ue-Li Pen
, et al. (9 additional authors not shown)
Abstract:
A pathfinder version of CHIME (the Canadian Hydrogen Intensity Map** Experiment) is currently being commissioned at the Dominion Radio Astrophysical Observatory (DRAO) in Penticton, BC. The instrument is a hybrid cylindrical interferometer designed to measure the large scale neutral hydrogen power spectrum across the redshift range 0.8 to 2.5. The power spectrum will be used to measure the baryo…
▽ More
A pathfinder version of CHIME (the Canadian Hydrogen Intensity Map** Experiment) is currently being commissioned at the Dominion Radio Astrophysical Observatory (DRAO) in Penticton, BC. The instrument is a hybrid cylindrical interferometer designed to measure the large scale neutral hydrogen power spectrum across the redshift range 0.8 to 2.5. The power spectrum will be used to measure the baryon acoustic oscillation (BAO) scale across this poorly probed redshift range where dark energy becomes a significant contributor to the evolution of the Universe. The instrument revives the cylinder design in radio astronomy with a wide field survey as a primary goal. Modern low-noise amplifiers and digital processing remove the necessity for the analog beamforming that characterized previous designs. The Pathfinder consists of two cylinders 37\,m long by 20\,m wide oriented north-south for a total collecting area of 1,500 square meters. The cylinders are stationary with no moving parts, and form a transit instrument with an instantaneous field of view of $\sim$100\,degrees by 1-2\,degrees. Each CHIME Pathfinder cylinder has a feedline with 64 dual polarization feeds placed every $\sim$30\,cm which Nyquist sample the north-south sky over much of the frequency band. The signals from each dual-polarization feed are independently amplified, filtered to 400-800\,MHz, and directly sampled at 800\,MSps using 8 bits. The correlator is an FX design, where the Fourier transform channelization is performed in FPGAs, which are interfaced to a set of GPUs that compute the correlation matrix. The CHIME Pathfinder is a 1/10th scale prototype version of CHIME and is designed to detect the BAO feature and constrain the distance-redshift relation.
△ Less
Submitted 9 June, 2014;
originally announced June 2014.
-
A Radio-Frequency-over-Fiber link for large-array radio astronomy applications
Authors:
Juan Mena,
Kevin Bandura,
Jean-Francois Cliche,
Matt Dobbs,
Adam Gilbert,
Qing Yang Tang
Abstract:
A prototype 425-850 MHz Radio-Frequency-over-Fiber (RFoF) link for the Canadian Hydrogen Intensity Map** Experiment (CHIME) is presented. The design is based on a directly modulated Fabry-Perot (FP) laser, operating at ambient temperature, and a single-mode fiber. The dynamic performance, gain stability, and phase stability of the RFoF link are characterized. Tests on a two-element interferomete…
▽ More
A prototype 425-850 MHz Radio-Frequency-over-Fiber (RFoF) link for the Canadian Hydrogen Intensity Map** Experiment (CHIME) is presented. The design is based on a directly modulated Fabry-Perot (FP) laser, operating at ambient temperature, and a single-mode fiber. The dynamic performance, gain stability, and phase stability of the RFoF link are characterized. Tests on a two-element interferometer built at the Dominion Radio Astrophysical Observatory for CHIME prototy** demonstrate that RFoF can be successfully used as a cost-effective solution for analog signal transport on the CHIME telescope and other large-array radio astronomy applications
△ Less
Submitted 2 October, 2013; v1 submitted 25 August, 2013;
originally announced August 2013.