-
Parameter-Efficient Active Learning for Foundational models
Authors:
Athmanarayanan Lakshmi Narayanan,
Ranganath Krishnan,
Amrutha Machireddy,
Mahesh Subedar
Abstract:
Foundational vision transformer models have shown impressive few shot performance on many vision tasks. This research presents a novel investigation into the application of parameter efficient fine-tuning methods within an active learning (AL) framework, to advance the sampling selection process in extremely budget constrained classification tasks. The focus on image datasets, known for their out-…
▽ More
Foundational vision transformer models have shown impressive few shot performance on many vision tasks. This research presents a novel investigation into the application of parameter efficient fine-tuning methods within an active learning (AL) framework, to advance the sampling selection process in extremely budget constrained classification tasks. The focus on image datasets, known for their out-of-distribution characteristics, adds a layer of complexity and relevance to our study. Through a detailed evaluation, we illustrate the improved AL performance on these challenging datasets, highlighting the strategic advantage of merging parameter efficient fine tuning methods with foundation models. This contributes to the broader discourse on optimizing AL strategies, presenting a promising avenue for future exploration in leveraging foundation models for efficient and effective data annotation in specialized domains.
△ Less
Submitted 14 June, 2024; v1 submitted 13 June, 2024;
originally announced June 2024.
-
Observation of sequential three-body dissociation of camphor molecule -- a native frame approach
Authors:
S. De,
S. Mandal,
Sanket Sen,
Arnab Sen,
R. Gopal,
L. Ben Ltaief,
S. Turchini,
D. Catone,
N. Zema,
M. Coreno,
R. Richter,
M. Mudrich,
V. Sharma,
S. R. Krishnan
Abstract:
The three-body dissociation dynamics of the dicationic camphor molecule (C$_{10}$H$_{16}$O$^{2+}$) resulting from Auger decay are investigated using soft X-ray synchrotron radiation. A photoelectron-photoion-photoion coincidence (PEPIPICO) method, a combination of a velocity map imaging (VMI) spectrometer and a time-of-flight (ToF) spectrometer is employed to measure the 3D momenta of ions detecte…
▽ More
The three-body dissociation dynamics of the dicationic camphor molecule (C$_{10}$H$_{16}$O$^{2+}$) resulting from Auger decay are investigated using soft X-ray synchrotron radiation. A photoelectron-photoion-photoion coincidence (PEPIPICO) method, a combination of a velocity map imaging (VMI) spectrometer and a time-of-flight (ToF) spectrometer is employed to measure the 3D momenta of ions detected in coincidence. The ion mass spectra and the ion-ion coincidence map at photon energies of 287.9 eV (below the C 1s ionization potential) and 292.4 eV (above the C 1s ionization potential for skeletal carbon) reveal that fragmentation depends on the final dicationic state rather than the initial excitation. Using the native frame method, three new fragmentation channels are discussed; (1) CH$_2$CO$^+$ + C$_7$H$_{11}^+$ + CH$_3$, (2) CH$_3^+$ + C$_7$H$_{11}^+$ + CH$_2$CO, and (3) C$_2$H$_5^+$ + C$_6$H$_9^+$ + CH$_2$CO. The dominating nature of sequential decay with deferred charge separation is clearly evidenced in all three channels. The results are discussed based on the experimental angular distributions and momenta distributions, corroborated by geometry optimization of the ground, monocationic, and dicationic camphor molecule.
△ Less
Submitted 31 May, 2024;
originally announced June 2024.
-
InterpreTabNet: Distilling Predictive Signals from Tabular Data by Salient Feature Interpretation
Authors:
Jacob Si,
Wendy Yusi Cheng,
Michael Cooper,
Rahul G. Krishnan
Abstract:
Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. However, the inferred attention masks are often dense, making it challenging to come up with rationales about the predictive signal. To remedy this, we propose InterpreTabNet, a variant o…
▽ More
Tabular data are omnipresent in various sectors of industries. Neural networks for tabular data such as TabNet have been proposed to make predictions while leveraging the attention mechanism for interpretability. However, the inferred attention masks are often dense, making it challenging to come up with rationales about the predictive signal. To remedy this, we propose InterpreTabNet, a variant of the TabNet model that models the attention mechanism as a latent variable sampled from a Gumbel-Softmax distribution. This enables us to regularize the model to learn distinct concepts in the attention masks via a KL Divergence regularizer. It prevents overlap** feature selection by promoting sparsity which maximizes the model's efficacy and improves interpretability to determine the important features when predicting the outcome. To assist in the interpretation of feature interdependencies from our model, we employ a large language model (GPT-4) and use prompt engineering to map from the learned feature mask onto natural language text describing the learned signal. Through comprehensive experiments on real-world datasets, we demonstrate that InterpreTabNet outperforms previous methods for interpreting tabular data while attaining competitive accuracy.
△ Less
Submitted 11 June, 2024; v1 submitted 1 June, 2024;
originally announced June 2024.
-
Measurement of enhanced spin-orbit coupling strength for donor-bound electron spins in silicon
Authors:
Radha Krishnan,
Beng Yee Gan,
Yu-Ling Hsueh,
A. M. Saffat-Ee Huq,
Jonathan Kenny,
Rajib Rahman,
Teck Seng Koh,
Michelle Y. Simmons,
Bent Weber
Abstract:
While traditionally considered a deleterious effect in quantum dot spin qubits, the spin-orbit interaction is recently being revisited as it allows for rapid coherent control by on-chip AC electric fields. For electrons in bulk silicon, SOC is intrinsically weak, however, it can be enhanced at surfaces and interfaces, or through atomic placement. Here we show that the strength of the spin-orbit co…
▽ More
While traditionally considered a deleterious effect in quantum dot spin qubits, the spin-orbit interaction is recently being revisited as it allows for rapid coherent control by on-chip AC electric fields. For electrons in bulk silicon, SOC is intrinsically weak, however, it can be enhanced at surfaces and interfaces, or through atomic placement. Here we show that the strength of the spin-orbit coupling can be locally enhanced by more than two orders of magnitude in the manybody wave functions of multi-donor quantum dots compared to a single donor, reaching strengths so far only reported for holes or two-donor system with certain symmetry. Our findings may provide a pathway towards all-electrical control of donor-bound spins in silicon using electric dipole spin resonance (EDSR).
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Sequential Decision Making with Expert Demonstrations under Unobserved Heterogeneity
Authors:
Vahid Balazadeh,
Keertana Chidambaram,
Viet Nguyen,
Rahul G. Krishnan,
Vasilis Syrgkanis
Abstract:
We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information. These demonstrations can be viewed as solving related but slightly different tasks than what the learner faces. This setting arises in many application domains, such as self-driving cars, healthcare, and finance, where expert dem…
▽ More
We study the problem of online sequential decision-making given auxiliary demonstrations from experts who made their decisions based on unobserved contextual information. These demonstrations can be viewed as solving related but slightly different tasks than what the learner faces. This setting arises in many application domains, such as self-driving cars, healthcare, and finance, where expert demonstrations are made using contextual information, which is not recorded in the data available to the learning agent. We model the problem as a zero-shot meta-reinforcement learning setting with an unknown task distribution and a Bayesian regret minimization objective, where the unobserved tasks are encoded as parameters with an unknown prior. We propose the Experts-as-Priors algorithm (ExPerior), a non-parametric empirical Bayes approach that utilizes the principle of maximum entropy to establish an informative prior over the learner's decision-making problem. This prior enables the application of any Bayesian approach for online decision-making, such as posterior sampling. We demonstrate that our strategy surpasses existing behaviour cloning and online algorithms for multi-armed bandits and reinforcement learning, showcasing the utility of our approach in leveraging expert demonstrations across different decision-making setups.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
A Geometric Explanation of the Likelihood OOD Detection Paradox
Authors:
Hamidreza Kamkari,
Brendan Leigh Ross,
Jesse C. Cresswell,
Anthony L. Caterini,
Rahul G. Krishnan,
Gabriel Loaiza-Ganem
Abstract:
Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making l…
▽ More
Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making likelihood-based OOD detection unreliable. Our primary observation is that high-likelihood regions will not be generated if they contain minimal probability mass. We demonstrate how this seeming contradiction of large densities yet low probability mass can occur around data confined to low-dimensional manifolds. We also show that this scenario can be identified through local intrinsic dimension (LID) estimation, and propose a method for OOD detection which pairs the likelihoods and LID estimates obtained from a pre-trained DGM. Our method can be applied to normalizing flows and score-based diffusion models, and obtains results which match or surpass state-of-the-art OOD detection benchmarks using the same DGM backbones. Our code is available at https://github.com/layer6ai-labs/dgm_ood_detection.
△ Less
Submitted 11 June, 2024; v1 submitted 27 March, 2024;
originally announced March 2024.
-
HEAL: Brain-inspired Hyperdimensional Efficient Active Learning
Authors:
Yang Ni,
Zhuowen Zou,
Wenjun Huang,
Hanning Chen,
William Youngwoo Chung,
Samuel Cho,
Ranganath Krishnan,
Pietro Mercati,
Mohsen Imani
Abstract:
Drawing inspiration from the outstanding learning capability of our human brains, Hyperdimensional Computing (HDC) emerges as a novel computing paradigm, and it leverages high-dimensional vector presentation and operations for brain-like lightweight Machine Learning (ML). Practical deployments of HDC have significantly enhanced the learning efficiency compared to current deep ML methods on a broad…
▽ More
Drawing inspiration from the outstanding learning capability of our human brains, Hyperdimensional Computing (HDC) emerges as a novel computing paradigm, and it leverages high-dimensional vector presentation and operations for brain-like lightweight Machine Learning (ML). Practical deployments of HDC have significantly enhanced the learning efficiency compared to current deep ML methods on a broad spectrum of applications. However, boosting the data efficiency of HDC classifiers in supervised learning remains an open question. In this paper, we introduce Hyperdimensional Efficient Active Learning (HEAL), a novel Active Learning (AL) framework tailored for HDC classification. HEAL proactively annotates unlabeled data points via uncertainty and diversity-guided acquisition, leading to a more efficient dataset annotation and lowering labor costs. Unlike conventional AL methods that only support classifiers built upon deep neural networks (DNN), HEAL operates without the need for gradient or probabilistic computations. This allows it to be effortlessly integrated with any existing HDC classifier architecture. The key design of HEAL is a novel approach for uncertainty estimation in HDC classifiers through a lightweight HDC ensemble with prior hypervectors. Additionally, by exploiting hypervectors as prototypes (i.e., compact representations), we develop an extra metric for HEAL to select diverse samples within each batch for annotation. Our evaluation shows that HEAL surpasses a diverse set of baselines in AL quality and achieves notably faster acquisition than many BNN-powered or diversity-guided AL methods, recording 11 times to 40,000 times speedup in acquisition runtime per batch.
△ Less
Submitted 17 February, 2024;
originally announced February 2024.
-
Measurement Scheduling for ICU Patients with Offline Reinforcement Learning
Authors:
Zongliang Ji,
Anna Goldenberg,
Rahul G. Krishnan
Abstract:
Scheduling laboratory tests for ICU patients presents a significant challenge. Studies show that 20-40% of lab tests ordered in the ICU are redundant and could be eliminated without compromising patient safety. Prior work has leveraged offline reinforcement learning (Offline-RL) to find optimal policies for ordering lab tests based on patient information. However, new ICU patient datasets have sin…
▽ More
Scheduling laboratory tests for ICU patients presents a significant challenge. Studies show that 20-40% of lab tests ordered in the ICU are redundant and could be eliminated without compromising patient safety. Prior work has leveraged offline reinforcement learning (Offline-RL) to find optimal policies for ordering lab tests based on patient information. However, new ICU patient datasets have since been released, and various advancements have been made in Offline-RL methods. In this study, we first introduce a preprocessing pipeline for the newly-released MIMIC-IV dataset geared toward time-series tasks. We then explore the efficacy of state-of-the-art Offline-RL methods in identifying better policies for ICU patient lab test scheduling. Besides assessing methodological performance, we also discuss the overall suitability and practicality of using Offline-RL frameworks for scheduling laboratory tests in ICU settings.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
Symmetry breaking and spin-orbit coupling for individual vacancy-induced in-gap states in MoS2 monolayers
Authors:
Thasneem Aliyar,
Hongyang Ma,
Radha Krishnan,
Gagandeep Singh,
Bi Qi Chong,
Yitao Wang,
Ivan Verzhbitskiy,
Calvin Pei Yu Wong,
Kuan Eng Johnson Goh,
Ze Xiang Shen,
Teck Seng Koh,
Rajib Rahman,
Bent Weber
Abstract:
Spins confined to point defects in atomically-thin semiconductors constitute well-defined atomic-scale quantum systems that are being explored as single photon emitters and spin qubits. Here, we investigate the in-gap electronic structure of individual sulphur vacancies in molybdenum disulphide (MoS2) monolayers using resonant tunneling scanning probe spectroscopy in the Coulomb blockade regime. S…
▽ More
Spins confined to point defects in atomically-thin semiconductors constitute well-defined atomic-scale quantum systems that are being explored as single photon emitters and spin qubits. Here, we investigate the in-gap electronic structure of individual sulphur vacancies in molybdenum disulphide (MoS2) monolayers using resonant tunneling scanning probe spectroscopy in the Coulomb blockade regime. Spectroscopic map** of defect wavefunctions reveals an interplay of local symmetry breaking by a charge-state dependent Jahn-Teller lattice distortion that, when combined with strong (~100 meV) spin-orbit coupling, leads to a locking of an unpaired spin-1/2 magnetic moment to the lattice at low temperature, susceptible to lattice strain. Our results provide new insights into spin and electronic structure of vacancy induced in-gap states towards their application as electrically and optically addressable quantum systems.
△ Less
Submitted 20 February, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Evaluation of Mean Shift, ComBat, and CycleGAN for Harmonizing Brain Connectivity Matrices Across Sites
Authors:
Hanliang Xu,
Nancy R. Newlin,
Michael E. Kim,
Chenyu Gao,
Praitayini Kanakaraj,
Aravind R. Krishnan,
Lucas W. Remedios,
Nazirah Mohd Khairi,
Kimberly Pechman,
Derek Archer,
Timothy J. Hohman,
Angela L. Jefferson,
The BIOCARD Study Team,
Ivana Isgum,
Yuankai Huo,
Daniel Moyer,
Kurt G. Schilling,
Bennett A. Landman
Abstract:
Connectivity matrices derived from diffusion MRI (dMRI) provide an interpretable and generalizable way of understanding the human brain connectome. However, dMRI suffers from inter-site and between-scanner variation, which impedes analysis across datasets to improve robustness and reproducibility of results. To evaluate different harmonization approaches on connectivity matrices, we compared graph…
▽ More
Connectivity matrices derived from diffusion MRI (dMRI) provide an interpretable and generalizable way of understanding the human brain connectome. However, dMRI suffers from inter-site and between-scanner variation, which impedes analysis across datasets to improve robustness and reproducibility of results. To evaluate different harmonization approaches on connectivity matrices, we compared graph measures derived from these matrices before and after applying three harmonization techniques: mean shift, ComBat, and CycleGAN. The sample comprises 168 age-matched, sex-matched normal subjects from two studies: the Vanderbilt Memory and Aging Project (VMAP) and the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD). First, we plotted the graph measures and used coefficient of variation (CoV) and the Mann-Whitney U test to evaluate different methods' effectiveness in removing site effects on the matrices and the derived graph measures. ComBat effectively eliminated site effects for global efficiency and modularity and outperformed the other two methods. However, all methods exhibited poor performance when harmonizing average betweenness centrality. Second, we tested whether our harmonization methods preserved correlations between age and graph measures. All methods except for CycleGAN in one direction improved correlations between age and global efficiency and between age and modularity from insignificant to significant with p-values less than 0.05.
△ Less
Submitted 24 January, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Fragmentation of Water Clusters Formed in Helium Nanodroplets by Charge Transfer and Penning Ionization
Authors:
S. De,
A. R. Abid,
J. D. Asmussen,
L. Ben Ltaief,
K. Sishodia,
A. Ulmer,
H. B. Pedersen,
S. R. Krishnan,
M. Mudrich
Abstract:
Helium nanodroplets ("HNDs") are widely used for forming tailor-made clusters and molecular complexes in a cold, transparent, and weakly-interacting matrix. Characterization of embedded species by mass spectrometry is often complicated by fragmentation and trap** of ions in the HNDs. Here, we systematically study fragment ion mass spectra of HND-aggregated water and oxygen clusters following the…
▽ More
Helium nanodroplets ("HNDs") are widely used for forming tailor-made clusters and molecular complexes in a cold, transparent, and weakly-interacting matrix. Characterization of embedded species by mass spectrometry is often complicated by fragmentation and trap** of ions in the HNDs. Here, we systematically study fragment ion mass spectra of HND-aggregated water and oxygen clusters following their ionization by charge transfer ionization ("CTI") and Penning ionization ("PEI"). While the efficiency of PEI of embedded clusters is lower than for CTI by about factor 10, both the mean sizes of detected water clusters and the relative yields of unprotonated cluster ions are significantly larger, making PEI a ``soft ionization'' scheme. However, the tendency of ions to remain bound to HNDs leads to a reduced detection efficiency for large HNDs containing $>10^4$ helium atoms. These results are instrumental for determining optimal conditions for mass spectrometry and photoionization spectroscopy of molecular complexes and clusters aggregated in HNDs.
△ Less
Submitted 12 January, 2024;
originally announced January 2024.
-
MultiResFormer: Transformer with Adaptive Multi-Resolution Modeling for General Time Series Forecasting
Authors:
Linfeng Du,
Ji Xin,
Alex Labach,
Saba Zuberi,
Maksims Volkovs,
Rahul G. Krishnan
Abstract:
Transformer-based models have greatly pushed the boundaries of time series forecasting recently. Existing methods typically encode time series data into $\textit{patches}$ using one or a fixed set of patch lengths. This, however, could result in a lack of ability to capture the variety of intricate temporal dependencies present in real-world multi-periodic time series. In this paper, we propose Mu…
▽ More
Transformer-based models have greatly pushed the boundaries of time series forecasting recently. Existing methods typically encode time series data into $\textit{patches}$ using one or a fixed set of patch lengths. This, however, could result in a lack of ability to capture the variety of intricate temporal dependencies present in real-world multi-periodic time series. In this paper, we propose MultiResFormer, which dynamically models temporal variations by adaptively choosing optimal patch lengths. Concretely, at the beginning of each layer, time series data is encoded into several parallel branches, each using a detected periodicity, before going through the transformer encoder block. We conduct extensive evaluations on long- and short-term forecasting datasets comparing MultiResFormer with state-of-the-art baselines. MultiResFormer outperforms patch-based Transformer baselines on long-term forecasting tasks and also consistently outperforms CNN baselines by a large margin, while using much fewer parameters than these baselines.
△ Less
Submitted 8 February, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding
Authors:
Rohan Myer Krishnan,
Zitian Tang,
Zhiqiu Yu,
Chen Sun
Abstract:
Learning from videos is an emerging research area that enables robots to acquire skills from human demonstrations, such as procedural videos. To do this, video-language models must be able to obtain structured understandings, such as the temporal segmentation of a demonstration into sequences of actions and skills, and to generalize the understandings to novel domains. In pursuit of this goal, we…
▽ More
Learning from videos is an emerging research area that enables robots to acquire skills from human demonstrations, such as procedural videos. To do this, video-language models must be able to obtain structured understandings, such as the temporal segmentation of a demonstration into sequences of actions and skills, and to generalize the understandings to novel domains. In pursuit of this goal, we introduce Spacewalk-18, a benchmark containing two tasks: (1) step recognition and (2) intra-video retrieval over a dataset of temporally segmented and labeled tasks in International Space Station spacewalk recordings. In tandem, the two tasks quantify a model's ability to make use of: (1) out-of-domain visual information; (2) a high temporal context window; and (3) multimodal (e.g. visual and speech) domains. This departs from existing benchmarks for procedural video understanding, which typically deal with short context lengths and can be solved with a single modality. Spacewalk-18, with its inherent multimodal and long-form complexity, exposes the high difficulty of task recognition and segmentation. We find that state-of-the-art methods perform poorly on our benchmark, but improvements can be obtained by incorporating information from longer-range temporal context across different modalities. Our experiments underscore the need to develop new approaches to these tasks. Data, model, and code will be released at https://brown-palm.github.io/Spacewalk-18/.
△ Less
Submitted 21 March, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Structured Neural Networks for Density Estimation and Causal Inference
Authors:
Asic Q. Chen,
Ruian Shi,
Xiang Gao,
Ricardo Baptista,
Rahul G. Krishnan
Abstract:
Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs. For instance, when learning generative models using neural networks, it is advantageous to encode the conditional independence structure of observed variables, often in the form of Bayesian networks. We propose the Structured Neural Network (StrNN), which injects structur…
▽ More
Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs. For instance, when learning generative models using neural networks, it is advantageous to encode the conditional independence structure of observed variables, often in the form of Bayesian networks. We propose the Structured Neural Network (StrNN), which injects structure through masking pathways in a neural network. The masks are designed via a novel relationship we explore between neural network architectures and binary matrix factorization, to ensure that the desired independencies are respected. We devise and study practical algorithms for this otherwise NP-hard design problem based on novel objectives that control the model architecture. We demonstrate the utility of StrNN in three applications: (1) binary and Gaussian density estimation with StrNN, (2) real-valued density estimation with Structured Autoregressive Flows (StrAFs) and Structured Continuous Normalizing Flows (StrCNF), and (3) interventional and counterfactual analysis with StrAFs for causal inference. Our work opens up new avenues for learning neural networks that enable data-efficient generative modeling and the use of normalizing flows for causal effect estimation.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Observation of interatomic Coulombic decay induced by double excitation of helium in nanodroplets
Authors:
B. Bastian,
J. D. Asmussen,
L. Ben Ltaief,
H. B. Pedersen,
K. Sishodia,
S. De,
S. R. Krishnan,
C. Medina,
N. Pal,
R. Richter,
N. Sisourat,
M. Mudrich
Abstract:
Interatomic Coulombic decay (ICD) plays a crucial role in weakly bound complexes exposed to intense or high-energy radiation. So far, neutral or ionic atoms or molecules have been prepared in singly excited electron or hole states which can transfer energy to neighboring centers and cause ionization and radiation damage. Here we demonstrate that a doubly excited atom, despite its extremely short l…
▽ More
Interatomic Coulombic decay (ICD) plays a crucial role in weakly bound complexes exposed to intense or high-energy radiation. So far, neutral or ionic atoms or molecules have been prepared in singly excited electron or hole states which can transfer energy to neighboring centers and cause ionization and radiation damage. Here we demonstrate that a doubly excited atom, despite its extremely short lifetime, can decay by ICD; evidenced by high-resolution photoelectron spectra of He nanodroplets excited to the 2s2p+ state. We find that ICD proceeds by relaxation into excited He$^*$He$^+$ atom-pair states, in agreement with calculations. The ability of inducing ICD by resonant excitation far above the single-ionization threshold opens opportunities for controlling radiation damage to a high degree of element specificity and spectral selectivity.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Inter-vendor harmonization of Computed Tomography (CT) reconstruction kernels using unpaired image translation
Authors:
Aravind R. Krishnan,
Kaiwen Xu,
Thomas Li,
Chenyu Gao,
Lucas W. Remedios,
Praitayini Kanakaraj,
Ho Hin Lee,
Shunxing Bao,
Kim L. Sandler,
Fabien Maldonado,
Ivana Isgum,
Bennett A. Landman
Abstract:
The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmoni…
▽ More
The reconstruction kernel in computed tomography (CT) generation determines the texture of the image. Consistency in reconstruction kernels is important as the underlying CT texture can impact measurements during quantitative image analysis. Harmonization (i.e., kernel conversion) minimizes differences in measurements due to inconsistent reconstruction kernels. Existing methods investigate harmonization of CT scans in single or multiple manufacturers. However, these methods require paired scans of hard and soft reconstruction kernels that are spatially and anatomically aligned. Additionally, a large number of models need to be trained across different kernel pairs within manufacturers. In this study, we adopt an unpaired image translation approach to investigate harmonization between and across reconstruction kernels from different manufacturers by constructing a multipath cycle generative adversarial network (GAN). We use hard and soft reconstruction kernels from the Siemens and GE vendors from the National Lung Screening Trial dataset. We use 50 scans from each reconstruction kernel and train a multipath cycle GAN. To evaluate the effect of harmonization on the reconstruction kernels, we harmonize 50 scans each from Siemens hard kernel, GE soft kernel and GE hard kernel to a reference Siemens soft kernel (B30f) and evaluate percent emphysema. We fit a linear model by considering the age, smoking status, sex and vendor and perform an analysis of variance (ANOVA) on the emphysema scores. Our approach minimizes differences in emphysema measurement and highlights the impact of age, sex, smoking status and vendor on emphysema quantification.
△ Less
Submitted 26 January, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
The framework of the auxiliary group: two birds with one stone
Authors:
R. Krishnan
Abstract:
Flavon models in the literature assume constraints on the components of the vacuum expectation values (vevs) of flavons, and typically, these constraints are not fully determined by the residual symmetry group of the set of vevs. This poses a problem because the general potential of the flavons cannot have a minimum that leads to such constraints unless additional mechanisms involving supersymmetr…
▽ More
Flavon models in the literature assume constraints on the components of the vacuum expectation values (vevs) of flavons, and typically, these constraints are not fully determined by the residual symmetry group of the set of vevs. This poses a problem because the general potential of the flavons cannot have a minimum that leads to such constraints unless additional mechanisms involving supersymmetry, extra dimensions etc., are invoked. In this paper, we show that the framework of the auxiliary group naturally results in vevs satisfying the required constraints, and using it, we construct a type-1 seesaw model with two right-handed neutrinos, which predicts the ratio of the light neutrino masses $m_2/m_3=(\sqrt{2}-1)/(\sqrt{2}+1)$ and $\text{TM}_1$ mixing with $\sinθ_{13}=\frac{1}{\sqrt{3}}\sin\fracπ{12}$ and $\sinδ_\text{CP}=-1$. Our framework posits auxiliary group transformations which act on the flavons but not on the fermions. We construct the general renormalizable potential without invoking additional mechanisms and show that it has a minimum that leads to the required constraints.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Understanding the soil water dynamics during excess and deficit rainfall conditions over the Core monsoon zone of India
Authors:
Mangesh M. Goswami,
Milind Mujumdar,
Bhupendra Bahadur Singh,
Madhusudan Ingale,
Naresh Ganeshi,
Manish Ranalkar,
Trenton E. Franz,
Prashant Srivastav,
Dev Niyogi,
R. Krishnan,
S. N. Patil
Abstract:
Observations of soil moisture (SM) during excess and deficit monsoon seasons between 2000 to 2021 present a unique opportunity to understand the soil water dynamics (SWD) over core monsoon zone (CMZ) of India. This study aims to analyse SWD by investigating the SM variability, SM memory (SMM), and the coupling between the surface and subsurface SM levels. Particularly intriguing are instances of c…
▽ More
Observations of soil moisture (SM) during excess and deficit monsoon seasons between 2000 to 2021 present a unique opportunity to understand the soil water dynamics (SWD) over core monsoon zone (CMZ) of India. This study aims to analyse SWD by investigating the SM variability, SM memory (SMM), and the coupling between the surface and subsurface SM levels. Particularly intriguing are instances of concurrent monsoonal extremes, which give rise to complex SWD patterns. Usually, it is noted that a depleted convective activity and persistence of higher temperatures during the pre-monsoon season leads to lower SM, while monsoon rains and post-monsoon showers support the prevalence of higher SM conditions. The long persistent dry spells during deficit monsoon years enhances the Bowen ratio (BR) due to the high sensible heat fluxes. On the other hand, the availability of large latent heat flux during excess monsoon and post-monsoon seasons tends to decrease the BR. This enhancement or reduction in BR is due to evapotranspiration (ET), which influences the SWD by modulating the surface subsurface SM coupling. The surface and subsurface SM coupling analysis for CMZ exhibits significant distinction in the evolution of wet and dry extremes. SM variations and persistence time scale is used as an indicator of SMM, and analysed for both surface and subsurface SM observation levels. Evidently, subsurface SM exhibits remarkably prolonged memory timescales, approximately twice that of surface SM. Furthermore, we dissect SWD linked to wet and dry extremes by analysing annual soil water balance (SWB). Our findings reveal augmented (diminished) ET during deficit (excess) years, subjected to a higher (lower) number of break events. In essence, our study underscores the significance of surface-subsurface SM observations in unravelling the intricate tapestry of SWD.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Spectroscopically resolved resonant interatomic Coulombic decay in photoexcited large He nanodroplets
Authors:
L. Ben Ltaief,
K. Sishodia,
R. Richter,
B. Bastian,
J. D. Asmussen,
S. Mandal,
N. Pal,
C. Medina,
S. R. Krishnan,
K. von Haeften,
M. Mudrich
Abstract:
Interatomic Coulombic decay (ICD) processes play a crucial role in weakly bound complexes exposed to intense or high-energy radiation. Using large helium nanodroplets, we demonstrate that ICD is efficient even when the droplets are irradiated by weak synchrotron radiation at relatively low photon energies. Below the ionization threshold, resonant excitation of multiple centers efficiently induces…
▽ More
Interatomic Coulombic decay (ICD) processes play a crucial role in weakly bound complexes exposed to intense or high-energy radiation. Using large helium nanodroplets, we demonstrate that ICD is efficient even when the droplets are irradiated by weak synchrotron radiation at relatively low photon energies. Below the ionization threshold, resonant excitation of multiple centers efficiently induces resonant ICD as previously observed for intense pulses [A. C. LaForge et al., PRX 11, 021011 (2021)]. More surprisingly, we observe ICD even above the ionization threshold due to recombination of photoelectrons and ions into excited states which subsequently decay by ICD. This demonstrates the importance of secondary processes, in particular electron scattering and recombination, in inducing ICD in extended condensed phase systems. In addition, we show that ICD can serve as a diagnostic tool for monitoring the relaxation dynamics of highly-excited and ionized weakly-bound nanosystems.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Order-based Structure Learning with Normalizing Flows
Authors:
Hamidreza Kamkari,
Vahid Balazadeh,
Vahid Zehtab,
Rahul G. Krishnan
Abstract:
Estimating the causal structure of observational data is a challenging combinatorial search problem that scales super-exponentially with graph size. Existing methods use continuous relaxations to make this problem computationally tractable but often restrict the data-generating process to additive noise models (ANMs) through explicit or implicit assumptions. We present Order-based Structure Learni…
▽ More
Estimating the causal structure of observational data is a challenging combinatorial search problem that scales super-exponentially with graph size. Existing methods use continuous relaxations to make this problem computationally tractable but often restrict the data-generating process to additive noise models (ANMs) through explicit or implicit assumptions. We present Order-based Structure Learning with Normalizing Flows (OSLow), a framework that relaxes these assumptions using autoregressive normalizing flows. We leverage the insight that searching over topological orderings is a natural way to enforce acyclicity in structure discovery and propose a novel, differentiable permutation learning method to find such orderings. Through extensive experiments on synthetic and real-world data, we demonstrate that OSLow outperforms prior baselines and improves performance on the observational Sachs and SynTReN datasets as measured by structural hamming distance and structural intervention distance, highlighting the importance of relaxing the ANM assumption made by existing methods.
△ Less
Submitted 17 February, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
Spin-valley locking for in-gap quantum dots in a MoS2 transistor
Authors:
Radha Krishnan,
Sangram Biswas,
Yu-Ling Hsueh,
Hongyang Ma,
Rajib Rahman,
Bent Weber
Abstract:
Spins confined to atomically-thin semiconductors are being actively explored as quantum information carriers. In transition metal dichalcogenides (TMDCs), the hexagonal crystal lattice gives rise to an additional valley degree of freedom with spin-valley locking and potentially enhanced spin life- and coherence times. However, realizing well-separated single-particle levels, and achieving transpar…
▽ More
Spins confined to atomically-thin semiconductors are being actively explored as quantum information carriers. In transition metal dichalcogenides (TMDCs), the hexagonal crystal lattice gives rise to an additional valley degree of freedom with spin-valley locking and potentially enhanced spin life- and coherence times. However, realizing well-separated single-particle levels, and achieving transparent electrical contact to address them has remained challenging. Here, we report well-defined spin states in a few-layer MoS$ _2$ transistor, characterized with a spectral resolution of $\sim{50~μ}$eV at ${T_\textrm{el} = 150}$~mK. Ground state magnetospectroscopy confirms a finite Berry-curvature induced coupling of spin and valley, reflected in a pronounced Zeeman anisotropy, with a large out-of-plane $g$-factor of ${g_\perp \simeq 8}$. A finite in-plane $g$-factor (${g_\parallel \simeq 0.55-0.8}$) allows us to quantify spin-valley locking and estimate the spin-orbit splitting ${2Δ_{\rm SO} \sim 100~μ}$eV. The demonstration of spin-valley locking is an important milestone towards realizing spin-valley quantum bits.
△ Less
Submitted 23 June, 2023;
originally announced June 2023.
-
Secondary ionization of pyrimidine nucleobases and their microhydrated derivatives in helium nanodroplets
Authors:
Jakob D. Asmussen,
Abdul R. Abid,
Akgash Sundaralingam,
Björn Bastian,
Keshav Sishodia,
Subhendu De,
Ltaief Ben Ltaief,
Sivarama R. Krishnan,
Henrik B. Pedersen,
Marcel Mudrich
Abstract:
Radiation damage in biological systems by ionizing radiation is predominantly caused by secondary processes such as charge and energy transfer leading to the breaking of bonds in DNA. Here, we study the fragmentation of cytosine (Cyt) and thymine (Thy) molecules, clusters and microhydrated derivatives induced by direct and indirect ionization initiated by extreme-ultraviolet (XUV) irradiation. Pho…
▽ More
Radiation damage in biological systems by ionizing radiation is predominantly caused by secondary processes such as charge and energy transfer leading to the breaking of bonds in DNA. Here, we study the fragmentation of cytosine (Cyt) and thymine (Thy) molecules, clusters and microhydrated derivatives induced by direct and indirect ionization initiated by extreme-ultraviolet (XUV) irradiation. Photofragmentation mass spectra and photoelectron spectra of free Cyt and Thy molecules are compared with mass and electron spectra of Cyt/Thy clusters and microhydrated Cyt/Thy molecules formed by aggregation in superfluid helium (He) nanodroplets. Penning ionization after resonant excitation of the He droplets is generally found to cause less fragmentation compared to direct photoionization and charge-transfer ionization after photoionization of the He droplets. When Cyt/Thy molecules and oligomers are complexed with water molecules, their fragmentation is efficiently suppressed. However, a similar suppression of fragmentation is observed when homogeneous Cyt/Thy clusters are formed in He nanodroplets, indicating a general trend. Penning ionization electron spectra (PIES) of Cyt/Thy are broad and nearly featureless but PIES of their microhydrated derivatives point at a sequential ionization process ending in unfragmented microsolvated Cyt/Thy cations.
△ Less
Submitted 21 June, 2023;
originally announced June 2023.
-
Copula-Based Deep Survival Models for Dependent Censoring
Authors:
Ali Hossein Gharari Foomani,
Michael Cooper,
Russell Greiner,
Rahul G. Krishnan
Abstract:
A survival dataset describes a set of instances (e.g. patients) and provides, for each, either the time until an event (e.g. death), or the censoring time (e.g. when lost to follow-up - which is a lower bound on the time until the event). We consider the challenge of survival prediction: learning, from such data, a predictive model that can produce an individual survival distribution for a novel i…
▽ More
A survival dataset describes a set of instances (e.g. patients) and provides, for each, either the time until an event (e.g. death), or the censoring time (e.g. when lost to follow-up - which is a lower bound on the time until the event). We consider the challenge of survival prediction: learning, from such data, a predictive model that can produce an individual survival distribution for a novel instance. Many contemporary methods of survival prediction implicitly assume that the event and censoring distributions are independent conditional on the instance's covariates - a strong assumption that is difficult to verify (as we observe only one outcome for each instance) and which can induce significant bias when it does not hold. This paper presents a parametric model of survival that extends modern non-linear survival analysis by relaxing the assumption of conditional independence. On synthetic and semi-synthetic data, our approach significantly improves estimates of survival distributions compared to the standard that assumes conditional independence in the data.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Homogeneous linear intrinsic constraints in the stationary manifold of a $G$-invariant potential
Authors:
R. Krishnan
Abstract:
Given a $G$-invariant potential $\mathcal{V}$ of a scalar multiplet $\varphi$, there may exist a set of homogenous linear equations that constrain the components of a stationary point of $\mathcal{V}$ independently of the coefficients of the terms in $\mathcal{V}$. We call them homogeneous linear intrinsic constraints (HLICs). HLICs in a stationary point manifest as HLICs in the corresponding vacu…
▽ More
Given a $G$-invariant potential $\mathcal{V}$ of a scalar multiplet $\varphi$, there may exist a set of homogenous linear equations that constrain the components of a stationary point of $\mathcal{V}$ independently of the coefficients of the terms in $\mathcal{V}$. We call them homogeneous linear intrinsic constraints (HLICs). HLICs in a stationary point manifest as HLICs in the corresponding vacuum alignment of $\varphi$, which plays a central role in predictive phenomenological models. We discover that a group $\tilde{H}$ generates HLICs if the terms in $\mathcal{V}$ satisfy a condition, which we call the compatibility condition. In this paper, we also develop a procedure, which involves splitting $\mathcal{V}$ into smaller parts, to establish the existence of specific stationary points using arguments based on symmetries without the need for explicitly extremizing the potential. Using this procedure, we obtain $\tilde{H}$ as a direct product of the symmetry groups associated with the various irreducible multiplets (irreps) in $\varphi$. This results from considering the potentials of the irreps separately and verifying if the cross terms are compatible with $\tilde{H}$.
△ Less
Submitted 22 September, 2023; v1 submitted 12 June, 2023;
originally announced June 2023.
-
Clinical Camel: An Open Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding
Authors:
Augustin Toma,
Patrick R. Lawler,
Jimmy Ba,
Rahul G. Krishnan,
Barry B. Rubin,
Bo Wang
Abstract:
We present Clinical Camel, an open large language model (LLM) explicitly tailored for clinical research. Fine-tuned from LLaMA-2 using QLoRA, Clinical Camel achieves state-of-the-art performance across medical benchmarks among openly available medical LLMs. Leveraging efficient single-GPU training, Clinical Camel surpasses GPT-3.5 in five-shot evaluations on all assessed benchmarks, including 64.3…
▽ More
We present Clinical Camel, an open large language model (LLM) explicitly tailored for clinical research. Fine-tuned from LLaMA-2 using QLoRA, Clinical Camel achieves state-of-the-art performance across medical benchmarks among openly available medical LLMs. Leveraging efficient single-GPU training, Clinical Camel surpasses GPT-3.5 in five-shot evaluations on all assessed benchmarks, including 64.3% on the USMLE Sample Exam (compared to 58.5% for GPT-3.5), 77.9% on PubMedQA (compared to 60.2%), 60.7% on MedQA (compared to 53.6%), and 54.2% on MedMCQA (compared to 51.0%). In addition to these benchmarks, Clinical Camel demonstrates its broader capabilities, such as synthesizing plausible clinical notes. This work introduces dialogue-based knowledge encoding, a novel method to synthesize conversational data from dense medical texts. While benchmark results are encouraging, extensive and rigorous human evaluation across diverse clinical scenarios is imperative to ascertain safety before implementation. By openly sharing Clinical Camel, we hope to foster transparent and collaborative research, working towards the safe integration of LLMs within the healthcare domain. Significant challenges concerning reliability, bias, and the potential for outdated knowledge persist. Nonetheless, the transparency provided by an open approach reinforces the scientific rigor essential for future clinical applications.
△ Less
Submitted 17 August, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
DuETT: Dual Event Time Transformer for Electronic Health Records
Authors:
Alex Labach,
Aslesha Pokhrel,
Xiao Shi Huang,
Saba Zuberi,
Seung Eun Yi,
Maksims Volkovs,
Tomi Poutanen,
Rahul G. Krishnan
Abstract:
Electronic health records (EHRs) recorded in hospital settings typically contain a wide range of numeric time series data that is characterized by high sparsity and irregular observations. Effective modelling for such data must exploit its time series nature, the semantic relationship between different types of observations, and information in the sparsity structure of the data. Self-supervised Tr…
▽ More
Electronic health records (EHRs) recorded in hospital settings typically contain a wide range of numeric time series data that is characterized by high sparsity and irregular observations. Effective modelling for such data must exploit its time series nature, the semantic relationship between different types of observations, and information in the sparsity structure of the data. Self-supervised Transformers have shown outstanding performance in a variety of structured tasks in NLP and computer vision. But multivariate time series data contains structured relationships over two dimensions: time and recorded event type, and straightforward applications of Transformers to time series data do not leverage this distinct structure. The quadratic scaling of self-attention layers can also significantly limit the input sequence length without appropriate input engineering. We introduce the DuETT architecture, an extension of Transformers designed to attend over both time and event type dimensions, yielding robust representations from EHR data. DuETT uses an aggregated input where sparse time series are transformed into a regular sequence with fixed length; this lowers the computational complexity relative to previous EHR Transformer models and, more importantly, enables the use of larger and deeper neural networks. When trained with self-supervised prediction tasks, that provide rich and informative signals for model pre-training, our model outperforms state-of-the-art deep learning models on multiple downstream tasks from the MIMIC-IV and PhysioNet-2012 EHR datasets.
△ Less
Submitted 15 August, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Artificial Intelligence/Operations Research Workshop 2 Report Out
Authors:
John Dickerson,
Bistra Dilkina,
Yu Ding,
Swati Gupta,
Pascal Van Hentenryck,
Sven Koenig,
Ramayya Krishnan,
Radhika Kulkarni,
Catherine Gill,
Haley Griffin,
Maddy Hunter,
Ann Schwartz
Abstract:
This workshop Report Out focuses on the foundational elements of trustworthy AI and OR technology, and how to ensure all AI and OR systems implement these elements in their system designs. Four sessions on various topics within Trustworthy AI were held, these being Fairness, Explainable AI/Causality, Robustness/Privacy, and Human Alignment and Human-Computer Interaction. Following discussions of e…
▽ More
This workshop Report Out focuses on the foundational elements of trustworthy AI and OR technology, and how to ensure all AI and OR systems implement these elements in their system designs. Four sessions on various topics within Trustworthy AI were held, these being Fairness, Explainable AI/Causality, Robustness/Privacy, and Human Alignment and Human-Computer Interaction. Following discussions of each of these topics, workshop participants also brainstormed challenge problems which require the collaboration of AI and OR researchers and will result in the integration of basic techniques from both fields to eventually benefit societal needs.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Zero-shot CT Field-of-view Completion with Unconditional Generative Diffusion Prior
Authors:
Kaiwen Xu,
Aravind R. Krishnan,
Thomas Z. Li,
Yuankai Huo,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman
Abstract:
Anatomically consistent field-of-view (FOV) completion to recover truncated body sections has important applications in quantitative analyses of computed tomography (CT) with limited FOV. Existing solution based on conditional generative models relies on the fidelity of synthetic truncation patterns at training phase, which poses limitations for the generalizability of the method to potential unkn…
▽ More
Anatomically consistent field-of-view (FOV) completion to recover truncated body sections has important applications in quantitative analyses of computed tomography (CT) with limited FOV. Existing solution based on conditional generative models relies on the fidelity of synthetic truncation patterns at training phase, which poses limitations for the generalizability of the method to potential unknown types of truncation. In this study, we evaluate a zero-shot method based on a pretrained unconditional generative diffusion prior, where truncation pattern with arbitrary forms can be specified at inference phase. In evaluation on simulated chest CT slices with synthetic FOV truncation, the method is capable of recovering anatomically consistent body sections and subcutaneous adipose tissue measurement error caused by FOV truncation. However, the correction accuracy is inferior to the conditionally trained counterpart.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Longitudinal Multimodal Transformer Integrating Imaging and Latent Clinical Signatures From Routine EHRs for Pulmonary Nodule Classification
Authors:
Thomas Z. Li,
John M. Still,
Kaiwen Xu,
Ho Hin Lee,
Leon Y. Cai,
Aravind R. Krishnan,
Riqiang Gao,
Mirza S. Khan,
Sanja Antic,
Michael Kammer,
Kim L. Sandler,
Fabien Maldonado,
Bennett A. Landman,
Thomas A. Lasko
Abstract:
The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learni…
▽ More
The accuracy of predictive models for solitary pulmonary nodule (SPN) diagnosis can be greatly increased by incorporating repeat imaging and medical context, such as electronic health records (EHRs). However, clinically routine modalities such as imaging and diagnostic codes can be asynchronous and irregularly sampled over different time scales which are obstacles to longitudinal multimodal learning. In this work, we propose a transformer-based multimodal strategy to integrate repeat imaging with longitudinal clinical signatures from routinely collected EHRs for SPN classification. We perform unsupervised disentanglement of latent clinical signatures and leverage time-distance scaled self-attention to jointly learn from clinical signatures expressions and chest computed tomography (CT) scans. Our classifier is pretrained on 2,668 scans from a public dataset and 1,149 subjects with longitudinal chest CTs, billing codes, medications, and laboratory tests from EHRs of our home institution. Evaluation on 227 subjects with challenging SPNs revealed a significant AUC improvement over a longitudinal multimodal baseline (0.824 vs 0.752 AUC), as well as improvements over a single cross-section multimodal scenario (0.809 AUC) and a longitudinal imaging-only scenario (0.741 AUC). This work demonstrates significant advantages with a novel approach for co-learning longitudinal imaging and non-imaging phenotypes with transformers. Code available at https://github.com/MASILab/lmsignatures.
△ Less
Submitted 29 June, 2023; v1 submitted 5 April, 2023;
originally announced April 2023.
-
Efficient Indirect Interatomic Coulombic Decay Induced by Photoelectron Impact Excitation in Large He Nanodroplets
Authors:
L. Ben Ltaief,
K. Sishodia,
S. Mandal,
S. De,
S. R. Krishnan,
C. Medina,
N. Pal,
R. Richter,
T. Fennel,
M. Mudrich
Abstract:
Ionization of matter by energetic radiation generally causes complex secondary reactions which are hard to decipher. Using large helium nanodroplets irradiated by XUV photons, we show that the full chain of processes ensuing primary photoionization can be tracked in detail by means of high-resolution electron spectroscopy. We find that elastic and inelastic scattering of photoelectrons efficiently…
▽ More
Ionization of matter by energetic radiation generally causes complex secondary reactions which are hard to decipher. Using large helium nanodroplets irradiated by XUV photons, we show that the full chain of processes ensuing primary photoionization can be tracked in detail by means of high-resolution electron spectroscopy. We find that elastic and inelastic scattering of photoelectrons efficiently induces interatomic Coulombic decay (ICD) in the droplets. This type of indirect ICD even becomes the dominant process of electron emission in nearly the entire XUV range in large droplets with radius $\gtrsim40~$nm. Indirect ICD processes induced by electron scattering likely play an important role in other condensed phase systems exposed to ionizing radiation as well, including biological matter.
△ Less
Submitted 26 March, 2023;
originally announced March 2023.
-
Anamnesic Neural Differential Equations with Orthogonal Polynomial Projections
Authors:
Edward De Brouwer,
Rahul G. Krishnan
Abstract:
Neural ordinary differential equations (Neural ODEs) are an effective framework for learning dynamical systems from irregularly sampled time series data. These models provide a continuous-time latent representation of the underlying dynamical system where new observations at arbitrary time points can be used to update the latent representation of the dynamical system. Existing parameterizations fo…
▽ More
Neural ordinary differential equations (Neural ODEs) are an effective framework for learning dynamical systems from irregularly sampled time series data. These models provide a continuous-time latent representation of the underlying dynamical system where new observations at arbitrary time points can be used to update the latent representation of the dynamical system. Existing parameterizations for the dynamics functions of Neural ODEs limit the ability of the model to retain global information about the time series; specifically, a piece-wise integration of the latent process between observations can result in a loss of memory on the dynamic patterns of previously observed data points. We propose PolyODE, a Neural ODE that models the latent continuous-time process as a projection onto a basis of orthogonal polynomials. This formulation enforces long-range memory and preserves a global representation of the underlying dynamical system. Our construction is backed by favourable theoretical guarantees and in a series of experiments, we demonstrate that it outperforms previous works in the reconstruction of past and future data, and in downstream prediction tasks.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Changes in Commuter Behavior from COVID-19 Lockdowns in the Atlanta Metropolitan Area
Authors:
Tejas Santanam,
Anthony Trasatti,
Hanyu Zhang,
Connor Riley,
Pascal Van Hentenryck,
Ramayya Krishnan
Abstract:
This paper analyzes the impact of COVID-19 related lockdowns in the Atlanta, Georgia metropolitan area by examining commuter patterns in three periods: prior to, during, and after the pandemic lockdown. A cellular phone location dataset is utilized in a novel pipeline to infer the home and work locations of thousands of users from the Density-based Spatial Clustering of Applications with Noise (DB…
▽ More
This paper analyzes the impact of COVID-19 related lockdowns in the Atlanta, Georgia metropolitan area by examining commuter patterns in three periods: prior to, during, and after the pandemic lockdown. A cellular phone location dataset is utilized in a novel pipeline to infer the home and work locations of thousands of users from the Density-based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. The coordinates derived from the clustering are put through a reverse geocoding process from which word embeddings are extracted in order to categorize the industry of each work place based on the workplace name and Point of Interest (POI) map**. Frequencies of commute from home locations to work locations are analyzed in and across all three time periods. Public health and economic factors are discussed to explain potential reasons for the observed changes in commuter patterns.
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
Iridium-do** as a strategy to realize visible light absorption and p-type behavior in BaTiO3
Authors:
Sujana Chandrappa,
Simon Joyson Galbao,
P S Sankara Rama Krishnan,
Namitha Anna Koshi,
Srewashi Das,
Stephen Nagaraju Myakala,
Seung Cheol Lee,
Arnab Dutta,
Alexey Cherevan,
Satadeep Bhattacharjee,
Dharmapura H K Murthy
Abstract:
BaTiO3 is typically a strong n-type material with tuneable optoelectronic properties via do** and controlling the synthesis conditions. It has a wide band gap that can only harness the ultraviolet region of the solar spectrum. Despite significant progress, achieving visible-light absorbing BTO with tuneable carrier concentration has been challenging, a crucial requirement for many applications.…
▽ More
BaTiO3 is typically a strong n-type material with tuneable optoelectronic properties via do** and controlling the synthesis conditions. It has a wide band gap that can only harness the ultraviolet region of the solar spectrum. Despite significant progress, achieving visible-light absorbing BTO with tuneable carrier concentration has been challenging, a crucial requirement for many applications. In this work, a p-type BTO with visible-light absorption is realized via iridium do**. Detailed analysis using advanced spectroscopy tools and computational electronic structure analysis is used to rationalize the n- to p-type transition after Ir do**. Results offered mechanistic insight into the interplay between the dopant site occupancy, the dopant position within the band gap, and the defect chemistry affecting the carrier concentration. A decrease in the Ti3+ donor levels concentration and the mutually correlated oxygen vacancies upon Ir do** is attributed to the p-type behavior. Due to the formation of Ir3+ or Ir4+ in-gap energy levels within the forbidden region, the optical transition can be elicited from or to such levels resulting in visible-light absorption. This newly developed Ir-doped BTO can be a promising p-type perovskite-oxide with imminent applications in solar fuel generation, spintronics and optoelectronics.
△ Less
Submitted 15 February, 2023;
originally announced February 2023.
-
Reliable Multimodal Trajectory Prediction via Error Aligned Uncertainty Optimization
Authors:
Neslihan Kose,
Ranganath Krishnan,
Akash Dhamasia,
Omesh Tickoo,
Michael Paulitsch
Abstract:
Reliable uncertainty quantification in deep neural networks is very crucial in safety-critical applications such as automated driving for trustworthy and informed decision-making. Assessing the quality of uncertainty estimates is challenging as ground truth for uncertainty estimates is not available. Ideally, in a well-calibrated model, uncertainty estimates should perfectly correlate with model e…
▽ More
Reliable uncertainty quantification in deep neural networks is very crucial in safety-critical applications such as automated driving for trustworthy and informed decision-making. Assessing the quality of uncertainty estimates is challenging as ground truth for uncertainty estimates is not available. Ideally, in a well-calibrated model, uncertainty estimates should perfectly correlate with model error. We propose a novel error aligned uncertainty optimization method and introduce a trainable loss function to guide the models to yield good quality uncertainty estimates aligning with the model error. Our approach targets continuous structured prediction and regression tasks, and is evaluated on multiple datasets including a large-scale vehicle motion prediction task involving real-world distributional shifts. We demonstrate that our method improves average displacement error by 1.69% and 4.69%, and the uncertainty correlation with model error by 17.22% and 19.13% as quantified by Pearson correlation coefficient on two state-of-the-art baselines.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
A Learning Based Hypothesis Test for Harmful Covariate Shift
Authors:
Tom Ginsberg,
Zhongyuan Liang,
Rahul G. Krishnan
Abstract:
The ability to quickly and accurately identify covariate shift at test time is a critical and often overlooked component of safe machine learning systems deployed in high-risk domains. While methods exist for detecting when predictions should not be made on out-of-distribution test examples, identifying distributional level differences between training and test time can help determine when a model…
▽ More
The ability to quickly and accurately identify covariate shift at test time is a critical and often overlooked component of safe machine learning systems deployed in high-risk domains. While methods exist for detecting when predictions should not be made on out-of-distribution test examples, identifying distributional level differences between training and test time can help determine when a model should be removed from the deployment setting and retrained. In this work, we define harmful covariate shift (HCS) as a change in distribution that may weaken the generalization of a predictive model. To detect HCS, we use the discordance between an ensemble of classifiers trained to agree on training data and disagree on test data. We derive a loss function for training this ensemble and show that the disagreement rate and entropy represent powerful discriminative statistics for HCS. Empirically, we demonstrate the ability of our method to detect harmful covariate shift with statistical certainty on a variety of high-dimensional datasets. Across numerous domains and modalities, we show state-of-the-art performance compared to existing methods, particularly when the number of observed test samples is small.
△ Less
Submitted 1 March, 2023; v1 submitted 5 December, 2022;
originally announced December 2022.
-
Learning predictive checklists from continuous medical data
Authors:
Yukti Makhija,
Edward De Brouwer,
Rahul G. Krishnan
Abstract:
Checklists, while being only recently introduced in the medical domain, have become highly popular in daily clinical practice due to their combined effectiveness and great interpretability. Checklists are usually designed by expert clinicians that manually collect and analyze available evidence. However, the increasing quantity of available medical data is calling for a partially automated checkli…
▽ More
Checklists, while being only recently introduced in the medical domain, have become highly popular in daily clinical practice due to their combined effectiveness and great interpretability. Checklists are usually designed by expert clinicians that manually collect and analyze available evidence. However, the increasing quantity of available medical data is calling for a partially automated checklist design. Recent works have taken a step in that direction by learning predictive checklists from categorical data. In this work, we propose to extend this approach to accomodate learning checklists from continuous medical data using mixed-integer programming approach. We show that this extension outperforms a range of explainable machine learning baselines on the prediction of sepsis from intensive care clinical trajectories.
△ Less
Submitted 13 November, 2022;
originally announced November 2022.
-
Partial Identification of Treatment Effects with Implicit Generative Models
Authors:
Vahid Balazadeh,
Vasilis Syrgkanis,
Rahul G. Krishnan
Abstract:
We consider the problem of partial identification, the estimation of bounds on the treatment effects from observational data. Although studied using discrete treatment variables or in specific causal graphs (e.g., instrumental variables), partial identification has been recently explored using tools from deep generative modeling. We propose a new method for partial identification of average treatm…
▽ More
We consider the problem of partial identification, the estimation of bounds on the treatment effects from observational data. Although studied using discrete treatment variables or in specific causal graphs (e.g., instrumental variables), partial identification has been recently explored using tools from deep generative modeling. We propose a new method for partial identification of average treatment effects(ATEs) in general causal graphs using implicit generative models comprising continuous and discrete random variables. Since ATE with continuous treatment is generally non-regular, we leverage the partial derivatives of response functions to define a regular approximation of ATE, a quantity we call uniform average treatment derivative (UATD). We prove that our algorithm converges to tight bounds on ATE in linear structural causal models (SCMs). For nonlinear SCMs, we empirically show that using UATD leads to tighter and more stable bounds than methods that directly optimize the ATE.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
HiCu: Leveraging Hierarchy for Curriculum Learning in Automated ICD Coding
Authors:
Weiming Ren,
Rui**g Zeng,
Tongzi Wu,
Tianshu Zhu,
Rahul G. Krishnan
Abstract:
There are several opportunities for automation in healthcare that can improve clinician throughput. One such example is assistive tools to document diagnosis codes when clinicians write notes. We study the automation of medical code prediction using curriculum learning, which is a training strategy for machine learning models that gradually increases the hardness of the learning tasks from easy to…
▽ More
There are several opportunities for automation in healthcare that can improve clinician throughput. One such example is assistive tools to document diagnosis codes when clinicians write notes. We study the automation of medical code prediction using curriculum learning, which is a training strategy for machine learning models that gradually increases the hardness of the learning tasks from easy to difficult. One of the challenges in curriculum learning is the design of curricula -- i.e., in the sequential design of tasks that gradually increase in difficulty. We propose Hierarchical Curriculum Learning (HiCu), an algorithm that uses graph structure in the space of outputs to design curricula for multi-label classification. We create curricula for multi-label classification models that predict ICD diagnosis and procedure codes from natural language descriptions of patients. By leveraging the hierarchy of ICD codes, which groups diagnosis codes based on various organ systems in the human body, we find that our proposed curricula improve the generalization of neural network-based predictive models across recurrent, convolutional, and transformer-based architectures. Our code is available at https://github.com/wren93/HiCu-ICD.
△ Less
Submitted 3 August, 2022;
originally announced August 2022.
-
Machine Learning in Access Control: A Taxonomy and Survey
Authors:
Mohammad Nur Nobi,
Maanak Gupta,
Lopamudra Praharaj,
Mahmoud Abdelsalam,
Ram Krishnan,
Ravi Sandhu
Abstract:
An increasing body of work has recognized the importance of exploiting machine learning (ML) advancements to address the need for efficient automation in extracting access control attributes, policy mining, policy verification, access decisions, etc. In this work, we survey and summarize various ML approaches to solve different access control problems. We propose a novel taxonomy of the ML model's…
▽ More
An increasing body of work has recognized the importance of exploiting machine learning (ML) advancements to address the need for efficient automation in extracting access control attributes, policy mining, policy verification, access decisions, etc. In this work, we survey and summarize various ML approaches to solve different access control problems. We propose a novel taxonomy of the ML model's application in the access control domain. We highlight current limitations and open challenges such as lack of public real-world datasets, administration of ML-based access control systems, understanding a black-box ML model's decision, etc., and enumerate future research directions.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning
Authors:
Richard J. Chen,
Chengkuan Chen,
Yicong Li,
Tiffany Y. Chen,
Andrew D. Trister,
Rahul G. Krishnan,
Faisal Mahmood
Abstract:
Vision Transformers (ViTs) and their multi-scale and hierarchical variations have been successful at capturing image representations but their use has been generally studied for low-resolution images (e.g. - 256x256, 384384). For gigapixel whole-slide imaging (WSI) in computational pathology, WSIs can be as large as 150000x150000 pixels at 20X magnification and exhibit a hierarchical structure of…
▽ More
Vision Transformers (ViTs) and their multi-scale and hierarchical variations have been successful at capturing image representations but their use has been generally studied for low-resolution images (e.g. - 256x256, 384384). For gigapixel whole-slide imaging (WSI) in computational pathology, WSIs can be as large as 150000x150000 pixels at 20X magnification and exhibit a hierarchical structure of visual tokens across varying resolutions: from 16x16 images capture spatial patterns among cells, to 4096x4096 images characterizing interactions within the tissue microenvironment. We introduce a new ViT architecture called the Hierarchical Image Pyramid Transformer (HIPT), which leverages the natural hierarchical structure inherent in WSIs using two levels of self-supervised learning to learn high-resolution image representations. HIPT is pretrained across 33 cancer types using 10,678 gigapixel WSIs, 408,218 4096x4096 images, and 104M 256x256 images. We benchmark HIPT representations on 9 slide-level tasks, and demonstrate that: 1) HIPT with hierarchical pretraining outperforms current state-of-the-art methods for cancer subty** and survival prediction, 2) self-supervised ViTs are able to model important inductive biases about the hierarchical structure of phenotypes in the tumor microenvironment.
△ Less
Submitted 6 June, 2022;
originally announced June 2022.
-
Hierarchical Optimal Transport for Comparing Histopathology Datasets
Authors:
Anna Yeaton,
Rahul G. Krishnan,
Rebecca Mieloszyk,
David Alvarez-Melis,
Grace Huynh
Abstract:
Scarcity of labeled histopathology data limits the applicability of deep learning methods to under-profiled cancer types and labels. Transfer learning allows researchers to overcome the limitations of small datasets by pre-training machine learning models on larger datasets similar to the small target dataset. However, similarity between datasets is often determined heuristically. In this paper, w…
▽ More
Scarcity of labeled histopathology data limits the applicability of deep learning methods to under-profiled cancer types and labels. Transfer learning allows researchers to overcome the limitations of small datasets by pre-training machine learning models on larger datasets similar to the small target dataset. However, similarity between datasets is often determined heuristically. In this paper, we propose a principled notion of distance between histopathology datasets based on a hierarchical generalization of optimal transport distances. Our method does not require any training, is agnostic to model type, and preserves much of the hierarchical structure in histopathology datasets imposed by tiling. We apply our method to H&E stained slides from The Cancer Genome Atlas from six different cancer types. We show that our method outperforms a baseline distance in a cancer-type prediction task. Our results also show that our optimal transport distance predicts difficulty of transferability in a tumor vs.normal prediction setting.
△ Less
Submitted 20 April, 2022; v1 submitted 18 April, 2022;
originally announced April 2022.
-
Mixture-of-experts VAEs can disregard variation in surjective multimodal data
Authors:
Jannik Wolff,
Tassilo Klein,
Moin Nabi,
Rahul G. Krishnan,
Shinichi Nakajima
Abstract:
Machine learning systems are often deployed in domains that entail data from multiple modalities, for example, phenotypic and genotypic characteristics describe patients in healthcare. Previous works have developed multimodal variational autoencoders (VAEs) that generate several modalities. We consider subjective data, where single datapoints from one modality (such as class labels) describe multi…
▽ More
Machine learning systems are often deployed in domains that entail data from multiple modalities, for example, phenotypic and genotypic characteristics describe patients in healthcare. Previous works have developed multimodal variational autoencoders (VAEs) that generate several modalities. We consider subjective data, where single datapoints from one modality (such as class labels) describe multiple datapoints from another modality (such as images). We theoretically and empirically demonstrate that multimodal VAEs with a mixture of experts posterior can struggle to capture variability in such surjective data.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
Assessing the impact of soil moisture-temperature coupling on temperature extremes over the Indian region
Authors:
Naresh G Ganeshi,
Milind Mujumdar,
Takaya Yuhei,
Mangesh M Goswami,
Bhupendra Bahadur Singh,
R Krishnan,
Toru Terao
Abstract:
While previous model sensitivity studies have mainly focused on discerning the soil moisture-precipitation feedback processes over the Indian region, the present study investigates the impact of soil moisture-temperature (SM-T) coupling on the temperature extremes (ExT) using the high-resolution (~60 km) model simulations. These simulations include the control and soil moisture (SM) sensitivity ex…
▽ More
While previous model sensitivity studies have mainly focused on discerning the soil moisture-precipitation feedback processes over the Indian region, the present study investigates the impact of soil moisture-temperature (SM-T) coupling on the temperature extremes (ExT) using the high-resolution (~60 km) model simulations. These simulations include the control and soil moisture (SM) sensitivity experiments (DRY-SM and WET-SM) initialized by perturbing (decreasing/increasing) SM from the historical (HIST: 1951-2010) and future 4K warming (FUT: 2051-2100) control runs. The analysis identifies the transitional regions of north-central India (NCI) as the hotspot of strong SM-T coupling. Over NCI, the HIST experiment shows an occurrence of 4-5 extreme events per year, with an average duration of 5-6 days per event and intensity exceeding 46oC. Whereas, FUT estimates indicate relatively severe, long-lasting, and more frequent extreme events. The SM sensitivity experiments reveal the significant influence of SM-T coupling on the ExT over NCI in both historical and future climates. We find that the DRY-SM results in significant enhancement of frequency, duration and intensity of ExT, in contrast to WET-SM. We note that the difference between DRY-SM and WET-SM 50-year return value of the block maxima GEV fit can reach upto 1.25oC and 3oC for historical and future climate, respectively. The enhanced (reduced) extreme temperature conditions in DRY-SM (WET-SM) simulation are caused by the intensification (abridgement) of sensible heat flux by limiting (intensifying) available total energy for evaporative cooling due to faster (slower) dissipation of positive soil moisture anomalies (also called as soil moisture memory). In addition, the influence of SM on ExT over NCI is found to be larger during the post-monsoon season as compared to the pre-monsoon and monsoon seasons.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Toward Deep Learning Based Access Control
Authors:
Mohammad Nur Nobi,
Ram Krishnan,
Yufei Huang,
Mehrnoosh Shakarami,
Ravi Sandhu
Abstract:
A common trait of current access control approaches is the challenging need to engineer abstract and intuitive access control models. This entails designing access control information in the form of roles (RBAC), attributes (ABAC), or relationships (ReBAC) as the case may be, and subsequently, designing access control rules. This framework has its benefits but has significant limitations in the co…
▽ More
A common trait of current access control approaches is the challenging need to engineer abstract and intuitive access control models. This entails designing access control information in the form of roles (RBAC), attributes (ABAC), or relationships (ReBAC) as the case may be, and subsequently, designing access control rules. This framework has its benefits but has significant limitations in the context of modern systems that are dynamic, complex, and large-scale, due to which it is difficult to maintain an accurate access control state in the system for a human administrator. This paper proposes Deep Learning Based Access Control (DLBAC) by leveraging significant advances in deep learning technology as a potential solution to this problem. We envision that DLBAC could complement and, in the long-term, has the potential to even replace, classical access control models with a neural network that reduces the burden of access control model engineering and updates. Without loss of generality, we conduct a thorough investigation of a candidate DLBAC model, called DLBAC_alpha, using both real-world and synthetic datasets. We demonstrate the feasibility of the proposed approach by addressing issues related to accuracy, generalization, and explainability. We also discuss challenges and future research directions.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Synthesizing Fine-Grained Synchronization Protocols for Implicit Monitors (Extended Version)
Authors:
Kostas Ferles,
Benjamin Sepanski,
Rahul Krishnan,
James Bornholt,
Isil Dillig
Abstract:
A monitor is a widely-used concurrent programming abstraction that encapsulates all shared state between threads. Monitors can be classified as being either implicit or explicit depending on the primitives they provide. Implicit monitors are much easier to program but typically not as efficient. To address this gap, there has been recent research on automatically synthesizing explicit-signal monit…
▽ More
A monitor is a widely-used concurrent programming abstraction that encapsulates all shared state between threads. Monitors can be classified as being either implicit or explicit depending on the primitives they provide. Implicit monitors are much easier to program but typically not as efficient. To address this gap, there has been recent research on automatically synthesizing explicit-signal monitors from an implicit specification, but prior work does not exploit all paralellization opportunities due to the use of a single lock for the entire monitor. This paper presents a new technique for synthesizing fine-grained explicit-synchronization protocols from implicit monitors. Our method is based on two key innovations: First, we present a new static analysis for inferring safe interleavings that allow violating mutual exclusion of monitor operations without changing its semantics. Second, we use the results of this static analysis to generate a MaxSAT instance whose models correspond to correct-by-construction synchronization protocols. We have implemented our approach in a tool called Cortado and evaluate it on monitors that contain parallelization opportunities. Our evaluation shows that Cortado can synthesize synchronization policies that are competitive with, or even better than, expert-written ones on these benchmarks.
△ Less
Submitted 16 March, 2022; v1 submitted 1 March, 2022;
originally announced March 2022.
-
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology
Authors:
Richard J. Chen,
Rahul G. Krishnan
Abstract:
Tissue phenoty** is a fundamental task in learning objective characterizations of histopathologic biomarkers within the tumor-immune microenvironment in cancer pathology. However, whole-slide imaging (WSI) is a complex computer vision in which: 1) WSIs have enormous image resolutions with precludes large-scale pixel-level efforts in data curation, and 2) diversity of morphological phenotypes res…
▽ More
Tissue phenoty** is a fundamental task in learning objective characterizations of histopathologic biomarkers within the tumor-immune microenvironment in cancer pathology. However, whole-slide imaging (WSI) is a complex computer vision in which: 1) WSIs have enormous image resolutions with precludes large-scale pixel-level efforts in data curation, and 2) diversity of morphological phenotypes results in inter- and intra-observer variability in tissue labeling. To address these limitations, current efforts have proposed using pretrained image encoders (transfer learning from ImageNet, self-supervised pretraining) in extracting morphological features from pathology, but have not been extensively validated. In this work, we conduct a search for good representations in pathology by training a variety of self-supervised models with validation on a variety of weakly-supervised and patch-level tasks. Our key finding is in discovering that Vision Transformers using DINO-based knowledge distillation are able to learn data-efficient and interpretable features in histology images wherein the different attention heads learn distinct morphological phenotypes. We make evaluation code and pretrained weights publicly-available at: https://github.com/Richarizardd/Self-Supervised-ViT-Path.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models
Authors:
Rickard K. A. Karlsson,
Martin Willbo,
Zeshan Hussain,
Rahul G. Krishnan,
David Sontag,
Fredrik D. Johansson
Abstract:
We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data…
▽ More
We study prediction of future outcomes with supervised models that use privileged information during learning. The privileged information comprises samples of time series observed between the baseline time of prediction and the future outcome; this information is only available at training time which differs from the traditional supervised learning. Our question is when using this privileged data leads to more sample-efficient learning of models that use only baseline data for predictions at test time. We give an algorithm for this setting and prove that when the time series are drawn from a non-stationary Gaussian-linear dynamical system of fixed horizon, learning with privileged information is more efficient than learning without it. On synthetic data, we test the limits of our algorithm and theory, both when our assumptions hold and when they are violated. On three diverse real-world datasets, we show that our approach is generally preferable to classical learning, particularly when data is scarce. Finally, we relate our estimator to a distillation approach both theoretically and empirically.
△ Less
Submitted 5 May, 2022; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Robust Contrastive Active Learning with Feature-guided Query Strategies
Authors:
Ranganath Krishnan,
Nilesh Ahuja,
Alok Sinha,
Mahesh Subedar,
Omesh Tickoo,
Ravi Iyer
Abstract:
We introduce supervised contrastive active learning (SCAL) and propose efficient query strategies in active learning based on the feature similarity (featuresim) and principal component analysis based feature-reconstruction error (fre) to select informative data samples with diverse feature representations. We demonstrate our proposed method achieves state-of-the-art accuracy, model calibration an…
▽ More
We introduce supervised contrastive active learning (SCAL) and propose efficient query strategies in active learning based on the feature similarity (featuresim) and principal component analysis based feature-reconstruction error (fre) to select informative data samples with diverse feature representations. We demonstrate our proposed method achieves state-of-the-art accuracy, model calibration and reduces sampling bias in an active learning setup for balanced and imbalanced datasets on image classification tasks. We also evaluate robustness of model to distributional shift derived from different query strategies in active learning setting. Using extensive experiments, we show that our proposed approach outperforms high performing compute-intensive methods by a big margin resulting in 9.9% lower mean corruption error, 7.2% lower expected calibration error under dataset shift and 8.9% higher AUROC for out-of-distribution detection.
△ Less
Submitted 14 August, 2022; v1 submitted 13 September, 2021;
originally announced September 2021.
-
Mitigating Sampling Bias and Improving Robustness in Active Learning
Authors:
Ranganath Krishnan,
Alok Sinha,
Nilesh Ahuja,
Mahesh Subedar,
Omesh Tickoo,
Ravi Iyer
Abstract:
This paper presents simple and efficient methods to mitigate sampling bias in active learning while achieving state-of-the-art accuracy and model robustness. We introduce supervised contrastive active learning by leveraging the contrastive loss for active learning under a supervised setting. We propose an unbiased query strategy that selects informative data samples of diverse feature representati…
▽ More
This paper presents simple and efficient methods to mitigate sampling bias in active learning while achieving state-of-the-art accuracy and model robustness. We introduce supervised contrastive active learning by leveraging the contrastive loss for active learning under a supervised setting. We propose an unbiased query strategy that selects informative data samples of diverse feature representations with our methods: supervised contrastive active learning (SCAL) and deep feature modeling (DFM). We empirically demonstrate our proposed methods reduce sampling bias, achieve state-of-the-art accuracy and model calibration in an active learning setup with the query computation 26x faster than Bayesian active learning by disagreement and 11x faster than CoreSet. The proposed SCAL method outperforms by a big margin in robustness to dataset shift and out-of-distribution.
△ Less
Submitted 13 September, 2021;
originally announced September 2021.
-
Experimental study to optimise the treatment efficacy of pharmaceutical effluents by combining electron beam irradiation with conventional techniques
Authors:
Pankaj Kumar,
Manisha Meena,
Anjali Bhagwan Kavar,
Pragya Nama,
Abhishek Pathak,
Raghava Varma,
Abhay Deshpande,
Tanuja Dixit,
R. Krishnan,
Chandrakant Nainwad
Abstract:
The inability of conventional methods to completely remove the contaminants from pharmaceutical effluents led us to study the effect of Electron Beam (EB) irradiation on real pharmaceutical wastewater. In this paper, the samples from different stages of existing treatment facilities of industry are irradiated with varying doses from 25 to 200 kGy. The study aimed to find a suitable combination of…
▽ More
The inability of conventional methods to completely remove the contaminants from pharmaceutical effluents led us to study the effect of Electron Beam (EB) irradiation on real pharmaceutical wastewater. In this paper, the samples from different stages of existing treatment facilities of industry are irradiated with varying doses from 25 to 200 kGy. The study aimed to find a suitable combination of EB and conventional treatments for efficient degradation of complex pharmaceutical effluent. It has been successfully demonstrated that electron beam irradiation when combined with conventional techniques like coagulation before or after the irradiation improves the efficiency of the process, resulting in lower Chemical Oxygen Demand (COD). In this investigation, the maximum COD reduction was found to be around 65 percent.
△ Less
Submitted 20 May, 2022; v1 submitted 27 August, 2021;
originally announced September 2021.