-
A generative flow for conditional sampling via optimal transport
Authors:
Jason Alfonso,
Ricardo Baptista,
Anupam Bhakta,
Noam Gal,
Alfin Hou,
Isa Lyubimova,
Daniel Pocklington,
Josef Sajonz,
Giulio Trigila,
Ryan Tsai
Abstract:
Sampling conditional distributions is a fundamental task for Bayesian inference and density estimation. Generative models, such as normalizing flows and generative adversarial networks, characterize conditional distributions by learning a transport map that pushes forward a simple reference (e.g., a standard Gaussian) to a target distribution. While these approaches successfully describe many non-…
▽ More
Sampling conditional distributions is a fundamental task for Bayesian inference and density estimation. Generative models, such as normalizing flows and generative adversarial networks, characterize conditional distributions by learning a transport map that pushes forward a simple reference (e.g., a standard Gaussian) to a target distribution. While these approaches successfully describe many non-Gaussian problems, their performance is often limited by parametric bias and the reliability of gradient-based (adversarial) optimizers to learn these transformations. This work proposes a non-parametric generative model that iteratively maps reference samples to the target. The model uses block-triangular transport maps, whose components are shown to characterize conditionals of the target distribution. These maps arise from solving an optimal transport problem with a weighted $L^2$ cost function, thereby extending the data-driven approach in [Trigila and Tabak, 2016] for conditional sampling. The proposed approach is demonstrated on a two dimensional example and on a parameter inference problem involving nonlinear ODEs.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Deep Distance Sensitivity Oracles
Authors:
Davin Jeong,
Allison Gunby-Mann,
Sarel Cohen,
Maximilian Katzmann,
Chau Pham,
Arnav Bhakta,
Tobias Friedrich,
Sang Chin
Abstract:
One of the most fundamental graph problems is finding a shortest path from a source to a target node. While in its basic forms the problem has been studied extensively and efficient algorithms are known, it becomes significantly harder as soon as parts of the graph are susceptible to failure. Although one can recompute a shortest replacement path after every outage, this is rather inefficient both…
▽ More
One of the most fundamental graph problems is finding a shortest path from a source to a target node. While in its basic forms the problem has been studied extensively and efficient algorithms are known, it becomes significantly harder as soon as parts of the graph are susceptible to failure. Although one can recompute a shortest replacement path after every outage, this is rather inefficient both in time and/or storage. One way to overcome this problem is to shift computational burden from the queries into a pre-processing step, where a data structure is computed that allows for fast querying of replacement paths, typically referred to as a Distance Sensitivity Oracle (DSO). While DSOs have been extensively studied in the theoretical computer science community, to the best of our knowledge this is the first work to construct DSOs using deep learning techniques. We show how to use deep learning to utilize a combinatorial structure of replacement paths. More specifically, we utilize the combinatorial structure of replacement paths as a concatenation of shortest paths and use deep learning to find the pivot nodes for stitching shortest paths into replacement paths.
△ Less
Submitted 18 October, 2023; v1 submitted 2 November, 2022;
originally announced November 2022.
-
Bicycling As A Mode Of Transport In Dhaka City Status And Prospects
Authors:
S. M. Haroon,
A. K. Bhakta,
M. Shahabuddin,
N. Rahman,
M. R. Mahmud
Abstract:
This study aims to find out the current status and prospects of using a bicycle as a mode for commuting within Dhaka city. Bicycling is a very sustainable mode of transport but unfortunately is used very less by the commuters of Dhaka. There has been a lot of factors affecting the choice of bicycle to commute. This study was aimed to find out what factors could motivate the commuters of Dhaka to u…
▽ More
This study aims to find out the current status and prospects of using a bicycle as a mode for commuting within Dhaka city. Bicycling is a very sustainable mode of transport but unfortunately is used very less by the commuters of Dhaka. There has been a lot of factors affecting the choice of bicycle to commute. This study was aimed to find out what factors could motivate the commuters of Dhaka to use a bicycle as a mode of transport. For determining the motivators, a survey was administered among the commuters of Dhaka city in which the respondents were asked to answer how certain factors would affect their choice to use a bicycle to commute. A Likert scale was used in the survey and the responses were analyzed, from which the top motivators were found. The Motivators were then grouped together using exploratory factor analysis to support possible policy making. Four factors were extracted using the method. The factors were named Additional Perks, General benefits, Personal Benefits, and Infrastructural benefits
△ Less
Submitted 31 January, 2022;
originally announced January 2022.
-
Comparing Machine Learning-Centered Approaches for Forecasting Language Patterns During Frustration in Early Childhood
Authors:
Arnav Bhakta,
Yeunjoo Kim,
Pamela Cole
Abstract:
When faced with self-regulation challenges, children have been known the use their language to inhibit their emotions and behaviors. Yet, to date, there has been a critical lack of evidence regarding what patterns in their speech children use during these moments of frustration. In this paper, eXtreme Gradient Boosting, Random Forest, Long Short-Term Memory Recurrent Neural Networks, and Elastic N…
▽ More
When faced with self-regulation challenges, children have been known the use their language to inhibit their emotions and behaviors. Yet, to date, there has been a critical lack of evidence regarding what patterns in their speech children use during these moments of frustration. In this paper, eXtreme Gradient Boosting, Random Forest, Long Short-Term Memory Recurrent Neural Networks, and Elastic Net Regression, have all been used to forecast these language patterns in children. Based on the results of a comparative analysis between these methods, the study reveals that when dealing with high-dimensional and dense data, with very irregular and abnormal distributions, as is the case with self-regulation patterns in children, decision tree-based algorithms are able to outperform traditional regression and neural network methods in their shortcomings.
△ Less
Submitted 29 October, 2021;
originally announced October 2021.
-
Creutzfeldt-Jakob Disease Prediction Using Machine Learning Techniques
Authors:
Arnav Bhakta,
Carolyn Byrne
Abstract:
Creutzfeldt-Jakob disease (CJD) is a rapidly progressive and fatal neurodegenerative disease, that causes approximately 350 deaths in the United States every year. In specific, it is a prion disease that is caused by a misfolded prion protein, termed $PrP^{Sc}$, which is the infectious form of the prion protein $PrP^{C}$. Rather than being recycled by the body, the $PrP^{Sc}$ aggregates in the bra…
▽ More
Creutzfeldt-Jakob disease (CJD) is a rapidly progressive and fatal neurodegenerative disease, that causes approximately 350 deaths in the United States every year. In specific, it is a prion disease that is caused by a misfolded prion protein, termed $PrP^{Sc}$, which is the infectious form of the prion protein $PrP^{C}$. Rather than being recycled by the body, the $PrP^{Sc}$ aggregates in the brain as plaques, leading to neurodegeneration of surrounding cells and the spongiform characteristics of the pathology. However, there has been very little research done into factors that can affect one's chances of acquiring $PrP^{Sc}$. In this paper, Elastic Net Regression, Long Short-Term Memory Recurrent Neural Network Architectures, and Random Forest have been used to predict Creutzfeldt-Jakob Disease Levels in the United States. New variables were created as data for the models to use on the basis of common factors that are known to affect CJD, such as soil, food, and water quality. Based on the root mean square error (RMSE), mean bias error (MBE), and mean absolute error (MAE) values, the study reveals the high impact of unhealthy lifestyle choices, CO$_{2}$ Levels, Pesticide Usage, and Potash K$_{2}$O Usage on CJD Levels. In doing so, the study highlights new avenues of research for CJD prevention and detection, as well as potential causes.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.