Search | arXiv e-print repository

Understanding and Modeling the Dynamics of Storm-time Atmospheric Neutral Density using Random Forests

Authors: Kyle R. Murphy, Alexa J. Halford, Vivian Liu, Jeffery Klenzing, Jonathon Smith, Katherine Garcia-Sage, Joshua Pettit, I. Jonathan Rae

Abstract: Atmospheric neutral density is a crucial component to accurately predict and track the motion of satellites. During periods of elevated solar and geomagnetic activity atmospheric neutral density becomes highly variable and dynamic. This variability and enhanced dynamics make it difficult to accurately model neutral density leading to increased errors which propagate from neutral density models thr… ▽ More Atmospheric neutral density is a crucial component to accurately predict and track the motion of satellites. During periods of elevated solar and geomagnetic activity atmospheric neutral density becomes highly variable and dynamic. This variability and enhanced dynamics make it difficult to accurately model neutral density leading to increased errors which propagate from neutral density models through to orbit propagation models. In this paper we investigate the dynamics of neutral density during geomagnetic storms. We use a combination of solar and geomagnetic variables to develop three Random Forest machine learning models of neutral density. These models are based on (1) slow solar indices, (2) high cadence solar irradiance, and (3) combined high-cadence solar irradiance and geomagnetic indices. Each model is validated using an out-of-sample dataset using analysis of residuals and typical metrics. During quiet-times, all three models perform well; however, during geomagnetic storms, the combined high cadence solar irradiance/geomagnetic model performs significantly better than the models based solely on solar activity. The combined model capturing an additional 10\% in the variability of density and having an error up to six times smaller during geomagnetic storms then the solar models. Overall, this work demonstrates the importance of including geomagnetic activity in the modeling of atmospheric density and serves as a proof of concept for using machine learning algorithms to model, and in the future forecast atmospheric density for operational use. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Submitted for publication to Space Weather

arXiv:2406.19635 [pdf, other]

Model Predictive Simulation Using Structured Graphical Models and Transformers

Authors: Xinghua Lou, Meet Dave, Shrinu Kushagra, Miguel Lazaro-Gredilla, Kevin Murphy

Abstract: We propose an approach to simulating trajectories of multiple interacting agents (road users) based on transformers and probabilistic graphical models (PGMs), and apply it to the Waymo SimAgents challenge. The transformer baseline is based on the MTR model, which predicts multiple future trajectories conditioned on the past trajectories and static road layout features. We then improve upon these g… ▽ More We propose an approach to simulating trajectories of multiple interacting agents (road users) based on transformers and probabilistic graphical models (PGMs), and apply it to the Waymo SimAgents challenge. The transformer baseline is based on the MTR model, which predicts multiple future trajectories conditioned on the past trajectories and static road layout features. We then improve upon these generated trajectories using a PGM, which contains factors which encode prior knowledge, such as a preference for smooth trajectories, and avoidance of collisions with static obstacles and other moving agents. We perform (approximate) MAP inference in this PGM using the Gauss-Newton method. Finally we sample $K=32$ trajectories for each of the $N \sim 100$ agents for the next $T=8 Δ$ time steps, where $Δ=10$ is the sampling rate per second. Following the Model Predictive Control (MPC) paradigm, we only return the first element of our forecasted trajectories at each step, and then we replan, so that the simulation can constantly adapt to its changing environment. We therefore call our approach "Model Predictive Simulation" or MPS. We show that MPS improves upon the MTR baseline, especially in safety critical metrics such as collision rate. Furthermore, our approach is compatible with any underlying forecasting model, and does not require extra training, so we believe it is a valuable contribution to the community. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: Special Mention at the Waymo Sim Agents Challenge 2024

arXiv:2406.17863 [pdf, other]

What type of inference is planning?

Authors: Miguel Lázaro-Gredilla, Li Yang Ku, Kevin P. Murphy, Dileep George

Abstract: Multiple types of inference are available for probabilistic graphical models, e.g., marginal, maximum-a-posteriori, and even marginal maximum-a-posteriori. Which one do researchers mean when they talk about "planning as inference"? There is no consistency in the literature, different types are used, and their ability to do planning is further entangled with specific approximations or additional co… ▽ More Multiple types of inference are available for probabilistic graphical models, e.g., marginal, maximum-a-posteriori, and even marginal maximum-a-posteriori. Which one do researchers mean when they talk about "planning as inference"? There is no consistency in the literature, different types are used, and their ability to do planning is further entangled with specific approximations or additional constraints. In this work we use the variational framework to show that all commonly used types of inference correspond to different weightings of the entropy terms in the variational problem, and that planning corresponds _exactly_ to a _different_ set of weights. This means that all the tricks of variational inference are readily applicable to planning. We develop an analogue of loopy belief propagation that allows us to perform approximate planning in factored state Markov decisions processes without incurring intractability due to the exponentially large state space. The variational perspective shows that the previous types of inference for planning are only adequate in environments with low stochasticity, and allows us to characterize each type by its own merits, disentangling the type of inference from the additional approximations that its practical use requires. We validate these results empirically on synthetic MDPs and tasks posed in the International Planning Competition. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.09902 [pdf]

Flight-Scope: microscopy with microfluidics in microgravity

Authors: Thomas Wareing, Alexander Stokes, Katrina Crompton, Koren Murphy, Jack Dawson, Yusuf Furkan Ugurluoglu, Connor Richardson, Hongquan Li, Manu Prakash, Adam J. M. Wollman

Abstract: With the European Space Agency (ESA) and NASA working to return humans to the moon and onwards to Mars, it has never been more important to study the impact of altered gravity conditions on biological organisms. These include astronauts but also useful micro-organisms they may bring with them to produce food, medicine, and other useful compounds by synthetic biology. Parabolic flights are one of t… ▽ More With the European Space Agency (ESA) and NASA working to return humans to the moon and onwards to Mars, it has never been more important to study the impact of altered gravity conditions on biological organisms. These include astronauts but also useful micro-organisms they may bring with them to produce food, medicine, and other useful compounds by synthetic biology. Parabolic flights are one of the most accessible microgravity research platforms but present their own challenges: relatively short periods of altered gravity (~20s) and aircraft vibration. Live-imaging is necessary in these altered-gravity conditions to readout any real-time phenotypes. Here we present Flight-Scope, a new microscopy and microfluidics platform to study dynamic cellular processes during the short, altered gravity periods on parabolic flights. We demonstrated Flight-Scopes capability by performing live and dynamic imaging of fluorescent glucose uptake by yeast, S. cerevisiae, on board an ESA parabolic flight. Flight-Scope operated well in this challenging environment, opening the way for future microgravity experiments on biological organisms. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 22 pages, 5 figures

arXiv:2405.21042 [pdf, other]

Comparing information content of representation spaces for disentanglement with VAE ensembles

Authors: Kieran A. Murphy, Sam Dillavou, Dani S. Bassett

Abstract: Disentanglement is the endeavour to use machine learning to divide information about a dataset into meaningful fragments. In practice these fragments are representation (sub)spaces, often the set of channels in the latent space of a variational autoencoder (VAE). Assessments of disentanglement predominantly employ metrics that are coarse-grained at the model level, but this approach can obscure mu… ▽ More Disentanglement is the endeavour to use machine learning to divide information about a dataset into meaningful fragments. In practice these fragments are representation (sub)spaces, often the set of channels in the latent space of a variational autoencoder (VAE). Assessments of disentanglement predominantly employ metrics that are coarse-grained at the model level, but this approach can obscure much about the process of information fragmentation. Here we propose to study the learned channels in aggregate, as the fragments of information learned by an ensemble of repeat training runs. Additionally, we depart from prior work where measures of similarity between individual subspaces neglected the nature of data embeddings as probability distributions. Instead, we view representation subspaces as communication channels that perform a soft clustering of the data; consequently, we generalize two classic information-theoretic measures of similarity between clustering assignments to compare representation spaces. We develop a lightweight method of estimation based on fingerprinting representation subspaces by their ability to distinguish dataset samples, allowing us to identify, analyze, and leverage meaningful structure in ensembles of VAEs trained on synthetic and natural datasets. Using this fully unsupervised pipeline we identify "hotspots" in the space of information fragments: groups of nearly identical representation subspaces that appear repeatedly in an ensemble of VAEs, particularly as regularization is increased. Finally, we leverage the proposed methodology to achieve ensemble learning with VAEs, boosting the information content of a set of weak learners -- a capability not possible with previous methods of assessing channel similarity. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: Code: https://github.com/murphyka/representation-space-info-comparison

arXiv:2405.19681 [pdf, other]

Bayesian Online Natural Gradient (BONG)

Authors: Matt Jones, Peter Chang, Kevin Murphy

Abstract: We propose a novel approach to sequential Bayesian inference based on variational Bayes. The key insight is that, in the online setting, we do not need to add the KL term to regularize to the prior (which comes from the posterior at the previous timestep); instead we can optimize just the expected log-likelihood, performing a single step of natural gradient descent starting at the prior predictive… ▽ More We propose a novel approach to sequential Bayesian inference based on variational Bayes. The key insight is that, in the online setting, we do not need to add the KL term to regularize to the prior (which comes from the posterior at the previous timestep); instead we can optimize just the expected log-likelihood, performing a single step of natural gradient descent starting at the prior predictive. We prove this method recovers exact Bayesian inference if the model is conjugate, and empirically outperforms other online VB methods in the non-conjugate setting, such as online learning for neural networks, especially when controlling for computational costs. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 41 pages, 11 figures

arXiv:2405.16852 [pdf, other]

EM Distillation for One-step Diffusion Models

Authors: Sirui Xie, Zhisheng Xiao, Diederik P Kingma, Tingbo Hou, Ying Nian Wu, Kevin Patrick Murphy, Tim Salimans, Ben Poole, Ruiqi Gao

Abstract: While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Disti… ▽ More While diffusion models can learn complex distributions, sampling requires a computationally expensive iterative process. Existing distillation methods enable efficient sampling, but have notable limitations, such as performance degradation with very few sampling steps, reliance on training data access, or mode-seeking optimization that may fail to capture the full distribution. We propose EM Distillation (EMD), a maximum likelihood-based approach that distills a diffusion model to a one-step generator model with minimal loss of perceptual quality. Our approach is derived through the lens of Expectation-Maximization (EM), where the generator parameters are updated using samples from the joint distribution of the diffusion teacher prior and inferred generator latents. We develop a reparametrized sampling scheme and a noise cancellation technique that together stabilizes the distillation process. We further reveal an interesting connection of our method with existing methods that minimize mode-seeking KL. EMD outperforms existing one-step generative methods in terms of FID scores on ImageNet-64 and ImageNet-128, and compares favorably with prior work on distilling text-to-image diffusion models. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.05646 [pdf, other]

Outlier-robust Kalman Filtering through Generalised Bayes

Authors: Gerardo Duran-Martin, Matias Altamirano, Alexander Y. Shestopaloff, Leandro Sánchez-Betancourt, Jeremias Knoblauch, Matt Jones, François-Xavier Briol, Kevin Murphy

Abstract: We derive a novel, provably robust, and closed-form Bayesian update rule for online filtering in state-space models in the presence of outliers and misspecified measurement models. Our method combines generalised Bayesian inference with filtering methods such as the extended and ensemble Kalman filter. We use the former to show robustness and the latter to ensure computational efficiency in the ca… ▽ More We derive a novel, provably robust, and closed-form Bayesian update rule for online filtering in state-space models in the presence of outliers and misspecified measurement models. Our method combines generalised Bayesian inference with filtering methods such as the extended and ensemble Kalman filter. We use the former to show robustness and the latter to ensure computational efficiency in the case of nonlinear models. Our method matches or outperforms other robust filtering methods (such as those based on variational Bayes) at a much lower computational cost. We show this empirically on a range of filtering problems with outlier measurements, such as object tracking, state estimation in high-dimensional chaotic systems, and online learning of neural networks. △ Less

Submitted 28 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 41st International Conference on Machine Learning (ICML 2024)

arXiv:2404.02228 [pdf, other]

Seemingly unrelated Bayesian additive regression trees for cost-effectiveness analyses in healthcare

Authors: Jonas Esser, Mateus Maia, Andrew C. Parnell, Judith Bosmans, Hanneke van Dongen, Thomas Klausch, Keefe Murphy

Abstract: In recent years, theoretical results and simulation evidence have shown Bayesian additive regression trees to be a highly-effective method for nonparametric regression. Motivated by cost-effectiveness analyses in health economics, where interest lies in jointly modelling the costs of healthcare treatments and the associated health-related quality of life experienced by a patient, we propose a mult… ▽ More In recent years, theoretical results and simulation evidence have shown Bayesian additive regression trees to be a highly-effective method for nonparametric regression. Motivated by cost-effectiveness analyses in health economics, where interest lies in jointly modelling the costs of healthcare treatments and the associated health-related quality of life experienced by a patient, we propose a multivariate extension of BART applicable in regression and classification analyses with several correlated outcome variables. Our framework overcomes some key limitations of existing multivariate BART models by allowing each individual response to be associated with different ensembles of trees, while still handling dependencies between the outcomes. In the case of continuous outcomes, our model is essentially a nonparametric version of seemingly unrelated regression. Likewise, our proposal for binary outcomes is a nonparametric generalisation of the multivariate probit model. We give suggestions for easily interpretable prior distributions, which allow specification of both informative and uninformative priors. We provide detailed discussions of MCMC sampling methods to conduct posterior inference. Our methods are implemented in the R package `suBART'. We showcase their performance through extensive simulations and an application to an empirical case study from health economics. By also accommodating propensity scores in a manner befitting a causal analysis, we find substantial evidence for a novel trauma care intervention's cost-effectiveness. △ Less

Submitted 10 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.13124 [pdf, other]

Cooperative Modular Manipulation with Numerous Cable-Driven Robots for Assistive Construction and Gap Crossing

Authors: Kevin Murphy, Joao C. V. Soares, Justin K. Yim, Dustin Nottage, Ahmet Soylemezoglu, Joao Ramos

Abstract: Soldiers in the field often need to cross negative obstacles, such as rivers or canyons, to reach goals or safety. Military gap crossing involves on-site temporary bridges construction. However, this procedure is conducted with dangerous, time and labor intensive operations, and specialized machinery. We envision a scalable robotic solution inspired by advancements in force-controlled and Cable Dr… ▽ More Soldiers in the field often need to cross negative obstacles, such as rivers or canyons, to reach goals or safety. Military gap crossing involves on-site temporary bridges construction. However, this procedure is conducted with dangerous, time and labor intensive operations, and specialized machinery. We envision a scalable robotic solution inspired by advancements in force-controlled and Cable Driven Parallel Robots (CDPRs); this solution can address the challenges inherent in this transportation problem, achieving fast, efficient, and safe deployment and field operations. We introduce the embodied vision in Co3MaNDR, a solution to the military gap crossing problem, a distributed robot consisting of several modules simultaneously pulling on a central payload, controlling the cables' tensions to achieve complex objectives, such as precise trajectory tracking or force amplification. Hardware experiments demonstrate teleoperation of a payload, trajectory following, and the sensing and amplification of operators' applied physical forces during slow operations. An operator was shown to manipulate a 27.2 kg (60 lb) payload with an average force utilization of 14.5\% of its weight. Results indicate that the system can be scaled up to heavier payloads without compromising performance or introducing superfluous complexity. This research lays a foundation to expand CDPR technology to uncoordinated and unstable mobile platforms in unknown environments. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Comments: 8 pages, 9 figures. Submit to IROS 2024

arXiv:2402.10939 [pdf, other]

Mechanical prions: Self-assembling microstructures

Authors: Mathieu Ouellet, Dani S. Bassett, Lee C. Bassett, Kieran A. Murphy, Shubhankar P. Patankar

Abstract: Prions are misfolded proteins that transmit their structural arrangement to neighboring proteins. In biological systems, prion dynamics can produce a variety of complex functional outcomes. Yet, an understanding of prionic causes has been hampered by the fact that few computational models exist that allow for experimental design, hypothesis testing, and control. Here, we identify essential prionic… ▽ More Prions are misfolded proteins that transmit their structural arrangement to neighboring proteins. In biological systems, prion dynamics can produce a variety of complex functional outcomes. Yet, an understanding of prionic causes has been hampered by the fact that few computational models exist that allow for experimental design, hypothesis testing, and control. Here, we identify essential prionic properties and present a biologically inspired model of prions using simple mechanical structures capable of undergoing complex conformational change. We demonstrate the utility of our approach by designing a prototypical mechanical prion and validating its properties experimentally. Our work provides a design framework for harnessing and manipulating prionic properties in natural and artificial systems. △ Less

Submitted 21 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: Added supplements, 25 pages, 11 figures

arXiv:2402.10797 [pdf, other]

BlackJAX: Composable Bayesian inference in JAX

Authors: Alberto Cabezas, Adrien Corenflos, Junpeng Lao, Rémi Louf, Antoine Carnec, Kaustubh Chaudhari, Reuben Cohn-Gordon, Jeremie Coullon, Wei Deng, Sam Duffield, Gerardo Durán-Martín, Marcin Elantkowski, Dan Foreman-Mackey, Michele Gregori, Carlos Iguaran, Ravin Kumar, Martin Lysy, Kevin Murphy, Juan Camilo Orduz, Karm Patel, Xi Wang, Rob Zinkov

Abstract: BlackJAX is a library implementing sampling and variational inference algorithms commonly used in Bayesian computation. It is designed for ease of use, speed, and modularity by taking a functional approach to the algorithms' implementation. BlackJAX is written in Python, using JAX to compile and run NumpPy-like samplers and variational methods on CPUs, GPUs, and TPUs. The library integrates well w… ▽ More BlackJAX is a library implementing sampling and variational inference algorithms commonly used in Bayesian computation. It is designed for ease of use, speed, and modularity by taking a functional approach to the algorithms' implementation. BlackJAX is written in Python, using JAX to compile and run NumpPy-like samplers and variational methods on CPUs, GPUs, and TPUs. The library integrates well with probabilistic programming languages by working directly with the (un-normalized) target log density function. BlackJAX is intended as a collection of low-level, composable implementations of basic statistical 'atoms' that can be combined to perform well-defined Bayesian inference, but also provides high-level routines for ease of use. It is designed for users who need cutting-edge methods, researchers who want to create complex sampling methods, and people who want to learn how these work. △ Less

Submitted 22 February, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: Companion paper for the library https://github.com/blackjax-devs/blackjax Update: minor changes and updated the list of authors to include technical contributors

arXiv:2401.02192 [pdf]

Nodule detection and generation on chest X-rays: NODE21 Challenge

Authors: Ecem Sogancioglu, Bram van Ginneken, Finn Behrendt, Marcel Bengs, Alexander Schlaefer, Miron Radu, Di Xu, Ke Sheng, Fabien Scalzo, Eric Marcus, Samuele Papa, Jonas Teuwen, Ernst Th. Scholten, Steven Schalekamp, Nils Hendrix, Colin Jacobs, Ward Hendrix, Clara I Sánchez, Keelin Murphy

Abstract: Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of… ▽ More Pulmonary nodules may be an early manifestation of lung cancer, the leading cause of cancer-related deaths among both men and women. Numerous studies have established that deep learning methods can yield high-performance levels in the detection of lung nodules in chest X-rays. However, the lack of gold-standard public datasets slows down the progression of the research and prevents benchmarking of methods for this task. To address this, we organized a public research challenge, NODE21, aimed at the detection and generation of lung nodules in chest X-rays. While the detection track assesses state-of-the-art nodule detection systems, the generation track determines the utility of nodule generation algorithms to augment training data and hence improve the performance of the detection systems. This paper summarizes the results of the NODE21 challenge and performs extensive additional experiments to examine the impact of the synthetically generated nodule training images on the detection algorithm performance. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 15 pages, 5 figures

arXiv:2311.04896 [pdf, other]

Machine-learning optimized measurements of chaotic dynamical systems via the information bottleneck

Authors: Kieran A. Murphy, Dani S. Bassett

Abstract: Deterministic chaos permits a precise notion of a "perfect measurement" as one that, when obtained repeatedly, captures all of the information created by the system's evolution with minimal redundancy. Finding an optimal measurement is challenging, and has generally required intimate knowledge of the dynamics in the few cases where it has been done. We establish an equivalence between a perfect me… ▽ More Deterministic chaos permits a precise notion of a "perfect measurement" as one that, when obtained repeatedly, captures all of the information created by the system's evolution with minimal redundancy. Finding an optimal measurement is challenging, and has generally required intimate knowledge of the dynamics in the few cases where it has been done. We establish an equivalence between a perfect measurement and a variant of the information bottleneck. As a consequence, we can employ machine learning to optimize measurement processes that efficiently extract information from trajectory data. We obtain approximately optimal measurements for multiple chaotic maps and lay the necessary groundwork for efficient information extraction from general time series. △ Less

Submitted 19 March, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

Comments: Project page: https://distributed-information-bottleneck.github.io

arXiv:2310.07938 [pdf, other]

Discrete and continuous mathematical models of sharp-fronted collective cell migration and invasion

Authors: Matthew J Simpson, Keeley M Murphy, Scott W McCue, Pascal R Buenzli

Abstract: Mathematical models describing the spatial spreading and invasion of populations of biological cells are often developed in a continuum modelling framework using reaction-diffusion equations. While continuum models based on linear diffusion are routinely employed and known to capture key experimental observations, linear diffusion fails to predict well-defined sharp fronts that are often observed… ▽ More Mathematical models describing the spatial spreading and invasion of populations of biological cells are often developed in a continuum modelling framework using reaction-diffusion equations. While continuum models based on linear diffusion are routinely employed and known to capture key experimental observations, linear diffusion fails to predict well-defined sharp fronts that are often observed experimentally. This observation has motivated the use of nonlinear degenerate diffusion, however these nonlinear models and the associated parameters lack a clear biological motivation and interpretation. Here we take a different approach by develo** a stochastic discrete lattice-based model incorporating biologically-inspired mechanisms and then deriving the reaction-diffusion continuum limit. Inspired by experimental observations, agents in the simulation deposit extracellular material, that we call a substrate, locally onto the lattice, and the motility of agents is taken to be proportional to the substrate density. Discrete simulations that mimic a two--dimensional circular barrier assay illustrate how the discrete model supports both smooth and sharp-fronted density profiles depending on the rate of substrate deposition. Coarse-graining the discrete model leads to a novel partial differential equation (PDE) model whose solution accurately approximates averaged data from the discrete model. The new discrete model and PDE approximation provides a simple, biologically motivated framework for modelling the spreading, growth and invasion of cell populations with well-defined sharp fronts △ Less

Submitted 19 January, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: 47 Pages, 8 Figures

MSC Class: 92-10

arXiv:2309.08558 [pdf, other]

A modern approach to transition analysis and process mining with Markov models: A tutorial with R

Authors: Jouni Helske, Satu Helske, Mohammed Saqr, Sonsoles López-Pernas, Keefe Murphy

Abstract: This chapter presents an introduction to Markovian modeling for the analysis of sequence data. Contrary to the deterministic approach seen in the previous sequence analysis chapters, Markovian models are probabilistic models, focusing on the transitions between states instead of studying sequences as a whole. The chapter provides an introduction to this method and differentiates between its most c… ▽ More This chapter presents an introduction to Markovian modeling for the analysis of sequence data. Contrary to the deterministic approach seen in the previous sequence analysis chapters, Markovian models are probabilistic models, focusing on the transitions between states instead of studying sequences as a whole. The chapter provides an introduction to this method and differentiates between its most common variations: first-order Markov models, hidden Markov models, mixture Markov models, and mixture hidden Markov models. In addition to a thorough explanation and contextualization within the existing literature, the chapter provides a step-by-step tutorial on how to implement each type of Markovian model using the R package seqHMM. The chaper also provides a complete guide to performing stochastic process mining with Markovian models as well as plotting, comparing and clustering different process models. △ Less

Submitted 2 September, 2023; originally announced September 2023.

MSC Class: 60J10

arXiv:2307.04962 [pdf, other]

Intrinsically motivated graph exploration using network theories of human curiosity

Authors: Shubhankar P. Patankar, Mathieu Ouellet, Juan Cervino, Alejandro Ribeiro, Kieran A. Murphy, Dani S. Bassett

Abstract: Intrinsically motivated exploration has proven useful for reinforcement learning, even without additional extrinsic rewards. When the environment is naturally represented as a graph, how to guide exploration best remains an open question. In this work, we propose a novel approach for exploring graph-structured data motivated by two theories of human curiosity: the information gap theory and the co… ▽ More Intrinsically motivated exploration has proven useful for reinforcement learning, even without additional extrinsic rewards. When the environment is naturally represented as a graph, how to guide exploration best remains an open question. In this work, we propose a novel approach for exploring graph-structured data motivated by two theories of human curiosity: the information gap theory and the compression progress theory. The theories view curiosity as an intrinsic motivation to optimize for topological features of subgraphs induced by nodes visited in the environment. We use these proposed features as rewards for graph neural-network-based reinforcement learning. On multiple classes of synthetically generated graphs, we find that trained agents generalize to longer exploratory walks and larger environments than are seen during training. Our method computes more efficiently than the greedy evaluation of the relevant topological properties. The proposed intrinsic motivations bear particular relevance for recommender systems. We demonstrate that next-node recommendations considering curiosity are more predictive of human choices than PageRank centrality in several real-world graph environments. △ Less

Submitted 1 December, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

Comments: 15 pages, 5 figures in main text, and 18 pages, 9 figures in supplement

arXiv:2307.04755 [pdf, other]

doi 0.1073/pnas.2312988121

Information decomposition in complex systems via machine learning

Authors: Kieran A. Murphy, Dani S. Bassett

Abstract: One of the fundamental steps toward understanding a complex system is identifying variation at the scale of the system's components that is most relevant to behavior on a macroscopic scale. Mutual information provides a natural means of linking variation across scales of a system due to its independence of functional relationship between observables. However, characterizing the manner in which inf… ▽ More One of the fundamental steps toward understanding a complex system is identifying variation at the scale of the system's components that is most relevant to behavior on a macroscopic scale. Mutual information provides a natural means of linking variation across scales of a system due to its independence of functional relationship between observables. However, characterizing the manner in which information is distributed across a set of observables is computationally challenging and generally infeasible beyond a handful of measurements. Here we propose a practical and general methodology that uses machine learning to decompose the information contained in a set of measurements by jointly optimizing a lossy compression of each measurement. Guided by the distributed information bottleneck as a learning objective, the information decomposition identifies the variation in the measurements of the system state most relevant to specified macroscale behavior. We focus our analysis on two paradigmatic complex systems: a Boolean circuit and an amorphous material undergoing plastic deformation. In both examples, the large amount of entropy of the system state is decomposed, bit by bit, in terms of what is most related to macroscale behavior. The identification of meaningful variation in data, with the full generality brought by information theory, is made practical for studying the connection between micro- and macroscale structure in complex systems. △ Less

Submitted 18 March, 2024; v1 submitted 10 July, 2023; originally announced July 2023.

Comments: Project page: https://distributed-information-bottleneck.github.io/

Journal ref: PNAS 121 (2024) e2312988121

arXiv:2306.17842 [pdf, other]

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Authors: Lijun Yu, Yong Cheng, Zhiruo Wang, Vivek Kumar, Wolfgang Macherey, Yan** Huang, David A. Ross, Irfan Essa, Yonatan Bisk, Ming-Hsuan Yang, Kevin Murphy, Alexander G. Hauptmann, Lu Jiang

Abstract: In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos. SPAE converts between raw pixels and interpretable lexical tokens (or words) extracted from the LLM's vocabulary. The resulting tokens capture both the semantic meaning and the fine-grained details n… ▽ More In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling frozen LLMs to perform both understanding and generation tasks involving non-linguistic modalities such as images or videos. SPAE converts between raw pixels and interpretable lexical tokens (or words) extracted from the LLM's vocabulary. The resulting tokens capture both the semantic meaning and the fine-grained details needed for visual reconstruction, effectively translating the visual content into a language comprehensible to the LLM, and empowering it to perform a wide array of multimodal tasks. Our approach is validated through in-context learning experiments with frozen PaLM 2 and GPT 3.5 on a diverse set of image understanding and generation tasks. Our method marks the first successful attempt to enable a frozen LLM to generate image content while surpassing state-of-the-art performance in image understanding tasks, under the same setting, by over 25%. △ Less

Submitted 28 October, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

Comments: NeurIPS 2023 spotlight

arXiv:2306.06219 [pdf, other]

An introduction and tutorial to model-based clustering in education via Gaussian mixture modelling

Authors: Luca Scrucca, Mohammed Saqr, Sonsoles López-Pernas, Keefe Murphy

Abstract: Heterogeneity has been a hot topic in recent educational literature. Several calls have been voiced to adopt methods that capture different patterns or subgroups within students behavior or functioning. Assuming that there is an average pattern that represents the entirety of student populations requires the measured construct to have the same causal mechanism, same development pattern, and affect… ▽ More Heterogeneity has been a hot topic in recent educational literature. Several calls have been voiced to adopt methods that capture different patterns or subgroups within students behavior or functioning. Assuming that there is an average pattern that represents the entirety of student populations requires the measured construct to have the same causal mechanism, same development pattern, and affect students in exactly the same way. Using a person-centered method (Finite Gaussian mixture model or latent profile analysis), the present tutorial shows how to uncover the heterogeneity within engagement data by identifying three latent or unobserved clusters. This chapter offers an introduction to the model-based clustering that includes the principles of the methods, a guide to choice of number of clusters, evaluation of clustering results and a detailed guide with code and a real-life dataset. The discussion elaborates on the interpretation of the results, the advantages of model-based clustering as well as how it compares with other methods. △ Less

Submitted 9 June, 2023; originally announced June 2023.

MSC Class: 62H30

arXiv:2305.19535 [pdf, other]

Low-rank extended Kalman filtering for online learning of neural networks from streaming data

Authors: Peter G. Chang, Gerardo Durán-Martín, Alexander Y Shestopaloff, Matt Jones, Kevin Murphy

Abstract: We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream. The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior precision matrix, which gives a cost per step which is linear in the number of model parameters. In… ▽ More We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream. The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior precision matrix, which gives a cost per step which is linear in the number of model parameters. In contrast to methods based on stochastic variational inference, our method is fully deterministic, and does not require step-size tuning. We show experimentally that this results in much faster (more sample efficient) learning, which results in more rapid adaptation to changing distributions, and faster accumulation of reward when used as part of a contextual bandit algorithm. △ Less

Submitted 27 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

Journal ref: COLLAS conference 2023

arXiv:2304.10974 [pdf]

The Effect of Particle Size and Concentration on Low-Frequency Terahertz Scattering in Granular Compacts

Authors: Keir N Murphy, Mira Naftaly, Alison Nordon, Daniel Markl

Abstract: Fundamental knowledge of scattering in granular compacts is essential to ensure accuracy of spectroscopic measurements and determine material characteristics such as size and shape of scattering objects. Terahertz time-domain spectroscopy (THz-TDS) was employed to investigate the effect of particle size and concentration on scattering in specially fabricated compacts consisting of borosilicate mic… ▽ More Fundamental knowledge of scattering in granular compacts is essential to ensure accuracy of spectroscopic measurements and determine material characteristics such as size and shape of scattering objects. Terahertz time-domain spectroscopy (THz-TDS) was employed to investigate the effect of particle size and concentration on scattering in specially fabricated compacts consisting of borosilicate microspheres in a polytetrafluoroethylene (PTFE) matrix. As expected, increasing particle size leads to an increase in overall scattering contribution. At low concentrations, the scattering contribution increases linearly with concentration. Scattering increases linearly at low concentrations, saturates at higher concentrations with a maximum level depending on particle size, and that the onset of saturation is independent of particle size. The effective refractive index becomes sublinear at high particle concentrations and exceeds the linear model at maximum density, which can cause errors in calculations based on it, such as porosity. The observed phenomena are attributed to the change in the fraction of photons propagating ballistically versus being scattered. At low concentrations, photons travel predominately ballistically through the PTFE matrix. At high concentrations, the photons again propagate ballistically through adjacent glass microspheres. In the intermediate regime, photons are predominately scattered. △ Less

Submitted 21 April, 2023; originally announced April 2023.

arXiv:2301.00704 [pdf, other]

Muse: Text-To-Image Generation via Masked Generative Transformers

Authors: Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T. Freeman, Michael Rubinstein, Yuanzhen Li, Dilip Krishnan

Abstract: We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. C… ▽ More We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io △ Less

Submitted 2 January, 2023; originally announced January 2023.

arXiv:2211.17264 [pdf, other]

Interpretability with full complexity by constraining feature information

Authors: Kieran A. Murphy, Dani S. Bassett

Abstract: Interpretability is a pressing issue for machine learning. Common approaches to interpretable machine learning constrain interactions between features of the input, rendering the effects of those features on a model's output comprehensible but at the expense of model complexity. We approach interpretability from a new angle: constrain the information about the features without restricting the comp… ▽ More Interpretability is a pressing issue for machine learning. Common approaches to interpretable machine learning constrain interactions between features of the input, rendering the effects of those features on a model's output comprehensible but at the expense of model complexity. We approach interpretability from a new angle: constrain the information about the features without restricting the complexity of the model. Borrowing from information theory, we use the Distributed Information Bottleneck to find optimal compressions of each feature that maximally preserve information about the output. The learned information allocation, by feature and by feature value, provides rich opportunities for interpretation, particularly in problems with many features and complex feature interactions. The central object of analysis is not a single trained model, but rather a spectrum of models serving as approximations that leverage variable amounts of information about the inputs. Information is allocated to features by their relevance to the output, thereby solving the problem of feature selection by constructing a learned continuum of feature inclusion-to-exclusion. The optimal compression of each feature -- at every stage of approximation -- allows fine-grained inspection of the distinctions among feature values that are most impactful for prediction. We develop a framework for extracting insight from the spectrum of approximate models and demonstrate its utility on a range of tabular datasets. △ Less

Submitted 30 November, 2022; originally announced November 2022.

Comments: project page: https://distributed-information-bottleneck.github.io

Journal ref: ICLR 2023

arXiv:2211.15646 [pdf, other]

Beyond Invariance: Test-Time Label-Shift Adaptation for Distributions with "Spurious" Correlations

Authors: Qingyao Sun, Kevin Murphy, Sayna Ebrahimi, Alexander D'Amour

Abstract: Changes in the data distribution at test time can have deleterious effects on the performance of predictive models $p(y|x)$. We consider situations where there are additional meta-data labels (such as group labels), denoted by $z$, that can account for such changes in the distribution. In particular, we assume that the prior distribution $p(y, z)$, which models the dependence between the class lab… ▽ More Changes in the data distribution at test time can have deleterious effects on the performance of predictive models $p(y|x)$. We consider situations where there are additional meta-data labels (such as group labels), denoted by $z$, that can account for such changes in the distribution. In particular, we assume that the prior distribution $p(y, z)$, which models the dependence between the class label $y$ and the "nuisance" factors $z$, may change across domains, either due to a change in the correlation between these terms, or a change in one of their marginals. However, we assume that the generative model for features $p(x|y,z)$ is invariant across domains. We note that this corresponds to an expanded version of the widely used "label shift" assumption, where the labels now also include the nuisance factors $z$. Based on this observation, we propose a test-time label shift correction that adapts to changes in the joint distribution $p(y, z)$ using EM applied to unlabeled samples from the target domain distribution, $p_t(x)$. Importantly, we are able to avoid fitting a generative model $p(x|y, z)$, and merely need to reweight the outputs of a discriminative model $p_s(y, z|x)$ trained on the source distribution. We evaluate our method, which we call "Test-Time Label-Shift Adaptation" (TTLSA), on several standard image and text datasets, as well as the CheXpert chest X-ray dataset, and show that it improves performance over methods that target invariance to changes in the distribution, as well as baseline empirical risk minimization methods. Code for reproducing experiments is available at https://github.com/nalzok/test-time-label-shift . △ Less

Submitted 28 November, 2023; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: 24 pages, 7 figures

arXiv:2210.14220 [pdf, other]

Characterizing information loss in a chaotic double pendulum with the Information Bottleneck

Authors: Kieran A. Murphy, Dani S. Bassett

Abstract: A hallmark of chaotic dynamics is the loss of information with time. Although information loss is often expressed through a connection to Lyapunov exponents -- valid in the limit of high information about the system state -- this picture misses the rich spectrum of information decay across different levels of granularity. Here we show how machine learning presents new opportunities for the study o… ▽ More A hallmark of chaotic dynamics is the loss of information with time. Although information loss is often expressed through a connection to Lyapunov exponents -- valid in the limit of high information about the system state -- this picture misses the rich spectrum of information decay across different levels of granularity. Here we show how machine learning presents new opportunities for the study of information loss in chaotic dynamics, with a double pendulum serving as a model system. We use the Information Bottleneck as a training objective for a neural network to extract information from the state of the system that is optimally predictive of the future state after a prescribed time horizon. We then decompose the optimally predictive information by distributing a bottleneck to each state variable, recovering the relative importance of the variables in determining future evolution. The framework we develop is broadly applicable to chaotic systems and pragmatic to apply, leveraging data and machine learning to monitor the limits of predictability and map out the loss of information. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022 workshop paper (Machine learning and the physical sciences); project page: distributed-information-bottleneck.github.io

arXiv:2210.10964 [pdf, other]

Uncertainty Disentanglement with Non-stationary Heteroscedastic Gaussian Processes for Active Learning

Authors: Zeel B Patel, Nipun Batra, Kevin Murphy

Abstract: Gaussian processes are Bayesian non-parametric models used in many areas. In this work, we propose a Non-stationary Heteroscedastic Gaussian process model which can be learned with gradient-based techniques. We demonstrate the interpretability of the proposed model by separating the overall uncertainty into aleatoric (irreducible) and epistemic (model) uncertainty. We illustrate the usability of d… ▽ More Gaussian processes are Bayesian non-parametric models used in many areas. In this work, we propose a Non-stationary Heteroscedastic Gaussian process model which can be learned with gradient-based techniques. We demonstrate the interpretability of the proposed model by separating the overall uncertainty into aleatoric (irreducible) and epistemic (model) uncertainty. We illustrate the usability of derived epistemic uncertainty on active learning problems. We demonstrate the efficacy of our model with various ablations on multiple datasets. △ Less

Submitted 19 October, 2022; originally announced October 2022.

Comments: Accepted at NeurIPS Workshop on Gaussian Processes, Spatiotemporal Modeling, and Decision-making Systems, 2023

arXiv:2207.10342 [pdf, ps, other]

Language Model Cascades

Authors: David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-dickstein, Kevin Murphy, Charles Sutton

Abstract: Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with cont… ▽ More Prompted models have demonstrated impressive few-shot learning abilities. Repeated interactions at test-time with a single model, or the composition of multiple models together, further expands capabilities. These compositions are probabilistic models, and may be expressed in the language of graphical models with random variables whose values are complex data types such as strings. Cases with control flow and dynamic structure require techniques from probabilistic programming, which allow implementing disparate model structures and inference strategies in a unified language. We formalize several existing techniques from this perspective, including scratchpads / chain of thought, verifiers, STaR, selection-inference, and tool use. We refer to the resulting programs as language model cascades. △ Less

Submitted 28 July, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

Comments: Presented as spotlight at the Beyond Bases workshop at ICML 2022 (https://beyond-bayes.github.io)

arXiv:2207.07411 [pdf, other]

Plex: Towards Reliability using Pretrained Large Model Extensions

Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive performance but also performs well consistently over many decision-making tasks involving uncertainty (e.g., selective prediction, open set recognition), robust generalization (e.g., accuracy and proper scoring rules such as log-likelihood on in- and out-of-distribution datasets), and adaptation (e.g., active learning, few-shot uncertainty). We devise 10 types of tasks over 40 datasets in order to evaluate different aspects of reliability on both vision and language domains. To improve reliability, we developed ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol as it improves the out-of-the-box performance and does not require designing scores or tuning the model for each task. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples. We also demonstrate Plex's capabilities on challenging tasks including zero-shot open set recognition, active learning, and uncertainty in conversational language understanding. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: Code available at https://goo.gle/plex-code

arXiv:2205.11232 [pdf, other]

Deep Neural Network approaches for Analysing Videos of Music Performances

Authors: Foteini Simistira Liwicki, Richa Upadhyay, Prakash Chandra Chhipa, Killian Murphy, Federico Visi, Stefan Östersjö, Marcus Liwicki

Abstract: This paper presents a framework to automate the labelling process for gestures in musical performance videos with a 3D Convolutional Neural Network (CNN). While this idea was proposed in a previous study, this paper introduces several novelties: (i) Presents a novel method to overcome the class imbalance challenge and make learning possible for co-existent gestures by batch balancing approach and… ▽ More This paper presents a framework to automate the labelling process for gestures in musical performance videos with a 3D Convolutional Neural Network (CNN). While this idea was proposed in a previous study, this paper introduces several novelties: (i) Presents a novel method to overcome the class imbalance challenge and make learning possible for co-existent gestures by batch balancing approach and spatial-temporal representations of gestures. (ii) Performs a detailed study on 7 and 18 categories of gestures generated during the performance (guitar play) of musical pieces that have been video-recorded. (iii) Investigates the possibility to use audio features. (iv) Extends the analysis to multiple videos. The novel methods significantly improve the performance of gesture identification by 12 %, when compared to the previous work (51 % in this study over 39 % in previous work). We successfully validate the proposed methods on 7 super classes (72 %), an ensemble of the 18 gestures/classes, and additional videos (75 %). △ Less

Submitted 24 May, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

arXiv:2204.07576 [pdf, other]

The Distributed Information Bottleneck reveals the explanatory structure of complex systems

Authors: Kieran A. Murphy, Dani S. Bassett

Abstract: The fruits of science are relationships made comprehensible, often by way of approximation. While deep learning is an extremely powerful way to find relationships in data, its use in science has been hindered by the difficulty of understanding the learned relationships. The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an ou… ▽ More The fruits of science are relationships made comprehensible, often by way of approximation. While deep learning is an extremely powerful way to find relationships in data, its use in science has been hindered by the difficulty of understanding the learned relationships. The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an output in terms of a trade-off between the fidelity and complexity of approximations to the relationship. Here we show that a crucial modification -- distributing bottlenecks across multiple components of the input -- opens fundamentally new avenues for interpretable deep learning in science. The Distributed Information Bottleneck throttles the downstream complexity of interactions between the components of the input, deconstructing a relationship into meaningful approximations found through deep learning without requiring custom-made datasets or neural network architectures. Applied to a complex system, the approximations illuminate aspects of the system's nature by restricting -- and monitoring -- the information about different components incorporated into the approximation. We demonstrate the Distributed IB's explanatory utility in systems drawn from applied mathematics and condensed matter physics. In the former, we deconstruct a Boolean circuit into approximations that isolate the most informative subsets of input components without requiring exhaustive search. In the latter, we localize information about future plastic rearrangement in the static structure of a sheared glass, and find the information to be more or less diffuse depending on the system's preparation. By way of a principled scheme of approximations, the Distributed IB brings much-needed interpretability to deep learning and enables unprecedented analysis of information flow through a system. △ Less

Submitted 15 April, 2022; originally announced April 2022.

arXiv:2204.02112 [pdf, other]

GP-BART: a novel Bayesian additive regression trees approach using Gaussian processes

Authors: Mateus Maia, Keefe Murphy, Andrew C. Parnell

Abstract: The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines "weak" tree models through a set of shrinkage priors, whereby each tree explains a small portion of the variability in the data. However, the lack of smoothness an… ▽ More The Bayesian additive regression trees (BART) model is an ensemble method extensively and successfully used in regression tasks due to its consistently strong predictive performance and its ability to quantify uncertainty. BART combines "weak" tree models through a set of shrinkage priors, whereby each tree explains a small portion of the variability in the data. However, the lack of smoothness and the absence of an explicit covariance structure over the observations in standard BART can yield poor performance in cases where such assumptions would be necessary. The Gaussian processes Bayesian additive regression trees (GP-BART) model is an extension of BART which addresses this limitation by assuming Gaussian process (GP) priors for the predictions of each terminal node among all trees. The model's effectiveness is demonstrated through applications to simulated and real-world data, surpassing the performance of traditional modeling approaches in various scenarios. △ Less

Submitted 14 September, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

arXiv:2203.03558 [pdf, other]

Hands-free Telelocomotion of a Wheeled Humanoid toward Dynamic Mobile Manipulation via Teleoperation

Authors: Amartya Purushottam, Yeongtae Jung, Kevin Murphy, Donghoon Baek, Joao Ramos

Abstract: Robotic systems that can dynamically combine manipulation and locomotion could facilitate dangerous or physically demanding labor. For instance, firefighter humanoid robots could leverage their body by leaning against collapsed building rubble to push it aside. Here we introduce a teleoperation system that targets the realization of these tasks using human whole-body motor skills. We describe a ne… ▽ More Robotic systems that can dynamically combine manipulation and locomotion could facilitate dangerous or physically demanding labor. For instance, firefighter humanoid robots could leverage their body by leaning against collapsed building rubble to push it aside. Here we introduce a teleoperation system that targets the realization of these tasks using human whole-body motor skills. We describe a new wheeled humanoid platform, SATYRR, and a novel hands-free teleoperation architecture using a whole-body Human Machine Interface (HMI). This system enables telelocomotion of the humanoid robot using the operator body motion, freeing their arms for manipulation tasks. In this study we evaluate the efficacy of the proposed system on hardware, and explore the control of SATYRR using two teleoperation map**s that map the operators body pitch and twist to the robot velocity or acceleration. Through experiments and user feedback we showcase our preliminary findings of the pilot-system response. Results suggest that the HMI is capable of effectively telelocomoting SATYRR, that pilot preferences should dictate the appropriate motion map** and gains, and finally that the pilot can better learn to control the system over time. This study represents a fundamental step towards the realization of combined manipulation and locomotion via teleoperation. △ Less

Submitted 7 March, 2022; originally announced March 2022.

arXiv:2201.02137 [pdf]

Uncertainty in solar wind forcing explains polar cap potential saturation

Authors: Nithin Sivadas, David Sibeck, Varsha Subramanyan, Maria-Theresia Walach, Kyle Murphy, Alexa Halford

Abstract: Extreme space weather events occur during intervals of strong solar wind electric fields. Curiously during these intervals, their impact on measures of the Earth's response, like the polar cap index, is not as high as expected. Theorists have put forward a host of explanations for this saturation effect, but there is no consensus. Here we show that the saturation is merely a perception created by… ▽ More Extreme space weather events occur during intervals of strong solar wind electric fields. Curiously during these intervals, their impact on measures of the Earth's response, like the polar cap index, is not as high as expected. Theorists have put forward a host of explanations for this saturation effect, but there is no consensus. Here we show that the saturation is merely a perception created by uncertainty in the solar wind measurements, especially in the measurement times. Correcting for the uncertainty reveals that extreme space weather events elicit a ~300% larger impact than previously thought. Furthermore, they point to a surprisingly general result relevant to any correlation study: uncertainty in the measurement time can cause a system's linear response to be perceived as non-linear. △ Less

Submitted 5 January, 2022; originally announced January 2022.

Comments: 22 pages, with "Materials and Methods" starting from page 10, after the Main Text. Supplementary figures starting from page 18. The manuscript is being submitted for peer review

arXiv:2112.04489 [pdf, other]

Learn2Reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning

Authors: Alessa Hering, Lasse Hansen, Tony C. W. Mok, Albert C. S. Chung, Hanna Siebert, Stephanie Häger, Annkristin Lange, Sven Kuckertz, Stefan Heldmann, Wei Shao, Sulaiman Vesal, Mirabela Rusu, Geoffrey Sonn, Théo Estienne, Maria Vakalopoulou, Luyi Han, Yunzhi Huang, Pew-Thian Yap, Mikael Brudfors, Yaël Balbastre, Samuel Joutard, Marc Modat, Gal Lifshitz, Dan Raviv, **xin Lv , et al. (28 additional authors not shown)

Abstract: Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing… ▽ More Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing approaches. The Learn2Reg challenge addresses these limitations by providing a multi-task medical image registration data set for comprehensive characterisation of deformable registration algorithms. A continuous evaluation will be possible at https://learn2reg.grand-challenge.org. Learn2Reg covers a wide range of anatomies (brain, abdomen, and thorax), modalities (ultrasound, CT, MR), availability of annotations, as well as intra- and inter-patient registration evaluation. We established an easily accessible framework for training and validation of 3D registration methods, which enabled the compilation of results of over 65 individual method submissions from more than 20 unique teams. We used a complementary set of metrics, including robustness, accuracy, plausibility, and runtime, enabling unique insight into the current state-of-the-art of medical image registration. This paper describes datasets, tasks, evaluation methods and results of the challenge, as well as results of further analysis of transferability to new datasets, the importance of label supervision, and resulting bias. While no single approach worked best across all tasks, many methodological aspects could be identified that push the performance of medical image registration to new state-of-the-art performance. Furthermore, we demystified the common belief that conventional registration methods have to be much slower than deep-learning-based methods. △ Less

Submitted 7 October, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

arXiv:2112.00195 [pdf, other]

Efficient Online Bayesian Inference for Neural Bandits

Authors: Gerardo Duran-Martin, Aleyna Kara, Kevin Murphy

Abstract: In this paper we present a new algorithm for online (sequential) inference in Bayesian neural networks, and show its suitability for tackling contextual bandit problems. The key idea is to combine the extended Kalman filter (which locally linearizes the likelihood function at each time step) with a (learned or random) low-dimensional affine subspace for the parameters; the use of a subspace enable… ▽ More In this paper we present a new algorithm for online (sequential) inference in Bayesian neural networks, and show its suitability for tackling contextual bandit problems. The key idea is to combine the extended Kalman filter (which locally linearizes the likelihood function at each time step) with a (learned or random) low-dimensional affine subspace for the parameters; the use of a subspace enables us to scale our algorithm to models with $\sim 1M$ parameters. While most other neural bandit methods need to store the entire past dataset in order to avoid the problem of "catastrophic forgetting", our approach uses constant memory. This is possible because we represent uncertainty about all the parameters in the model, not just the final linear layer. We show good results on the "Deep Bayesian Bandit Showdown" benchmark, as well as MNIST and a recommender system. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Journal ref: AISTATS 2022

arXiv:2109.00599 [pdf]

doi 10.3847/1538-4357/ac1c73

Compton-Thick AGN in the NuSTAR era VI: The observed Compton-thick fraction in the Local Universe

Authors: N. Torres-Albà, S. Marchesi, X. Zhao, M. Ajello, R. Silver, T. T. Ananna, M. Baloković, P. B. Boorman, A. Comastri, R. Gilli, G. Lanzuisi, K. Murphy, C. M. Urry, C. Vignali

Abstract: We present the analysis of simultaneous NuSTAR and XMM-Newton data of 8 Compton-thick (CT-) active galactic nuclei (AGN) candidates selected in the Swift-Burst Alert Telescope (BAT) 100 month survey. This work is part of an ongoing effort to find and characterize all CT-AGN in the local ($z\leq$0.05) Universe. We used two physically motivated models, MYTorus and borus02, to characterize the source… ▽ More We present the analysis of simultaneous NuSTAR and XMM-Newton data of 8 Compton-thick (CT-) active galactic nuclei (AGN) candidates selected in the Swift-Burst Alert Telescope (BAT) 100 month survey. This work is part of an ongoing effort to find and characterize all CT-AGN in the local ($z\leq$0.05) Universe. We used two physically motivated models, MYTorus and borus02, to characterize the sources in the sample, finding 5 of them to be confirmed CT-AGN. These results represent an increase of $\sim19$% over the previous NuSTAR-confirmed, BAT-selected CT-AGN at $z\leq0.05$, bringing the total number to 32. This corresponds to an observed fraction of $\sim 8$\% of all AGN within this volume-limited sample, although it increases to $20\pm5$% when limiting the sample to $z\leq0.01$. Out of a sample of 48 CT-AGN candidates, selected using BAT and soft (0.3$-$10 keV) X-ray data, only 24 are confirmed as CT-AGN with the addition of the NuSTAR data. This highlights the importance of NuSTAR when classifying local obscured AGN. We also note that most of the sources in our full sample of 48 Seyfert 2 galaxies with NuSTAR data have significantly different line-of-sight and average torus column densities, favouring a patchy torus scenario. △ Less

Submitted 1 September, 2021; originally announced September 2021.

Comments: 31 pages, 13 figures, accepted for publication in ApJ

arXiv:2108.07636 [pdf, other]

Accounting for shared covariates in semi-parametric Bayesian additive regression trees

Authors: Estevão B. Prado, Andrew C. Parnell, Keefe Murphy, Nathan McJames, Ann O'Shea, Rafael A. Moral

Abstract: We propose some extensions to semi-parametric models based on Bayesian additive regression trees (BART). In the semi-parametric BART paradigm, the response variable is approximated by a linear predictor and a BART model, where the linear component is responsible for estimating the main effects and BART accounts for non-specified interactions and non-linearities. Previous semi-parametric models bas… ▽ More We propose some extensions to semi-parametric models based on Bayesian additive regression trees (BART). In the semi-parametric BART paradigm, the response variable is approximated by a linear predictor and a BART model, where the linear component is responsible for estimating the main effects and BART accounts for non-specified interactions and non-linearities. Previous semi-parametric models based on BART have assumed that the set of covariates in the linear predictor and the BART model are mutually exclusive in an attempt to avoid poor coverage properties and reduce bias in the estimates of the parameters in the linear predictor. The main novelty in our approach lies in the way we change the tree-generation moves in BART to deal with this bias and resolve non-identifiability issues between the parametric and non-parametric components, even when they have covariates in common. This allows us to model complex interactions involving the covariates of primary interest, both among themselves and with those in the BART component. Our novel method is developed with a view to analysing data from an international education assessment, where certain predictors of students' achievements in mathematics are of particular interpretational interest. Through additional simulation studies and another application to a well-known benchmark dataset, we also show competitive performance when compared to regression models, alternative formulations of semi-parametric BART, and other tree-based methods. The implementation of the proposed method is available at \url{https://github.com/ebprado/CSP-BART}. △ Less

Submitted 3 June, 2022; v1 submitted 17 August, 2021; originally announced August 2021.

arXiv:2107.09128 [pdf, other]

doi 10.1103/PhysRevLett.127.108002

Sculpting liquids with ultrathin shells

Authors: Yousra Timounay, Alexander R. Hartwell, Mengfei He, D. Eric King, Lindsay K. Murphy, Vincent Démery, Joseph D. Paulsen

Abstract: Thin elastic films can spontaneously attach to liquid interfaces, offering a platform for tailoring their physical, chemical, and optical properties. Current understanding of the elastocapillarity of thin films is based primarily on studies of planar sheets. We show that curved shells can be used to manipulate interfaces in qualitatively different ways. We elucidate a regime where an ultrathin she… ▽ More Thin elastic films can spontaneously attach to liquid interfaces, offering a platform for tailoring their physical, chemical, and optical properties. Current understanding of the elastocapillarity of thin films is based primarily on studies of planar sheets. We show that curved shells can be used to manipulate interfaces in qualitatively different ways. We elucidate a regime where an ultrathin shell with vanishing bending rigidity imposes its own rest shape on a liquid surface, using experiment and theory. Conceptually, the pressure across the interface "inflates" the shell into its original shape. The setup is amenable to optical applications as the shell is transparent, free of wrinkles, and may be manufactured over a range of curvatures. △ Less

Submitted 9 September, 2021; v1 submitted 19 July, 2021; originally announced July 2021.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. Lett. 127, 108002 (2021)

arXiv:2106.05965 [pdf, other]

Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold

Authors: Kieran Murphy, Carlos Esteves, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

Abstract: Single image pose estimation is a fundamental problem in many vision and robotics tasks, and existing deep learning approaches suffer by not completely modeling and handling: i) uncertainty about the predictions, and ii) symmetric objects with multiple (sometimes infinite) correct poses. To this end, we introduce a method to estimate arbitrary, non-parametric distributions on SO(3). Our key idea i… ▽ More Single image pose estimation is a fundamental problem in many vision and robotics tasks, and existing deep learning approaches suffer by not completely modeling and handling: i) uncertainty about the predictions, and ii) symmetric objects with multiple (sometimes infinite) correct poses. To this end, we introduce a method to estimate arbitrary, non-parametric distributions on SO(3). Our key idea is to represent the distributions implicitly, with a neural network that estimates the probability given the input image and a candidate pose. Grid sampling or gradient ascent can be used to find the most likely pose, but it is also possible to evaluate the probability at any pose, enabling reasoning about symmetries and uncertainty. This is the most general way of representing distributions on manifolds, and to showcase the rich expressive power, we introduce a dataset of challenging symmetric and nearly-symmetric objects. We require no supervision on pose uncertainty -- the model trains only with a single pose per example. Nonetheless, our implicit model is highly expressive to handle complex distributions over 3D poses, while still obtaining accurate pose estimation on standard non-ambiguous environments, achieving state-of-the-art performance on Pascal3D+ and ModelNet10-SO(3) benchmarks. △ Less

Submitted 1 July, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

Comments: Additional implementation details

arXiv:2106.04015 [pdf, other]

Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compute availability for extensive tuning, incorporation of sufficiently many baselines, and concrete documentation for reproducibility. In this paper we introduce Uncertainty Baselines: high-quality implementations of standard and state-of-the-art deep learning methods on a variety of tasks. As of this writing, the collection spans 19 methods across 9 tasks, each with at least 5 metrics. Each baseline is a self-contained experiment pipeline with easily reusable and extendable components. Our goal is to provide immediate starting points for experimentation with new methods or applications. Additionally we provide model checkpoints, experiment outputs as Python notebooks, and leaderboards for comparing results. Code available at https://github.com/google/uncertainty-baselines. △ Less

Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

arXiv:2105.01181 [pdf, other]

doi 10.1002/mp.15655

Automated Estimation of Total Lung Volume using Chest Radiographs and Deep Learning

Authors: Ecem Sogancioglu, Keelin Murphy, Ernst Th. Scholten, Luuk H. Boulogne, Mathias Prokop, Bram van Ginneken

Abstract: Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 92… ▽ More Total lung volume is an important quantitative biomarker and is used for the assessment of restrictive lung diseases. In this study, we investigate the performance of several deep-learning approaches for automated measurement of total lung volume from chest radiographs. 7621 posteroanterior and lateral view chest radiographs (CXR) were collected from patients with chest CT available. Similarly, 928 CXR studies were chosen from patients with pulmonary function test (PFT) results. The reference total lung volume was calculated from lung segmentation on CT or PFT data, respectively. This dataset was used to train deep-learning architectures to predict total lung volume from chest radiographs. The experiments were constructed in a step-wise fashion with increasing complexity to demonstrate the effect of training with CT-derived labels only and the sources of error. The optimal models were tested on 291 CXR studies with reference lung volume obtained from PFT. The optimal deep-learning regression model showed an MAE of 408 ml and a MAPE of 8.1\% and Pearson's r = 0.92 using both frontal and lateral chest radiographs as input. CT-derived labels were useful for pre-training but the optimal performance was obtained by fine-tuning the network with PFT-derived labels. We demonstrate, for the first time, that state-of-the-art deep learning solutions can accurately measure total lung volume from plain chest radiographs. The proposed model can be used to obtain total lung volume from routinely acquired chest radiographs at no additional cost and could be a useful tool to identify trends over time in patients referred regularly for chest x-rays. △ Less

Submitted 3 May, 2021; originally announced May 2021.

Comments: Under review

arXiv:2104.11791 [pdf, other]

doi 10.1103/PhysRevLett.129.046402

Fermi surface and mass renormalization in the iron-based superconductor YFe$_2$Ge$_2$

Authors: Jordan Baglo, Jiasheng Chen, Keiron Murphy, Roos Leenen, Alix McCollam, Michael L. Sutherland, F. Malte Grosche

Abstract: Quantum oscillation measurements in the new unconventional superconductor YFe$_2$Ge$_2$ resolve all four Fermi surface pockets expected from band structure calculations, which predict an electron pocket in the Brillouin zone corner and three hole pockets envelo** the centers of the top and bottom of the Brillouin zone. The carrier masses are uniformly renormalized by about a factor of five and b… ▽ More Quantum oscillation measurements in the new unconventional superconductor YFe$_2$Ge$_2$ resolve all four Fermi surface pockets expected from band structure calculations, which predict an electron pocket in the Brillouin zone corner and three hole pockets envelo** the centers of the top and bottom of the Brillouin zone. The carrier masses are uniformly renormalized by about a factor of five and broadly account for the enhanced heat capacity Sommerfeld coefficient $\simeq 100$ mJ/molK$^2$. Our data highlight the key role of the electron pocket, which despite its small volume accounts for about half the total density of states, and point towards a predominantly local mechanism underlying the mass renormalization in YFe$_2$Ge$_2$. △ Less

Submitted 23 April, 2021; originally announced April 2021.

Comments: 5 pages, 5 figures

arXiv:2104.08415 [pdf, other]

Risk score learning for COVID-19 contact tracing apps

Authors: Kevin Murphy, Abhishek Kumar, Stylianos Serghiou

Abstract: Digital contact tracing apps for COVID, such as the one developed by Google and Apple, need to estimate the risk that a user was infected during a particular exposure, in order to decide whether to notify the user to take precautions, such as entering into quarantine, or requesting a test. Such risk score models contain numerous parameters that must be set by the public health authority. In this p… ▽ More Digital contact tracing apps for COVID, such as the one developed by Google and Apple, need to estimate the risk that a user was infected during a particular exposure, in order to decide whether to notify the user to take precautions, such as entering into quarantine, or requesting a test. Such risk score models contain numerous parameters that must be set by the public health authority. In this paper, we show how to automatically learn these parameters from data. Our method needs access to exposure and outcome data. Although this data is already being collected (in an aggregated, privacy-preserving way) by several health authorities, in this paper we limit ourselves to simulated data, so that we can systematically study the different factors that affect the feasibility of the approach. In particular, we show that the parameters become harder to estimate when there is more missing data (e.g., due to infections which were not recorded by the app), and when there is model misspecification. Nevertheless, the learning approach outperforms a strong manually designed baseline. Furthermore, the learning approach can adapt even when the risk factors of the disease change, e.g., due to the evolution of new variants, or the adoption of vaccines. △ Less

Submitted 21 July, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: 13 pages, 7 figures

arXiv:2103.13833 [pdf]

doi 10.1371/journal.pone.0255301

Deep Learning with robustness to missing data: A novel approach to the detection of COVID-19

Authors: Erdi Çallı, Keelin Murphy, Steef Kurstjens, Tijs Samson, Robert Herpers, Henk Smits, Matthieu Rutten, Bram van Ginneken

Abstract: In the context of the current global pandemic and the limitations of the RT-PCR test, we propose a novel deep learning architecture, DFCN (Denoising Fully Connected Network). Since medical facilities around the world differ enormously in what laboratory tests or chest imaging may be available, DFCN is designed to be robust to missing input data. An ablation study extensively evaluates the performa… ▽ More In the context of the current global pandemic and the limitations of the RT-PCR test, we propose a novel deep learning architecture, DFCN (Denoising Fully Connected Network). Since medical facilities around the world differ enormously in what laboratory tests or chest imaging may be available, DFCN is designed to be robust to missing input data. An ablation study extensively evaluates the performance benefits of the DFCN as well as its robustness to missing inputs. Data from 1088 patients with confirmed RT-PCR results are obtained from two independent medical facilities. The data includes results from 27 laboratory tests and a chest x-ray scored by a deep learning model. Training and test datasets are taken from different medical facilities. Data is made publicly available. The performance of DFCN in predicting the RT-PCR result is compared with 3 related architectures as well as a Random Forest baseline. All models are trained with varying levels of masked input data to encourage robustness to missing inputs. Missing data is simulated at test time by masking inputs randomly. DFCN outperforms all other models with statistical significance using random subsets of input data with 2-27 available inputs. When all 28 inputs are available DFCN obtains an AUC of 0.924, higher than any other model. Furthermore, with clinically meaningful subsets of parameters consisting of just 6 and 7 inputs respectively, DFCN achieves higher AUCs than any other model, with values of 0.909 and 0.919. △ Less

Submitted 2 August, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

arXiv:2103.08700 [pdf, other]

doi 10.1016/j.media.2021.102125

Deep Learning for Chest X-ray Analysis: A Survey

Authors: Ecem Sogancioglu, Erdi Çallı, Bram van Ginneken, Kicky G. van Leeuwen, Keelin Murphy

Abstract: Recent advances in deep learning have led to a promising performance in many medical image analysis tasks. As the most commonly performed radiological exam, chest radiographs are a particularly important modality for which a variety of applications have been researched. The release of multiple, large, publicly available chest X-ray datasets in recent years has encouraged research interest and boos… ▽ More Recent advances in deep learning have led to a promising performance in many medical image analysis tasks. As the most commonly performed radiological exam, chest radiographs are a particularly important modality for which a variety of applications have been researched. The release of multiple, large, publicly available chest X-ray datasets in recent years has encouraged research interest and boosted the number of publications. In this paper, we review all studies using deep learning on chest radiographs, categorizing works by task: image-level prediction (classification and regression), segmentation, localization, image generation and domain adaptation. Commercially available applications are detailed, and a comprehensive discussion of the current state of the art and potential future directions are provided. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: Under review in Medical Image Analysis

arXiv:2103.08433 [pdf, other]

HOPPY: An Open-source Kit for Education with Dynamic Legged Robots

Authors: Joao Ramos, Yanran Ding, Young-woo Sim, Kevin Murphy, Daniel Block

Abstract: This paper introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit is intended to lower the entry barrier for studying dynamic robots and legged locomotion with real systems. It bridges the theoretical content of fundamental robotic courses with real dynamic robots by facilitating and… ▽ More This paper introduces HOPPY, an open-source, low-cost, robust, and modular kit for robotics education. The robot dynamically hops around a rotating gantry with a fixed base. The kit is intended to lower the entry barrier for studying dynamic robots and legged locomotion with real systems. It bridges the theoretical content of fundamental robotic courses with real dynamic robots by facilitating and guiding the software and hardware integration. This paper describes the topics which can be studied using the kit, lists its components, discusses preferred practices for implementation, presents results from experiments with the simulator and the real system, and suggests further improvements. A simple heuristic-based controller is described to achieve velocities up to 1.7m/s, navigate small objects, and mitigate external disturbances when the robot is aided by a counterweight. HOPPY was utilized as the subject of a semester-long project for the Robot Dynamics and Control course at the University of Illinois at Urbana-Champaign. The positive feedback from the students and instructors about the hands-on activities during the course motivates us to share this kit and continue improving in the future. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2010.14580

arXiv:2103.03240 [pdf, other]

Learning ABCs: Approximate Bijective Correspondence for isolating factors of variation with weak supervision

Authors: Kieran A. Murphy, Varun Jampani, Srikumar Ramalingam, Ameesh Makadia

Abstract: Representational learning forms the backbone of most deep learning applications, and the value of a learned representation is intimately tied to its information content regarding different factors of variation. Finding good representations depends on the nature of supervision and the learning algorithm. We propose a novel algorithm that utilizes a weak form of supervision where the data is partiti… ▽ More Representational learning forms the backbone of most deep learning applications, and the value of a learned representation is intimately tied to its information content regarding different factors of variation. Finding good representations depends on the nature of supervision and the learning algorithm. We propose a novel algorithm that utilizes a weak form of supervision where the data is partitioned into sets according to certain inactive (common) factors of variation which are invariant across elements of each set. Our key insight is that by seeking correspondence between elements of different sets, we learn strong representations that exclude the inactive factors of variation and isolate the active factors that vary within all sets. As a consequence of focusing on the active factors, our method can leverage a mix of set-supervised and wholly unsupervised data, which can even belong to a different domain. We tackle the challenging problem of synthetic-to-real object pose transfer, without pose annotations on anything, by isolating pose information which generalizes to the category level and across the synthetic/real domain gap. The method can also boost performance in supervised settings, by strengthening intermediate representations, as well as operate in practically attainable scenarios with set-supervised natural images, where quantity is limited and nuisance factors of variation are more plentiful. △ Less

Submitted 30 March, 2022; v1 submitted 4 March, 2021; originally announced March 2021.

Comments: CVPR 2022. Code: https://github.com/google-research/google-research/tree/master/isolating_factors

arXiv:2102.00323 [pdf, other]

doi 10.37236/10225

Paths of Length Three are $K_{r+1}$-Turán Good

Authors: Kyle Murphy, JD Nir

Abstract: The generalized Turán problem $ext(n,T,F)$ is to determine the maximal number of copies of a graph $T$ that can exist in an $F$-free graph on $n$ vertices. Recently, Gerbner and Palmer noted that the solution to the generalized Turán problem is often the original Turán graph. They gave the name "$F$-Turán-good" to graphs $T$ for which, for large enough $n$, the solution to the generalized Turán pr… ▽ More The generalized Turán problem $ext(n,T,F)$ is to determine the maximal number of copies of a graph $T$ that can exist in an $F$-free graph on $n$ vertices. Recently, Gerbner and Palmer noted that the solution to the generalized Turán problem is often the original Turán graph. They gave the name "$F$-Turán-good" to graphs $T$ for which, for large enough $n$, the solution to the generalized Turán problem is realized by a Turán graph. They prove that the path graph on two edges, $P_2$, is $K_{r+1}$-Turán-good for all $r \ge 3$, but they conjecture that the same result should hold for all $P_\ell$. In this paper, using arguments based in flag algebras, we prove that the path on three edges, $P_3$, is also $K_{r+1}$-Turán-good for all $r \ge 3$. △ Less

Submitted 30 January, 2021; originally announced February 2021.

Comments: 24 pages

arXiv:2011.02508 [pdf, other]

A Comparison Between Joint Space and Task Space Map**s for Dynamic Teleoperation of an Anthropomorphic Robotic Arm in Reaction Tests

Authors: Sunyu Wang, Kevin Murphy, Dillan Kenney, Joao Ramos

Abstract: Teleoperation (i.e., controlling a robot with human motion) proves promising in enabling a humanoid robot to move as dynamically as a human. But how to map human motion to a humanoid robot matters because a human and a humanoid robot rarely have identical topologies and dimensions. This work presents an experimental study that utilizes reaction tests to compare the proposed joint space map** and… ▽ More Teleoperation (i.e., controlling a robot with human motion) proves promising in enabling a humanoid robot to move as dynamically as a human. But how to map human motion to a humanoid robot matters because a human and a humanoid robot rarely have identical topologies and dimensions. This work presents an experimental study that utilizes reaction tests to compare the proposed joint space map** and the proposed task space map** for dynamic teleoperation of an anthropomorphic robotic arm that possesses human-level dynamic motion capabilities. The experimental results suggest that the robot achieved similar and, in some cases, human-level dynamic performances with both map**s for the six participating human subjects. All subjects became proficient at teleoperating the robot with both map**s after practice, despite that the subjects and the robot differed in size and link length ratio and that the teleoperation required the subjects to move unintuitively. Yet, most subjects developed their teleoperation proficiencies more quickly with the task space map** than with the joint space map** after similar amounts of practice. This study also indicates the potential values of a three-dimensional task space map**, a teleoperation training simulator, and force feedback to the human pilot for intuitive and dynamic teleoperation of a humanoid robot's arms. △ Less

Submitted 4 November, 2020; originally announced November 2020.

Showing 1–50 of 148 results for author: Murphy, K