-
Population Graph Cross-Network Node Classification for Autism Detection Across Sample Groups
Authors:
Anna Stephens,
Francisco Santos,
Pang-Ning Tan,
Abdol-Hossein Esfahanian
Abstract:
Graph neural networks (GNN) are a powerful tool for combining imaging and non-imaging medical information for node classification tasks. Cross-network node classification extends GNN techniques to account for domain drift, allowing for node classification on an unlabeled target network. In this paper we present OTGCN, a powerful, novel approach to cross-network node classification. This approach l…
▽ More
Graph neural networks (GNN) are a powerful tool for combining imaging and non-imaging medical information for node classification tasks. Cross-network node classification extends GNN techniques to account for domain drift, allowing for node classification on an unlabeled target network. In this paper we present OTGCN, a powerful, novel approach to cross-network node classification. This approach leans on concepts from graph convolutional networks to harness insights from graph data structures while simultaneously applying strategies rooted in optimal transport to correct for the domain drift that can occur between samples from different data collection sites. This blended approach provides a practical solution for scenarios with many distinct forms of data collected across different locations and equipment. We demonstrate the effectiveness of this approach at classifying Autism Spectrum Disorder subjects using a blend of imaging and non-imaging data.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Linking Symptom Inventories using Semantic Textual Similarity
Authors:
Eamonn Kennedy,
Shashank Vadlamani,
Hannah M Lindsey,
Kelly S Peterson,
Kristen Dams OConnor,
Kenton Murray,
Ronak Agarwal,
Houshang H Amiri,
Raeda K Andersen,
Talin Babikian,
David A Baron,
Erin D Bigler,
Karen Caeyenberghs,
Lisa Delano-Wood,
Seth G Disner,
Ekaterina Dobryakova,
Blessen C Eapen,
Rachel M Edelstein,
Carrie Esopenko,
Helen M Genova,
Elbert Geuze,
Naomi J Goodrich-Hunsaker,
Jordan Grafman,
Asta K Haberg,
Cooper B Hodges
, et al. (57 additional authors not shown)
Abstract:
An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores…
▽ More
An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores across previously incongruous symptom inventories. We tested the ability of four pre-trained STS models to screen thousands of symptom description pairs for related content - a challenging task typically requiring expert panels. Models were tasked to predict symptom severity across four different inventories for 6,607 participants drawn from 16 international data sources. The STS approach achieved 74.8% accuracy across five tasks, outperforming other models tested. This work suggests that incorporating contextual, semantic information can assist expert decision-making processes, yielding gains for both general and disease-specific clinical assessment.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Accelerating Generalized Random Forests with Fixed-Point Trees
Authors:
David Fleischer,
David A. Stephens,
Archer Yang
Abstract:
Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing…
▽ More
Generalized random forests arXiv:1610.01271 build upon the well-established success of conventional forests (Breiman, 2001) to offer a flexible and powerful non-parametric method for estimating local solutions of heterogeneous estimating equations. Estimators are constructed by leveraging random forests as an adaptive kernel weighting algorithm and implemented through a gradient-based tree-growing procedure. By expressing this gradient-based approximation as being induced from a single Newton-Raphson root-finding iteration, and drawing upon the connection between estimating equations and fixed-point problems arXiv:2110.11074, we propose a new tree-growing rule for generalized random forests induced from a fixed-point iteration type of approximation, enabling gradient-free optimization, and yielding substantial time savings for tasks involving even modest dimensionality of the target quantity (e.g. multiple/multi-level treatment effects). We develop an asymptotic theory for estimators obtained from forests whose trees are grown through the fixed-point splitting rule, and provide numerical simulations demonstrating that the estimators obtained from such forests are comparable to those obtained from the more costly gradient-based rule.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Hardness of braided quantum circuit optimization in the surface code
Authors:
Kunihiro Wasa,
Shin Nishio,
Koki Suetsugu,
Michael Hanks,
Ashley Stephens,
Yu Yokoi,
Kae Nemoto
Abstract:
Large-scale quantum information processing requires the use of quantum error correcting codes to mitigate the effects of noise in quantum devices. Topological error-correcting codes, such as surface codes, are promising candidates as they can be implemented using only local interactions in a two-dimensional array of physical qubits. Procedures such as defect braiding and lattice surgery can then b…
▽ More
Large-scale quantum information processing requires the use of quantum error correcting codes to mitigate the effects of noise in quantum devices. Topological error-correcting codes, such as surface codes, are promising candidates as they can be implemented using only local interactions in a two-dimensional array of physical qubits. Procedures such as defect braiding and lattice surgery can then be used to realize a fault-tolerant universal set of gates on the logical space of such topological codes. However, error correction also introduces a significant overhead in computation time, the number of physical qubits, and the number of physical gates. While optimizing fault-tolerant circuits to minimize this overhead is critical, the computational complexity of such optimization problems remains unknown. This ambiguity leaves room for doubt surrounding the most effective methods for compiling fault-tolerant circuits for a large-scale quantum computer. In this paper, we show that the optimization of a special subset of braided quantum circuits is NP-hard by a polynomial-time reduction of the optimization problem into a specific problem called Planar Rectilinear 3SAT.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Stochastic Reweighted Gradient Descent
Authors:
Ayoub El Hanchi,
David A. Stephens
Abstract:
Despite the strong theoretical guarantees that variance-reduced finite-sum optimization algorithms enjoy, their applicability remains limited to cases where the memory overhead they introduce (SAG/SAGA), or the periodic full gradient computation they require (SVRG/SARAH) are manageable. A promising approach to achieving variance reduction while avoiding these drawbacks is the use of importance sam…
▽ More
Despite the strong theoretical guarantees that variance-reduced finite-sum optimization algorithms enjoy, their applicability remains limited to cases where the memory overhead they introduce (SAG/SAGA), or the periodic full gradient computation they require (SVRG/SARAH) are manageable. A promising approach to achieving variance reduction while avoiding these drawbacks is the use of importance sampling instead of control variates. While many such methods have been proposed in the literature, directly proving that they improve the convergence of the resulting optimization algorithm has remained elusive. In this work, we propose an importance-sampling-based algorithm we call SRG (stochastic reweighted gradient). We analyze the convergence of SRG in the strongly-convex case and show that, while it does not recover the linear rate of control variates methods, it provably outperforms SGD. We pay particular attention to the time and memory overhead of our proposed method, and design a specialized red-black tree allowing its efficient implementation. Finally, we present empirical results to support our findings.
△ Less
Submitted 23 March, 2021;
originally announced March 2021.
-
Adaptive Importance Sampling for Finite-Sum Optimization and Sampling with Decreasing Step-Sizes
Authors:
Ayoub El Hanchi,
David A. Stephens
Abstract:
Reducing the variance of the gradient estimator is known to improve the convergence rate of stochastic gradient-based optimization and sampling algorithms. One way of achieving variance reduction is to design importance sampling strategies. Recently, the problem of designing such schemes was formulated as an online learning problem with bandit feedback, and algorithms with sub-linear static regret…
▽ More
Reducing the variance of the gradient estimator is known to improve the convergence rate of stochastic gradient-based optimization and sampling algorithms. One way of achieving variance reduction is to design importance sampling strategies. Recently, the problem of designing such schemes was formulated as an online learning problem with bandit feedback, and algorithms with sub-linear static regret were designed. In this work, we build on this framework and propose Avare, a simple and efficient algorithm for adaptive importance sampling for finite-sum optimization and sampling with decreasing step-sizes. Under standard technical conditions, we show that Avare achieves $\mathcal{O}(T^{2/3})$ and $\mathcal{O}(T^{5/6})$ dynamic regret for SGD and SGLD respectively when run with $\mathcal{O}(1/t)$ step sizes. We achieve this dynamic regret bound by leveraging our knowledge of the dynamics defined by the algorithm, and combining ideas from online learning and variance-reduced stochastic optimization. We validate empirically the performance of our algorithm and identify settings in which it leads to significant improvements.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.
-
NeBula: Quest for Robotic Autonomy in Challenging Environments; TEAM CoSTAR at the DARPA Subterranean Challenge
Authors:
Ali Agha,
Kyohei Otsu,
Benjamin Morrell,
David D. Fan,
Rohan Thakker,
Angel Santamaria-Navarro,
Sung-Kyun Kim,
Amanda Bouman,
Xianmei Lei,
Jeffrey Edlund,
Muhammad Fadhil Ginting,
Kamak Ebadi,
Matthew Anderson,
Torkom Pailevanian,
Edward Terry,
Michael Wolf,
Andrea Tagliabue,
Tiago Stegun Vaquero,
Matteo Palieri,
Scott Tepsuporn,
Yun Chang,
Arash Kalantari,
Fernando Chavez,
Brett Lopez,
Nobuhiro Funabiki
, et al. (47 additional authors not shown)
Abstract:
This paper presents and discusses algorithms, hardware, and software architecture developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), competing in the DARPA Subterranean Challenge. Specifically, it presents the techniques utilized within the Tunnel (2019) and Urban (2020) competitions, where CoSTAR achieved 2nd and 1st place, respectively. We also discuss CoSTAR's demonstr…
▽ More
This paper presents and discusses algorithms, hardware, and software architecture developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), competing in the DARPA Subterranean Challenge. Specifically, it presents the techniques utilized within the Tunnel (2019) and Urban (2020) competitions, where CoSTAR achieved 2nd and 1st place, respectively. We also discuss CoSTAR's demonstrations in Martian-analog surface and subsurface (lava tubes) exploration. The paper introduces our autonomy solution, referred to as NeBula (Networked Belief-aware Perceptual Autonomy). NeBula is an uncertainty-aware framework that aims at enabling resilient and modular autonomy solutions by performing reasoning and decision making in the belief space (space of probability distributions over the robot and world states). We discuss various components of the NeBula framework, including: (i) geometric and semantic environment map**; (ii) a multi-modal positioning system; (iii) traversability analysis and local planning; (iv) global motion planning and exploration behavior; (i) risk-aware mission planning; (vi) networking and decentralized reasoning; and (vii) learning-enabled adaptation. We discuss the performance of NeBula on several robot types (e.g. wheeled, legged, flying), in various environments. We discuss the specific results and lessons learned from fielding this solution in the challenging courses of the DARPA Subterranean Challenge competition.
△ Less
Submitted 18 October, 2021; v1 submitted 21 March, 2021;
originally announced March 2021.
-
Accelerating Finite-temperature Kohn-Sham Density Functional Theory with Deep Neural Networks
Authors:
J. Austin Ellis,
Lenz Fiedler,
Gabriel A. Popoola,
Normand A. Modine,
J. Adam Stephens,
Aidan P. Thompson,
Attila Cangi,
Sivasankaran Rajamanickam
Abstract:
We present a numerical modeling workflow based on machine learning (ML) which reproduces the the total energies produced by Kohn-Sham density functional theory (DFT) at finite electronic temperature to within chemical accuracy at negligible computational cost. Based on deep neural networks, our workflow yields the local density of states (LDOS) for a given atomic configuration. From the LDOS, spat…
▽ More
We present a numerical modeling workflow based on machine learning (ML) which reproduces the the total energies produced by Kohn-Sham density functional theory (DFT) at finite electronic temperature to within chemical accuracy at negligible computational cost. Based on deep neural networks, our workflow yields the local density of states (LDOS) for a given atomic configuration. From the LDOS, spatially-resolved, energy-resolved, and integrated quantities can be calculated, including the DFT total free energy, which serves as the Born-Oppenheimer potential energy surface for the atoms. We demonstrate the efficacy of this approach for both solid and liquid metals and compare results between independent and unified machine-learning models for solid and liquid aluminum. Our machine-learning density functional theory framework opens up the path towards multiscale materials modeling for matter under ambient and extreme conditions at a computational scale and cost that is unattainable with current algorithms.
△ Less
Submitted 9 July, 2021; v1 submitted 10 October, 2020;
originally announced October 2020.
-
LAMP: Large-Scale Autonomous Map** and Positioning for Exploration of Perceptually-Degraded Subterranean Environments
Authors:
Kamak Ebadi,
Yun Chang,
Matteo Palieri,
Alex Stephens,
Alex Hatteland,
Eric Heiden,
Abhishek Thakur,
Nobuhiro Funabiki,
Benjamin Morrell,
Sally Wood,
Luca Carlone,
Ali-akbar Agha-mohammadi
Abstract:
Simultaneous Localization and Map** (SLAM) in large-scale, unknown, and complex subterranean environments is a challenging problem. Sensors must operate in off-nominal conditions; uneven and slippery terrains make wheel odometry inaccurate, while long corridors without salient features make exteroceptive sensing ambiguous and prone to drift; finally, spurious loop closures that are frequent in e…
▽ More
Simultaneous Localization and Map** (SLAM) in large-scale, unknown, and complex subterranean environments is a challenging problem. Sensors must operate in off-nominal conditions; uneven and slippery terrains make wheel odometry inaccurate, while long corridors without salient features make exteroceptive sensing ambiguous and prone to drift; finally, spurious loop closures that are frequent in environments with repetitive appearance, such as tunnels and mines, could result in a significant distortion of the entire map. These challenges are in stark contrast with the need to build highly-accurate 3D maps to support a wide variety of applications, ranging from disaster response to the exploration of underground extraterrestrial worlds. This paper reports on the implementation and testing of a lidar-based multi-robot SLAM system developed in the context of the DARPA Subterranean Challenge. We present a system architecture to enhance subterranean operation, including an accurate lidar-based front-end, and a flexible and robust back-end that automatically rejects outlying loop closures. We present an extensive evaluation in large-scale, challenging subterranean environments, including the results obtained in the Tunnel Circuit of the DARPA Subterranean Challenge. Finally, we discuss potential improvements, limitations of the state of the art, and future research directions.
△ Less
Submitted 5 March, 2020; v1 submitted 3 March, 2020;
originally announced March 2020.
-
Urban sidewalks: visualization and routing for individuals with limited mobility
Authors:
Nicholas Bolten,
Amirhossein Amini,
Yun Hao,
Vaishnavi Ravichandran,
Andre Stephens,
Anat Caspi
Abstract:
People with limited mobility in the U.S. (defined as having difficulty or inability to walk a quarter of a mile without help and without the use of special equipment) face a growing informational gap: while pedestrian routing algorithms are getting faster and more informative, planning a route with a wheeled device in urban centers is very difficult due to lack of integrated pertinent information…
▽ More
People with limited mobility in the U.S. (defined as having difficulty or inability to walk a quarter of a mile without help and without the use of special equipment) face a growing informational gap: while pedestrian routing algorithms are getting faster and more informative, planning a route with a wheeled device in urban centers is very difficult due to lack of integrated pertinent information regarding accessibility along the route. Moreover, reducing access to street-spaces translates to reduced access to other public information and services that are increasingly made available to the public along urban streets. To adequately plan a commute, a traveler with limited or wheeled mobility must know whether her path may be blocked by construction, whether the sidewalk would be too steep or rendered unusable due to poor conditions, whether the street can be crossed or a highway is blocking the way, or whether there is a sidewalk at all. These details populate different datasets in many modern municipalities, but they are not immediately available in a convenient, integrated format to be useful to people with limited mobility. Our project, AccessMap, in its first phase (v.1) overlayed the information that is most relevant to people with limited mobility on a map, enabling self-planning of routes. Here, we describe the next phase of the project: synthesizing commonly available open data (including streets, sidewalks, curb ramps, elevation data, and construction permit information) to generate a graph of paths to enable variable cost-function accessible routing.
△ Less
Submitted 12 February, 2016;
originally announced February 2016.
-
The JASMIN super-data-cluster
Authors:
B. N. Lawrence,
V. Bennett,
J. Churchill,
M. Juckes,
P. Kershaw,
P. Oliver,
M. Pritchard,
A. Stephens
Abstract:
The JASMIN super-data-cluster is being deployed to support the data analysis requirements of the UK and European climate and earth system modelling community. Physical colocation of the core JASMIN resource with significant components of the facility for Climate and Environmental Monitoring from Space (CEMS) provides additional support for the earth observation community, as well as facilitating f…
▽ More
The JASMIN super-data-cluster is being deployed to support the data analysis requirements of the UK and European climate and earth system modelling community. Physical colocation of the core JASMIN resource with significant components of the facility for Climate and Environmental Monitoring from Space (CEMS) provides additional support for the earth observation community, as well as facilitating further comparison and evaluation of models with data. JASMIN and CEMS together centrally deploy 9.3 PB of storage - 4.6 PB of Panasas fast disk storage alongside the STFC Atlas Tape Store. Over 370 computing cores provide local computation. Remote JASMIN resources at Bristol, Leeds and Reading provide additional distributed storage and compute configured to support local workflow as a step** stone to using the central JASMIN system. Fast network links from JASMIN provide reliable communication between the UK supercomputers MONSooN (at the Met Office) and HECToR (at the University of Edinburgh). JASMIN also supports European users via a light path to KNMI in the Netherlands. The functional components of the JASMIN infrastructure have been designed to support and integrate workflows for three main goals: (1) the efficient operation of data curation and facilitation at the STFC Centre for Environmental Data Archival; (2) efficient data analysis by the UK and European climate and earth system science communities, and; (3) flexible access for the climate impacts and earth observation communities to complex data and concomitant services.
△ Less
Submitted 16 April, 2012;
originally announced April 2012.