Search | arXiv e-print repository

Cluster Quilting: Spectral Clustering for Patchwork Learning

Authors: Lili Zheng, Andersen Chang, Genevera I. Allen

Abstract: Patchwork learning arises as a new and challenging data collection paradigm where both samples and features are observed in fragmented subsets. Due to technological limits, measurement expense, or multimodal data integration, such patchwork data structures are frequently seen in neuroscience, healthcare, and genomics, among others. Instead of analyzing each data patch separately, it is highly desi… ▽ More Patchwork learning arises as a new and challenging data collection paradigm where both samples and features are observed in fragmented subsets. Due to technological limits, measurement expense, or multimodal data integration, such patchwork data structures are frequently seen in neuroscience, healthcare, and genomics, among others. Instead of analyzing each data patch separately, it is highly desirable to extract comprehensive knowledge from the whole data set. In this work, we focus on the clustering problem in patchwork learning, aiming at discovering clusters amongst all samples even when some are never jointly observed for any feature. We propose a novel spectral clustering method called Cluster Quilting, consisting of (i) patch ordering that exploits the overlap** structure amongst all patches, (ii) patchwise SVD, (iii) sequential linear map** of top singular vectors for patch overlaps, followed by (iv) k-means on the combined and weighted singular vectors. Under a sub-Gaussian mixture model, we establish theoretical guarantees via a non-asymptotic misclustering rate bound that reflects both properties of the patch-wise observation regime as well as the clustering signal and noise dependencies. We also validate our Cluster Quilting algorithm through extensive empirical studies on both simulated and real data sets in neuroscience and genomics, where it discovers more accurate and scientifically more plausible clusters than other approaches. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13651 [pdf, other]

CLAMP: Majorized Plug-and-Play for Coherent 3D LIDAR Imaging

Authors: Tony G. Allen, David J. Rabb, Gregery T. Buzzard, Charles A. Bouman

Abstract: Coherent LIDAR uses a chirped laser pulse for 3D imaging of distant targets. However, existing coherent LIDAR image reconstruction methods do not account for the system's aperture, resulting in sub-optimal resolution. Moreover, these methods use majorization-minimization for computational efficiency, but do so without a theoretical treatment of convergence. In this paper, we present Coherent LID… ▽ More Coherent LIDAR uses a chirped laser pulse for 3D imaging of distant targets. However, existing coherent LIDAR image reconstruction methods do not account for the system's aperture, resulting in sub-optimal resolution. Moreover, these methods use majorization-minimization for computational efficiency, but do so without a theoretical treatment of convergence. In this paper, we present Coherent LIDAR Aperture Modeled Plug-and-Play (CLAMP) for multi-look coherent LIDAR image reconstruction. CLAMP uses multi-agent consensus equilibrium (a form of PnP) to combine a neural network denoiser with an accurate physics-based forward model. CLAMP introduces an FFT-based method to account for the effects of the aperture and uses majorization of the forward model for computational efficiency. We also formalize the use of majorization-minimization in consensus optimization problems and prove convergence to the exact consensus equilibrium solution. Finally, we apply CLAMP to synthetic and measured data to demonstrate its effectiveness in producing high-resolution, speckle-free, 3D imagery. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2404.01521 [pdf, other]

Fair MP-BOOST: Fair and Interpretable Minipatch Boosting

Authors: Camille Olivia Little, Genevera I. Allen

Abstract: Ensemble methods, particularly boosting, have established themselves as highly effective and widely embraced machine learning techniques for tabular data. In this paper, we aim to leverage the robust predictive power of traditional boosting methods while enhancing fairness and interpretability. To achieve this, we develop Fair MP-Boost, a stochastic boosting scheme that balances fairness and accur… ▽ More Ensemble methods, particularly boosting, have established themselves as highly effective and widely embraced machine learning techniques for tabular data. In this paper, we aim to leverage the robust predictive power of traditional boosting methods while enhancing fairness and interpretability. To achieve this, we develop Fair MP-Boost, a stochastic boosting scheme that balances fairness and accuracy by adaptively learning features and observations during training. Specifically, Fair MP-Boost sequentially samples small subsets of observations and features, termed minipatches (MP), according to adaptively learned feature and observation sampling probabilities. We devise these probabilities by combining loss functions, or by combining feature importance scores to address accuracy and fairness simultaneously. Hence, Fair MP-Boost prioritizes important and fair features along with challenging instances, to select the most relevant minipatches for learning. The learned probability distributions also yield intrinsic interpretations of feature importance and important observations in Fair MP-Boost. Through empirical evaluation of simulated and benchmark datasets, we showcase the interpretability, accuracy, and fairness of Fair MP-Boost. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2312.14416 [pdf, other]

Joint Semi-Symmetric Tensor PCA for Integrating Multi-modal Populations of Networks

Authors: Jiaming Liu, Lili Zheng, Zhengwu Zhang, Genevera I. Allen

Abstract: Multi-modal populations of networks arise in many scenarios including in large-scale multi-modal neuroimaging studies that capture both functional and structural neuroimaging data for thousands of subjects. A major research question in such studies is how functional and structural brain connectivity are related and how they vary across the population. we develop a novel PCA-type framework for inte… ▽ More Multi-modal populations of networks arise in many scenarios including in large-scale multi-modal neuroimaging studies that capture both functional and structural neuroimaging data for thousands of subjects. A major research question in such studies is how functional and structural brain connectivity are related and how they vary across the population. we develop a novel PCA-type framework for integrating multi-modal undirected networks measured on many subjects. Specifically, we arrange these networks as semi-symmetric tensors, where each tensor slice is a symmetric matrix representing a network from an individual subject. We then propose a novel Joint, Integrative Semi-Symmetric Tensor PCA (JisstPCA) model, associated with an efficient iterative algorithm, for jointly finding low-rank representations of two or more networks across the same population of subjects. We establish one-step statistical convergence of our separate low-rank network factors as well as the shared population factors to the true factors, with finite sample statistical error bounds. Through simulation studies and a real data example for integrating multi-subject functional and structural brain connectivity, we illustrate the advantages of our method for finding joint low-rank structures in multi-modal populations of networks. △ Less

Submitted 21 December, 2023; originally announced December 2023.

arXiv:2310.04352 [pdf, other]

Fair Feature Importance Scores for Interpreting Tree-Based Methods and Surrogates

Authors: Camille Olivia Little, Debolina Halder Lina, Genevera I. Allen

Abstract: Across various sectors such as healthcare, criminal justice, national security, finance, and technology, large-scale machine learning (ML) and artificial intelligence (AI) systems are being deployed to make critical data-driven decisions. Many have asked if we can and should trust these ML systems to be making these decisions. Two critical components are prerequisites for trust in ML systems: inte… ▽ More Across various sectors such as healthcare, criminal justice, national security, finance, and technology, large-scale machine learning (ML) and artificial intelligence (AI) systems are being deployed to make critical data-driven decisions. Many have asked if we can and should trust these ML systems to be making these decisions. Two critical components are prerequisites for trust in ML systems: interpretability, or the ability to understand why the ML system makes the decisions it does, and fairness, which ensures that ML systems do not exhibit bias against certain individuals or groups. Both interpretability and fairness are important and have separately received abundant attention in the ML literature, but so far, there have been very few methods developed to directly interpret models with regard to their fairness. In this paper, we focus on arguably the most popular type of ML interpretation: feature importance scores. Inspired by the use of decision trees in knowledge distillation, we propose to leverage trees as interpretable surrogates for complex black-box ML models. Specifically, we develop a novel fair feature importance score for trees that can be used to interpret how each feature contributes to fairness or bias in trees, tree-based ensembles, or tree-based surrogates of any complex ML system. Like the popular mean decrease in impurity for trees, our Fair Feature Importance Score is defined based on the mean decrease (or increase) in group bias. Through simulations as well as real examples on benchmark fairness datasets, we demonstrate that our Fair Feature Importance Score offers valid interpretations for both tree-based ensembles and tree-based surrogates of other ML systems. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2309.07110 [pdf, other]

Data Augmentation via Subgroup Mixup for Improving Fairness

Authors: Madeline Navarro, Camille Little, Genevera I. Allen, Santiago Segarra

Abstract: In this work, we propose data augmentation via pairwise mixup across subgroups to improve group fairness. Many real-world applications of machine learning systems exhibit biases across certain groups due to under-representation or training data that reflects societal biases. Inspired by the successes of mixup for improving classification performance, we develop a pairwise mixup scheme to augment t… ▽ More In this work, we propose data augmentation via pairwise mixup across subgroups to improve group fairness. Many real-world applications of machine learning systems exhibit biases across certain groups due to under-representation or training data that reflects societal biases. Inspired by the successes of mixup for improving classification performance, we develop a pairwise mixup scheme to augment training data and encourage fair and accurate decision boundaries for all subgroups. Data augmentation for group fairness allows us to add new samples of underrepresented groups to balance subpopulations. Furthermore, our method allows us to use the generalization ability of mixup to improve both fairness and accuracy. We compare our proposed mixup to existing data augmentation and bias mitigation approaches on both synthetic simulations and real-world benchmark fair classification data, demonstrating that we are able to achieve fair outcomes with robust if not improved accuracy. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 5 pages, 2 figures, 1 table

arXiv:2308.15265 [pdf, other]

A Multi-Perspective Learning to Rank Approach to Support Children's Information Seeking in the Classroom

Authors: Garrett Allen, Katherine Landau Wright, Jerry Alan Fails, Casey Kennington, Maria Soledad Pera

Abstract: We introduce a novel re-ranking model that aims to augment the functionality of standard search engines to support classroom search activities for children (ages 6 to 11). This model extends the known listwise learning-to-rank framework by balancing risk and reward. Doing so enables the model to prioritize Web resources of high educational alignment, appropriateness, and adequate readability by an… ▽ More We introduce a novel re-ranking model that aims to augment the functionality of standard search engines to support classroom search activities for children (ages 6 to 11). This model extends the known listwise learning-to-rank framework by balancing risk and reward. Doing so enables the model to prioritize Web resources of high educational alignment, appropriateness, and adequate readability by analyzing the URLs, snippets, and page titles of Web resources retrieved by a given mainstream search engine. Experimental results, including an ablation study and comparisons with existing baselines, showcase the correctness of the proposed model. The outcomes of this work demonstrate the value of considering multiple perspectives inherent to the classroom setting, e.g., educational alignment, readability, and objectionability, when applied to the design of algorithms that can better support children's information discovery. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: Extended version of the manuscript to appear in proceedings of the 22nd IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology

arXiv:2308.01475 [pdf, other]

Interpretable Machine Learning for Discovery: Statistical Challenges \& Opportunities

Authors: Genevera I. Allen, Luqin Gan, Lili Zheng

Abstract: New technologies have led to vast troves of large and complex datasets across many scientific domains and industries. People routinely use machine learning techniques to not only process, visualize, and make predictions from this big data, but also to make data-driven discoveries. These discoveries are often made using Interpretable Machine Learning, or machine learning models and techniques that… ▽ More New technologies have led to vast troves of large and complex datasets across many scientific domains and industries. People routinely use machine learning techniques to not only process, visualize, and make predictions from this big data, but also to make data-driven discoveries. These discoveries are often made using Interpretable Machine Learning, or machine learning models and techniques that yield human understandable insights. In this paper, we discuss and review the field of interpretable machine learning, focusing especially on the techniques as they are often employed to generate new knowledge or make discoveries from large data sets. We outline the types of discoveries that can be made using Interpretable Machine Learning in both supervised and unsupervised settings. Additionally, we focus on the grand challenge of how to validate these discoveries in a data-driven manner, which promotes trust in machine learning systems and reproducibility in science. We discuss validation from both a practical perspective, reviewing approaches based on data-splitting and stability, as well as from a theoretical perspective, reviewing statistical results on model selection consistency and uncertainty quantification via statistical inference. Finally, we conclude by highlighting open challenges in using interpretable machine learning techniques to make discoveries, including gaps between theory and practice for validating data-driven-discoveries. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2307.02243 [pdf, ps, other]

Power-up! What Can Generative Models Do for Human Computation Workflows?

Authors: Garrett Allen, Gaole He, Ujwal Gadiraju

Abstract: We are amidst an explosion of artificial intelligence research, particularly around large language models (LLMs). These models have a range of applications across domains like medicine, finance, commonsense knowledge graphs, and crowdsourcing. Investigation into LLMs as part of crowdsourcing workflows remains an under-explored space. The crowdsourcing research community has produced a body of work… ▽ More We are amidst an explosion of artificial intelligence research, particularly around large language models (LLMs). These models have a range of applications across domains like medicine, finance, commonsense knowledge graphs, and crowdsourcing. Investigation into LLMs as part of crowdsourcing workflows remains an under-explored space. The crowdsourcing research community has produced a body of work investigating workflows and methods for managing complex tasks using hybrid human-AI methods. Within crowdsourcing, the role of LLMs can be envisioned as akin to a cog in a larger wheel of workflows. From an empirical standpoint, little is currently understood about how LLMs can improve the effectiveness of crowdsourcing workflows and how such workflows can be evaluated. In this work, we present a vision for exploring this gap from the perspectives of various stakeholders involved in the crowdsourcing paradigm -- the task requesters, crowd workers, platforms, and end-users. We identify junctures in typical crowdsourcing workflows at which the introduction of LLMs can play a beneficial role and propose means to augment existing design patterns for crowd work. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: Accepted and presented at the Generative AI Workshop as part of CHI 2023

arXiv:2307.01343 [pdf, other]

doi 10.1088/1361-6382/ad13c5

HPC-driven computational reproducibility in numerical relativity codes: A use case study with IllinoisGRMHD

Authors: Yufeng Luo, Qian Zhang, Roland Haas, Zachariah B. Etienne, Gabrielle Allen

Abstract: Reproducibility of results is a cornerstone of the scientific method. Scientific computing encounters two challenges when aiming for this goal. Firstly, reproducibility should not depend on details of the runtime environment, such as the compiler version or computing environment, so results are verifiable by third-parties. Secondly, different versions of software code executed in the same runtime… ▽ More Reproducibility of results is a cornerstone of the scientific method. Scientific computing encounters two challenges when aiming for this goal. Firstly, reproducibility should not depend on details of the runtime environment, such as the compiler version or computing environment, so results are verifiable by third-parties. Secondly, different versions of software code executed in the same runtime environment should produce consistent numerical results for physical quantities. In this manuscript, we test the feasibility of reproducing scientific results obtained using the IllinoisGRMHD code that is part of an open-source community software for simulation in relativistic astrophysics, the Einstein Toolkit. We verify that numerical results of simulating a single isolated neutron star with IllinoisGRMHD can be reproduced, and compare them to results reported by the code authors in 2015. We use two different supercomputers: Expanse at SDSC, and Stampede2 at TACC. By compiling the source code archived along with the paper on both Expanse and Stampede2, we find that IllinoisGRMHD reproduces results published in its announcement paper up to errors comparable to round-off level changes in initial data parameters. We also verify that a current version of IlliinoisGRMHD reproduces these results once we account for bug fixes which has occurred since the original publication △ Less

Submitted 8 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: 23 pages, 6 figures, accepted to Classical and Quantum Gravity

arXiv:2305.13491 [pdf, other]

Nonparanormal Graph Quilting with Applications to Calcium Imaging

Authors: Andersen Chang, Lili Zheng, Gautam Dasarthy, Genevera I. Allen

Abstract: Probabilistic graphical models have become an important unsupervised learning tool for detecting network structures for a variety of problems, including the estimation of functional neuronal connectivity from two-photon calcium imaging data. However, in the context of calcium imaging, technological limitations only allow for partially overlap** layers of neurons in a brain region of interest to… ▽ More Probabilistic graphical models have become an important unsupervised learning tool for detecting network structures for a variety of problems, including the estimation of functional neuronal connectivity from two-photon calcium imaging data. However, in the context of calcium imaging, technological limitations only allow for partially overlap** layers of neurons in a brain region of interest to be jointly recorded. In this case, graph estimation for the full data requires inference for edge selection when many pairs of neurons have no simultaneous observations. This leads to the Graph Quilting problem, which seeks to estimate a graph in the presence of block-missingness in the empirical covariance matrix. Solutions for the Graph Quilting problem have previously been studied for Gaussian graphical models; however, neural activity data from calcium imaging are often non-Gaussian, thereby requiring a more flexible modeling approach. Thus, in our work, we study two approaches for nonparanormal Graph Quilting based on the Gaussian copula graphical model, namely a maximum likelihood procedure and a low-rank based framework. We provide theoretical guarantees on edge recovery for the former approach under similar conditions to those previously developed for the Gaussian setting, and we investigate the empirical performance of both methods using simulations as well as real data calcium imaging data. Our approaches yield more scientifically meaningful functional connectivity estimates compared to existing Gaussian graph quilting methods for this calcium imaging data set. △ Less

Submitted 22 May, 2023; originally announced May 2023.

MSC Class: 62H22

arXiv:2210.11625 [pdf, other]

Graphical Model Inference with Erosely Measured Data

Authors: Lili Zheng, Genevera I. Allen

Abstract: In this paper, we investigate the Gaussian graphical model inference problem in a novel setting that we call erose measurements, referring to irregularly measured or observed data. For graphs, this results in different node pairs having vastly different sample sizes which frequently arises in data integration, genomics, neuroscience, and sensor networks. Existing works characterize the graph selec… ▽ More In this paper, we investigate the Gaussian graphical model inference problem in a novel setting that we call erose measurements, referring to irregularly measured or observed data. For graphs, this results in different node pairs having vastly different sample sizes which frequently arises in data integration, genomics, neuroscience, and sensor networks. Existing works characterize the graph selection performance using the minimum pairwise sample size, which provides little insights for erosely measured data, and no existing inference method is applicable. We aim to fill in this gap by proposing the first inference method that characterizes the different uncertainty levels over the graph caused by the erose measurements, named GI-JOE (Graph Inference when Joint Observations are Erose). Specifically, we develop an edge-wise inference method and an affiliated FDR control procedure, where the variance of each edge depends on the sample sizes associated with corresponding neighbors. We prove statistical validity under erose measurements, thanks to careful localized edge-wise analysis and disentangling the dependencies across the graph. Finally, through simulation studies and a real neuroscience data example, we demonstrate the advantages of our inference methods for graph selection from erosely measured data. △ Less

Submitted 14 May, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

arXiv:2209.08273 [pdf, other]

Low-Rank Covariance Completion for Graph Quilting with Applications to Functional Connectivity

Authors: Andersen Chang, Lili Zheng, Genevera I. Allen

Abstract: As a tool for estimating networks in high dimensions, graphical models are commonly applied to calcium imaging data to estimate functional neuronal connectivity, i.e. relationships between the activities of neurons. However, in many calcium imaging data sets, the full population of neurons is not recorded simultaneously, but instead in partially overlap** blocks. This leads to the Graph Quilting… ▽ More As a tool for estimating networks in high dimensions, graphical models are commonly applied to calcium imaging data to estimate functional neuronal connectivity, i.e. relationships between the activities of neurons. However, in many calcium imaging data sets, the full population of neurons is not recorded simultaneously, but instead in partially overlap** blocks. This leads to the Graph Quilting problem, as first introduced by (Vinci et.al. 2019), in which the goal is to infer the structure of the full graph when only subsets of features are jointly observed. In this paper, we study a novel two-step approach to Graph Quilting, which first imputes the complete covariance matrix using low-rank covariance completion techniques before estimating the graph structure. We introduce three approaches to solve this problem: block singular value decomposition, nuclear norm penalization, and non-convex low-rank factorization. While prior works have studied low-rank matrix completion, we address the challenges brought by the block-wise missingness and are the first to investigate the problem in the context of graph learning. We discuss theoretical properties of the two-step procedure, showing graph selection consistency of one proposed approach by proving novel L infinity-norm error bounds for matrix completion with block-missingness. We then investigate the empirical performance of the proposed methods on simulations and on real-world data examples, through which we show the efficacy of these methods for estimating functional connectivity from calcium imaging data. △ Less

Submitted 17 September, 2022; originally announced September 2022.

arXiv:2206.02088 [pdf, other]

Model-Agnostic Confidence Intervals for Feature Importance: A Fast and Powerful Approach Using Minipatch Ensembles

Authors: Luqin Gan, Lili Zheng, Genevera I. Allen

Abstract: To promote new scientific discoveries from complex data sets, feature importance inference has been a long-standing statistical problem. Instead of testing for parameters that are only interpretable for specific models, there has been increasing interest in model-agnostic methods, often in the form of feature occlusion or leave-one-covariate-out (LOCO) inference. Existing approaches often make dis… ▽ More To promote new scientific discoveries from complex data sets, feature importance inference has been a long-standing statistical problem. Instead of testing for parameters that are only interpretable for specific models, there has been increasing interest in model-agnostic methods, often in the form of feature occlusion or leave-one-covariate-out (LOCO) inference. Existing approaches often make distributional assumptions, which can be difficult to verify in practice, or require model refitting and data splitting, which are computationally intensive and lead to losses in power. In this work, we develop a novel, mostly model-agnostic and distribution-free inference framework for feature importance that is computationally efficient and statistically powerful. Our approach is fast as we avoid model refitting by leveraging a form of random observation and feature subsampling called minipatch ensembles; this approach also improves statistical power by avoiding data splitting. Our framework can be applied on tabular data and with any machine learning algorithm, together with minipatch ensembles, for regression and classification tasks. Despite the dependencies induced by using minipatch ensembles, we show that our approach provides asymptotic coverage for the feature importance score of any model under mild assumptions. Finally, our same procedure can also be leveraged to provide valid confidence intervals for predictions, hence providing fast, simultaneous quantification of the uncertainty of both predictions and feature importance. We validate our intervals on a series of synthetic and real data examples, including non-linear settings, showing that our approach detects the correct important features and exhibits many computational and statistical advantages over existing methods. △ Less

Submitted 24 January, 2023; v1 submitted 4 June, 2022; originally announced June 2022.

arXiv:2206.00074 [pdf, other]

To the Fairness Frontier and Beyond: Identifying, Quantifying, and Optimizing the Fairness-Accuracy Pareto Frontier

Authors: Camille Olivia Little, Michael Weylandt, Genevera I Allen

Abstract: Algorithmic fairness has emerged as an important consideration when using machine learning to make high-stakes societal decisions. Yet, improved fairness often comes at the expense of model accuracy. While aspects of the fairness-accuracy tradeoff have been studied, most work reports the fairness and accuracy of various models separately; this makes model comparisons nearly impossible without a mo… ▽ More Algorithmic fairness has emerged as an important consideration when using machine learning to make high-stakes societal decisions. Yet, improved fairness often comes at the expense of model accuracy. While aspects of the fairness-accuracy tradeoff have been studied, most work reports the fairness and accuracy of various models separately; this makes model comparisons nearly impossible without a model-agnostic metric that reflects the balance of the two desiderata. We seek to identify, quantify, and optimize the empirical Pareto frontier of the fairness-accuracy tradeoff. Specifically, we identify and outline the empirical Pareto frontier through Tradeoff-between-Fairness-and-Accuracy (TAF) Curves; we then develop a metric to quantify this Pareto frontier through the weighted area under the TAF Curve which we term the Fairness-Area-Under-the-Curve (FAUC). TAF Curves provide the first empirical, model-agnostic characterization of the Pareto frontier, while FAUC provides the first metric to impartially compare model families on both fairness and accuracy. Both TAF Curves and FAUC can be employed with all group fairness definitions and accuracy measures. Next, we ask: Is it possible to expand the empirical Pareto frontier and thus improve the FAUC for a given collection of fitted models? We answer affirmately by develo** a novel fair model stacking framework, FairStacks, that solves a convex program to maximize the accuracy of model ensemble subject to a score-bias constraint. We show that optimizing with FairStacks always expands the empirical Pareto frontier and improves the FAUC; we additionally study other theoretical properties of our proposed approach. Finally, we empirically validate TAF, FAUC, and FairStacks through studies on several real benchmark data sets, showing that FairStacks leads to major improvements in FAUC that outperform existing algorithmic fairness approaches. △ Less

Submitted 31 May, 2022; originally announced June 2022.

arXiv:2112.00076 [pdf, ps, other]

Using Conversational Artificial Intelligence to Support Children's Search in the Classroom

Authors: Garrett Allen, Jie Yang, Maria Soledad Pera, Ujwal Gadiraju

Abstract: We present pathways of investigation regarding conversational user interfaces (CUIs) for children in the classroom. We highlight anticipated challenges to be addressed in order to advance knowledge on CUIs for children. Further, we discuss preliminary ideas on strategies for evaluation. We present pathways of investigation regarding conversational user interfaces (CUIs) for children in the classroom. We highlight anticipated challenges to be addressed in order to advance knowledge on CUIs for children. Further, we discuss preliminary ideas on strategies for evaluation. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Comments: Presented at CUI@CSCW 2021 -- https://www.conversationaluserinterfaces.org/workshops/CSCW2021/pdfs/2-Allen.pdf

ACM Class: H.5.2

arXiv:2111.01273 [pdf, other]

Network Clustering for Latent State and Changepoint Detection

Authors: Madeline Navarro, Genevera I. Allen, Michael Weylandt

Abstract: Network models provide a powerful and flexible framework for analyzing a wide range of structured data sources. In many situations of interest, however, multiple networks can be constructed to capture different aspects of an underlying phenomenon or to capture changing behavior over time. In such settings, it is often useful to cluster together related networks in attempt to identify patterns of c… ▽ More Network models provide a powerful and flexible framework for analyzing a wide range of structured data sources. In many situations of interest, however, multiple networks can be constructed to capture different aspects of an underlying phenomenon or to capture changing behavior over time. In such settings, it is often useful to cluster together related networks in attempt to identify patterns of common structure. In this paper, we propose a convex approach for the task of network clustering. Our approach uses a convex fusion penalty to induce a smoothly-varying tree-like cluster structure, eliminating the need to select the number of clusters a priori. We provide an efficient algorithm for convex network clustering and demonstrate its effectiveness on synthetic examples. △ Less

Submitted 1 November, 2021; originally announced November 2021.

arXiv:2111.01025 [pdf, other]

doi 10.3847/1538-4357/ac2ebb

Tracing the Ionization Structure of the Shocked Filaments of NGC 6240

Authors: Anne M. Medling, Lisa J. Kewley, Daniela Calzetti, George C. Privon, Kirsten Larson, Jeffrey A. Rich, Lee Armus, Mark G. Allen, Geoffrey V. Bicknell, Tanio Díaz-Santos, Timothy M. Heckman, Claus Leitherer, Claire E. Max, David S. N. Rupke, Ezequiel Treister, Hugo Messias, Alexander Y. Wagner

Abstract: We study the ionization and excitation structure of the interstellar medium in the late-stage gas-rich galaxy merger NGC 6240 using a suite of emission line maps at $\sim$25 pc resolution from the Hubble Space Telescope, Keck NIRC2 with Adaptive Optics, and ALMA. NGC 6240 hosts a superwind driven by intense star formation and/or one or both of two active nuclei; the outflows produce bubbles and fi… ▽ More We study the ionization and excitation structure of the interstellar medium in the late-stage gas-rich galaxy merger NGC 6240 using a suite of emission line maps at $\sim$25 pc resolution from the Hubble Space Telescope, Keck NIRC2 with Adaptive Optics, and ALMA. NGC 6240 hosts a superwind driven by intense star formation and/or one or both of two active nuclei; the outflows produce bubbles and filaments seen in shock tracers from warm molecular gas (H$_2$ 2.12$μ$m) to optical ionized gas ([O III], [N II], [S II], [O I]) and hot plasma (Fe XXV). In the most distinct bubble, we see a clear shock front traced by high [O III]/H$β$ and [O III]/[O I]. Cool molecular gas (CO(2-1)) is only present near the base of the bubble, towards the nuclei launching the outflow. We interpret the lack of molecular gas outside the bubble to mean that the shock front is not responsible for dissociating molecular gas, and conclude that the molecular clouds are partly shielded and either entrained briefly in the outflow, or left undisturbed while the hot wind flows around them. Elsewhere in the galaxy, shock-excited H$_2$ extends at least $\sim$4 kpc from the nuclei, tracing molecular gas even warmer than that between the nuclei, where the two galaxies' interstellar media are colliding. A ridgeline of high [O III]/H$β$ emission along the eastern arm aligns with the south nucleus' stellar disk minor axis; optical integral field spectroscopy from WiFeS suggests this highly ionized gas is centered at systemic velocity and likely photoionized by direct line-of-sight to the south AGN. △ Less

Submitted 1 November, 2021; originally announced November 2021.

Comments: 27 pages, 18 figures; accepted for publication in ApJ

arXiv:2110.12067 [pdf, other]

Fast and Accurate Graph Learning for Huge Data via Minipatch Ensembles

Authors: Tianyi Yao, Minjie Wang, Genevera I. Allen

Abstract: Gaussian graphical models provide a powerful framework for uncovering conditional dependence relationships between sets of nodes; they have found applications in a wide variety of fields including sensor and communication networks, physics, finance, and computational biology. Often, one observes data on the nodes and the task is to learn the graph structure, or perform graphical model selection. W… ▽ More Gaussian graphical models provide a powerful framework for uncovering conditional dependence relationships between sets of nodes; they have found applications in a wide variety of fields including sensor and communication networks, physics, finance, and computational biology. Often, one observes data on the nodes and the task is to learn the graph structure, or perform graphical model selection. While this is a well-studied problem with many popular techniques, there are typically three major practical challenges: i) many existing algorithms become computationally intractable in huge-data settings with tens of thousands of nodes; ii) the need for separate data-driven hyperparameter tuning considerably adds to the computational burden; iii) the statistical accuracy of selected edges often deteriorates as the dimension and/or the complexity of the underlying graph structures increase. We tackle these problems by develo** the novel Minipatch Graph (MPGraph) estimator. Our approach breaks up the huge graph learning problem into many smaller problems by creating an ensemble of tiny random subsets of both the observations and the nodes, termed minipatches. We then leverage recent advances that use hard thresholding to solve the latent variable graphical model problem to consistently learn the graph on each minipatch. Our approach is computationally fast, embarrassingly parallelizable, memory efficient, and has integrated stability-based hyperparamter tuning. Additionally, we prove that under weaker assumptions than that of the Graphical Lasso, our MPGraph estimator achieves graph selection consistency. We compare our approach to state-of-the-art computational approaches for Gaussian graphical model selection including the BigQUIC algorithm, and empirically demonstrate that our approach is not only more statistically accurate but also extensively faster for huge graph learning problems. △ Less

Submitted 2 January, 2023; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2110.02388 [pdf, other]

doi 10.1371/journal.pcbi.1010577

Fast and Interpretable Consensus Clustering via Minipatch Learning

Authors: Luqin Gan, Genevera I. Allen

Abstract: Consensus clustering has been widely used in bioinformatics and other applications to improve the accuracy, stability and reliability of clustering results. This approach ensembles cluster co-occurrences from multiple clustering runs on subsampled observations. For application to large-scale bioinformatics data, such as to discover cell types from single-cell sequencing data, for example, consensu… ▽ More Consensus clustering has been widely used in bioinformatics and other applications to improve the accuracy, stability and reliability of clustering results. This approach ensembles cluster co-occurrences from multiple clustering runs on subsampled observations. For application to large-scale bioinformatics data, such as to discover cell types from single-cell sequencing data, for example, consensus clustering has two significant drawbacks: (i) computational inefficiency due to repeatedly applying clustering algorithms, and (ii) lack of interpretability into the important features for differentiating clusters. In this paper, we address these two challenges by develo** IMPACC: Interpretable MiniPatch Adaptive Consensus Clustering. Our approach adopts three major innovations. We ensemble cluster co-occurrences from tiny subsets of both observations and features, termed minipatches, thus dramatically reducing computation time. Additionally, we develop adaptive sampling schemes for observations, which result in both improved reliability and computational savings, as well as adaptive sampling schemes of features, which leads to interpretable solutions by quickly learning the most relevant features that differentiate clusters. We study our approach on synthetic data and a variety of real large-scale bioinformatics data sets; results show that our approach not only yields more accurate and interpretable cluster solutions, but it also substantially improves computational efficiency compared to standard consensus clustering approaches. △ Less

Submitted 18 October, 2021; v1 submitted 5 October, 2021; originally announced October 2021.

arXiv:2107.00600 [pdf, other]

doi 10.1103/PhysRevD.104.082004

All-sky Search for Continuous Gravitational Waves from Isolated Neutron Stars in the Early O3 LIGO Data

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1566 additional authors not shown)

Abstract: We report on an all-sky search for continuous gravitational waves in the frequency band 20-2000\,Hz and with a frequency time derivative in the range of $[-1.0, +0.1]\times10^{-8}$\,Hz/s. Such a signal could be produced by a nearby, spinning and slightly non-axisymmetric isolated neutron star in our galaxy. This search uses the LIGO data from the first six months of Advanced LIGO's and Advanced Vi… ▽ More We report on an all-sky search for continuous gravitational waves in the frequency band 20-2000\,Hz and with a frequency time derivative in the range of $[-1.0, +0.1]\times10^{-8}$\,Hz/s. Such a signal could be produced by a nearby, spinning and slightly non-axisymmetric isolated neutron star in our galaxy. This search uses the LIGO data from the first six months of Advanced LIGO's and Advanced Virgo's third observational run, O3. No periodic gravitational wave signals are observed, and 95\%\ confidence-level (CL) frequentist upper limits are placed on their strengths. The lowest upper limits on worst-case (linearly polarized) strain amplitude $h_0$ are $~1.7\times10^{-25}$ near 200\,Hz. For a circularly polarized source (most favorable orientation), the lowest upper limits are $\sim6.3\times10^{-26}$. These strict frequentist upper limits refer to all sky locations and the entire range of frequency derivative values. For a population-averaged ensemble of sky locations and stellar orientations, the lowest 95\%\ CL upper limits on the strain amplitude are $\sim1.\times10^{-25}$. These upper limits improve upon our previously published all-sky results, with the greatest improvement (factor of $\sim$2) seen at higher frequencies, in part because quantum squeezing has dramatically improved the detector noise level relative to the second observational run, O2. These limits are the most constraining to date over most of the parameter space searched. △ Less

Submitted 8 October, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

Comments: 28 pages, 7 figures

Report number: LIGO-P2000334-v9

Journal ref: Phys. Rev. D 104, 082004 (2021)

arXiv:2106.15163 [pdf, other]

doi 10.3847/2041-8213/ac082e

Observation of gravitational waves from two neutron star-black hole coalescences

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1577 additional authors not shown)

Abstract: We report the observation of gravitational waves from two compact binary coalescences in LIGO's and Virgo's third observing run with properties consistent with neutron star-black hole (NSBH) binaries. The two events are named GW200105_162426 and GW200115_042309, abbreviated as GW200105 and GW200115; the first was observed by LIGO Livingston and Virgo, and the second by all three LIGO-Virgo detecto… ▽ More We report the observation of gravitational waves from two compact binary coalescences in LIGO's and Virgo's third observing run with properties consistent with neutron star-black hole (NSBH) binaries. The two events are named GW200105_162426 and GW200115_042309, abbreviated as GW200105 and GW200115; the first was observed by LIGO Livingston and Virgo, and the second by all three LIGO-Virgo detectors. The source of GW200105 has component masses $8.9^{+1.2}_{-1.5}\,M_\odot$ and $1.9^{+0.3}_{-0.2}\,M_\odot$, whereas the source of GW200115 has component masses $5.7^{+1.8}_{-2.1}\,M_\odot$ and $1.5^{+0.7}_{-0.3}\,M_\odot$ (all measurements quoted at the 90% credible level). The probability that the secondary's mass is below the maximal mass of a neutron star is 89%-96% and 87%-98%, respectively, for GW200105 and GW200115, with the ranges arising from different astrophysical assumptions. The source luminosity distances are $280^{+110}_{-110}$ Mpc and $300^{+150}_{-100}$ Mpc, respectively. The magnitude of the primary spin of GW200105 is less than 0.23 at the 90% credible level, and its orientation is unconstrained. For GW200115, the primary spin has a negative spin projection onto the orbital angular momentum at 88% probability. We are unable to constrain spin or tidal deformation of the secondary component for either event. We infer a NSBH merger rate density of $45^{+75}_{-33}\,\mathrm{Gpc}^{-3} \mathrm{yr}^{-1}$ when assuming GW200105 and GW200115 are representative of the NSBH population, or $130^{+112}_{-69}\,\mathrm{Gpc}^{-3} \mathrm{yr}^{-1}$ under the assumption of a broader distribution of component masses. △ Less

Submitted 29 June, 2021; originally announced June 2021.

Report number: LIGO Document P2000357

Journal ref: ApJL, 915, L5 (2021)

arXiv:2106.11554 [pdf, other]

Subbotin Graphical Models for Extreme Value Dependencies with Applications to Functional Neuronal Connectivity

Authors: Andersen Chang, Genevera I. Allen

Abstract: With modern calcium imaging technology, activities of thousands of neurons can be recorded in vivo. These experiments can potentially provide new insights into intrinsic functional neuronal connectivity, defined as contemporaneous correlations between neuronal activities. As a common tool for estimating conditional dependencies in high-dimensional settings, graphical models are a natural choice fo… ▽ More With modern calcium imaging technology, activities of thousands of neurons can be recorded in vivo. These experiments can potentially provide new insights into intrinsic functional neuronal connectivity, defined as contemporaneous correlations between neuronal activities. As a common tool for estimating conditional dependencies in high-dimensional settings, graphical models are a natural choice for estimating functional connectivity networks. However, raw neuronal activity data presents a unique challenge: the relevant information in the data lies in rare extreme value observations that indicate neuronal firing, rather than in the observations near the mean. Existing graphical modeling techniques for extreme values rely on binning or thresholding observations, which may not be appropriate for calcium imaging data. In this paper, we develop a novel class of graphical models, called the Subbotin graphical model, which finds sparse conditional dependency structures with respect to the extreme value observations without requiring data pre-processing. We first derive the form of the Subbotin graphical model and show the conditions under which it is normalizable. We then study the empirical performance of the Subbotin graphical model and compare it to existing extreme value graphical modeling techniques and functional connectivity models from neuroscience through several simulation studies as well as a real-world calcium imaging data example. △ Less

Submitted 25 August, 2022; v1 submitted 22 June, 2021; originally announced June 2021.

MSC Class: 62H22

arXiv:2106.07813 [pdf, other]

To Infinity and Beyond! Accessibility is the Future for Kids' Search Engines

Authors: Ashlee Milton, Garrett Allen, Maria Soledad Pera

Abstract: Research in the area of search engines for children remains in its infancy. Seminal works have studied how children use mainstream search engines, as well as how to design and evaluate custom search engines explicitly for children. These works, however, tend to take a one-size-fits-all view, treating children as a unit. Nevertheless, even at the same age, children are known to possess and exhibit… ▽ More Research in the area of search engines for children remains in its infancy. Seminal works have studied how children use mainstream search engines, as well as how to design and evaluate custom search engines explicitly for children. These works, however, tend to take a one-size-fits-all view, treating children as a unit. Nevertheless, even at the same age, children are known to possess and exhibit different capabilities. These differences affect how children access and use search engines. To better serve children, in this vision paper, we spotlight accessibility and discuss why current research on children and search engines does not, but should, focus on this significant matter. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: In the proceeding of IR for Children 2000-2020: Where Are We Now? (https://www.fab4.science/ir4c/) -- Workshop co-located with the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

arXiv:2105.11641 [pdf, other]

doi 10.3847/1538-4357/ac17ea

Searches for continuous gravitational waves from young supernova remnants in the early third observing run of Advanced LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1567 additional authors not shown)

Abstract: We present results of three wide-band directed searches for continuous gravitational waves from 15 young supernova remnants in the first half of the third Advanced LIGO and Virgo observing run. We use three search pipelines with distinct signal models and methods of identifying noise artifacts. Without ephemerides of these sources, the searches are conducted over a frequency band spanning from 10~… ▽ More We present results of three wide-band directed searches for continuous gravitational waves from 15 young supernova remnants in the first half of the third Advanced LIGO and Virgo observing run. We use three search pipelines with distinct signal models and methods of identifying noise artifacts. Without ephemerides of these sources, the searches are conducted over a frequency band spanning from 10~Hz to 2~kHz. We find no evidence of continuous gravitational radiation from these sources. We set upper limits on the intrinsic signal strain at 95\% confidence level in sample sub-bands, estimate the sensitivity in the full band, and derive the corresponding constraints on the fiducial neutron star ellipticity and $r$-mode amplitude. The best 95\% confidence constraints placed on the signal strain are $7.7\times 10^{-26}$ and $7.8\times 10^{-26}$ near 200~Hz for the supernova remnants G39.2--0.3 and G65.7+1.2, respectively. The most stringent constraints on the ellipticity and $r$-mode amplitude reach $\lesssim 10^{-7}$ and $ \lesssim 10^{-5}$, respectively, at frequencies above $\sim 400$~Hz for the closest supernova remnant G266.2--1.2/Vela Jr. △ Less

Submitted 14 July, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

Comments: https://dcc.ligo.org/P2000479

arXiv:2105.06384 [pdf, other]

doi 10.3847/1538-4357/ac23db

Search for lensing signatures in the gravitational-wave observations from the first half of LIGO-Virgo's third observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, K. M. Aleman, G. Allen, A. Allocca, P. A. Altin, A. Amato , et al. (1356 additional authors not shown)

Abstract: We search for signatures of gravitational lensing in the gravitational-wave signals from compact binary coalescences detected by Advanced LIGO and Advanced Virgo during O3a, the first half of their third observing run. We study: 1) the expected rate of lensing at current detector sensitivity and the implications of a non-observation of strong lensing or a stochastic gravitational-wave background o… ▽ More We search for signatures of gravitational lensing in the gravitational-wave signals from compact binary coalescences detected by Advanced LIGO and Advanced Virgo during O3a, the first half of their third observing run. We study: 1) the expected rate of lensing at current detector sensitivity and the implications of a non-observation of strong lensing or a stochastic gravitational-wave background on the merger-rate density at high redshift; 2) how the interpretation of individual high-mass events would change if they were found to be lensed; 3) the possibility of multiple images due to strong lensing by galaxies or galaxy clusters; and 4) possible wave-optics effects due to point-mass microlenses. Several pairs of signals in the multiple-image analysis show similar parameters and, in this sense, are nominally consistent with the strong lensing hypothesis. However, taking into account population priors, selection effects, and the prior odds against lensing, these events do not provide sufficient evidence for lensing. Overall, we find no compelling evidence for lensing in the observed gravitational-wave signals from any of these analyses. △ Less

Submitted 30 November, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

Comments: 31 pages and 6 figures. Accepted by the Astrophysical Journal

Report number: LIGO-P2000400

arXiv:2105.03456 [pdf, other]

CASTing a Net: Supporting Teachers with Search Technology

Authors: Garrett Allen, Katherine Landau Wright, Jerry Alan Fails, Casey Kennington, Maria Soledad Pera

Abstract: Past and current research has typically focused on ensuring that search technology for the classroom serves children. In this paper, we argue for the need to broaden the research focus to include teachers and how search technology can aid them. In particular, we share how furnishing a behind-the-scenes portal for teachers can empower them by providing a window into the spelling, writing, and conce… ▽ More Past and current research has typically focused on ensuring that search technology for the classroom serves children. In this paper, we argue for the need to broaden the research focus to include teachers and how search technology can aid them. In particular, we share how furnishing a behind-the-scenes portal for teachers can empower them by providing a window into the spelling, writing, and concept connection skills of their students. △ Less

Submitted 7 May, 2021; originally announced May 2021.

Comments: KidRec '21: 5th International and Interdisciplinary Perspectives on Children & Recommender and Information Retrieval Systems (KidRec) Search and Recommendation Technology through the Lens of a Teacher- Co-located with ACM IDC 2021

arXiv:2104.14417 [pdf, other]

doi 10.3847/1538-4357/ac0d52

Constraints from LIGO O3 data on gravitational-wave emission due to r-modes in the glitching pulsar PSR J0537-6910

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1574 additional authors not shown)

Abstract: We present a search for continuous gravitational-wave emission due to r-modes in the pulsar PSR J0537-6910 using data from the LIGO-Virgo Collaboration observing run O3. PSR J0537-6910 is a young energetic X-ray pulsar and is the most frequent glitcher known. The inter-glitch braking index of the pulsar suggests that gravitational-wave emission due to r-mode oscillations may play an important role… ▽ More We present a search for continuous gravitational-wave emission due to r-modes in the pulsar PSR J0537-6910 using data from the LIGO-Virgo Collaboration observing run O3. PSR J0537-6910 is a young energetic X-ray pulsar and is the most frequent glitcher known. The inter-glitch braking index of the pulsar suggests that gravitational-wave emission due to r-mode oscillations may play an important role in the spin evolution of this pulsar. Theoretical models confirm this possibility and predict emission at a level that can be probed by ground-based detectors. In order to explore this scenario, we search for r-mode emission in the epochs between glitches by using a contemporaneous timing ephemeris obtained from NICER data. We do not detect any signals in the theoretically expected band of 86-97 Hz, and report upper limits on the amplitude of the gravitational waves. Our results improve on previous amplitude upper limits from r-modes in J0537-6910 by a factor of up to 3 and place stringent constraints on theoretical models for r-mode driven spin-down in PSR J0537-6910, especially for higher frequencies at which our results reach below the spin-down limit defined by energy conservation. △ Less

Submitted 7 January, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

Comments: 28 pages, 19 figures, accepted in ApJ

Report number: LIGO-P2100069

Journal ref: ApJ 922 71 (2021)

arXiv:2104.06389 [pdf, other]

Thresholded Graphical Lasso Adjusts for Latent Variables: Application to Functional Neural Connectivity

Authors: Minjie Wang, Genevera I. Allen

Abstract: In neuroscience, researchers seek to uncover the connectivity of neurons from large-scale neural recordings or imaging; often people employ graphical model selection and estimation techniques for this purpose. But, existing technologies can only record from a small subset of neurons leading to a challenging problem of graph selection in the presence of extensive latent variables. Chandrasekaran et… ▽ More In neuroscience, researchers seek to uncover the connectivity of neurons from large-scale neural recordings or imaging; often people employ graphical model selection and estimation techniques for this purpose. But, existing technologies can only record from a small subset of neurons leading to a challenging problem of graph selection in the presence of extensive latent variables. Chandrasekaran et al. (2012) proposed a convex program to address this problem that poses challenges from both a computational and statistical perspective. To solve this problem, we propose an incredibly simple solution: apply a hard thresholding operator to existing graph selection methods. Conceptually simple and computationally attractive, we demonstrate that thresholding the graphical Lasso, neighborhood selection, or CLIME estimators have superior theoretical properties in terms of graph selection consistency as well as stronger empirical results than existing approaches for the latent variable graphical model problem. We also demonstrate the applicability of our approach through a neuroscience case study on calcium-imaging data to estimate functional neural connections. △ Less

Submitted 13 April, 2021; originally announced April 2021.

arXiv:2103.08520 [pdf, other]

doi 10.1103/PhysRevD.104.022005

Search for anisotropic gravitational-wave backgrounds using data from Advanced LIGO and Advanced Virgo's first three observing runs

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1568 additional authors not shown)

Abstract: We report results from searches for anisotropic stochastic gravitational-wave backgrounds using data from the first three observing runs of the Advanced LIGO and Advanced Virgo detectors. For the first time, we include Virgo data in our analysis and run our search with a new efficient pipeline called {\tt PyStoch} on data folded over one sidereal day. We use gravitational-wave radiometry (broadban… ▽ More We report results from searches for anisotropic stochastic gravitational-wave backgrounds using data from the first three observing runs of the Advanced LIGO and Advanced Virgo detectors. For the first time, we include Virgo data in our analysis and run our search with a new efficient pipeline called {\tt PyStoch} on data folded over one sidereal day. We use gravitational-wave radiometry (broadband and narrow band) to produce sky maps of stochastic gravitational-wave backgrounds and to search for gravitational waves from point sources. A spherical harmonic decomposition method is employed to look for gravitational-wave emission from spatially-extended sources. Neither technique found evidence of gravitational-wave signals. Hence we derive 95\% confidence-level upper limit sky maps on the gravitational-wave energy flux from broadband point sources, ranging from $F_{α, Θ} < {\rm (0.013 - 7.6)} \times 10^{-8} {\rm erg \, cm^{-2} \, s^{-1} \, Hz^{-1}},$ and on the (normalized) gravitational-wave energy density spectrum from extended sources, ranging from $Ω_{α, Θ} < {\rm (0.57 - 9.3)} \times 10^{-9} \, {\rm sr^{-1}}$, depending on direction ($Θ$) and spectral index ($α$). These limits improve upon previous limits by factors of $2.9 - 3.5$. We also set 95\% confidence level upper limits on the frequency-dependent strain amplitudes of quasimonochromatic gravitational waves coming from three interesting targets, Scorpius X-1, SN 1987A and the Galactic Center, with best upper limits range from $h_0 < {\rm (1.7-2.1)} \times 10^{-25},$ a factor of $\geq 2.0$ improvement compared to previous stochastic radiometer searches. △ Less

Submitted 2 February, 2022; v1 submitted 15 March, 2021; originally announced March 2021.

Comments: 23 Pages, 9 Figures

Report number: LIGO-P2000500

Journal ref: Phys. Rev. D 104, 022005 (2021)

arXiv:2102.04517 [pdf]

Power Off! Challenges in Planning and Executing Power Isolations on Shared-Use Electrified Railways

Authors: Alex Lu, Aleksandr Lukatskiy, Zhiqi Zhong, John G. Allen

Abstract: Electric railways are fast, clean, and safe, but complex to operate and maintain. Electric traction infrastructure includes signal power and feeder lines that remain live during isolations and complicate maintenance processes. Stakeholders involved in power outage planning include contractors, linemen, groundmen, power directors, dispatchers, conductor-flag, and support personnel. Weekly planning… ▽ More Electric railways are fast, clean, and safe, but complex to operate and maintain. Electric traction infrastructure includes signal power and feeder lines that remain live during isolations and complicate maintenance processes. Stakeholders involved in power outage planning include contractors, linemen, groundmen, power directors, dispatchers, conductor-flag, and support personnel. Weekly planning processes for track time requires many contingencies due to large number of moving parts and factors not known in advance, like personnel availability. Electrical and mechanical environments faced by crews working in adjacent areas may be entirely different and require a "bespoke" circuit configuration to de-energize catenary, which must be planned meticulously. Although recent automation improved real-time "plate order" communications between power directors and dispatchers, each outage still requires many manual switching operations. Net impact of this isolation process reduces available construction work windows nightly from a nominal 7 hours to 2 hrs 39 mins. We recommend joint design of electrical and civil infrastructure, cross-training between disciplines, limiting maximum number of concurrent outages, formal study of maintenance outage capacity, and further automation in power switching. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 26 pages, 6 figures

arXiv:2101.12248 [pdf, other]

doi 10.1103/PhysRevLett.126.241102

Constraints on cosmic strings using data from the third Advanced LIGO-Virgo observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1565 additional authors not shown)

Abstract: We search for gravitational-wave signals produced by cosmic strings in the Advanced LIGO and Virgo full O3 data set. Search results are presented for gravitational waves produced by cosmic string loop features such as cusps, kinks and, for the first time, kink-kink collisions.cA template-based search for short-duration transient signals does not yield a detection. We also use the stochastic gravit… ▽ More We search for gravitational-wave signals produced by cosmic strings in the Advanced LIGO and Virgo full O3 data set. Search results are presented for gravitational waves produced by cosmic string loop features such as cusps, kinks and, for the first time, kink-kink collisions.cA template-based search for short-duration transient signals does not yield a detection. We also use the stochastic gravitational-wave background energy density upper limits derived from the O3 data to constrain the cosmic string tension, $Gμ$, as a function of the number of kinks, or the number of cusps, for two cosmic string loop distribution models.cAdditionally, we develop and test a third model which interpolates between these two models. Our results improve upon the previous LIGO-Virgo constraints on $Gμ$ by one to two orders of magnitude depending on the model which is tested. In particular, for one loop distribution model, we set the most competitive constraints to date, $Gμ\lesssim 4\times 10^{-15}$. △ Less

Submitted 28 January, 2021; originally announced January 2021.

Comments: 20 pages, 10 figures

Report number: LIGO-P2000506

Journal ref: Phys. Rev. Lett. 126, 241102 (2021)

arXiv:2101.12130 [pdf, other]

doi 10.1103/PhysRevD.104.022004

Upper Limits on the Isotropic Gravitational-Wave Background from Advanced LIGO's and Advanced Virgo's Third Observing Run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca, P. A. Altin , et al. (1566 additional authors not shown)

Abstract: We report results of a search for an isotropic gravitational-wave background (GWB) using data from Advanced LIGO's and Advanced Virgo's third observing run (O3) combined with upper limits from the earlier O1 and O2 runs. Unlike in previous observing runs in the advanced detector era, we include Virgo in the search for the GWB. The results are consistent with uncorrelated noise, and therefore we pl… ▽ More We report results of a search for an isotropic gravitational-wave background (GWB) using data from Advanced LIGO's and Advanced Virgo's third observing run (O3) combined with upper limits from the earlier O1 and O2 runs. Unlike in previous observing runs in the advanced detector era, we include Virgo in the search for the GWB. The results are consistent with uncorrelated noise, and therefore we place upper limits on the strength of the GWB. We find that the dimensionless energy density $Ω_{\rm GW}\leq 5.8\times 10^{-9}$ at the 95% credible level for a flat (frequency-independent) GWB, using a prior which is uniform in the log of the strength of the GWB, with 99% of the sensitivity coming from the band 20-76.6 Hz; $\leq 3.4 \times 10^{-9}$ at 25 Hz for a power-law GWB with a spectral index of 2/3 (consistent with expectations for compact binary coalescences), in the band 20-90.6 Hz; and $\leq 3.9 \times 10^{-10}$ at 25 Hz for a spectral index of 3, in the band 20-291.6 Hz. These upper limits improve over our previous results by a factor of 6.0 for a flat GWB. We also search for a GWB arising from scalar and vector modes, which are predicted by alternative theories of gravity; we place upper limits on the strength of GWBs with these polarizations. We demonstrate that there is no evidence of correlated noise of magnetic origin by performing a Bayesian analysis that allows for the presence of both a GWB and an effective magnetic background arising from geophysical Schumann resonances. We compare our upper limits to a fiducial model for the GWB from the merger of compact binaries. Finally, we combine our results with observations of individual mergers andshow that, at design sensitivity, this joint approach may yield stronger constraints on the merger rate of binary black holes at $z \lesssim 2$ than can be achieved with individually resolved mergers alone. [abridged] △ Less

Submitted 28 January, 2021; originally announced January 2021.

Comments: 25 pages, 7 figures, Abstract abridged for arxiv submission

Report number: LIGO-DCC-P2000314

Journal ref: Phys. Rev. D 104, 022004 (2021)

arXiv:2012.12926 [pdf, other]

doi 10.3847/2041-8213/abffcd

Diving below the spin-down limit: Constraints on gravitational waves from the energetic young pulsar PSR J0537-6910

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, K. M. Aleman, G. Allen, A. Allocca , et al. (1568 additional authors not shown)

Abstract: We present a search for continuous gravitational-wave signals from the young, energetic X-ray pulsar PSR J0537-6910 using data from the second and third observing runs of LIGO and Virgo. The search is enabled by a contemporaneous timing ephemeris obtained using NICER data. The NICER ephemeris has also been extended through 2020 October and includes three new glitches. PSR J0537-6910 has the larges… ▽ More We present a search for continuous gravitational-wave signals from the young, energetic X-ray pulsar PSR J0537-6910 using data from the second and third observing runs of LIGO and Virgo. The search is enabled by a contemporaneous timing ephemeris obtained using NICER data. The NICER ephemeris has also been extended through 2020 October and includes three new glitches. PSR J0537-6910 has the largest spin-down luminosity of any pulsar and is highly active with regards to glitches. Analyses of its long-term and inter-glitch braking indices provided intriguing evidence that its spin-down energy budget may include gravitational-wave emission from a time-varying mass quadrupole moment. Its 62 Hz rotation frequency also puts its possible gravitational-wave emission in the most sensitive band of LIGO/Virgo detectors. Motivated by these considerations, we search for gravitational-wave emission at both once and twice the rotation frequency. We find no signal, however, and report our upper limits. Assuming a rigidly rotating triaxial star, our constraints reach below the gravitational-wave spin-down limit for this star for the first time by more than a factor of two and limit gravitational waves from the $l=m=2$ mode to account for less than 14% of the spin-down energy budget. The fiducial equatorial ellipticity is limited to less than about 3e-5, which is the third best constraint for any young pulsar. △ Less

Submitted 10 June, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

Comments: 21 pages, 5 figures, published in ApJL

Report number: LIGO-P2000407

arXiv:2012.12128 [pdf, other]

doi 10.1103/PhysRevD.103.064017

All-sky search in early O3 LIGO data for continuous gravitational-wave signals from unknown neutron stars in binary systems

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, K. M. Aleman, G. Allen, A. Allocca, P. A. Altin, A. Amato , et al. (1347 additional authors not shown)

Abstract: Rapidly spinning neutron stars are promising sources of persistent, continuous gravitational waves. Detecting such a signal would allow probing of the physical properties of matter under extreme conditions. A significant fraction of the known pulsar population belongs to binary systems. Searching for unknown neutron stars in binary systems requires specialized algorithms to address unknown orbital… ▽ More Rapidly spinning neutron stars are promising sources of persistent, continuous gravitational waves. Detecting such a signal would allow probing of the physical properties of matter under extreme conditions. A significant fraction of the known pulsar population belongs to binary systems. Searching for unknown neutron stars in binary systems requires specialized algorithms to address unknown orbital frequency modulations. We present a search for continuous gravitational waves emitted by neutron stars in binary systems in early data from the third observing run of the Advanced LIGO and Advanced Virgo detectors using the semicoherent, GPU-accelerated, BinarySkyHough pipeline. The search analyzes the most sensitive frequency band of the LIGO detectors, 50 - 300 Hz. Binary orbital parameters are split into four regions, comprising orbital periods of 3 - 45 days and projected semimajor axes of 2 - 40 light-seconds. No detections are reported. We estimate the sensitivity of the search using simulated continuous wave signals, achieving the most sensitive results to date across the analyzed parameter space. △ Less

Submitted 19 March, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

Comments: 23 pages, 12 figures, 7 tables

Report number: LIGO-P2000298

Journal ref: Phys. Rev. D 103, 064017 (2021)

arXiv:2012.11534 [pdf, ps, other]

ESCAPE -- addressing Open Science challenges

Authors: Mark G. Allen, Giovanni Lamanna, Xavier Espinal, Kay Graf, Michiel van Haarlem, Stephen Serjeant, Ian Bird, Elena Cuoco, Jayesh Wagh

Abstract: ESCAPE (European Science Cluster of Astronomy & Particle physics ESFRI research infrastructures) is an EU H2020 project that addresses the Open Science challenges shared by the astrophysics and and accelerator-based physics and nuclear physics ESFRI projects and landmarks. This project is embedded in the context of the European Open Science Cloud (EOSC) and involves activities to develop a prototy… ▽ More ESCAPE (European Science Cluster of Astronomy & Particle physics ESFRI research infrastructures) is an EU H2020 project that addresses the Open Science challenges shared by the astrophysics and and accelerator-based physics and nuclear physics ESFRI projects and landmarks. This project is embedded in the context of the European Open Science Cloud (EOSC) and involves activities to develop a prototype Data Lake and Science Platform, as well as support of an Open Source Software Repository, connection of the Virtual Observatory framework to EOSC, and engaging the public in citizen science. In this poster paper we provide a brief overview of the project and the results presented at ADASS. △ Less

Submitted 22 December, 2020; v1 submitted 21 December, 2020; originally announced December 2020.

Comments: 4 pages, to appear in the proceedings of Astronomical Data Analysis Software and Systems XXX published by ASP

arXiv:2012.06635 [pdf, other]

doi 10.1088/1361-6382/abf9b5

DataVault: A Data Storage Infrastructure for the Einstein Toolkit

Authors: Yufeng Luo, Roland Haas, Qian Zhang, Gabrielle Allen

Abstract: Data sharing is essential in the numerical simulations research. We introduce a data repository, DataVault, that is designed for data sharing, search and analysis. A comparative study of existing repositories is performed to analyze features that are critical to a data repository. We describe the architecture, workflow, and deployment of DataVault, and provide three use-case scenarios for differen… ▽ More Data sharing is essential in the numerical simulations research. We introduce a data repository, DataVault, that is designed for data sharing, search and analysis. A comparative study of existing repositories is performed to analyze features that are critical to a data repository. We describe the architecture, workflow, and deployment of DataVault, and provide three use-case scenarios for different communities to facilitate the use and application of DataVault. Potential features are proposed and we outline the future development for these features. △ Less

Submitted 15 February, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

Comments: 17 pages, 3 figures, 2 tables

arXiv:2012.04762 [pdf, other]

doi 10.1109/DSLW51110.2021.9523413

Simultaneous Grou** and Denoising via Sparse Convex Wavelet Clustering

Authors: Michael Weylandt, T. Mitchell Roddenberry, Genevera I. Allen

Abstract: Clustering is a ubiquitous problem in data science and signal processing. In many applications where we observe noisy signals, it is common practice to first denoise the data, perhaps using wavelet denoising, and then to apply a clustering algorithm. In this paper, we develop a sparse convex wavelet clustering approach that simultaneously denoises and discovers groups. Our approach utilizes convex… ▽ More Clustering is a ubiquitous problem in data science and signal processing. In many applications where we observe noisy signals, it is common practice to first denoise the data, perhaps using wavelet denoising, and then to apply a clustering algorithm. In this paper, we develop a sparse convex wavelet clustering approach that simultaneously denoises and discovers groups. Our approach utilizes convex fusion penalties to achieve agglomeration and group-sparse penalties to denoise through sparsity in the wavelet domain. In contrast to common practice which denoises then clusters, our method is a unified, convex approach that performs both simultaneously. Our method yields denoised (wavelet-sparse) cluster centroids that both improve interpretability and data compression. We demonstrate our method on synthetic examples and in an application to NMR spectroscopy. △ Less

Submitted 3 March, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

Comments: To appear in IEEE DSLW 2021

Journal ref: DSLW 2021: Proceedings of the IEEE Data Science and Learning Workshop 2021, pp.1-8. 2021

arXiv:2011.09447 [pdf, other]

Interpretable Visualization and Higher-Order Dimension Reduction for ECoG Data

Authors: Kelly Geyer, Frederick Campbell, Andersen Chang, John Magnotti, Michael Beauchamp, Genevera I. Allen

Abstract: ElectroCOrticoGraphy (ECoG) technology measures electrical activity in the human brain via electrodes placed directly on the cortical surface during neurosurgery. Through its capability to record activity at a fast temporal resolution, ECoG experiments have allowed scientists to better understand how the human brain processes speech. By its nature, ECoG data is difficult for neuroscientists to dir… ▽ More ElectroCOrticoGraphy (ECoG) technology measures electrical activity in the human brain via electrodes placed directly on the cortical surface during neurosurgery. Through its capability to record activity at a fast temporal resolution, ECoG experiments have allowed scientists to better understand how the human brain processes speech. By its nature, ECoG data is difficult for neuroscientists to directly interpret for two major reasons. Firstly, ECoG data tends to be large in size, as each individual experiment yields data up to several gigabytes. Secondly, ECoG data has a complex, higher-order nature. After signal processing, this type of data may be organized as a 4-way tensor with dimensions representing trials, electrodes, frequency, and time. In this paper, we develop an interpretable dimension reduction approach called Regularized Higher Order Principal Components Analysis, as well as an extension to Regularized Higher Order Partial Least Squares, that allows neuroscientists to explore and visualize ECoG data. Our approach employs a sparse and functional Candecomp-Parafac (CP) decomposition that incorporates sparsity to select relevant electrodes and frequency bands, as well as smoothness over time and frequency, yielding directly interpretable factors. We demonstrate the performance and interpretability of our method with an ECoG case study on audio and visual processing of human speech. △ Less

Submitted 12 December, 2020; v1 submitted 15 November, 2020; originally announced November 2020.

arXiv:2011.07218 [pdf, other]

doi 10.1109/BigComp51126.2021.00023

MP-Boost: Minipatch Boosting via Adaptive Feature and Observation Sampling

Authors: Mohammad Taha Toghani, Genevera I. Allen

Abstract: Boosting methods are among the best general-purpose and off-the-shelf machine learning approaches, gaining widespread popularity. In this paper, we seek to develop a boosting method that yields comparable accuracy to popular AdaBoost and gradient boosting methods, yet is faster computationally and whose solution is more interpretable. We achieve this by develo** MP-Boost, an algorithm loosely ba… ▽ More Boosting methods are among the best general-purpose and off-the-shelf machine learning approaches, gaining widespread popularity. In this paper, we seek to develop a boosting method that yields comparable accuracy to popular AdaBoost and gradient boosting methods, yet is faster computationally and whose solution is more interpretable. We achieve this by develo** MP-Boost, an algorithm loosely based on AdaBoost that learns by adaptively selecting small subsets of instances and features, or what we term minipatches (MP), at each iteration. By sequentially learning on tiny subsets of the data, our approach is computationally faster than other classic boosting algorithms. Also as it progresses, MP-Boost adaptively learns a probability distribution on the features and instances that upweight the most important features and challenging instances, hence adaptively selecting the most relevant minipatches for learning. These learned probability distributions also aid in interpretation of our method. We empirically demonstrate the interpretability, comparative accuracy, and computational time of our approach on a variety of binary classification tasks. △ Less

Submitted 13 November, 2020; originally announced November 2020.

arXiv:2010.14550 [pdf, other]

doi 10.3847/1538-4357/abee15

Search for Gravitational Waves Associated with Gamma-Ray Bursts Detected by Fermi and Swift During the LIGO-Virgo Run O3a

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1228 additional authors not shown)

Abstract: We search for gravitational-wave transients associated with gamma-ray bursts detected by the Fermi and Swift satellites during the first part of the third observing run of Advanced LIGO and Advanced Virgo (1 April 2019 15:00 UTC - 1 October 2019 15:00 UTC). 105 gamma-ray bursts were analyzed using a search for generic gravitational-wave transients; 32 gamma-ray bursts were analyzed with a search t… ▽ More We search for gravitational-wave transients associated with gamma-ray bursts detected by the Fermi and Swift satellites during the first part of the third observing run of Advanced LIGO and Advanced Virgo (1 April 2019 15:00 UTC - 1 October 2019 15:00 UTC). 105 gamma-ray bursts were analyzed using a search for generic gravitational-wave transients; 32 gamma-ray bursts were analyzed with a search that specifically targets neutron star binary mergers as short gamma-ray burst progenitors. We describe a method to calculate the probability that triggers from the binary merger targeted search are astrophysical and apply that method to the most significant gamma-ray bursts in that search. We find no significant evidence for gravitational-wave signals associated with the gamma-ray bursts that we followed up, nor for a population of unidentified subthreshold signals. We consider several source types and signal morphologies, and report for these lower bounds on the distance to each gamma-ray burst. △ Less

Submitted 20 August, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: 17 pages, 5 figures, 2 tables

Report number: LIGO-P2000040

Journal ref: Astrophys. J. 915, 86 (2021)

arXiv:2010.14533 [pdf, other]

doi 10.3847/2041-8213/abe949

Population Properties of Compact Objects from the Second LIGO-Virgo Gravitational-Wave Transient Catalog

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1316 additional authors not shown)

Abstract: We report on the population of the 47 compact binary mergers detected with a false-alarm rate 1/yr in the second LIGO--Virgo Gravitational-Wave Transient Catalog, GWTC-2. We observe several characteristics of the merging binary black hole (BBH) population not discernible until now. First, we find that the primary mass spectrum contains structure beyond a power-law with a sharp high-mass cut-off; i… ▽ More We report on the population of the 47 compact binary mergers detected with a false-alarm rate 1/yr in the second LIGO--Virgo Gravitational-Wave Transient Catalog, GWTC-2. We observe several characteristics of the merging binary black hole (BBH) population not discernible until now. First, we find that the primary mass spectrum contains structure beyond a power-law with a sharp high-mass cut-off; it is more consistent with a broken power law with a break at $39.7^{+20.3}_{-9.1}\,M_\odot$, or a power law with a Gaussian feature peaking at $33.1^{+4.0}_{-5.6}\,M_\odot$ (90\% credible interval). While the primary mass distribution must extend to $\sim65\,M_\odot$ or beyond, only $2.9^{+3.5}_{1.7}\%$ of systems have primary masses greater than $45\,M_\odot$. Second, we find that a fraction of BBH systems have component spins misaligned with the orbital angular momentum, giving rise to precession of the orbital plane. Moreover, 12% to 44% of BBH systems have spins tilted by more than $90^\circ$, giving rise to a negative effective inspiral spin parameter $χ_\mathrm{eff}$. Under the assumption that such systems can only be formed by dynamical interactions, we infer that between 25% and 93% of BBH with non-vanishing $|χ_\mathrm{eff}| > 0.01$ are dynamically assembled. Third, we estimate merger rates, finding $\mathcal{R}_\text{BBH} = 23.9^{+14.3}_{8.6}$ Gpc$^{-3}$ yr$^{-1}$ for BBH and $\mathcal{R}_\text{BNS}= 320^{+490}_{-240}$ Gpc$^{-3}$ yr$^{-1}$ for binary neutron stars. We find that the BBH rate likely increases with redshift ($85\%$ credibility), but not faster than the star-formation rate ($86\%$ credibility). Additionally, we examine recent exceptional events in the context of our population models, finding that the asymmetric masses of GW190412 and the high component masses of GW190521 are consistent with our models, but the low secondary mass of GW190814 makes it an outlier. △ Less

Submitted 25 February, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: 53 pages, including 24 pages main text, 18 pages appendix, 30 figures

Report number: LIGO-P2000077

arXiv:2010.14529 [pdf, other]

doi 10.1103/PhysRevD.103.122002

Tests of General Relativity with Binary Black Holes from the second LIGO-Virgo Gravitational-Wave Transient Catalog

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1322 additional authors not shown)

Abstract: Gravitational waves enable tests of general relativity in the highly dynamical and strong-field regime. Using events detected by LIGO-Virgo up to 1 October 2019, we evaluate the consistency of the data with predictions from the theory. We first establish that residuals from the best-fit waveform are consistent with detector noise, and that the low- and high-frequency parts of the signals are in ag… ▽ More Gravitational waves enable tests of general relativity in the highly dynamical and strong-field regime. Using events detected by LIGO-Virgo up to 1 October 2019, we evaluate the consistency of the data with predictions from the theory. We first establish that residuals from the best-fit waveform are consistent with detector noise, and that the low- and high-frequency parts of the signals are in agreement. We then consider parametrized modifications to the waveform by varying post-Newtonian and phenomenological coefficients, improving past constraints by factors of ${\sim}2$; we also find consistency with Kerr black holes when we specifically target signatures of the spin-induced quadrupole moment. Looking for gravitational-wave dispersion, we tighten constraints on Lorentz-violating coefficients by a factor of ${\sim}2.6$ and bound the mass of the graviton to $m_g \leq 1.76 \times 10^{-23} \mathrm{eV}/c^2$ with 90% credibility. We also analyze the properties of the merger remnants by measuring ringdown frequencies and dam** times, constraining fractional deviations away from the Kerr frequency to $δ\hat{f}_{220} = 0.03^{+0.38}_{-0.35}$ for the fundamental quadrupolar mode, and $δ\hat{f}_{221} = 0.04^{+0.27}_{-0.32}$ for the first overtone; additionally, we find no evidence for postmerger echoes. Finally, we determine that our data are consistent with tensorial polarizations through a template-independent method. When possible, we assess the validity of general relativity based on collections of events analyzed jointly. We find no evidence for new physics beyond general relativity, for black hole mimickers, or for any unaccounted systematics. △ Less

Submitted 16 June, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: 24 pages + appendices, 19 figures; journal version

Report number: LIGO-P2000091

Journal ref: Phys. Rev. D 103, 122002 (2021)

arXiv:2010.14527 [pdf, other]

doi 10.1103/PhysRevX.11.021053

GWTC-2: Compact Binary Coalescences Observed by LIGO and Virgo During the First Half of the Third Observing Run

Authors: R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva, S. B. Anderson , et al. (1327 additional authors not shown)

Abstract: We report on gravitational wave discoveries from compact binary coalescences detected by Advanced LIGO and Advanced Virgo in the first half of the third observing run (O3a) between 1 April 2019 15:00 UTC and 1 October 2019 15:00. By imposing a false-alarm-rate threshold of two per year in each of the four search pipelines that constitute our search, we present 39 candidate gravitational wave event… ▽ More We report on gravitational wave discoveries from compact binary coalescences detected by Advanced LIGO and Advanced Virgo in the first half of the third observing run (O3a) between 1 April 2019 15:00 UTC and 1 October 2019 15:00. By imposing a false-alarm-rate threshold of two per year in each of the four search pipelines that constitute our search, we present 39 candidate gravitational wave events. At this threshold, we expect a contamination fraction of less than 10%. Of these, 26 candidate events were reported previously in near real-time through GCN Notices and Circulars; 13 are reported here for the first time. The catalog contains events whose sources are black hole binary mergers up to a redshift of ~0.8, as well as events whose components could not be unambiguously identified as black holes or neutron stars. For the latter group, we are unable to determine the nature based on estimates of the component masses and spins from gravitational wave data alone. The range of candidate events which are unambiguously identified as binary black holes (both objects $\geq 3~M_\odot$) is increased compared to GWTC-1, with total masses from $\sim 14~M_\odot$ for GW190924_021846 to $\sim 150~M_\odot$ for GW190521. For the first time, this catalog includes binary systems with significantly asymmetric mass ratios, which had not been observed in data taken before April 2019. We also find that 11 of the 39 events detected since April 2019 have positive effective inspiral spins under our default prior (at 90% credibility), while none exhibit negative effective inspiral spin. Given the increased sensitivity of Advanced LIGO and Advanced Virgo, the detection of 39 candidate events in ~26 weeks of data (~1.5 per week) is consistent with GWTC-1. △ Less

Submitted 8 March, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

Comments: This version updates with minor revisions to typographical errors. We would also like to call attention to the updated parameter estimation samples data release here: https://dcc.ligo.org/LIGO-P2000223/public

Report number: P2000061

Journal ref: Phys. Rev. X 11, 021053 (2021)

arXiv:2010.08529 [pdf, other]

Feature Selection for Huge Data via Minipatch Learning

Authors: Tianyi Yao, Genevera I. Allen

Abstract: Feature selection often leads to increased model interpretability, faster computation, and improved model performance by discarding irrelevant or redundant features. While feature selection is a well-studied problem with many widely-used techniques, there are typically two key challenges: i) many existing approaches become computationally intractable in huge-data settings with millions of observat… ▽ More Feature selection often leads to increased model interpretability, faster computation, and improved model performance by discarding irrelevant or redundant features. While feature selection is a well-studied problem with many widely-used techniques, there are typically two key challenges: i) many existing approaches become computationally intractable in huge-data settings with millions of observations and features; and ii) the statistical accuracy of selected features degrades in high-noise, high-correlation settings, thus hindering reliable model interpretation. We tackle these problems by proposing Stable Minipatch Selection (STAMPS) and Adaptive STAMPS (AdaSTAMPS). These are meta-algorithms that build ensembles of selection events of base feature selectors trained on many tiny, (adaptively-chosen) random subsets of both the observations and features of the data, which we call minipatches. Our approaches are general and can be employed with a variety of existing feature selection strategies and machine learning techniques. In addition, we provide theoretical insights on STAMPS and empirically demonstrate that our approaches, especially AdaSTAMPS, dominate competing methods in terms of feature selection accuracy and computational time. △ Less

Submitted 10 February, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: Updated theoretical statements

arXiv:2009.01190 [pdf, other]

doi 10.3847/2041-8213/aba493

Properties and astrophysical implications of the 150 Msun binary black hole merger GW190521

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand , et al. (1233 additional authors not shown)

Abstract: The gravitational-wave signal GW190521 is consistent with a binary black hole merger source at redshift 0.8 with unusually high component masses, $85^{+21}_{-14}\,M_{\odot}$ and $66^{+17}_{-18}\,M_{\odot}$, compared to previously reported events, and shows mild evidence for spin-induced orbital precession. The primary falls in the mass gap predicted by (pulsational) pair-instability supernova theo… ▽ More The gravitational-wave signal GW190521 is consistent with a binary black hole merger source at redshift 0.8 with unusually high component masses, $85^{+21}_{-14}\,M_{\odot}$ and $66^{+17}_{-18}\,M_{\odot}$, compared to previously reported events, and shows mild evidence for spin-induced orbital precession. The primary falls in the mass gap predicted by (pulsational) pair-instability supernova theory, in the approximate range $65 - 120\,M_{\odot}$. The probability that at least one of the black holes in GW190521 is in that range is 99.0%. The final mass of the merger $(142^{+28}_{-16}\,M_{\odot})$ classifies it as an intermediate-mass black hole. Under the assumption of a quasi-circular binary black hole coalescence, we detail the physical properties of GW190521's source binary and its post-merger remnant, including component masses and spin vectors. Three different waveform models, as well as direct comparison to numerical solutions of general relativity, yield consistent estimates of these properties. Tests of strong-field general relativity targeting the merger-ringdown stages of coalescence indicate consistency of the observed signal with theoretical predictions. We estimate the merger rate of similar systems to be $0.13^{+0.30}_{-0.11}\,{\rm Gpc}^{-3}\,\rm{yr}^{-1}$. We discuss the astrophysical implications of GW190521 for stellar collapse, and for the possible formation of black holes in the pair-instability mass gap through various channels: via (multiple) stellar coalescence, or via hierarchical merger of lower-mass black holes in star clusters or in active galactic nuclei. We find it to be unlikely that GW190521 is a strongly lensed signal of a lower-mass black hole binary merger. We also discuss more exotic possible sources for GW190521, including a highly eccentric black hole binary, or a primordial black hole binary. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: 39 pages, 13 figures; data available at https://dcc.ligo.org/P2000158-v4/public

Report number: LIGO-P2000021

Journal ref: Astrophys. J. Lett. 900, L13 (2020)

arXiv:2009.01075 [pdf]

doi 10.1103/PhysRevLett.125.101102

GW190521: A Binary Black Hole Merger with a Total Mass of $150 ~ M_{\odot}$

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand , et al. (1232 additional authors not shown)

Abstract: On May 21, 2019 at 03:02:29 UTC Advanced LIGO and Advanced Virgo observed a short duration gravitational-wave signal, GW190521, with a three-detector network signal-to-noise ratio of 14.7, and an estimated false-alarm rate of 1 in 4900 yr using a search sensitive to generic transients. If GW190521 is from a quasicircular binary inspiral, then the detected signal is consistent with the merger of tw… ▽ More On May 21, 2019 at 03:02:29 UTC Advanced LIGO and Advanced Virgo observed a short duration gravitational-wave signal, GW190521, with a three-detector network signal-to-noise ratio of 14.7, and an estimated false-alarm rate of 1 in 4900 yr using a search sensitive to generic transients. If GW190521 is from a quasicircular binary inspiral, then the detected signal is consistent with the merger of two black holes with masses of $85^{+21}_{-14} M_{\odot}$ and $66^{+17}_{-18} M_{\odot}$ (90 % credible intervals). We infer that the primary black hole mass lies within the gap produced by (pulsational) pair-instability supernova processes, and has only a 0.32 % probability of being below $65 M_{\odot}$. We calculate the mass of the remnant to be $142^{+28}_{-16} M_{\odot}$, which can be considered an intermediate mass black hole (IMBH). The luminosity distance of the source is $5.3^{+2.4}_{-2.6}$ Gpc, corresponding to a redshift of $0.82^{+0.28}_{-0.34}$. The inferred rate of mergers similar to GW190521 is $0.13^{+0.30}_{-0.11}\,\mathrm{Gpc}^{-3}\,\mathrm{yr}^{-1}$. △ Less

Submitted 2 September, 2020; originally announced September 2020.

Comments: Supplementary Material at https://dcc.ligo.org/LIGO-P2000020/Public

Journal ref: Phys. Rev. Lett. 125, 101102 (2020)

arXiv:2008.05909 [pdf]

Population stratification enables modeling effects of reopening policies on mortality and hospitalization rates

Authors: Tongtong Huang, Yan Chu, Shayan Shams, Ye** Kim, Genevera Allen, Ananth V Annapragada, Devika Subramanian, Ioannis Kakadiaris, Assaf Gottlieb, Xiaoqian Jiang

Abstract: Objective: We study the influence of local reopening policies on the composition of the infectious population and their impact on future hospitalization and mortality rates. Materials and Methods: We collected datasets of daily reported hospitalization and cumulative morality of COVID 19 in Houston, Texas, from May 1, 2020 until June 29, 2020. These datasets are from multiple sources (USA FACTS, S… ▽ More Objective: We study the influence of local reopening policies on the composition of the infectious population and their impact on future hospitalization and mortality rates. Materials and Methods: We collected datasets of daily reported hospitalization and cumulative morality of COVID 19 in Houston, Texas, from May 1, 2020 until June 29, 2020. These datasets are from multiple sources (USA FACTS, Southeast Texas Regional Advisory Council COVID 19 report, TMC daily news, and New York Times county level mortality reporting). Our model, risk stratified SIR HCD uses separate variables to model the dynamics of local contact (e.g., work from home) and high contact (e.g., work on site) subpopulations while sharing parameters to control their respective $R_0(t)$ over time. Results: We evaluated our models forecasting performance in Harris County, TX (the most populated county in the Greater Houston area) during the Phase I and Phase II reopening. Not only did our model outperform other competing models, it also supports counterfactual analysis to simulate the impact of future policies in a local setting, which is unique among existing approaches. Discussion: Local mortality and hospitalization are significantly impacted by quarantine and reopening policies. No existing model has directly accounted for the effect of these policies on local trends in infections, hospitalizations, and deaths in an explicit and explainable manner. Our work is an attempt to close this important technical gap to support decision making. Conclusion: Despite several limitations, we think it is a timely effort to rethink about how to best model the dynamics of pandemics under the influence of reopening policies. △ Less

Submitted 10 August, 2020; originally announced August 2020.

arXiv:2007.14251 [pdf, other]

doi 10.3847/2041-8213/abb655

Gravitational-wave constraints on the equatorial ellipticity of millisecond pulsars

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1311 additional authors not shown)

Abstract: We present a search for continuous gravitational waves from five radio pulsars, comprising three recycled pulsars (PSR J0437-4715, PSR J0711-6830, and PSR J0737-3039A) and two young pulsars: the Crab pulsar (J0534+2200) and the Vela pulsar (J0835-4510). We use data from the third observing run of Advanced LIGO and Virgo combined with data from their first and second observing runs. For the first t… ▽ More We present a search for continuous gravitational waves from five radio pulsars, comprising three recycled pulsars (PSR J0437-4715, PSR J0711-6830, and PSR J0737-3039A) and two young pulsars: the Crab pulsar (J0534+2200) and the Vela pulsar (J0835-4510). We use data from the third observing run of Advanced LIGO and Virgo combined with data from their first and second observing runs. For the first time we are able to match (for PSR J0437-4715) or surpass (for PSR J0711-6830) the indirect limits on gravitational-wave emission from recycled pulsars inferred from their observed spin-downs, and constrain their equatorial ellipticities to be less than $10^{-8}$. For each of the five pulsars, we perform targeted searches that assume a tight coupling between the gravitational-wave and electromagnetic signal phase evolution. We also present constraints on PSR J0711-6830, the Crab pulsar and the Vela pulsar from a search that relaxes this assumption, allowing the gravitational-wave signal to vary from the electromagnetic expectation within a narrow band of frequencies and frequency derivatives. △ Less

Submitted 13 October, 2020; v1 submitted 28 July, 2020; originally announced July 2020.

Comments: 23 pages, 6 figures, 3 tables

Report number: LIGO-P2000029

Journal ref: 2020 ApJL 902 L21

arXiv:2007.04441 [pdf, other]

Sparse Regression for Extreme Values

Authors: Andersen Chang, Minjie Wang, Genevera Allen

Abstract: We study the problem of selecting features associated with extreme values in high dimensional linear regression. Normally, in linear modeling problems, the presence of abnormal extreme values or outliers is considered an anomaly which should either be removed from the data or remedied using robust regression methods. In many situations, however, the extreme values in regression modeling are not ou… ▽ More We study the problem of selecting features associated with extreme values in high dimensional linear regression. Normally, in linear modeling problems, the presence of abnormal extreme values or outliers is considered an anomaly which should either be removed from the data or remedied using robust regression methods. In many situations, however, the extreme values in regression modeling are not outliers but rather the signals of interest; consider traces from spiking neurons, volatility in finance, or extreme events in climate science, for example. In this paper, we propose a new method for sparse high-dimensional linear regression for extreme values which is motivated by the Subbotin, or generalized normal distribution, which we call the extreme value linear regression model. For our method, we utilize an $\ell_p$ norm loss where $p$ is an even integer greater than two; we demonstrate that this loss increases the weight on extreme values. We prove consistency and variable selection consistency for the extreme value linear regression with a Lasso penalty, which we term the Extreme Lasso, and we also analyze the theoretical impact of extreme value observations on the model parameter estimates using the concept of influence functions. Through simulation studies and a real-world data example, we show that the Extreme Lasso outperforms other methods currently used in the literature for selecting features of interest associated with extreme values in high-dimensional regression. △ Less

Submitted 14 June, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

Comments: 4 figures

MSC Class: 62J07

Showing 1–50 of 243 results for author: Allen, G