Skip to main content

Showing 1–11 of 11 results for author: Sim, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2212.03720  [pdf, other

    cs.SI cs.LG stat.ML

    Pseudo-Riemannian Embedding Models for Multi-Relational Graph Representations

    Authors: Saee Paliwal, Angus Brayne, Benedek Fabian, Maciej Wiatrak, Aaron Sim

    Abstract: In this paper we generalize single-relation pseudo-Riemannian graph embedding models to multi-relational networks, and show that the typical approach of encoding relations as manifold transformations translates from the Riemannian to the pseudo-Riemannian case. In addition we construct a view of relations as separate spacetime submanifolds of multi-time manifolds, and consider an interpolation bet… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 11 pages, 3 figures, AKBC 2022 conference

    Journal ref: 4th Conference on Automated Knowledge Base Construction 2022

  2. arXiv:2205.09703  [pdf, other

    cs.LG cs.DC cs.PF eess.SY stat.AP

    Extract Dynamic Information To Improve Time Series Modeling: a Case Study with Scientific Workflow

    Authors: Jeeyung Kim, Mengtian **, Youkow Homma, Alex Sim, Wilko Kroeger, Kesheng Wu

    Abstract: In modeling time series data, we often need to augment the existing data records to increase the modeling accuracy. In this work, we describe a number of techniques to extract dynamic information about the current state of a large scientific workflow, which could be generalized to other types of applications. The specific task to be modeled is the time needed for transferring a file from an experi… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  3. arXiv:2205.06622  [pdf

    stat.AP

    What Makes You Hold on to That Old Car? Joint Insights from Machine Learning and Multinomial Logit on Vehicle-level Transaction Decisions

    Authors: Ling **, Alina Lazar, Caitlin Brown, Bingrong Sun, Venu Garikapati, Srinath Ravulaparthy, Qianmiao Chen, Alexander Sim, Kesheng Wu, Tin Ho, Thomas Wenzel, C. Anna Spurlock

    Abstract: What makes you hold on that old car? While the vast majority of the household vehicles are still powered by conventional internal combustion engines, the progress of adopting emerging vehicle technologies will critically depend on how soon the existing vehicles are transacted out of the household fleet. Leveraging a nationally representative longitudinal data set, the Panel Study of Income Dynamic… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

  4. arXiv:2202.07451  [pdf, other

    stat.AP cs.LG

    Phenoty** with Positive Unlabelled Learning for Genome-Wide Association Studies

    Authors: Andre Vauvelle, Hamish Tomlinson, Aaron Sim, Spiros Denaxas

    Abstract: Identifying phenotypes plays an important role in furthering our understanding of disease biology through practical applications within healthcare and the life sciences. The challenge of dealing with the complexities and noise within electronic health records (EHRs) has motivated applications of machine learning in phenotypic discovery. While recent research has focused on finding predictive subty… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  5. arXiv:2106.08678  [pdf, other

    stat.ML cs.AI cs.LG

    Directed Graph Embeddings in Pseudo-Riemannian Manifolds

    Authors: Aaron Sim, Maciej Wiatrak, Angus Brayne, Páidí Creed, Saee Paliwal

    Abstract: The inductive biases of graph representation learning algorithms are often encoded in the background geometry of their embedding space. In this paper, we show that general directed graphs can be effectively represented by an embedding model that combines three components: a pseudo-Riemannian metric structure, a non-trivial global topology, and a unique likelihood function that explicitly incorpora… ▽ More

    Submitted 16 June, 2021; originally announced June 2021.

    Comments: Accepted at ICML 2021

  6. arXiv:2106.08161  [pdf, other

    stat.ML cs.LG q-bio.GN

    Contrastive Mixture of Posteriors for Counterfactual Inference, Data Integration and Fairness

    Authors: Adam Foster, Árpi Vezér, Craig A Glastonbury, Páidí Creed, Sam Abujudeh, Aaron Sim

    Abstract: Learning meaningful representations of data that can address challenges such as batch effect correction and counterfactual inference is a central problem in many domains including computational biology. Adopting a Conditional VAE framework, we show that marginal independence between the representation and a condition variable plays a key role in both of these challenges. We propose the Contrastive… ▽ More

    Submitted 26 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: Published as a conference paper (long presentation) at ICML 2022

  7. arXiv:2102.11027  [pdf, other

    stat.AP cs.CY cs.LG

    Investigating Underlying Drivers of Variability in Residential Energy Usage Patterns with Daily Load Shape Clustering of Smart Meter Data

    Authors: Ling **, C. Anna Spurlock, Sam Borgeson, Alina Lazar, Daniel Fredman, Annika Todd, Alexander Sim, Kesheng Wu

    Abstract: Residential customers have traditionally not been treated as individual entities due to the high volatility in residential consumption patterns as well as a historic focus on aggregated loads from the utility and system feeder perspective. Large-scale deployment of smart meters has motivated increasing studies to explore disaggregated daily load patterns, which can reveal important heterogeneity a… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: 11 pages, 11 figures

  8. arXiv:1910.12806  [pdf, other

    cs.LG cs.NI stat.ML

    An Ensemble Approach toward Automated Variable Selection for Network Anomaly Detection

    Authors: Makiya Nakashima, Alex Sim, Youngsoo Kim, Jonghyun Kim, **oh Kim

    Abstract: While variable selection is essential to optimize the learning complexity by prioritizing features, automating the selection process is preferred since it requires laborious efforts with intensive analysis otherwise. However, it is not an easy task to enable the automation due to several reasons. First, selection techniques often need a condition to terminate the reduction process, for example, by… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

  9. arXiv:1812.00279  [pdf, other

    cs.LG stat.ML

    Interpretable Graph Convolutional Neural Networks for Inference on Noisy Knowledge Graphs

    Authors: Daniel Neil, Joss Briody, Alix Lacoste, Aaron Sim, Paidi Creed, Amir Saffari

    Abstract: In this work, we provide a new formulation for Graph Convolutional Neural Networks (GCNNs) for link prediction on graph data that addresses common challenges for biomedical knowledge graphs (KGs). We introduce a regularized attention mechanism to GCNNs that not only improves performance on clean datasets, but also favorably accommodates noise in KGs, a pervasive issue in real-world applications. F… ▽ More

    Submitted 1 December, 2018; originally announced December 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

  10. arXiv:1309.6158  [pdf, other

    stat.ML stat.AP

    Random Forests on Distance Matrices for Imaging Genetics Studies

    Authors: Aaron Sim, Dimosthenis Tsagkrasoulis, Giovanni Montana

    Abstract: We propose a non-parametric regression methodology, Random Forests on Distance Matrices (RFDM), for detecting genetic variants associated to quantitative phenotypes representing the human brain's structure or function, and obtained using neuroimaging techniques. RFDM, which is an extension of decision forests, requires a distance matrix as response that encodes all pair-wise phenotypic distances i… ▽ More

    Submitted 24 September, 2013; originally announced September 2013.

  11. arXiv:1212.0764  [pdf, other

    stat.ME physics.comp-ph

    Information Geometry and Sequential Monte Carlo

    Authors: Aaron Sim, Sarah Filippi, Michael P. H. Stumpf

    Abstract: This paper explores the application of methods from information geometry to the sequential Monte Carlo (SMC) sampler. In particular the Riemannian manifold Metropolis-adjusted Langevin algorithm (mMALA) is adapted for the transition kernels in SMC. Similar to its function in Markov chain Monte Carlo methods, the mMALA is a fully adaptable kernel which allows for efficient sampling of high-dimensio… ▽ More

    Submitted 4 December, 2012; originally announced December 2012.

    Comments: 23 pages, 10 figures