-
Auto-Encoder Neural Network Incorporating X-Ray Fluorescence Fundamental Parameters with Machine Learning
Authors:
Matthew Dirks,
David Poole
Abstract:
We consider energy-dispersive X-ray Fluorescence (EDXRF) applications where the fundamental parameters method is impractical such as when instrument parameters are unavailable. For example, on a mining shovel or conveyor belt, rocks are constantly moving (leading to varying angles of incidence and distances) and there may be other factors not accounted for (like dust). Neural networks do not requi…
▽ More
We consider energy-dispersive X-ray Fluorescence (EDXRF) applications where the fundamental parameters method is impractical such as when instrument parameters are unavailable. For example, on a mining shovel or conveyor belt, rocks are constantly moving (leading to varying angles of incidence and distances) and there may be other factors not accounted for (like dust). Neural networks do not require instrument and fundamental parameters but training neural networks requires XRF spectra labelled with elemental composition, which is often limited because of its expense. We develop a neural network model that learns from limited labelled data and also benefits from domain knowledge by learning to invert a forward model. The forward model uses transition energies and probabilities of all elements and parameterized distributions to approximate other fundamental and instrument parameters. We evaluate the model and baseline models on a rock dataset from a lithium mineral exploration project. Our model works particularly well for some low-Z elements (Li, Mg, Al, and K) as well as some high-Z elements (Sn and Pb) despite these elements being outside the suitable range for common spectrometers to directly measure, likely owing to the ability of neural networks to learn correlations and non-linear relationships.
△ Less
Submitted 21 March, 2023; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Automatic Neural Network Hyperparameter Optimization for Extrapolation: Lessons Learned from Visible and Near-Infrared Spectroscopy of Mango Fruit
Authors:
Matthew Dirks,
David Poole
Abstract:
Neural networks are configured by choosing an architecture and hyperparameter values; doing so often involves expert intuition and hand-tuning to find a configuration that extrapolates well without overfitting. This paper considers automatic methods for configuring a neural network that extrapolates in time for the domain of visible and near-infrared (VNIR) spectroscopy. In particular, we study th…
▽ More
Neural networks are configured by choosing an architecture and hyperparameter values; doing so often involves expert intuition and hand-tuning to find a configuration that extrapolates well without overfitting. This paper considers automatic methods for configuring a neural network that extrapolates in time for the domain of visible and near-infrared (VNIR) spectroscopy. In particular, we study the effect of (a) selecting samples for validating configurations and (b) using ensembles.
Most of the time, models are built of the past to predict the future. To encourage the neural network model to extrapolate, we consider validating model configurations on samples that are shifted in time similar to the test set. We experiment with three validation set choices: (1) a random sample of 1/3 of non-test data (the technique used in previous work), (2) using the latest 1/3 (sorted by time), and (3) using a semantically meaningful subset of the data. Hyperparameter optimization relies on the validation set to estimate test-set error, but neural network variance obfuscates the true error value. Ensemble averaging - computing the average across many neural networks - can reduce the variance of prediction errors.
To test these methods, we do a comprehensive study of a held-out 2018 harvest season of mango fruit given VNIR spectra from 3 prior years. We find that ensembling improves the state-of-the-art model's variance and accuracy. Furthermore, hyperparameter optimization experiments - with and without ensemble averaging and with each validation set choice - show that when ensembling is combined with using the latest 1/3 of samples as the validation set, a neural network configuration is found automatically that is on par with the state-of-the-art.
△ Less
Submitted 2 October, 2022;
originally announced October 2022.
-
Knowledge Hypergraph Embedding Meets Relational Algebra
Authors:
Bahare Fatemi,
Perouz Taslakian,
David Vazquez,
David Poole
Abstract:
Embedding-based methods for reasoning in knowledge hypergraphs learn a representation for each entity and relation. Current methods do not capture the procedural rules underlying the relations in the graph. We propose a simple embedding-based model called ReAlE that performs link prediction in knowledge hypergraphs (generalized knowledge graphs) and can represent high-level abstractions in terms o…
▽ More
Embedding-based methods for reasoning in knowledge hypergraphs learn a representation for each entity and relation. Current methods do not capture the procedural rules underlying the relations in the graph. We propose a simple embedding-based model called ReAlE that performs link prediction in knowledge hypergraphs (generalized knowledge graphs) and can represent high-level abstractions in terms of relational algebra operations. We show theoretically that ReAlE is fully expressive and provide proofs and empirical evidence that it can represent a large subset of the primitive relational algebra operations, namely renaming, projection, set union, selection, and set difference. We also verify experimentally that ReAlE outperforms state-of-the-art models in knowledge hypergraph completion, and in representing each of these primitive relational algebra operations. For the latter experiment, we generate a synthetic knowledge hypergraph, for which we design an algorithm based on the Erdos-R'enyi model for generating random graphs.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Binarised Regression with Instance-Varying Costs: Evaluation using Impact Curves
Authors:
Matthew Dirks,
David Poole
Abstract:
Many evaluation methods exist, each for a particular prediction task, and there are a number of prediction tasks commonly performed including classification and regression. In binarised regression, binary decisions are generated from a learned regression model (or real-valued dependent variable), which is useful when the division between instances that should be predicted positive or negative depe…
▽ More
Many evaluation methods exist, each for a particular prediction task, and there are a number of prediction tasks commonly performed including classification and regression. In binarised regression, binary decisions are generated from a learned regression model (or real-valued dependent variable), which is useful when the division between instances that should be predicted positive or negative depends on the utility. For example, in mining, the boundary between a valuable rock and a waste rock depends on the market price of various metals, which varies with time. This paper proposes impact curves to evaluate binarised regression with instance-varying costs, where some instances are much worse to be classified as positive (or negative) than other instances; e.g., it is much worse to throw away a high-grade gold rock than a medium-grade copper-ore rock, even if the mine wishes to keep both because both are profitable. We show how to construct an impact curve for a variety of domains, including examples from healthcare, mining, and entertainment. Impact curves optimize binary decisions across all utilities of the chosen utility function, identify the conditions where one model may be favoured over another, and quantitatively assess improvement between competing models.
△ Less
Submitted 14 August, 2020;
originally announced August 2020.
-
GPU-Accelerated Drug Discovery with Docking on the Summit Supercomputer: Porting, Optimization, and Application to COVID-19 Research
Authors:
Scott LeGrand,
Aaron Scheinberg,
Andreas F. Tillack,
Mathialakan Thavappiragasam,
Josh V. Vermaas,
Rupesh Agarwal,
Jeff Larkin,
Duncan Poole,
Diogo Santos-Martins,
Leonardo Solis-Vasquez,
Andreas Koch,
Stefano Forli,
Oscar Hernandez,
Jeremy C. Smith,
Ada Sedova
Abstract:
Protein-ligand docking is an in silico tool used to screen potential drug compounds for their ability to bind to a given protein receptor within a drug-discovery campaign. Experimental drug screening is expensive and time consuming, and it is desirable to carry out large scale docking calculations in a high-throughput manner to narrow the experimental search space. Few of the existing computationa…
▽ More
Protein-ligand docking is an in silico tool used to screen potential drug compounds for their ability to bind to a given protein receptor within a drug-discovery campaign. Experimental drug screening is expensive and time consuming, and it is desirable to carry out large scale docking calculations in a high-throughput manner to narrow the experimental search space. Few of the existing computational docking tools were designed with high performance computing in mind. Therefore, optimizations to maximize use of high-performance computational resources available at leadership-class computing facilities enables these facilities to be leveraged for drug discovery. Here we present the porting, optimization, and validation of the AutoDock-GPU program for the Summit supercomputer, and its application to initial compound screening efforts to target proteins of the SARS-CoV-2 virus responsible for the current COVID-19 pandemic.
△ Less
Submitted 6 July, 2020;
originally announced July 2020.
-
Predicting Landslides Using Locally Aligned Convolutional Neural Networks
Authors:
Ainaz Hajimoradlou,
Gioachino Roberti,
David Poole
Abstract:
Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for t…
▽ More
Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for the prediction task. Traditional filters in these networks either have a fixed orientation or are rotationally invariant. Intuitively, the filters should orient uphill, but there is not enough data to learn the concept of uphill; instead, it can be provided as prior knowledge. We propose a model called Locally Aligned Convolutional Neural Network, LACNN, that follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we created a standardized dataset of georeferenced images consisting of the heterogeneous features as inputs, and compared our method to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. Our model achieves 2-7% improvement in terms of accuracy and 2-15% boost in terms of log likelihood compared to the other proposed baselines.
△ Less
Submitted 17 July, 2020; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Knowledge Hypergraphs: Prediction Beyond Binary Relations
Authors:
Bahare Fatemi,
Perouz Taslakian,
David Vazquez,
David Poole
Abstract:
Knowledge graphs store facts using relations between two entities. In this work, we address the question of link prediction in knowledge hypergraphs where relations are defined on any number of entities. While techniques exist (such as reification) that convert non-binary relations into binary ones, we show that current embedding-based methods for knowledge graph completion do not work well out of…
▽ More
Knowledge graphs store facts using relations between two entities. In this work, we address the question of link prediction in knowledge hypergraphs where relations are defined on any number of entities. While techniques exist (such as reification) that convert non-binary relations into binary ones, we show that current embedding-based methods for knowledge graph completion do not work well out of the box for knowledge graphs obtained through these techniques. To overcome this, we introduce HSimplE and HypE, two embedding-based methods that work directly with knowledge hypergraphs. In both models, the prediction is a function of the relation embedding, the entity embeddings and their corresponding positions in the relation. We also develop public datasets, benchmarks and baselines for hypergraph prediction and show experimentally that the proposed models are more effective than the baselines.
△ Less
Submitted 15 July, 2020; v1 submitted 31 May, 2019;
originally announced June 2019.
-
Improved Knowledge Graph Embedding using Background Taxonomic Information
Authors:
Bahare Fatemi,
Siamak Ravanbakhsh,
David Poole
Abstract:
Knowledge graphs are used to represent relational information in terms of triples. To enable learning about domains, embedding models, such as tensor factorization models, can be used to make predictions of new triples. Often there is background taxonomic information (in terms of subclasses and subproperties) that should also be taken into account. We show that existing fully expressive (a.k.a. un…
▽ More
Knowledge graphs are used to represent relational information in terms of triples. To enable learning about domains, embedding models, such as tensor factorization models, can be used to make predictions of new triples. Often there is background taxonomic information (in terms of subclasses and subproperties) that should also be taken into account. We show that existing fully expressive (a.k.a. universal) models cannot provably respect subclass and subproperty information. We show that minimal modifications to an existing knowledge graph completion method enables injection of taxonomic information. Moreover, we prove that our model is fully expressive, assuming a lower-bound on the size of the embeddings. Experimental results on public knowledge graphs show that despite its simplicity our approach is surprisingly effective.
△ Less
Submitted 7 December, 2018;
originally announced December 2018.
-
Structure Learning for Relational Logistic Regression: An Ensemble Approach
Authors:
Nandini Ramanan,
Gautam Kunapuli,
Tushar Khot,
Bahare Fatemi,
Seyed Mehran Kazemi,
David Poole,
Kristian Kersting,
Sriraam Natarajan
Abstract:
We consider the problem of learning Relational Logistic Regression (RLR). Unlike standard logistic regression, the features of RLRs are first-order formulae with associated weight vectors instead of scalar weights. We turn the problem of learning RLR to learning these vector-weighted formulae and develop a learning algorithm based on the recently successful functional-gradient boosting methods for…
▽ More
We consider the problem of learning Relational Logistic Regression (RLR). Unlike standard logistic regression, the features of RLRs are first-order formulae with associated weight vectors instead of scalar weights. We turn the problem of learning RLR to learning these vector-weighted formulae and develop a learning algorithm based on the recently successful functional-gradient boosting methods for probabilistic logic models. We derive the functional gradients and show how weights can be learned simultaneously in an efficient manner. Our empirical evaluation on standard and novel data sets demonstrates the superiority of our approach over other methods for learning RLR.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
Record Linkage to Match Customer Names: A Probabilistic Approach
Authors:
Bahare Fatemi,
Seyed Mehran Kazemi,
David Poole
Abstract:
Consider the following problem: given a database of records indexed by names (e.g., name of companies, restaurants, businesses, or universities) and a new name, determine whether the new name is in the database, and if so, which record it refers to. This problem is an instance of record linkage problem and is a challenging problem because people do not consistently use the official name, but use a…
▽ More
Consider the following problem: given a database of records indexed by names (e.g., name of companies, restaurants, businesses, or universities) and a new name, determine whether the new name is in the database, and if so, which record it refers to. This problem is an instance of record linkage problem and is a challenging problem because people do not consistently use the official name, but use abbreviations, synonyms, different order of terms, different spelling of terms, short form of terms, and the name can contain typos or spacing issues. We provide a probabilistic model using relational logistic regression to find the probability of each record in the database being the desired record for a given query and find the best record(s) with respect to the probabilities. Building on term-matching and translational approaches for search, our model addresses many of the aforementioned challenges and provides good results when existing baselines fail. Using the probabilities outputted by the model, we can automate the search process for a portion of queries whose desired documents get a probability higher than a trust threshold. We evaluate our model on a large real-world dataset from a telecommunications company and compare it to several state-of-the-art baselines. The obtained results show that our model is a promising probabilistic model for record linkage for names. We also test if the knowledge learned by our model on one domain can be effectively transferred to a new domain. For this purpose, we test our model on an unseen test set from the business names of the secondString dataset. Promising results show that our model can be effectively applied to unseen datasets. Finally, we study the sensitivity of our model to the statistics of datasets.
△ Less
Submitted 26 June, 2018;
originally announced June 2018.
-
SimplE Embedding for Link Prediction in Knowledge Graphs
Authors:
Seyed Mehran Kazemi,
David Poole
Abstract:
Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction proble…
▽ More
Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE's code is available on GitHub at https://github.com/Mehran-k/SimplE.
△ Less
Submitted 25 October, 2018; v1 submitted 13 February, 2018;
originally announced February 2018.
-
RelNN: A Deep Neural Model for Relational Learning
Authors:
Seyed Mehran Kazemi,
David Poole
Abstract:
Statistical relational AI (StarAI) aims at reasoning and learning in noisy domains described in terms of objects and relationships by combining probability with first-order logic. With huge advances in deep learning in the current years, combining deep networks with first-order logic has been the focus of several recent studies. Many of the existing attempts, however, only focus on relations and i…
▽ More
Statistical relational AI (StarAI) aims at reasoning and learning in noisy domains described in terms of objects and relationships by combining probability with first-order logic. With huge advances in deep learning in the current years, combining deep networks with first-order logic has been the focus of several recent studies. Many of the existing attempts, however, only focus on relations and ignore object properties. The attempts that do consider object properties are limited in terms of modelling power or scalability. In this paper, we develop relational neural networks (RelNNs) by adding hidden layers to relational logistic regression (the relational counterpart of logistic regression). We learn latent properties for objects both directly and through general rules. Back-propagation is used for training these models. A modular, layer-wise architecture facilitates utilizing the techniques developed within deep learning community to our architecture. Initial experiments on eight tasks over three real-world datasets show that RelNNs are promising models for relational learning.
△ Less
Submitted 7 December, 2017;
originally announced December 2017.
-
Comparing Aggregators for Relational Probabilistic Models
Authors:
Seyed Mehran Kazemi,
Bahare Fatemi,
Alexandra Kim,
Zilun Peng,
Moumita Roy Tora,
Xing Zeng,
Matthew Dirks,
David Poole
Abstract:
Relational probabilistic models have the challenge of aggregation, where one variable depends on a population of other variables. Consider the problem of predicting gender from movie ratings; this is challenging because the number of movies per user and users per movie can vary greatly. Surprisingly, aggregation is not well understood. In this paper, we show that existing relational models (implic…
▽ More
Relational probabilistic models have the challenge of aggregation, where one variable depends on a population of other variables. Consider the problem of predicting gender from movie ratings; this is challenging because the number of movies per user and users per movie can vary greatly. Surprisingly, aggregation is not well understood. In this paper, we show that existing relational models (implicitly or explicitly) either use simple numerical aggregators that lose great amounts of information, or correspond to naive Bayes, logistic regression, or noisy-OR that suffer from overconfidence. We propose new simple aggregators and simple modifications of existing models that empirically outperform the existing ones. The intuition we provide on different (existing or new) models and their shortcomings plus our empirical findings promise to form the foundation for future representations.
△ Less
Submitted 24 July, 2017;
originally announced July 2017.
-
Domain Recursion for Lifted Inference with Existential Quantifiers
Authors:
Seyed Mehran Kazemi,
Angelika Kimmig,
Guy Van den Broeck,
David Poole
Abstract:
In recent work, we proved that the domain recursion inference rule makes domain-lifted inference possible on several relational probability models (RPMs) for which the best known time complexity used to be exponential. We also identified two classes of RPMs for which inference becomes domain lifted when using domain recursion. These two classes subsume the largest lifted classes that were previous…
▽ More
In recent work, we proved that the domain recursion inference rule makes domain-lifted inference possible on several relational probability models (RPMs) for which the best known time complexity used to be exponential. We also identified two classes of RPMs for which inference becomes domain lifted when using domain recursion. These two classes subsume the largest lifted classes that were previously known. In this paper, we show that domain recursion can also be applied to models with existential quantifiers. Currently, all lifted inference algorithms assume that existential quantifiers have been removed in pre-processing by Skolemization. We show that besides introducing potentially inconvenient negative weights, Skolemization may increase the time complexity of inference. We give two example models where domain recursion can replace Skolemization, avoids the need for dealing with negative numbers, and reduces the time complexity of inference. These two examples may be interesting from three theoretical aspects: 1- they provide a better and deeper understanding of domain recursion and, in general, (lifted) inference, 2- they may serve as evidence that there are larger classes of models for which domain recursion can satisfyingly replace Skolemization, and 3- they may serve as evidence that better Skolemization techniques exist.
△ Less
Submitted 27 July, 2017; v1 submitted 24 July, 2017;
originally announced July 2017.
-
New Liftable Classes for First-Order Probabilistic Inference
Authors:
Seyed Mehran Kazemi,
Angelika Kimmig,
Guy Van den Broeck,
David Poole
Abstract:
Statistical relational models provide compact encodings of probabilistic dependencies in relational domains, but result in highly intractable graphical models. The goal of lifted inference is to carry out probabilistic inference without needing to reason about each individual separately, by instead treating exchangeable, undistinguished objects as a whole. In this paper, we study the domain recurs…
▽ More
Statistical relational models provide compact encodings of probabilistic dependencies in relational domains, but result in highly intractable graphical models. The goal of lifted inference is to carry out probabilistic inference without needing to reason about each individual separately, by instead treating exchangeable, undistinguished objects as a whole. In this paper, we study the domain recursion inference rule, which, despite its central role in early theoretical results on domain-lifted inference, has later been believed redundant. We show that this rule is more powerful than expected, and in fact significantly extends the range of models for which lifted inference runs in time polynomial in the number of individuals in the domain. This includes an open problem called S4, the symmetric transitivity model, and a first-order logic encoding of the birthday paradox. We further identify new classes S2FO2 and S2RU of domain-liftable theories, which respectively subsume FO2 and recursively unary theories, the largest classes of domain-liftable theories known so far, and show that using domain recursion can achieve exponential speedup even in theories that cannot fully be lifted with the existing set of inference rules.
△ Less
Submitted 26 October, 2016;
originally announced October 2016.
-
Birth of a giant $(k_1,k_2)$-core in the random digraph
Authors:
Boris Pittel,
Dan Poole
Abstract:
The $(k_1,k_2)$-core of a digraph is the largest sub-digraph with minimum in-degree and minimum out-degree at least $k_1$ and $k_2$ respectively. For $\max\{k_1, k_2\} \geq 2$, we establish existence of the threshold edge-density $c^*=c^*(k_1,k_2)$, such that the random digraph $D(n,m)$, on the vertex set $[n]$ with $m$ edges, asymptotically almost surely has a giant $(k_1,k_2)$-core if…
▽ More
The $(k_1,k_2)$-core of a digraph is the largest sub-digraph with minimum in-degree and minimum out-degree at least $k_1$ and $k_2$ respectively. For $\max\{k_1, k_2\} \geq 2$, we establish existence of the threshold edge-density $c^*=c^*(k_1,k_2)$, such that the random digraph $D(n,m)$, on the vertex set $[n]$ with $m$ edges, asymptotically almost surely has a giant $(k_1,k_2)$-core if $m/n> c^*$, and has no $(k_1,k_2)$-core if $m/n<c^*$. Specifically, denoting $\text{P}(\text{Poisson}(z)\ge k)$ by $p_k(z)$, we prove that $c^*=\min\limits_{z_1,z_2}\max\left\{\tfrac{z_1}{p_{k_1}(z_1)p_{k_2-1}(z_2)}; \tfrac{z_2}{p_{k_1-1}(z_1)p_{k_2}(z_2)}\right\}$.
△ Less
Submitted 17 August, 2016;
originally announced August 2016.
-
A Learning Algorithm for Relational Logistic Regression: Preliminary Results
Authors:
Bahare Fatemi,
Seyed Mehran Kazemi,
David Poole
Abstract:
Relational logistic regression (RLR) is a representation of conditional probability in terms of weighted formulae for modelling multi-relational data. In this paper, we develop a learning algorithm for RLR models. Learning an RLR model from data consists of two steps: 1- learning the set of formulae to be used in the model (a.k.a. structure learning) and learning the weight of each formula (a.k.a.…
▽ More
Relational logistic regression (RLR) is a representation of conditional probability in terms of weighted formulae for modelling multi-relational data. In this paper, we develop a learning algorithm for RLR models. Learning an RLR model from data consists of two steps: 1- learning the set of formulae to be used in the model (a.k.a. structure learning) and learning the weight of each formula (a.k.a. parameter learning). For structure learning, we deploy Schmidt and Murphy's hierarchical assumption: first we learn a model with simple formulae, then more complex formulae are added iteratively only if all their sub-formulae have proven effective in previous learned models. For parameter learning, we convert the problem into a non-relational learning problem and use an off-the-shelf logistic regression learning algorithm from Weka, an open-source machine learning tool, to learn the weights. We also indicate how hidden features about the individuals can be incorporated into RLR to boost the learning performance. We compare our learning algorithm to other structure and parameter learning algorithms in the literature, and compare the performance of RLR models to standard logistic regression and RDN-Boost on a modified version of the MovieLens data-set.
△ Less
Submitted 27 June, 2016;
originally announced June 2016.
-
Why is Compiling Lifted Inference into a Low-Level Language so Effective?
Authors:
Seyed Mehran Kazemi,
David Poole
Abstract:
First-order knowledge compilation techniques have proven efficient for lifted inference. They compile a relational probability model into a target circuit on which many inference queries can be answered efficiently. Early methods used data structures as their target circuit. In our KR-2016 paper, we showed that compiling to a low-level program instead of a data structure offers orders of magnitude…
▽ More
First-order knowledge compilation techniques have proven efficient for lifted inference. They compile a relational probability model into a target circuit on which many inference queries can be answered efficiently. Early methods used data structures as their target circuit. In our KR-2016 paper, we showed that compiling to a low-level program instead of a data structure offers orders of magnitude speedup, resulting in the state-of-the-art lifted inference technique. In this paper, we conduct experiments to address two questions regarding our KR-2016 results: 1- does the speedup come from more efficient compilation or more efficient reasoning with the target circuit?, and 2- why are low-level programs more efficient target circuits than data structures?
△ Less
Submitted 14 June, 2016;
originally announced June 2016.
-
On Weak Hamiltonicity of a Random Hypergraph
Authors:
Daniel Poole
Abstract:
A {\it weak (Berge) cycle} is an alternating sequence of vertices and (hyper)edges $C=(v_0, e_1, v_1, ..., v_{\ell-1}, e_\ell, v_{\ell}=v_0)$ such that the vertices $v_0, ..., v_{\ell-1}$ are distinct with $v_k, v_{k+1} \in e_{k}$ for each $k$, but the edges $e_1, ..., e_\ell$ are not necessarily distinct. We prove that the main barrier to the random $d$-uniform hypergraph $H_d(n,p),$ where each o…
▽ More
A {\it weak (Berge) cycle} is an alternating sequence of vertices and (hyper)edges $C=(v_0, e_1, v_1, ..., v_{\ell-1}, e_\ell, v_{\ell}=v_0)$ such that the vertices $v_0, ..., v_{\ell-1}$ are distinct with $v_k, v_{k+1} \in e_{k}$ for each $k$, but the edges $e_1, ..., e_\ell$ are not necessarily distinct. We prove that the main barrier to the random $d$-uniform hypergraph $H_d(n,p),$ where each of the potential edges of cardinality $d$ is present with probability $p$, develo** a weak Hamilton cycle is the presence of isolated vertices. In particular, for $d \geq 3$ fixed and $p=(d-1)! \frac{\ln n + c}{n^{d-1}}$, the probability that $H_d(n, p)$ has a weak Hamilton cycle tends to $e^{-e^{-c}}$, which is also the limiting probability that $H_d(n,p)$ has no isolated vertices. As a consequence, the probability that the random hypergraph $H_d(n, m=\frac{n(\ln n + c)}{d}),$ where $m$ potential edges are chosen uniformly at random to be present, is weak Hamiltonian also tends to $e^{-e^{-c}}$.
△ Less
Submitted 27 October, 2014;
originally announced October 2014.
-
On the strength of connectedness of a random hypergraph
Authors:
Daniel Poole
Abstract:
Bollobás and Thomason (1985) proved that for each $k=k(n) \in [1, n-1]$, with high probability, the random graph process, where edges are added to vertex set $V=[n]$ uniformly at random one after another, is such that the stop** time of having minimal degree $k$ is equal to the stop** time of becoming $k$-(vertex-)connected. We extend this result to the $d$-uniform random hypergraph process, w…
▽ More
Bollobás and Thomason (1985) proved that for each $k=k(n) \in [1, n-1]$, with high probability, the random graph process, where edges are added to vertex set $V=[n]$ uniformly at random one after another, is such that the stop** time of having minimal degree $k$ is equal to the stop** time of becoming $k$-(vertex-)connected. We extend this result to the $d$-uniform random hypergraph process, where $k$ and $d$ are fixed. Consequently, for $m=\frac{n}{d}(\ln n +(k-1)\ln \ln n +c)$ and $p=(d-1)! \frac{\ln n + (k-1) \ln \ln n +c}{n^{d-1}}$, the probability that the random hypergraph models $H_d(n, m)$ and $H_d(n, p)$ are $k$-connected tends to $e^{-e^{-c}/(k-1)!}.$
△ Less
Submitted 9 March, 2015; v1 submitted 4 September, 2014;
originally announced September 2014.
-
Asymptotic distribution of the numbers of vertices and arcs of the giant strong component in sparse random digraphs
Authors:
Boris Pittel,
Daniel Poole
Abstract:
Two models of a random digraph on $n$ vertices, $D(n,\text{Prob}(\text{arc})=p)$ and $D(n,\text{number of arcs}=m)$ are studied. In 1990, Karp for $D(n,p)$ and independently T. Łuczak for $D(n,m=cn)$ proved that for $c>1$, with probability tending to 1, there is an unique strong component of size of order $n$. Karp showed, in fact, that the giant component has likely size asymptotic to $nθ^2$, whe…
▽ More
Two models of a random digraph on $n$ vertices, $D(n,\text{Prob}(\text{arc})=p)$ and $D(n,\text{number of arcs}=m)$ are studied. In 1990, Karp for $D(n,p)$ and independently T. Łuczak for $D(n,m=cn)$ proved that for $c>1$, with probability tending to 1, there is an unique strong component of size of order $n$. Karp showed, in fact, that the giant component has likely size asymptotic to $nθ^2$, where $θ=θ(c)$ is the unique positive root of $1-θ=e^{-c θ}$. In this paper we prove that, for both random digraphs, the joint distribution of the number of vertices and number of arcs in the giant strong component is asymptotically Gaussian with the same mean vector $n\boldsymbolμ(c)$, $\boldsymbolμ(c):=(θ^2, cθ^2)$ and two distinct $2\times 2$ covariance matrices, $n\mathbf{B}(c)$ and $n[\mathbf{B}(c)+c (\boldsymbolμ'(c))^T (\boldsymbolμ'(c)))]$. To this end, we introduce and analyze a randomized deletion process which determines the directed $(1,1)$-core, the maximal digraph with minimum in-degree and out-degree at least 1. This $(1,1)$-core contains all non-trivial strong components. However, we show that the likely numbers of peripheral vertices and arcs in the $(1,1)$-core, those outside the largest strong component, are of log-polynomial order, thus dwarfed by anticipated fluctuations, on the scale of $n^{1/2}$, of the giant component parameters. By approximating the likely realization of the deletion algorithm with a deterministic trajectory, we obtain our main result via exponential supermartingales and Fourier-based techniques.
△ Less
Submitted 20 May, 2015; v1 submitted 15 May, 2014;
originally announced May 2014.
-
Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (1994)
Authors:
Ramon Lopez de Mantaras,
David Poole
Abstract:
This is the Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, which was held in Seattle, WA, July 29-31, 1994
This is the Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, which was held in Seattle, WA, July 29-31, 1994
△ Less
Submitted 13 April, 2013;
originally announced April 2013.
-
Towards Solving the Multiple Extension Problem: Combining Defaults and Probabilities
Authors:
Eric Neufeld,
David L Poole
Abstract:
The multiple extension problem arises frequently in diagnostic and default inference. That is, we can often use any of a number of sets of defaults or possible hypotheses to explain observations or make Predictions. In default inference, some extensions seem to be simply wrong and we use qualitative techniques to weed out the unwanted ones. In the area of diagnosis, however, the multiple explanati…
▽ More
The multiple extension problem arises frequently in diagnostic and default inference. That is, we can often use any of a number of sets of defaults or possible hypotheses to explain observations or make Predictions. In default inference, some extensions seem to be simply wrong and we use qualitative techniques to weed out the unwanted ones. In the area of diagnosis, however, the multiple explanations may all seem reasonable, however improbable. Choosing among them is a matter of quantitative preference. Quantitative preference works well in diagnosis when knowledge is modelled causally. Here we suggest a framework that combines probabilities and defaults in a single unified framework that retains the semantics of diagnosis as construction of explanations from a fixed set of possible hypotheses. We can then compute probabilities incrementally as we construct explanations. Here we describe a branch and bound algorithm that maintains a set of all partial explanations while exploring a most promising one first. A most probable explanation is found first if explanations are partially ordered.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
Probabilistic Semantics and Defaults
Authors:
Eric Neufeld,
David L Poole
Abstract:
There is much interest in providing probabilistic semantics for defaults but most approaches seem to suffer from one of two problems: either they require numbers, a problem defaults were intended to avoid, or they generate peculiar side effects. Rather than provide semantics for defaults, we address the problem defaults were intended to solve: that of reasoning under uncertainty where numeric pro…
▽ More
There is much interest in providing probabilistic semantics for defaults but most approaches seem to suffer from one of two problems: either they require numbers, a problem defaults were intended to avoid, or they generate peculiar side effects. Rather than provide semantics for defaults, we address the problem defaults were intended to solve: that of reasoning under uncertainty where numeric probability distributions are not available. We describe a non-numeric formalism called an inference graph based on standard probability theory, conditional independence and sentences of favouring where a favours b - favours(a, b) - p(a|b) > p(a). The formalism seems to handle the examples from the nonmonotonic literature. Most importantly, the sentences of our system can be verified by performing an appropriate experiment in the semantic domain.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
Can Uncertainty Management be Realized in a Finite Totally Ordered Probability Algebra?
Authors:
Yang Xiang,
Michael P. Beddoes,
David L Poole
Abstract:
In this paper, the feasibility of using finite totally ordered probability models under Alelinnas's Theory of Probabilistic Logic [Aleliunas, 1988] is investigated. The general form of the probability algebra of these models is derived and the number of possible algebras with given size is deduced. Based on this analysis, we discuss problems of denominator-indifference and ambiguity-generation t…
▽ More
In this paper, the feasibility of using finite totally ordered probability models under Alelinnas's Theory of Probabilistic Logic [Aleliunas, 1988] is investigated. The general form of the probability algebra of these models is derived and the number of possible algebras with given size is deduced. Based on this analysis, we discuss problems of denominator-indifference and ambiguity-generation that arise in reasoning by cases and abductive reasoning. An example is given that illustrates how these problems arise. The investigation shows that a finite probability model may be of very limited usage.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
A Dynamic Approach to Probabilistic Inference
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
In this paper we present a framework for dynamically constructing Bayesian networks. We introduce the notion of a background knowledge base of schemata, which is a collection of parameterized conditional probability statements. These schemata explicitly separate the general knowledge of properties an individual may have from the specific knowledge of particular individuals that may have these prop…
▽ More
In this paper we present a framework for dynamically constructing Bayesian networks. We introduce the notion of a background knowledge base of schemata, which is a collection of parameterized conditional probability statements. These schemata explicitly separate the general knowledge of properties an individual may have from the specific knowledge of particular individuals that may have these properties. Knowledge of individuals can be combined with this background knowledge to create Bayesian networks, which can then be used in any propagation scheme. We discuss the theory and assumptions necessary for the implementation of dynamic Bayesian networks, and indicate where our approach may be useful.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
What is an Optimal Diagnosis?
Authors:
David L. Poole,
Gregory M. Provan
Abstract:
Within diagnostic reasoning there have been a number of proposed definitions of a diagnosis, and thus of the most likely diagnosis, including most probable posterior hypothesis, most probable interpretation, most probable covering hypothesis, etc. Most of these approaches assume that the most likely diagnosis must be computed, and that a definition of what should be computed can be made a priori,…
▽ More
Within diagnostic reasoning there have been a number of proposed definitions of a diagnosis, and thus of the most likely diagnosis, including most probable posterior hypothesis, most probable interpretation, most probable covering hypothesis, etc. Most of these approaches assume that the most likely diagnosis must be computed, and that a definition of what should be computed can be made a priori, independent of what the diagnosis is used for. We argue that the diagnostic problem, as currently posed, is incomplete: it does not consider how the diagnosis is to be used, or the utility associated with the treatment of the abnormalities. In this paper we analyze several well-known definitions of diagnosis, showing that the different definitions of the most likely diagnosis have different qualitative meanings, even given the same input data. We argue that the most appropriate definition of (optimal) diagnosis needs to take into account the utility of outcomes and what the diagnosis is used for.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.
-
High Level Path Planning with Uncertainty
Authors:
Run** Qi,
David L. Poole
Abstract:
For high level path planning, environments are usually modeled as distance graphs, and path planning problems are reduced to computing the shortest path in distance graphs. One major drawback of this modeling is the inability to model uncertainties, which are often encountered in practice. In this paper, a new tool, called U-yraph, is proposed for environment modeling. A U-graph is an extension…
▽ More
For high level path planning, environments are usually modeled as distance graphs, and path planning problems are reduced to computing the shortest path in distance graphs. One major drawback of this modeling is the inability to model uncertainties, which are often encountered in practice. In this paper, a new tool, called U-yraph, is proposed for environment modeling. A U-graph is an extension of distance graphs with the ability to handle a kind of uncertainty. By modeling an uncertain environment as a U-graph, and a navigation problem as a Markovian decision process, we can precisely define a new optimality criterion for navigation plans, and more importantly, we can come up with a general algorithm for computing optimal plans for navigation tasks.
△ Less
Submitted 20 March, 2013;
originally announced March 2013.
-
Representing Bayesian Networks within Probabilistic Horn Abduction
Authors:
David L. Poole
Abstract:
This paper presents a simple framework for Horn clause abduction, with probabilities associated with hypotheses. It is shown how this representation can represent any probabilistic knowledge representable in a Bayesian belief network. The main contributions are in finding a relationship between logical and probabilistic notions of evidential reasoning. This can be used as a basis for a new way…
▽ More
This paper presents a simple framework for Horn clause abduction, with probabilities associated with hypotheses. It is shown how this representation can represent any probabilistic knowledge representable in a Bayesian belief network. The main contributions are in finding a relationship between logical and probabilistic notions of evidential reasoning. This can be used as a basis for a new way to implement Bayesian Networks that allows for approximations to the value of the posterior probabilities, and also points to a way that Bayesian networks can be extended beyond a propositional language.
△ Less
Submitted 20 March, 2013;
originally announced March 2013.
-
Sidestep** the Triangulation Problem in Bayesian Net Computations
Authors:
Nevin Lianwen Zhang,
David L. Poole
Abstract:
This paper presents a new approach for computing posterior probabilities in Bayesian nets, which sidesteps the triangulation problem. The current state of art is the clique tree propagation approach. When the underlying graph of a Bayesian net is triangulated, this approach arranges its cliques into a tree and computes posterior probabilities by appropriately passing around messages in that tree…
▽ More
This paper presents a new approach for computing posterior probabilities in Bayesian nets, which sidesteps the triangulation problem. The current state of art is the clique tree propagation approach. When the underlying graph of a Bayesian net is triangulated, this approach arranges its cliques into a tree and computes posterior probabilities by appropriately passing around messages in that tree. The computation in each clique is simply direct marginalization. When the underlying graph is not triangulated, one has to first triangulated it by adding edges. Referred to as the triangulation problem, the problem of finding an optimal or even a ?good? triangulation proves to be difficult. In this paper, we propose to first decompose a Bayesian net into smaller components by making use of Tarjan's algorithm for decomposing an undirected graph at all its minimal complete separators. Then, the components are arranged into a tree and posterior probabilities are computed by appropriately passing around messages in that tree. The computation in each component is carried out by repeating the whole procedure from the beginning. Thus the triangulation problem is sidestepped.
△ Less
Submitted 13 March, 2013;
originally announced March 2013.
-
Exploring Localization in Bayesian Networks for Large Expert Systems
Authors:
Yang Xiang,
David L. Poole,
Michael P. Beddoes
Abstract:
Current Bayesian net representations do not consider structure in the domain and include all variables in a homogeneous network. At any time, a human reasoner in a large domain may direct his attention to only one of a number of natural subdomains, i.e., there is ?localization' of queries and evidence. In such a case, propagating evidence through a homogeneous network is inefficient since the en…
▽ More
Current Bayesian net representations do not consider structure in the domain and include all variables in a homogeneous network. At any time, a human reasoner in a large domain may direct his attention to only one of a number of natural subdomains, i.e., there is ?localization' of queries and evidence. In such a case, propagating evidence through a homogeneous network is inefficient since the entire network has to be updated each time. This paper presents multiply sectioned Bayesian networks that enable a (localization preserving) representation of natural subdomains by separate Bayesian subnets. The subnets are transformed into a set of permanent junction trees such that evidential reasoning takes place at only one of them at a time. Probabilities obtained are identical to those that would be obtained from the homogeneous network. We discuss attention shift to a different junction tree and propagation of previously acquired evidence. Although the overall system can be large, computational requirements are governed by the size of only one junction tree.
△ Less
Submitted 13 March, 2013;
originally announced March 2013.
-
Incremental computation of the value of perfect information in stepwise-decomposable influence diagrams
Authors:
Nevin Lianwen Zhang,
Run** Qi,
David L. Poole
Abstract:
To determine the value of perfect information in an influence diagram, one needs first to modify the diagram to reflect the change in information availability, and then to compute the optimal expected values of both the original diagram and the modified diagram. The value of perfect information is the difference between the two optimal expected values. This paper is about how to speed up the comp…
▽ More
To determine the value of perfect information in an influence diagram, one needs first to modify the diagram to reflect the change in information availability, and then to compute the optimal expected values of both the original diagram and the modified diagram. The value of perfect information is the difference between the two optimal expected values. This paper is about how to speed up the computation of the optimal expected value of the modified diagram by making use of the intermediate computation results obtained when computing the optimal expected value of the original diagram.
△ Less
Submitted 6 March, 2013;
originally announced March 2013.
-
The use of conflicts in searching Bayesian networks
Authors:
David L. Poole
Abstract:
This paper discusses how conflicts (as used by the consistency-based diagnosis community) can be adapted to be used in a search-based algorithm for computing prior and posterior probabilities in discrete Bayesian Networks. This is an "anytime" algorithm, that at any stage can estimate the probabilities and give an error bound. Whereas the most popular Bayesian net algorithms exploit the structure…
▽ More
This paper discusses how conflicts (as used by the consistency-based diagnosis community) can be adapted to be used in a search-based algorithm for computing prior and posterior probabilities in discrete Bayesian Networks. This is an "anytime" algorithm, that at any stage can estimate the probabilities and give an error bound. Whereas the most popular Bayesian net algorithms exploit the structure of the network for efficiency, we exploit probability distributions for efficiency; this algorithm is most suited to the case with extreme probabilities. This paper presents a solution to the inefficiencies found in naive algorithms, and shows how the tools of the consistency-based diagnosis community (namely conflicts) can be used effectively to improve the efficiency. Empirical results with networks having tens of thousands of nodes are presented.
△ Less
Submitted 6 March, 2013;
originally announced March 2013.
-
Inter-causal Independence and Heterogeneous Factorization
Authors:
Nevin Lianwen Zhang,
David L Poole
Abstract:
It is well known that conditional independence can be used to factorize a joint probability into a multiplication of conditional probabilities. This paper proposes a constructive definition of inter-causal independence, which can be used to further factorize a conditional probability. An inference algorithm is developed, which makes use of both conditional independence and inter-causal independe…
▽ More
It is well known that conditional independence can be used to factorize a joint probability into a multiplication of conditional probabilities. This paper proposes a constructive definition of inter-causal independence, which can be used to further factorize a conditional probability. An inference algorithm is developed, which makes use of both conditional independence and inter-causal independence to reduce inference complexity in Bayesian networks.
△ Less
Submitted 27 February, 2013;
originally announced February 2013.
-
Solving Asymmetric Decision Problems with Influence Diagrams
Authors:
Run** Qi,
Nevin Lianwen Zhang,
David L. Poole
Abstract:
While influence diagrams have many advantages as a representation framework for Bayesian decision problems, they have a serious drawback in handling asymmetric decision problems. To be represented in an influence diagram, an asymmetric decision problem must be symmetrized. A considerable amount of unnecessary computation may be involved when a symmetrized influence diagram is evaluated by conven…
▽ More
While influence diagrams have many advantages as a representation framework for Bayesian decision problems, they have a serious drawback in handling asymmetric decision problems. To be represented in an influence diagram, an asymmetric decision problem must be symmetrized. A considerable amount of unnecessary computation may be involved when a symmetrized influence diagram is evaluated by conventional algorithms. In this paper we present an approach for avoiding such unnecessary computation in influence diagram evaluation.
△ Less
Submitted 27 February, 2013;
originally announced February 2013.
-
Exploiting the Rule Structure for Decision Making within the Independent Choice Logic
Authors:
David L. Poole
Abstract:
This paper introduces the independent choice logic, and in particular the "single agent with nature" instance of the independent choice logic, namely ICLdt. This is a logical framework for decision making uncertainty that extends both logic programming and stochastic models such as influence diagrams. This paper shows how the representation of a decision problem within the independent choice log…
▽ More
This paper introduces the independent choice logic, and in particular the "single agent with nature" instance of the independent choice logic, namely ICLdt. This is a logical framework for decision making uncertainty that extends both logic programming and stochastic models such as influence diagrams. This paper shows how the representation of a decision problem within the independent choice logic can be exploited to cut down the combinatorics of dynamic programming. One of the main problems with influence diagram evaluation techniques is the need to optimise a decision for all values of the 'parents' of a decision variable. In this paper we show how the rule based nature of the ICLdt can be exploited so that we only make distinctions in the values of the information available for a decision that will make a difference to utility.
△ Less
Submitted 20 February, 2013;
originally announced February 2013.
-
A Framework for Decision-Theoretic Planning I: Combining the Situation Calculus, Conditional Plans, Probability and Utility
Authors:
David L. Poole
Abstract:
This paper shows how we can combine logical representations of actions and decision theory in such a manner that seems natural for both. In particular we assume an axiomatization of the domain in terms of situation calculus, using what is essentially Reiter's solution to the frame problem, in terms of the completion of the axioms defining the state change. Uncertainty is handled in terms of the…
▽ More
This paper shows how we can combine logical representations of actions and decision theory in such a manner that seems natural for both. In particular we assume an axiomatization of the domain in terms of situation calculus, using what is essentially Reiter's solution to the frame problem, in terms of the completion of the axioms defining the state change. Uncertainty is handled in terms of the independent choice logic, which allows for independent choices and a logic program that gives the consequences of the choices. As part of the consequences are a specification of the utility of (final) states. The robot adopts robot plans, similar to the GOLOG programming language. Within this logic, we can define the expected utility of a conditional plan, based on the axiomatization of the actions, the uncertainty and the utility. The ?planning' problem is to find the plan with the highest expected utility. This is related to recent structured representations for POMDPs; here we use stochastic situation calculus rules to specify the state transition function and the reward/value function. Finally we show that with stochastic frame axioms, actions representations in probabilistic STRIPS are exponentially larger than using the representation proposed here.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Flexible Policy Construction by Information Refinement
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
We report on work towards flexible algorithms for solving decision problems represented as influence diagrams. An algorithm is given to construct a tree structure for each decision node in an influence diagram. Each tree represents a decision function and is constructed incrementally. The improvements to the tree converge to the optimal decision function (neglecting computational costs) and the…
▽ More
We report on work towards flexible algorithms for solving decision problems represented as influence diagrams. An algorithm is given to construct a tree structure for each decision node in an influence diagram. Each tree represents a decision function and is constructed incrementally. The improvements to the tree converge to the optimal decision function (neglecting computational costs) and the asymptotic behaviour is only a constant factor worse than dynamic programming techniques, counting the number of Bayesian network queries. Empirical results show how expected utility increases with the size of the tree and the number of Bayesian net calculations.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Context-Specific Approximation in Probabilistic Inference
Authors:
David L. Poole
Abstract:
There is evidence that the numbers in probabilistic inference don't really matter. This paper considers the idea that we can make a probabilistic model simpler by making fewer distinctions. Unfortunately, the level of a Bayesian network seems too coarse; it is unlikely that a parent will make little difference for all values of the other parents. In this paper we consider an approximation scheme w…
▽ More
There is evidence that the numbers in probabilistic inference don't really matter. This paper considers the idea that we can make a probabilistic model simpler by making fewer distinctions. Unfortunately, the level of a Bayesian network seems too coarse; it is unlikely that a parent will make little difference for all values of the other parents. In this paper we consider an approximation scheme where distinctions can be ignored in some contexts, but not in other contexts. We elaborate on a notion of a parent context that allows a structured context-specific decomposition of a probability distribution and the associated probabilistic inference scheme called probabilistic partial evaluation (Poole 1997). This paper shows a way to simplify a probabilistic model by ignoring distinctions which have similar probabilities, a method to exploit the simpler model, a bound on the resulting errors, and some preliminary empirical results on simple networks.
△ Less
Submitted 30 January, 2013;
originally announced January 2013.
-
An Anytime Algorithm for Decision Making under Uncertainty
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
We present an anytime algorithm which computes policies for decision problems represented as multi-stage influence diagrams. Our algorithm constructs policies incrementally, starting from a policy which makes no use of the available information. The incremental process constructs policies which includes more of the information available to the decision maker at each step. While the process converg…
▽ More
We present an anytime algorithm which computes policies for decision problems represented as multi-stage influence diagrams. Our algorithm constructs policies incrementally, starting from a policy which makes no use of the available information. The incremental process constructs policies which includes more of the information available to the decision maker at each step. While the process converges to the optimal policy, our approach is designed for situations in which computing the optimal policy is infeasible. We provide examples of the process on several large decision problems, showing that, for these examples, the process constructs valuable (but sub-optimal) policies before the optimal policy would be available by traditional methods.
△ Less
Submitted 30 January, 2013;
originally announced January 2013.
-
Estimating the Value of Computation in Flexible Information Refinement
Authors:
Michael C. Horsch,
David L. Poole
Abstract:
We outline a method to estimate the value of computation for a flexible algorithm using empirical data. To determine a reasonable trade-off between cost and value, we build an empirical model of the value obtained through computation, and apply this model to estimate the value of computation for quite different problems. In particular, we investigate this trade-off for the problem of constructing…
▽ More
We outline a method to estimate the value of computation for a flexible algorithm using empirical data. To determine a reasonable trade-off between cost and value, we build an empirical model of the value obtained through computation, and apply this model to estimate the value of computation for quite different problems. In particular, we investigate this trade-off for the problem of constructing policies for decision problems represented as influence diagrams. We show how two features of our anytime algorithm provide reasonable estimates of the value of computation in this domain.
△ Less
Submitted 23 January, 2013;
originally announced January 2013.
-
Reasoning With Conditional Ceteris Paribus Preference Statem
Authors:
Craig Boutilier,
Ronen I. Brafman,
Holger H. Hoos,
David L. Poole
Abstract:
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an importnat component of automated decision tools. We propose a graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being…
▽ More
In many domains it is desirable to assess the preferences of users in a qualitative rather than quantitative way. Such representations of qualitative preference orderings form an importnat component of automated decision tools. We propose a graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interpretation. Such a representation is ofetn compact and arguably natural. We describe several search algorithms for dominance testing based on this representation; these algorithms are quite effective, especially in specific network topologies, such as chain-and tree- structured networks, as well as polytrees.
△ Less
Submitted 23 January, 2013;
originally announced January 2013.
-
Building a Stochastic Dynamic Model of Application Use
Authors:
Peter J. Gorniak,
David L. Poole
Abstract:
Many intelligent user interfaces employ application and user models to determine the user's preferences, goals and likely future actions. Such models require application analysis, adaptation and expansion. Building and maintaining such models adds a substantial amount of time and labour to the application development cycle. We present a system that observes the interface of an unmodified applicat…
▽ More
Many intelligent user interfaces employ application and user models to determine the user's preferences, goals and likely future actions. Such models require application analysis, adaptation and expansion. Building and maintaining such models adds a substantial amount of time and labour to the application development cycle. We present a system that observes the interface of an unmodified application and records users' interactions with the application. From a history of such observations we build a coarse state space of observed interface states and actions between them. To refine the space, we hypothesize sub-states based upon the histories that led users to a given state. We evaluate the information gain of possible state splits, varying the length of the histories considered in such splits. In this way, we automatically produce a stochastic dynamic model of the application and of how it is used. To evaluate our approach, we present models derived from real-world application usage data.
△ Less
Submitted 16 January, 2013;
originally announced January 2013.
-
Symmetric Collaborative Filtering Using the Noisy Sensor Model
Authors:
Rita Sharma,
David L Poole
Abstract:
Collaborative filtering is the process of making recommendations regarding the potential preference of a user, for example shop** on the Internet, based on the preference ratings of the user and a number of other users for various items. This paper considers collaborative filtering based on explicitmulti-valued ratings. To evaluate the algorithms, weconsider only {em pure} collaborative filterin…
▽ More
Collaborative filtering is the process of making recommendations regarding the potential preference of a user, for example shop** on the Internet, based on the preference ratings of the user and a number of other users for various items. This paper considers collaborative filtering based on explicitmulti-valued ratings. To evaluate the algorithms, weconsider only {em pure} collaborative filtering, using ratings exclusively, and no other information about the people or items.Our approach is to predict a user's preferences regarding a particularitem by using other people who rated that item and other items ratedby the user as noisy sensors. The noisy sensor model uses Bayes' theorem to compute the probability distribution for the user'srating of a new item. We give two variant models: in one, we learn a{em classical normal linear regression} model of how users rate items; in another,we assume different users rate items the same, but the accuracy of thesensors needs to be learned. We compare these variant models withstate-of-the-art techniques and show how they are significantly better,whether a user has rated only two items or many. We reportempirical results using the EachMovie database footnote{http://research.compaq.com/SRC/eachmovie/} of movie ratings. Wealso show that by considering items similarity along with theusers similarity, the accuracy of the prediction increases.
△ Less
Submitted 10 January, 2013;
originally announced January 2013.
-
Efficient Inference in Large Discrete Domains
Authors:
Rita Sharma,
David L Poole
Abstract:
In this paper we examine the problem of inference in Bayesian Networks with discrete random variables that have very large or even unbounded domains. For example, in a domain where we are trying to identify a person, we may have variables that have as domains, the set of all names, the set of all postal codes, or the set of all credit card numbers. We cannot just have big tables of the conditional…
▽ More
In this paper we examine the problem of inference in Bayesian Networks with discrete random variables that have very large or even unbounded domains. For example, in a domain where we are trying to identify a person, we may have variables that have as domains, the set of all names, the set of all postal codes, or the set of all credit card numbers. We cannot just have big tables of the conditional probabilities, but need compact representations. We provide an inference algorithm, based on variable elimination, for belief networks containing both large domain and normal discrete random variables. We use intensional (i.e., in terms of procedures) and extensional (in terms of listing the elements) representations of conditional probabilities and of the intermediate factors.
△ Less
Submitted 19 October, 2012;
originally announced December 2012.
-
Nonparametric Bayesian Logic
Authors:
Peter Carbonetto,
Jacek Kisynski,
Nando de Freitas,
David L Poole
Abstract:
The Bayesian Logic (BLOG) language was recently developed for defining first-order probability models over worlds with unknown numbers of objects. It handles important problems in AI, including data association and population estimation. This paper extends BLOG by adopting generative processes over function spaces - known as nonparametrics in the Bayesian literature. We introduce syntax for reason…
▽ More
The Bayesian Logic (BLOG) language was recently developed for defining first-order probability models over worlds with unknown numbers of objects. It handles important problems in AI, including data association and population estimation. This paper extends BLOG by adopting generative processes over function spaces - known as nonparametrics in the Bayesian literature. We introduce syntax for reasoning about arbitrary collections of objects, and their properties, in an intuitive manner. By exploiting exchangeability, distributions over unknown objects and their attributes are cast as Dirichlet processes, which resolve difficulties in model selection and inference caused by varying numbers of objects. We demonstrate these concepts with application to citation matching.
△ Less
Submitted 4 July, 2012;
originally announced July 2012.
-
Seeing the Forest Despite the Trees: Large Scale Spatial-Temporal Decision Making
Authors:
Mark Crowley,
John Nelson,
David L Poole
Abstract:
We introduce a challenging real-world planning problem where actions must be taken at each location in a spatial area at each point in time. We use forestry planning as the motivating application. In Large Scale Spatial-Temporal (LSST) planning problems, the state and action spaces are defined as the cross-products of many local state and action spaces spread over a large spatial area such as a ci…
▽ More
We introduce a challenging real-world planning problem where actions must be taken at each location in a spatial area at each point in time. We use forestry planning as the motivating application. In Large Scale Spatial-Temporal (LSST) planning problems, the state and action spaces are defined as the cross-products of many local state and action spaces spread over a large spatial area such as a city or forest. These problems possess state uncertainty, have complex utility functions involving spatial constraints and we generally must rely on simulations rather than an explicit transition model. We define LSST problems as reinforcement learning problems and present a solution using policy gradients. We compare two different policy formulations: an explicit policy that identifies each location in space and the action to take there; and an abstract policy that defines the proportion of actions to take across all locations in space. We show that the abstract policy is more robust and achieves higher rewards with far fewer parameters than the elementary policy. This abstract policy is also a better fit to the properties that practitioners in LSST problem domains require for such methods to be widely useful.
△ Less
Submitted 9 May, 2012;
originally announced May 2012.
-
Constraint Processing in Lifted Probabilistic Inference
Authors:
Jacek Kisynski,
David L Poole
Abstract:
First-order probabilistic models combine representational power of first-order logic with graphical models. There is an ongoing effort to design lifted inference algorithms for first-order probabilistic models. We analyze lifted inference from the perspective of constraint processing and, through this viewpoint, we analyze and compare existing approaches and expose their advantages and limitations…
▽ More
First-order probabilistic models combine representational power of first-order logic with graphical models. There is an ongoing effort to design lifted inference algorithms for first-order probabilistic models. We analyze lifted inference from the perspective of constraint processing and, through this viewpoint, we analyze and compare existing approaches and expose their advantages and limitations. Our theoretical results show that the wrong choice of constraint processing method can lead to exponential increase in computational complexity. Our empirical tests confirm the importance of constraint processing in lifted inference. This is the first theoretical and empirical study of constraint processing in lifted inference.
△ Less
Submitted 9 May, 2012;
originally announced May 2012.
-
Towards Completely Lifted Search-based Probabilistic Inference
Authors:
David Poole,
Fahiem Bacchus,
Jacek Kisynski
Abstract:
The promise of lifted probabilistic inference is to carry out probabilistic inference in a relational probabilistic model without needing to reason about each individual separately (grounding out the representation) by treating the undistinguished individuals as a block. Current exact methods still need to ground out in some cases, typically because the representation of the intermediate results i…
▽ More
The promise of lifted probabilistic inference is to carry out probabilistic inference in a relational probabilistic model without needing to reason about each individual separately (grounding out the representation) by treating the undistinguished individuals as a block. Current exact methods still need to ground out in some cases, typically because the representation of the intermediate results is not closed under the lifted operations. We set out to answer the question as to whether there is some fundamental reason why lifted algorithms would need to ground out undifferentiated individuals. We have two main results: (1) We completely characterize the cases where grounding is polynomial in a population size, and show how we can do lifted inference in time polynomial in the logarithm of the population size for these cases. (2) For the case of no-argument and single-argument parametrized random variables where the grounding is not polynomial in a population size, we present lifted inference which is polynomial in the population size whereas grounding is exponential. Neither of these cases requires reasoning separately about the individuals that are not explicitly mentioned.
△ Less
Submitted 21 July, 2011; v1 submitted 20 July, 2011;
originally announced July 2011.
-
CP-nets: A Tool for Representing and Reasoning withConditional Ceteris Paribus Preference Statements
Authors:
C. Boutilier,
R. I. Brafman,
C. Domshlak,
H. H. Hoos,
D. Poole
Abstract:
Information about user preferences plays a key role in automated decision making. In many domains it is desirable to assess such preferences in a qualitative rather than quantitative way. In this paper, we propose a qualitative graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interp…
▽ More
Information about user preferences plays a key role in automated decision making. In many domains it is desirable to assess such preferences in a qualitative rather than quantitative way. In this paper, we propose a qualitative graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interpretation. Such a representation is often compact and arguably quite natural in many circumstances. We provide a formal semantics for this model, and describe how the structure of the network can be exploited in several inference tasks, such as determining whether one outcome dominates (is preferred to) another, ordering a set outcomes according to the preference relation, and constructing the best outcome subject to available evidence.
△ Less
Submitted 30 June, 2011;
originally announced July 2011.