-
Conformal Off-Policy Prediction for Multi-Agent Systems
Authors:
Tom Kuipers,
Renukanandan Tumu,
Shuo Yang,
Milad Kazemi,
Rahul Mangharam,
Nicola Paoletti
Abstract:
Off-Policy Prediction (OPP), i.e., predicting the outcomes of a target policy using only data collected under a nominal (behavioural) policy, is a paramount problem in data-driven analysis of safety-critical systems where the deployment of a new policy may be unsafe. To achieve dependable off-policy predictions, recent work on Conformal Off-Policy Prediction (COPP) leverage the conformal predictio…
▽ More
Off-Policy Prediction (OPP), i.e., predicting the outcomes of a target policy using only data collected under a nominal (behavioural) policy, is a paramount problem in data-driven analysis of safety-critical systems where the deployment of a new policy may be unsafe. To achieve dependable off-policy predictions, recent work on Conformal Off-Policy Prediction (COPP) leverage the conformal prediction framework to derive prediction regions with probabilistic guarantees under the target process. Existing COPP methods can account for the distribution shifts induced by policy switching, but are limited to single-agent systems and scalar outcomes (e.g., rewards). In this work, we introduce MA-COPP, the first conformal prediction method to solve OPP problems involving multi-agent systems, deriving joint prediction regions for all agents' trajectories when one or more "ego" agents change their policies. Unlike the single-agent scenario, this setting introduces higher complexity as the distribution shifts affect predictions for all agents, not just the ego agents, and the prediction task involves full multi-dimensional trajectories, not just reward values. A key contribution of MA-COPP is to avoid enumeration or exhaustive search of the output space of agent trajectories, which is instead required by existing COPP methods to construct the prediction region. We achieve this by showing that an over-approximation of the true JPR can be constructed, without enumeration, from the maximum density ratio of the JPR trajectories. We evaluate the effectiveness of MA-COPP in multi-agent systems from the PettingZoo library and the F1TENTH autonomous racing environment, achieving nominal coverage in higher dimensions and various shift settings.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Let Your Graph Do the Talking: Encoding Structured Data for LLMs
Authors:
Bryan Perozzi,
Bahare Fatemi,
Dustin Zelle,
Anton Tsitsulin,
Mehran Kazemi,
Rami Al-Rfou,
Jonathan Halcrow
Abstract:
How can we best encode structured data into sequential form for use in large language models (LLMs)? In this work, we introduce a parameter-efficient method to explicitly represent structured data for LLMs. Our method, GraphToken, learns an encoding function to extend prompts with explicit structured information. Unlike other work which focuses on limited domains (e.g. knowledge graph representati…
▽ More
How can we best encode structured data into sequential form for use in large language models (LLMs)? In this work, we introduce a parameter-efficient method to explicitly represent structured data for LLMs. Our method, GraphToken, learns an encoding function to extend prompts with explicit structured information. Unlike other work which focuses on limited domains (e.g. knowledge graph representation), our work is the first effort focused on the general encoding of structured data to be used for various reasoning tasks. We show that explicitly representing the graph structure allows significant improvements to graph reasoning tasks. Specifically, we see across the board improvements - up to 73% points - on node, edge and, graph-level tasks from the GraphQA benchmark.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
Application of Artificial Neural Networks for Investigation of Pressure Filtration Performance, a Zinc Leaching Filter Cake Moisture Modeling
Authors:
Masoume Kazemi,
Davood Moradkhani,
Alireza A. Alipour
Abstract:
Machine Learning (ML) is a powerful tool for material science applications. Artificial Neural Network (ANN) is a machine learning technique that can provide high prediction accuracy. This study aimed to develop an ANN model to predict the cake moisture of the pressure filtration process of zinc production. The cake moisture was influenced by seven parameters: temperature (35 and 65 Celsius), solid…
▽ More
Machine Learning (ML) is a powerful tool for material science applications. Artificial Neural Network (ANN) is a machine learning technique that can provide high prediction accuracy. This study aimed to develop an ANN model to predict the cake moisture of the pressure filtration process of zinc production. The cake moisture was influenced by seven parameters: temperature (35 and 65 Celsius), solid concentration (0.2 and 0.38 g/L), pH (2, 3.5, and 5), air-blow time (2, 10, and 15 min), cake thickness (14, 20, 26, and 34 mm), pressure, and filtration time. The study conducted 288 tests using two types of fabrics: polypropylene (S1) and polyester (S2). The ANN model was evaluated by the Coefficient of determination (R2), the Mean Square Error (MSE), and the Mean Absolute Error (MAE) metrics for both datasets. The results showed R2 values of 0.88 and 0.83, MSE values of 6.243x10-07 and 1.086x10-06, and MAE values of 0.00056 and 0.00088 for S1 and S2, respectively. These results indicated that the ANN model could predict the cake moisture of pressure filtration in the zinc leaching process with high accuracy.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Diachronic Embedding for Temporal Knowledge Graph Completion
Authors:
Rishab Goel,
Seyed Mehran Kazemi,
Marcus Brubaker,
Pascal Poupart
Abstract:
Knowledge graphs (KGs) typically contain temporal facts indicating relationships among entities at different times. Due to their incompleteness, several approaches have been proposed to infer new facts for a KG based on the existing ones-a problem known as KG completion. KG embedding approaches have proved effective for KG completion, however, they have been developed mostly for static KGs. Develo…
▽ More
Knowledge graphs (KGs) typically contain temporal facts indicating relationships among entities at different times. Due to their incompleteness, several approaches have been proposed to infer new facts for a KG based on the existing ones-a problem known as KG completion. KG embedding approaches have proved effective for KG completion, however, they have been developed mostly for static KGs. Develo** temporal KG embedding models is an increasingly important problem. In this paper, we build novel models for temporal KG completion through equip** static models with a diachronic entity embedding function which provides the characteristics of entities at any point in time. This is in contrast to the existing temporal KG embedding approaches where only static entity features are provided. The proposed embedding function is model-agnostic and can be potentially combined with any static model. We prove that combining it with SimplE, a recent model for static KG embedding, results in a fully expressive model for temporal KG completion. Our experiments indicate the superiority of our proposal compared to existing baselines.
△ Less
Submitted 6 July, 2019;
originally announced July 2019.
-
Representation Learning for Dynamic Graphs: A Survey
Authors:
Seyed Mehran Kazemi,
Rishab Goel,
Kshitij Jain,
Ivan Kobyzev,
Akshay Sethi,
Peter Forsyth,
Pascal Poupart
Abstract:
Graphs arise naturally in many real-world applications including social networks, recommender systems, ontologies, biology, and computational finance. Traditionally, machine learning models for graphs have been mostly designed for static graphs. However, many applications involve evolving graphs. This introduces important challenges for learning and inference since nodes, attributes, and edges cha…
▽ More
Graphs arise naturally in many real-world applications including social networks, recommender systems, ontologies, biology, and computational finance. Traditionally, machine learning models for graphs have been mostly designed for static graphs. However, many applications involve evolving graphs. This introduces important challenges for learning and inference since nodes, attributes, and edges change over time. In this survey, we review the recent advances in representation learning for dynamic graphs, including dynamic knowledge graphs. We describe existing models from an encoder-decoder perspective, categorize these encoders and decoders based on the techniques they employ, and analyze the approaches in each category. We also review several prominent applications and widely used datasets and highlight directions for future research.
△ Less
Submitted 27 April, 2020; v1 submitted 27 May, 2019;
originally announced May 2019.
-
Structure Learning for Relational Logistic Regression: An Ensemble Approach
Authors:
Nandini Ramanan,
Gautam Kunapuli,
Tushar Khot,
Bahare Fatemi,
Seyed Mehran Kazemi,
David Poole,
Kristian Kersting,
Sriraam Natarajan
Abstract:
We consider the problem of learning Relational Logistic Regression (RLR). Unlike standard logistic regression, the features of RLRs are first-order formulae with associated weight vectors instead of scalar weights. We turn the problem of learning RLR to learning these vector-weighted formulae and develop a learning algorithm based on the recently successful functional-gradient boosting methods for…
▽ More
We consider the problem of learning Relational Logistic Regression (RLR). Unlike standard logistic regression, the features of RLRs are first-order formulae with associated weight vectors instead of scalar weights. We turn the problem of learning RLR to learning these vector-weighted formulae and develop a learning algorithm based on the recently successful functional-gradient boosting methods for probabilistic logic models. We derive the functional gradients and show how weights can be learned simultaneously in an efficient manner. Our empirical evaluation on standard and novel data sets demonstrates the superiority of our approach over other methods for learning RLR.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
SimplE Embedding for Link Prediction in Knowledge Graphs
Authors:
Seyed Mehran Kazemi,
David Poole
Abstract:
Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction proble…
▽ More
Knowledge graphs contain knowledge about the world and provide a structured representation of this knowledge. Current knowledge graphs contain only a small subset of what is true in the world. Link prediction approaches aim at predicting new links for a knowledge graph given the existing links among the entities. Tensor factorization approaches have proved promising for such link prediction problems. Proposed in 1927, Canonical Polyadic (CP) decomposition is among the first tensor factorization approaches. CP generally performs poorly for link prediction as it learns two independent embedding vectors for each entity, whereas they are really tied. We present a simple enhancement of CP (which we call SimplE) to allow the two embeddings of each entity to be learned dependently. The complexity of SimplE grows linearly with the size of embeddings. The embeddings learned through SimplE are interpretable, and certain types of background knowledge can be incorporated into these embeddings through weight tying. We prove SimplE is fully expressive and derive a bound on the size of its embeddings for full expressivity. We show empirically that, despite its simplicity, SimplE outperforms several state-of-the-art tensor factorization techniques. SimplE's code is available on GitHub at https://github.com/Mehran-k/SimplE.
△ Less
Submitted 25 October, 2018; v1 submitted 13 February, 2018;
originally announced February 2018.
-
RelNN: A Deep Neural Model for Relational Learning
Authors:
Seyed Mehran Kazemi,
David Poole
Abstract:
Statistical relational AI (StarAI) aims at reasoning and learning in noisy domains described in terms of objects and relationships by combining probability with first-order logic. With huge advances in deep learning in the current years, combining deep networks with first-order logic has been the focus of several recent studies. Many of the existing attempts, however, only focus on relations and i…
▽ More
Statistical relational AI (StarAI) aims at reasoning and learning in noisy domains described in terms of objects and relationships by combining probability with first-order logic. With huge advances in deep learning in the current years, combining deep networks with first-order logic has been the focus of several recent studies. Many of the existing attempts, however, only focus on relations and ignore object properties. The attempts that do consider object properties are limited in terms of modelling power or scalability. In this paper, we develop relational neural networks (RelNNs) by adding hidden layers to relational logistic regression (the relational counterpart of logistic regression). We learn latent properties for objects both directly and through general rules. Back-propagation is used for training these models. A modular, layer-wise architecture facilitates utilizing the techniques developed within deep learning community to our architecture. Initial experiments on eight tasks over three real-world datasets show that RelNNs are promising models for relational learning.
△ Less
Submitted 7 December, 2017;
originally announced December 2017.
-
Comparing Aggregators for Relational Probabilistic Models
Authors:
Seyed Mehran Kazemi,
Bahare Fatemi,
Alexandra Kim,
Zilun Peng,
Moumita Roy Tora,
Xing Zeng,
Matthew Dirks,
David Poole
Abstract:
Relational probabilistic models have the challenge of aggregation, where one variable depends on a population of other variables. Consider the problem of predicting gender from movie ratings; this is challenging because the number of movies per user and users per movie can vary greatly. Surprisingly, aggregation is not well understood. In this paper, we show that existing relational models (implic…
▽ More
Relational probabilistic models have the challenge of aggregation, where one variable depends on a population of other variables. Consider the problem of predicting gender from movie ratings; this is challenging because the number of movies per user and users per movie can vary greatly. Surprisingly, aggregation is not well understood. In this paper, we show that existing relational models (implicitly or explicitly) either use simple numerical aggregators that lose great amounts of information, or correspond to naive Bayes, logistic regression, or noisy-OR that suffer from overconfidence. We propose new simple aggregators and simple modifications of existing models that empirically outperform the existing ones. The intuition we provide on different (existing or new) models and their shortcomings plus our empirical findings promise to form the foundation for future representations.
△ Less
Submitted 24 July, 2017;
originally announced July 2017.
-
Estimation of P(X > Y ) for Weibull distribution based on hybrid censored samples
Authors:
Akbar Asgharzadeh,
Mohammad Kazemi,
Debasis Kundu
Abstract:
A Hybrid censoring scheme is mixture of Type-I and Type-II censoring schemes. Based on hybrid censored samples, this paper deals with the in- ference on R = P(X > Y ), when X and Y are two independent Weibull distributions with different scale parameters, but having the same shape pa- rameter. The maximum likelihood estimator (MLE), and the approximate MLE (AMLE) of R are obtained. The asymptotic…
▽ More
A Hybrid censoring scheme is mixture of Type-I and Type-II censoring schemes. Based on hybrid censored samples, this paper deals with the in- ference on R = P(X > Y ), when X and Y are two independent Weibull distributions with different scale parameters, but having the same shape pa- rameter. The maximum likelihood estimator (MLE), and the approximate MLE (AMLE) of R are obtained. The asymptotic distribution of the maxi- mum likelihood estimator of R is obtained. Based on the asymptotic distribu- tion, the confidence interval of R can be derived. Two bootstrap confidence intervals are also proposed. We consider the Bayesian estimate of R, and propose the corresponding credible interval for R. Monte Carlo simulations are performed to compare the different proposed methods. Analysis of a real data set has also been presented for illustrative purposes.
△ Less
Submitted 18 July, 2017;
originally announced July 2017.
-
Estimation of Inverse Weibull Distribution Under Type-I Hybrid Censoring
Authors:
Mohammad Kazemi,
Mina Azizpour
Abstract:
The hybrid censoring is a mixture of Type I and Type II censoring schemes. This paper presents the statistical inferences of the Inverse Weibull distribution when the data are Type-I hybrid censored. First we consider the maximum likelihood estimators of the unknown parameters. It is observed that the maximum likelihood estimators can not be obtained in closed form. We further obtain the Bayes est…
▽ More
The hybrid censoring is a mixture of Type I and Type II censoring schemes. This paper presents the statistical inferences of the Inverse Weibull distribution when the data are Type-I hybrid censored. First we consider the maximum likelihood estimators of the unknown parameters. It is observed that the maximum likelihood estimators can not be obtained in closed form. We further obtain the Bayes estimators and the corresponding highest posterior density credible intervals of the unknown parameters under the assumption of independent gamma priors using the importance sampling procedure. We also compute the approximate Bayes estimators using Lindley's approximation technique. We have performed a simulation study and a real data analysis in order to compare the proposed Bayes estimators with the maximum likelihood estimators.
△ Less
Submitted 4 April, 2020; v1 submitted 18 July, 2017;
originally announced July 2017.
-
Small Sample Inference for the Common Coefficient of Variation
Authors:
Mohmammad Reza Kazemi,
Ali Akbar Jafari
Abstract:
This paper utilizes the modified signed log-likelihood ratio method for the problem of inference about the common coefficient of variation in several independent normal populations. This method is applicable for both the problem of hypothesis testing and constructing a confidence interval for this parameter. Simulation studies show that the coverage probability of this proposed approach is close t…
▽ More
This paper utilizes the modified signed log-likelihood ratio method for the problem of inference about the common coefficient of variation in several independent normal populations. This method is applicable for both the problem of hypothesis testing and constructing a confidence interval for this parameter. Simulation studies show that the coverage probability of this proposed approach is close to the confidence coefficient. Also, its expected length is smaller than expected lengths of other competing approaches. In fact, the proposed approach is very satisfactory regardless of the number of populations and the different values of the common coefficient of variation even for very small sample size. Finally, we illustrate the proposed method using two real data sets.
△ Less
Submitted 13 July, 2017;
originally announced July 2017.
-
New Liftable Classes for First-Order Probabilistic Inference
Authors:
Seyed Mehran Kazemi,
Angelika Kimmig,
Guy Van den Broeck,
David Poole
Abstract:
Statistical relational models provide compact encodings of probabilistic dependencies in relational domains, but result in highly intractable graphical models. The goal of lifted inference is to carry out probabilistic inference without needing to reason about each individual separately, by instead treating exchangeable, undistinguished objects as a whole. In this paper, we study the domain recurs…
▽ More
Statistical relational models provide compact encodings of probabilistic dependencies in relational domains, but result in highly intractable graphical models. The goal of lifted inference is to carry out probabilistic inference without needing to reason about each individual separately, by instead treating exchangeable, undistinguished objects as a whole. In this paper, we study the domain recursion inference rule, which, despite its central role in early theoretical results on domain-lifted inference, has later been believed redundant. We show that this rule is more powerful than expected, and in fact significantly extends the range of models for which lifted inference runs in time polynomial in the number of individuals in the domain. This includes an open problem called S4, the symmetric transitivity model, and a first-order logic encoding of the birthday paradox. We further identify new classes S2FO2 and S2RU of domain-liftable theories, which respectively subsume FO2 and recursively unary theories, the largest classes of domain-liftable theories known so far, and show that using domain recursion can achieve exponential speedup even in theories that cannot fully be lifted with the existing set of inference rules.
△ Less
Submitted 26 October, 2016;
originally announced October 2016.
-
A Learning Algorithm for Relational Logistic Regression: Preliminary Results
Authors:
Bahare Fatemi,
Seyed Mehran Kazemi,
David Poole
Abstract:
Relational logistic regression (RLR) is a representation of conditional probability in terms of weighted formulae for modelling multi-relational data. In this paper, we develop a learning algorithm for RLR models. Learning an RLR model from data consists of two steps: 1- learning the set of formulae to be used in the model (a.k.a. structure learning) and learning the weight of each formula (a.k.a.…
▽ More
Relational logistic regression (RLR) is a representation of conditional probability in terms of weighted formulae for modelling multi-relational data. In this paper, we develop a learning algorithm for RLR models. Learning an RLR model from data consists of two steps: 1- learning the set of formulae to be used in the model (a.k.a. structure learning) and learning the weight of each formula (a.k.a. parameter learning). For structure learning, we deploy Schmidt and Murphy's hierarchical assumption: first we learn a model with simple formulae, then more complex formulae are added iteratively only if all their sub-formulae have proven effective in previous learned models. For parameter learning, we convert the problem into a non-relational learning problem and use an off-the-shelf logistic regression learning algorithm from Weka, an open-source machine learning tool, to learn the weights. We also indicate how hidden features about the individuals can be incorporated into RLR to boost the learning performance. We compare our learning algorithm to other structure and parameter learning algorithms in the literature, and compare the performance of RLR models to standard logistic regression and RDN-Boost on a modified version of the MovieLens data-set.
△ Less
Submitted 27 June, 2016;
originally announced June 2016.
-
Modified Signed Log-Likelihood Ratio Test for Comparing the Correlation Coefficients of Two Independent Bivariate Normal Distributions
Authors:
M. R. Kazemi,
A. A. Jafari
Abstract:
In this paper, we use the method of modified signed log-likelihood ratio test for the problem of testing the equality of correlation coefficients in two independent bivariate normal distributions. We compare this method with two other %competing approaches, Fisher's Z-transform and generalized test variable, using a Monte Carlo simulation. It indicates that the proposed method is better than the o…
▽ More
In this paper, we use the method of modified signed log-likelihood ratio test for the problem of testing the equality of correlation coefficients in two independent bivariate normal distributions. We compare this method with two other %competing approaches, Fisher's Z-transform and generalized test variable, using a Monte Carlo simulation. It indicates that the proposed method is better than the other approaches, in terms of the actual sizes and powers especially when the sample sizes are unequal. We illustrate performance of the proposed approach, using a real data set.
△ Less
Submitted 31 May, 2016;
originally announced May 2016.
-
A New Approach in Persian Handwritten Letters Recognition Using Error Correcting Output Coding
Authors:
Maziar Kazemi,
Muhammad Yousefnezhad,
Saber Nourian
Abstract:
Classification Ensemble, which uses the weighed polling of outputs, is the art of combining a set of basic classifiers for generating high-performance, robust and more stable results. This study aims to improve the results of identifying the Persian handwritten letters using Error Correcting Output Coding (ECOC) ensemble method. Furthermore, the feature selection is used to reduce the costs of err…
▽ More
Classification Ensemble, which uses the weighed polling of outputs, is the art of combining a set of basic classifiers for generating high-performance, robust and more stable results. This study aims to improve the results of identifying the Persian handwritten letters using Error Correcting Output Coding (ECOC) ensemble method. Furthermore, the feature selection is used to reduce the costs of errors in our proposed method. ECOC is a method for decomposing a multi-way classification problem into many binary classification tasks; and then combining the results of the subtasks into a hypothesized solution to the original problem. Firstly, the image features are extracted by Principal Components Analysis (PCA). After that, ECOC is used for identification the Persian handwritten letters which it uses Support Vector Machine (SVM) as the base classifier. The empirical results of applying this ensemble method using 10 real-world data sets of Persian handwritten letters indicate that this method has better results in identifying the Persian handwritten letters than other ensemble methods and also single classifications. Moreover, by testing a number of different features, this paper found that we can reduce the additional cost in feature selection stage by using this method.
△ Less
Submitted 26 April, 2016;
originally announced April 2016.
-
Characterization of Order Statistics in Two Runs Using Conditional Expectation
Authors:
Mohammad Reza Kazemi,
Ali Akbar Jafari
Abstract:
The runs test is a well-known test that is used for checking independence between elements of a sample data sequence. Some of runs tests are based on the longest run and others based on the total runs. In this paper, we consider order statistics of two runs statistics, and obtain their probability mass functions. In addition, the means and variances of the order statistics are derived using tradit…
▽ More
The runs test is a well-known test that is used for checking independence between elements of a sample data sequence. Some of runs tests are based on the longest run and others based on the total runs. In this paper, we consider order statistics of two runs statistics, and obtain their probability mass functions. In addition, the means and variances of the order statistics are derived using traditional conditional expectation.
△ Less
Submitted 29 October, 2014;
originally announced October 2014.
-
Comparing Seventeen Interval Estimates for a Bivariate Normal Correlation Coefficient
Authors:
Mohammad Reza Kazemi,
Ali Akbar Jafari
Abstract:
In this paper, we consider the problem of constructing confidence interval for the correlation coefficient in a bivariate normal distribution. For this problem, we found fifteen approaches in literatures. Also, we have proposed a generalized confidence interval and a parametric bootstrap confidence interval. The coverage probabilities and expected lengths of these seventeen approaches are evaluate…
▽ More
In this paper, we consider the problem of constructing confidence interval for the correlation coefficient in a bivariate normal distribution. For this problem, we found fifteen approaches in literatures. Also, we have proposed a generalized confidence interval and a parametric bootstrap confidence interval. The coverage probabilities and expected lengths of these seventeen approaches are evaluated and compared via simulation study. In addition, robustness of the methods is considered in the comparisons by the non-normal distributions. Two real examples are given to illustrate the approaches.
△ Less
Submitted 29 October, 2014;
originally announced October 2014.