Search | arXiv e-print repository

SlotGNN: Unsupervised Discovery of Multi-Object Representations and Visual Dynamics

Authors: Alireza Rezazadeh, Athreyi Badithela, Karthik Desingh, Changhyun Choi

Abstract: Learning multi-object dynamics from visual data using unsupervised techniques is challenging due to the need for robust, object representations that can be learned through robot interactions. This paper presents a novel framework with two new architectures: SlotTransport for discovering object representations from RGB images and SlotGNN for predicting their collective dynamics from RGB images and… ▽ More Learning multi-object dynamics from visual data using unsupervised techniques is challenging due to the need for robust, object representations that can be learned through robot interactions. This paper presents a novel framework with two new architectures: SlotTransport for discovering object representations from RGB images and SlotGNN for predicting their collective dynamics from RGB images and robot interactions. Our SlotTransport architecture is based on slot attention for unsupervised object discovery and uses a feature transport mechanism to maintain temporal alignment in object-centric representations. This enables the discovery of slots that consistently reflect the composition of multi-object scenes. These slots robustly bind to distinct objects, even under heavy occlusion or absence. Our SlotGNN, a novel unsupervised graph-based dynamics model, predicts the future state of multi-object scenes. SlotGNN learns a graph representation of the scene using the discovered slots from SlotTransport and performs relational and spatial reasoning to predict the future appearance of each slot conditioned on robot actions. We demonstrate the effectiveness of SlotTransport in learning object-centric features that accurately encode both visual and positional information. Further, we highlight the accuracy of SlotGNN in downstream robotic tasks, including challenging multi-object rearrangement and long-horizon prediction. Finally, our unsupervised approach proves effective in the real world. With only minimal additional data, our framework robustly predicts slots and their corresponding dynamics in real-world control tasks. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2306.15858 [pdf, other]

Hierarchical Graph Neural Networks for Proprioceptive 6D Pose Estimation of In-hand Objects

Authors: Alireza Rezazadeh, Snehal Dikhale, Soshi Iba, Nawid Jamali

Abstract: Robotic manipulation, in particular in-hand object manipulation, often requires an accurate estimate of the object's 6D pose. To improve the accuracy of the estimated pose, state-of-the-art approaches in 6D object pose estimation use observational data from one or more modalities, e.g., RGB images, depth, and tactile readings. However, existing approaches make limited use of the underlying geometr… ▽ More Robotic manipulation, in particular in-hand object manipulation, often requires an accurate estimate of the object's 6D pose. To improve the accuracy of the estimated pose, state-of-the-art approaches in 6D object pose estimation use observational data from one or more modalities, e.g., RGB images, depth, and tactile readings. However, existing approaches make limited use of the underlying geometric structure of the object captured by these modalities, thereby, increasing their reliance on visual features. This results in poor performance when presented with objects that lack such visual features or when visual features are simply occluded. Furthermore, current approaches do not take advantage of the proprioceptive information embedded in the position of the fingers. To address these limitations, in this paper: (1) we introduce a hierarchical graph neural network architecture for combining multimodal (vision and touch) data that allows for a geometrically informed 6D object pose estimation, (2) we introduce a hierarchical message passing operation that flows the information within and across modalities to learn a graph-based object representation, and (3) we introduce a method that accounts for the proprioceptive information for in-hand object representation. We evaluate our model on a diverse subset of objects from the YCB Object and Model Set, and show that our method substantially outperforms existing state-of-the-art work in accuracy and robustness to occlusion. We also deploy our proposed framework on a real robot and qualitatively demonstrate successful transfer to real settings. △ Less

Submitted 27 June, 2023; originally announced June 2023.

arXiv:2305.10977 [pdf]

Multi-microservice migration modelling, comparison, and potential in 5G/6G mobile edge computing: A non-average parameter values approach

Authors: Arshin Rezazadeh, Hanan Lutfiyya

Abstract: Cloud, fog, and edge computing integration with future mobile Internet-of-Things (IoT) devices and related applications in 5G/6G networks will become more practical in the coming years. Containers became the de facto virtualization technique that replaced Virtual Memory (VM). Mobile IoT applications, e.g., intelligent transportation and augmented reality, incorporating fog-edge, have increased the… ▽ More Cloud, fog, and edge computing integration with future mobile Internet-of-Things (IoT) devices and related applications in 5G/6G networks will become more practical in the coming years. Containers became the de facto virtualization technique that replaced Virtual Memory (VM). Mobile IoT applications, e.g., intelligent transportation and augmented reality, incorporating fog-edge, have increased the demand for a millisecond-scale response and processing time. Edge Computing reduces remote network traffic and latency. These services must run on edge nodes that are physically close to devices. However, classical migration techniques may not meet the requirements of future mission-critical IoT applications. IoT mobile devices have limited resources for running multiple services, and client-server latency worsens when fog-edge services must migrate to maintain proximity in light of device mobility. This study analyzes the performance of the MiGrror migration method and the pre-copy live migration method when the migration of multiple VMs/containers is considered. This paper presents mathematical models for the stated methods and provides migration guidelines and comparisons for services to be implemented as multiple containers, as in microservice-based environments. Experiments demonstrate that MiGrror outperforms the pre-copy technique and, unlike conventional live migrations, can maintain less than 10 milliseconds of downtime and reduce migration time with a minimal bandwidth overhead. The results show that MiGrror can improve service continuity and availability for users. Most significant is that the model can use average and non-average values for different parameters during migration to achieve improved and more accurate results, while other research typically only uses average values. This paper shows that using only average parameter values in migration can lead to inaccurate results. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2209.09168 [pdf]

Application of Neural Network in the Prediction of NOx Emissions from Degrading Gas Turbine

Authors: Zhenkun Zheng, Alan Rezazadeh

Abstract: This paper is aiming to apply neural network algorithm for predicting the process response (NOx emissions) from degrading natural gas turbines. Nine different process variables, or predictors, are considered in the predictive modelling. It is found out that the model trained by neural network algorithm should use part of recent data in the training and validation sets accounting for the impact of… ▽ More This paper is aiming to apply neural network algorithm for predicting the process response (NOx emissions) from degrading natural gas turbines. Nine different process variables, or predictors, are considered in the predictive modelling. It is found out that the model trained by neural network algorithm should use part of recent data in the training and validation sets accounting for the impact of the system degradation. R-Square values of the training and validation sets demonstrate the validity of the model. The residue plot, without any clear pattern, shows the model is appropriate. The ranking of the importance of the process variables are demonstrated and the prediction profile confirms the significance of the process variables. The model trained by using neural network algorithm manifests the optimal settings of the process variables to reach the minimum value of NOx emissions from the degrading gas turbine system. △ Less

Submitted 19 September, 2022; originally announced September 2022.

arXiv:2206.05454 [pdf, ps, other]

A General framework for PAC-Bayes Bounds for Meta-Learning

Authors: Arezou Rezazadeh

Abstract: Meta learning automatically infers an inductive bias, that includes the hyperparameter of the base-learning algorithm, by observing data from a finite number of related tasks. This paper studies PAC-Bayes bounds on meta generalization gap. The meta-generalization gap comprises two sources of generalization gaps: the environment-level and task-level gaps resulting from observation of a finite numbe… ▽ More Meta learning automatically infers an inductive bias, that includes the hyperparameter of the base-learning algorithm, by observing data from a finite number of related tasks. This paper studies PAC-Bayes bounds on meta generalization gap. The meta-generalization gap comprises two sources of generalization gaps: the environment-level and task-level gaps resulting from observation of a finite number of tasks and data samples per task, respectively. In this paper, by upper bounding arbitrary convex functions, which link the expected and empirical losses at the environment and also per-task levels, we obtain new PAC-Bayes bounds. Using these bounds, we develop new PAC-Bayes meta-learning algorithms. Numerical examples demonstrate the merits of the proposed novel bounds and algorithm in comparison to prior PAC-Bayes bounds for meta-learning. △ Less

Submitted 11 June, 2022; originally announced June 2022.

arXiv:2202.09006 [pdf, other]

KINet: Unsupervised Forward Models for Robotic Pushing Manipulation

Authors: Alireza Rezazadeh, Changhyun Choi

Abstract: Object-centric representation is an essential abstraction for forward prediction. Most existing forward models learn this representation through extensive supervision (e.g., object class and bounding box) although such ground-truth information is not readily accessible in reality. To address this, we introduce KINet (Keypoint Interaction Network) -- an end-to-end unsupervised framework to reason a… ▽ More Object-centric representation is an essential abstraction for forward prediction. Most existing forward models learn this representation through extensive supervision (e.g., object class and bounding box) although such ground-truth information is not readily accessible in reality. To address this, we introduce KINet (Keypoint Interaction Network) -- an end-to-end unsupervised framework to reason about object interactions based on a keypoint representation. Using visual observations, our model learns to associate objects with keypoint coordinates and discovers a graph representation of the system as a set of keypoint embeddings and their relations. It then learns an action-conditioned forward model using contrastive estimation to predict future keypoint states. By learning to perform physical reasoning in the keypoint space, our model automatically generalizes to scenarios with a different number of objects, novel backgrounds, and unseen object geometries. Experiments demonstrate the effectiveness of our model in accurately performing forward prediction and learning plannable object-centric representations for downstream robotic pushing manipulation tasks. △ Less

Submitted 5 August, 2023; v1 submitted 17 February, 2022; originally announced February 2022.

arXiv:2201.07227 [pdf, other]

Explainable Ensemble Machine Learning for Breast Cancer Diagnosis based on Ultrasound Image Texture Features

Authors: Alireza Rezazadeh, Yasamin Jafarian, Ali Kord

Abstract: Image classification is widely used to build predictive models for breast cancer diagnosis. Most existing approaches overwhelmingly rely on deep convolutional networks to build such diagnosis pipelines. These model architectures, although remarkable in performance, are black-box systems that provide minimal insight into the inner logic behind their predictions. This is a major drawback as the expl… ▽ More Image classification is widely used to build predictive models for breast cancer diagnosis. Most existing approaches overwhelmingly rely on deep convolutional networks to build such diagnosis pipelines. These model architectures, although remarkable in performance, are black-box systems that provide minimal insight into the inner logic behind their predictions. This is a major drawback as the explainability of prediction is vital for applications such as cancer diagnosis. In this paper, we address this issue by proposing an explainable machine learning pipeline for breast cancer diagnosis based on ultrasound images. We extract first- and second-order texture features of the ultrasound images and use them to build a probabilistic ensemble of decision tree classifiers. Each decision tree learns to classify the input ultrasound image by learning a set of robust decision thresholds for texture features of the image. The decision path of the model predictions can then be interpreted by decomposing the learned decision trees. Our results show that our proposed framework achieves high predictive performance while being explainable. △ Less

Submitted 17 January, 2022; originally announced January 2022.

arXiv:2011.08978 [pdf]

Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants

Authors: Alan Rezazadeh

Abstract: The main objective of this paper is to propose K-Nearest-Neighbor (KNN) algorithm for predicting NOx emissions from natural gas electrical generation turbines. The process of producing electricity is dynamic and rapidly changing due to many factors such as weather and electrical grid requirements. Gas turbine equipment are also a dynamic part of the electricity generation since the equipment chara… ▽ More The main objective of this paper is to propose K-Nearest-Neighbor (KNN) algorithm for predicting NOx emissions from natural gas electrical generation turbines. The process of producing electricity is dynamic and rapidly changing due to many factors such as weather and electrical grid requirements. Gas turbine equipment are also a dynamic part of the electricity generation since the equipment characteristics and thermodynamics behavior change as the turbines age. Regular maintenance of turbines are also another dynamic part of the electrical generation process, affecting the performance of equipment. This analysis discovered using KNN, trained on relatively small dataset produces the most accurate prediction rates. This statement can be logically explained as KNN finds the K nearest neighbor to the current input parameters and estimates a rated average of historically similar observations as prediction. This paper incorporates ambient weather conditions, electrical output as well as turbine performance factors to build a machine learning model to predict NOx emissions. The model can be used to optimize the operational processes for reduction in harmful emissions and increasing overall operational efficiency. Latent algorithms such as Principle Component Algorithms (PCA) have been used for monitoring the equipment performance behavior change which deeply influences process paraments and consequently determines NOx emissions. Typical statistical methods of machine learning performance evaluations such as multivariate analysis, clustering and residual analysis have been used throughout the paper. △ Less

Submitted 18 January, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

Comments: 12 pages, 9 tables, 7 figures

arXiv:2010.10886 [pdf, ps, other]

Conditional Mutual Information-Based Generalization Bound for Meta Learning

Authors: Arezou Rezazadeh, Sharu Theresa Jose, Giuseppe Durisi, Osvaldo Simeone

Abstract: Meta-learning optimizes an inductive bias---typically in the form of the hyperparameters of a base-learning algorithm---by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the pr… ▽ More Meta-learning optimizes an inductive bias---typically in the form of the hyperparameters of a base-learning algorithm---by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training \textit{meta-supersample} obtained by first sampling $2N$ independent tasks from the task environment, and then drawing $2M$ independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting $N$ tasks from the available $2N$ tasks and $M$ training samples per task from the available $2M$ training samples per task. The resulting bound is explicit in two CMI terms, which measure the information that the meta-learner output and the base-learner output provide about which training data are selected, given the entire meta-supersample. Finally, we present a numerical example that illustrates the merits of the proposed bound in comparison to prior information-theoretic bounds for meta-learning. △ Less

Submitted 8 February, 2021; v1 submitted 21 October, 2020; originally announced October 2020.

Comments: Submitted for conference publication

arXiv:2002.01441 [pdf, other]

doi 10.3390/forecast2030015

A Generalized Flow for B2B Sales Predictive Modeling: An Azure Machine Learning Approach

Authors: Alireza Rezazadeh

Abstract: Predicting the outcome of sales opportunities is a core part of successful business management. Conventionally, making this prediction has relied mostly on subjective human evaluations in the process of sales decision making. In this paper, we addressed the problem of forecasting the outcome of business to business (B2B) sales by proposing a thorough data-driven Machine Learning (ML) workflow on a… ▽ More Predicting the outcome of sales opportunities is a core part of successful business management. Conventionally, making this prediction has relied mostly on subjective human evaluations in the process of sales decision making. In this paper, we addressed the problem of forecasting the outcome of business to business (B2B) sales by proposing a thorough data-driven Machine Learning (ML) workflow on a cloud-based computing platform: Microsoft Azure Machine Learning Service (Azure ML). This workflow consists of two pipelines: (1) An ML pipeline to train probabilistic predictive models on the historical sales opportunities data. In this pipeline, data is enriched with an extensive feature enhancement step and then used to train an ensemble of ML classification models in parallel. (2) A prediction pipeline to utilize the trained ML model and infer the likelihood of winning new sales opportunities along with calculating optimal decision boundaries. The effectiveness of the proposed workflow was evaluated on a real sales dataset of a major global B2B consulting firm. Our results implied that decision-making based on the ML predictions is more accurate and brings a higher monetary value. △ Less

Submitted 2 July, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

arXiv:1910.03087 [pdf]

doi 10.1371/journal.pone.0225002

Force Field Generalization and the Internal Representation of Motor Learning

Authors: Alireza Rezazadeh, Max Berniker

Abstract: When learning a new motor behavior, e.g. reaching in a force field, the nervous system builds an internal representation. Examining how subsequent reaches in unpracticed directions generalize reveals this representation. Though it is the subject of frequent studies, it is not known how this representation changes across training directions, or how changes in reach direction and the corresponding c… ▽ More When learning a new motor behavior, e.g. reaching in a force field, the nervous system builds an internal representation. Examining how subsequent reaches in unpracticed directions generalize reveals this representation. Though it is the subject of frequent studies, it is not known how this representation changes across training directions, or how changes in reach direction and the corresponding changes in limb impedance, influence measurements of it. We ran a force field adaptation experiment using eight groups of subjects each trained on one of eight standard directions and then tested for generalization in the remaining seven directions. Generalization in all directions was local and asymmetric, providing limited and unequal transfer to the left and right side of the trained target. These asymmetries were not consistent in either magnitude or direction even after correcting for changes in limb impedance, at odds with previous explanations. Relying on a standard model for generalization the inferred representations inconsistently shifted to one side or the other of their respective training direction. A second model that accounted for limb impedance and variations in baseline trajectories explained more data and the inferred representations were centered on their respective training directions. Our results highlight the influence of limb mechanics and impedance on psychophysical measurements and their interpretations for motor learning. △ Less

Submitted 7 October, 2019; originally announced October 2019.

Comments: Accepted for Publication in PLoS One Journal

arXiv:1901.05778 [pdf, ps, other]

Joint Source-Channel Coding for the Multiple-Access Channel with Correlated Sources

Authors: Arezou Rezazadeh, Josep Font-Segura, Alfonso Martinez, Albert Guillén i Fàbregas

Abstract: This paper studies the random-coding exponent of joint source-channel coding for the multiple-access channel with correlated sources. For each user, by defining a threshold, the messages of each source are partitioned into two classes. The achievable exponent for correlated sources with two message-dependent input distributions for each user is determined and shown to be larger than that achieved… ▽ More This paper studies the random-coding exponent of joint source-channel coding for the multiple-access channel with correlated sources. For each user, by defining a threshold, the messages of each source are partitioned into two classes. The achievable exponent for correlated sources with two message-dependent input distributions for each user is determined and shown to be larger than that achieved using only one input distribution for each user. A system of equations is presented to determine the optimal thresholds maximizing the achievable exponent. The obtained exponent is compared with the one derived for the MAC with independent sources. △ Less

Submitted 17 April, 2019; v1 submitted 17 January, 2019; originally announced January 2019.

arXiv:1805.11315 [pdf, ps, other]

Multiple-Access Channel with Independent Sources: Error Exponent Analysis

Authors: Arezou Rezazadeh, Josep Font-Segura, Alfonso Martinez, Albert Guillén i Fàbregas

Abstract: In this paper, an achievable error exponent for the multiple-access channel with two independent sources is derived. For each user, the source messages are partitioned into two classes and codebooks are generated by drawing codewords from an input distribution depending on the class index of the source message. The partitioning thresholds that maximize the achievable exponent are given by the solu… ▽ More In this paper, an achievable error exponent for the multiple-access channel with two independent sources is derived. For each user, the source messages are partitioned into two classes and codebooks are generated by drawing codewords from an input distribution depending on the class index of the source message. The partitioning thresholds that maximize the achievable exponent are given by the solution of a system of equations. We also derive both lower and upper bounds for the achievable exponent in terms of Gallager's source and channel functions. Finally, a numerical example shows that using the proposed ensemble gives a noticeable gain in terms of exponent with respect to independent identically distributed codebooks. △ Less

Submitted 5 September, 2018; v1 submitted 29 May, 2018; originally announced May 2018.

arXiv:1805.05514 [pdf, other]

doi 10.4204/EPTCS.271.3

Incremental Database Design using UML-B and Event-B

Authors: Ahmed Al-Brashdi, Michael Butler, Abdolbaghi Rezazadeh

Abstract: Correct operation of many critical systems is dependent on the data consistency and integrity properties of underlying databases. Therefore, a verifiable and rigorous database design process is highly desirable. This research aims to investigate and deliver a comprehensive and practical approach for modelling databases in formal methods through layered refinements. The methodology is being guided… ▽ More Correct operation of many critical systems is dependent on the data consistency and integrity properties of underlying databases. Therefore, a verifiable and rigorous database design process is highly desirable. This research aims to investigate and deliver a comprehensive and practical approach for modelling databases in formal methods through layered refinements. The methodology is being guided by a number of case studies, using abstraction and refinement in UML-B and verification with the Rodin tool. UML-B is a graphical representation of the Event-B formalism and the Rodin tool supports verification for Event-B and UML-B. Our method guides developers to model relational databases in UML-B through layered refinement and to specify the necessary constraints and operations on the database. △ Less

Submitted 14 May, 2018; originally announced May 2018.

Comments: In Proceedings IMPEX 2017 and FM&MDD 2017, arXiv:1805.04636

ACM Class: D.2 SOFTWARE ENGINEERING; F.3 LOGICS AND MEANINGS OF PROGRAMS

Journal ref: EPTCS 271, 2018, pp. 34-47

Showing 1–14 of 14 results for author: Rezazadeh, A