Search | arXiv e-print repository

Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization

Authors: Mohammad Mehdi Nasiri, Mansoor Rezghi

Abstract: This paper presents an extension of the Mirror Descent method to overcome challenges in cooperative Multi-Agent Reinforcement Learning (MARL) settings, where agents have varying abilities and individual policies. The proposed Heterogeneous-Agent Mirror Descent Policy Optimization (HAMDPO) algorithm utilizes the multi-agent advantage decomposition lemma to enable efficient policy updates for each a… ▽ More This paper presents an extension of the Mirror Descent method to overcome challenges in cooperative Multi-Agent Reinforcement Learning (MARL) settings, where agents have varying abilities and individual policies. The proposed Heterogeneous-Agent Mirror Descent Policy Optimization (HAMDPO) algorithm utilizes the multi-agent advantage decomposition lemma to enable efficient policy updates for each agent while ensuring overall performance improvements. By iteratively updating agent policies through an approximate solution of the trust-region problem, HAMDPO guarantees stability and improves performance. Moreover, the HAMDPO algorithm is capable of handling both continuous and discrete action spaces for heterogeneous agents in various MARL problems. We evaluate HAMDPO on Multi-Agent MuJoCo and StarCraftII tasks, demonstrating its superiority over state-of-the-art algorithms such as HATRPO and HAPPO. These results suggest that HAMDPO is a promising approach for solving cooperative MARL problems and could potentially be extended to address other challenging problems in the field of MARL. △ Less

Submitted 13 August, 2023; originally announced August 2023.

MSC Class: 68T ACM Class: I.2

arXiv:2303.05558 [pdf, other]

doi 10.1209/0295-5075/acc270

Optimal active particle navigation meets machine learning

Authors: Mahdi Nasiri, Hartmut Löwen, Benno Liebchen

Abstract: The question of how "smart" active agents, like insects, microorganisms, or future colloidal robots need to steer to optimally reach or discover a target, such as an odor source, food, or a cancer cell in a complex environment has recently attracted great interest. Here, we provide an overview of recent developments, regarding such optimal navigation problems, from the micro- to the macroscale, an… ▽ More The question of how "smart" active agents, like insects, microorganisms, or future colloidal robots need to steer to optimally reach or discover a target, such as an odor source, food, or a cancer cell in a complex environment has recently attracted great interest. Here, we provide an overview of recent developments, regarding such optimal navigation problems, from the micro- to the macroscale, and give a perspective by discussing some of the challenges which are ahead of us. Besides exemplifying an elementary approach to optimal navigation problems, the article focuses on works utilizing machine learning-based methods. Such learning-based approaches can uncover highly efficient navigation strategies even for problems that involve e.g. chaotic, high-dimensional, or unknown environments and are hardly solvable based on conventional analytical or simulation methods. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: 7 pages, 3 figures

arXiv:2209.14690 [pdf, other]

Prompt-guided Scene Generation for 3D Zero-Shot Learning

Authors: Majid Nasiri, Ali Cheraghian, Townim Faisal Chowdhury, Sahar Ahmadi, Morteza Saberi, Shafin Rahman

Abstract: Zero-shot learning on 3D point cloud data is a related underexplored problem compared to its 2D image counterpart. 3D data brings new challenges for ZSL due to the unavailability of robust pre-trained feature extraction models. To address this problem, we propose a prompt-guided 3D scene generation and supervision method that augments 3D data to learn the network better, exploring the complex inte… ▽ More Zero-shot learning on 3D point cloud data is a related underexplored problem compared to its 2D image counterpart. 3D data brings new challenges for ZSL due to the unavailability of robust pre-trained feature extraction models. To address this problem, we propose a prompt-guided 3D scene generation and supervision method that augments 3D data to learn the network better, exploring the complex interplay of seen and unseen objects. First, we merge point clouds of two 3D models in certain ways described by a prompt. The prompt acts like the annotation describing each 3D scene. Later, we perform contrastive learning to train our proposed architecture in an end-to-end manner. We argue that 3D scenes can relate objects more efficiently than single objects because popular language models (like BERT) can achieve high performance when objects appear in a context. Our proposed prompt-guided scene generation method encapsulates data augmentation and prompt-based annotation/captioning to improve 3D ZSL performance. We have achieved state-of-the-art ZSL and generalized ZSL performance on synthetic (ModelNet40, ModelNet10) and real-scanned (ScanOjbectNN) 3D object datasets. △ Less

Submitted 29 September, 2022; originally announced September 2022.

arXiv:2207.12305 [pdf, other]

Error-Aware Spatial Ensembles for Video Frame Interpolation

Authors: Zhixiang Chi, Rasoul Mohammadi Nasiri, Zheng Liu, Yuanhao Yu, Juwei Lu, ** Tang, Konstantinos N Plataniotis

Abstract: Video frame interpolation~(VFI) algorithms have improved considerably in recent years due to unprecedented progress in both data-driven algorithms and their implementations. Recent research has introduced advanced motion estimation or novel war** methods as the means to address challenging VFI scenarios. However, none of the published VFI works considers the spatially non-uniform characteristics… ▽ More Video frame interpolation~(VFI) algorithms have improved considerably in recent years due to unprecedented progress in both data-driven algorithms and their implementations. Recent research has introduced advanced motion estimation or novel war** methods as the means to address challenging VFI scenarios. However, none of the published VFI works considers the spatially non-uniform characteristics of the interpolation error (IE). This work introduces such a solution. By closely examining the correlation between optical flow and IE, the paper proposes novel error prediction metrics that partition the middle frame into distinct regions corresponding to different IE levels. Building upon this IE-driven segmentation, and through the use of novel error-controlled loss functions, it introduces an ensemble of spatially adaptive interpolation units that progressively processes and integrates the segmented regions. This spatial ensemble results in an effective and computationally attractive VFI solution. Extensive experimentation on popular video interpolation benchmarks indicates that the proposed solution outperforms the current state-of-the-art (SOTA) in applications of current interest. △ Less

Submitted 25 July, 2022; originally announced July 2022.

Comments: 10 pages, 8 figures, demo video: https://www.youtube.com/watch?v=_32GNANSr5U

arXiv:2202.00812 [pdf, other]

doi 10.1088/1367-2630/ac8013

Reinforcement learning of optimal active particle navigation

Authors: Mahdi Nasiri, Benno Liebchen

Abstract: The development of self-propelled particles at the micro- and the nanoscale has sparked a huge potential for future applications in active matter physics, microsurgery, and targeted drug delivery. However, while the latter applications provoke the quest on how to optimally navigate towards a target, such as e.g. a cancer cell, there is still no simple way known to determine the optimal route in su… ▽ More The development of self-propelled particles at the micro- and the nanoscale has sparked a huge potential for future applications in active matter physics, microsurgery, and targeted drug delivery. However, while the latter applications provoke the quest on how to optimally navigate towards a target, such as e.g. a cancer cell, there is still no simple way known to determine the optimal route in sufficiently complex environments. Here we develop a machine learning-based approach that allows us, for the first time, to determine the asymptotically optimal path of a self-propelled agent which can freely steer in complex environments. Our method hinges on policy gradient-based deep reinforcement learning techniques and, crucially, does not require any reward sha** or heuristics. The presented method provides a powerful alternative to current analytical methods to calculate optimal trajectories and opens a route towards a universal path planner for future intelligent active particles. △ Less

Submitted 1 February, 2022; originally announced February 2022.

arXiv:2008.05683 [pdf]

Blockchain applications in Healthcare: A model for research

Authors: Amir Hussain Zolfaghari, Herbert Daly, Mahdi Nasiri, Roxana Sharifian

Abstract: Blockchain technology has rapidly evolved from an enabling technology for cryptocurrencies to a potential solution to a wider range of problems found in data-centric and distributed systems. Interest in this area has encouraged many recent innovations to address challenges that traditional approaches of design have been unable to meet. Healthcare Information Systems with issues around privacy, int… ▽ More Blockchain technology has rapidly evolved from an enabling technology for cryptocurrencies to a potential solution to a wider range of problems found in data-centric and distributed systems. Interest in this area has encouraged many recent innovations to address challenges that traditional approaches of design have been unable to meet. Healthcare Information Systems with issues around privacy, interoperability, data integrity, and access control is potentially an area where blockchain technology may have a significant impact. Blockchain, however, is a meta-technology, combining multiple techniques, as it is often important to determine how best to separate concerns in the design and implementation of such systems. This paper proposes a layered approach for the organization of blockchain in healthcare applications. Key issues driving the adoption of this technology are explored. A model presenting the points in each layer is explored. Finally, we present an example of how the perspective we describe can improve the development of Health Information Systems. △ Less

Submitted 13 August, 2020; originally announced August 2020.

arXiv:2007.11762 [pdf, other]

All at Once: Temporally Adaptive Multi-Frame Interpolation with Advanced Motion Modeling

Authors: Zhixiang Chi, Rasoul Mohammadi Nasiri, Zheng Liu, Juwei Lu, ** Tang, Konstantinos N Plataniotis

Abstract: Recent advances in high refresh rate displays as well as the increased interest in high rate of slow motion and frame up-conversion fuel the demand for efficient and cost-effective multi-frame video interpolation solutions. To that regard, inserting multiple frames between consecutive video frames are of paramount importance for the consumer electronics industry. State-of-the-art methods are itera… ▽ More Recent advances in high refresh rate displays as well as the increased interest in high rate of slow motion and frame up-conversion fuel the demand for efficient and cost-effective multi-frame video interpolation solutions. To that regard, inserting multiple frames between consecutive video frames are of paramount importance for the consumer electronics industry. State-of-the-art methods are iterative solutions interpolating one frame at the time. They introduce temporal inconsistencies and clearly noticeable visual artifacts. Departing from the state-of-the-art, this work introduces a true multi-frame interpolator. It utilizes a pyramidal style network in the temporal domain to complete the multi-frame interpolation task in one-shot. A novel flow estimation procedure using a relaxed loss function, and an advanced, cubic-based, motion model is also used to further boost interpolation accuracy when complex motion segments are encountered. Results on the Adobe240 dataset show that the proposed method generates visually pleasing, temporally consistent frames, outperforms the current best off-the-shelf method by 1.57db in PSNR with 8 times smaller model and 7.7 times faster. The proposed method can be easily extended to interpolate a large number of new frames while remaining efficient because of the one-shot mechanism. △ Less

Submitted 8 January, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

Comments: Accepted at ECCV2020 (poster), project: https://chi-chi-zx.github.io/all-at-once/

arXiv:1908.08489 [pdf]

Time series model selection with a meta-learning approach; evidence from a pool of forecasting algorithms

Authors: Sasan Barak, Mahdi Nasiri, Mehrdad Rostamzadeh

Abstract: One of the challenging questions in time series forecasting is how to find the best algorithm. In recent years, a recommender system scheme has been developed for time series analysis using a meta-learning approach. This system selects the best forecasting method with consideration of the time series characteristics. In this paper, we propose a novel approach to focusing on some of the unanswered… ▽ More One of the challenging questions in time series forecasting is how to find the best algorithm. In recent years, a recommender system scheme has been developed for time series analysis using a meta-learning approach. This system selects the best forecasting method with consideration of the time series characteristics. In this paper, we propose a novel approach to focusing on some of the unanswered questions resulting from the use of meta-learning in time series forecasting. Therefore, three main gaps in previous works are addressed including, analyzing various subsets of top forecasters as inputs for meta-learners; evaluating the effect of forecasting error measures; and assessing the role of the dimensionality of the feature space on the forecasting errors of meta-learners. All of these objectives are achieved with the help of a diverse state-of-the-art pool of forecasters and meta-learners. For this purpose, first, a pool of forecasting algorithms is implemented on the NN5 competition dataset and ranked based on the two error measures. Then, six machine-learning classifiers known as meta-learners, are trained on the extracted features of the time series in order to assign the most suitable forecasting method for the various subsets of the pool of forecasters. Furthermore, two-dimensionality reduction methods are implemented in order to investigate the role of feature space dimension on the performance of meta-learners. In general, it was found that meta-learners were able to defeat all of the individual benchmark forecasters; this performance was improved even after applying the feature selection method. △ Less

Submitted 22 August, 2019; originally announced August 2019.

Comments: 30 pages, 10 tables, and 7 figures

arXiv:1806.00281 [pdf, other]

A Recursive Least Square Method for 3D Pose Graph Optimization Problem

Authors: S. M. Nasiri, Reshad Hosseini, Hadi Moradi

Abstract: Pose Graph Optimization (PGO) is an important non-convex optimization problem and is the state-of-the-art formulation for SLAM in robotics. It also has applications like camera motion estimation, structure from motion and 3D reconstruction in machine vision. Recent researches have shown the importance of good initialization to bootstrap well-known iterative PGO solvers to converge to good solution… ▽ More Pose Graph Optimization (PGO) is an important non-convex optimization problem and is the state-of-the-art formulation for SLAM in robotics. It also has applications like camera motion estimation, structure from motion and 3D reconstruction in machine vision. Recent researches have shown the importance of good initialization to bootstrap well-known iterative PGO solvers to converge to good solutions. The state-of-the-art initialization methods, however, works in low noise or eventually moderate noise problems, and they fail in challenging problems with high measurement noise. Consequently, iterative methods may get entangled in local minima in high noise scenarios. In this paper we present an initialization method which uses orientation measurements and then present a convergence analysis of our iterative algorithm. We show how the algorithm converges to global optima in noise-free cases and also obtain a bound for the difference between our result and the optimum solution in scenarios with noisy measurements. We then present our second algorithm that uses both relative orientation and position measurements to obtain a more accurate least squares approximation of the problem that is again solved iteratively. In the convergence proof, a structural coefficient arises that has important influence on the basin of convergence. Interestingly, simulation results show that this coefficient also affects the performance of other solvers and so it can indicate the complexity of the problem. Experimental results show the excellent performance of the proposed initialization algorithm, specially in high noise scenarios. △ Less

Submitted 1 June, 2018; originally announced June 2018.

arXiv:1409.0517 [pdf]

Experiments on Data Preprocessing of Persian Blog Networks

Authors: Zeinab Borhani-fard, Leila Esmaeili, Behrouz Minaei-Bidgoli, Mehdi Nasiri

Abstract: Social networks analysis and exploring is important for researchers, sociologists, academics, and various businesses due to their information potential. Because of the large volume, diversity, and the data growth rate in web 2.0, some challenges have been made in these data analysis. Based on definitions, weblogs are a form of social networking. So far, the majority of studies and researches in th… ▽ More Social networks analysis and exploring is important for researchers, sociologists, academics, and various businesses due to their information potential. Because of the large volume, diversity, and the data growth rate in web 2.0, some challenges have been made in these data analysis. Based on definitions, weblogs are a form of social networking. So far, the majority of studies and researches in the field of weblog networks analysis and exploring their stored data have been based on international data sets. In this paper, a framework for preprocessing and data analysis in weblog networks is presented and the results of applying it on a Persian weblog network, as a case study, are expressed. △ Less

Submitted 1 September, 2014; originally announced September 2014.

Comments: International Journal of Advanced Studies in Computer Science & Engineering (IJASCSE)- 2014

Showing 1–10 of 10 results for author: Nasiri, M