-
Enhancing Language Learning through Technology: Introducing a New English-Azerbaijani (Arabic Script) Parallel Corpus
Authors:
Jalil Nourmohammadi Khiarak,
Ammar Ahmadi,
Taher Ak-bari Saeed,
Meysam Asgari-Chenaghlu,
Toğrul Atabay,
Mohammad Reza Baghban Karimi,
Ismail Ceferli,
Farzad Hasanvand,
Seyed Mahboub Mousavi,
Morteza Noshad
Abstract:
This paper introduces a pioneering English-Azerbaijani (Arabic Script) parallel corpus, designed to bridge the technological gap in language learning and machine translation (MT) for under-resourced languages. Consisting of 548,000 parallel sentences and approximately 9 million words per language, this dataset is derived from diverse sources such as news articles and holy texts, aiming to enhance…
▽ More
This paper introduces a pioneering English-Azerbaijani (Arabic Script) parallel corpus, designed to bridge the technological gap in language learning and machine translation (MT) for under-resourced languages. Consisting of 548,000 parallel sentences and approximately 9 million words per language, this dataset is derived from diverse sources such as news articles and holy texts, aiming to enhance natural language processing (NLP) applications and language education technology. This corpus marks a significant step forward in the realm of linguistic resources, particularly for Turkic languages, which have lagged in the neural machine translation (NMT) revolution. By presenting the first comprehensive case study for the English-Azerbaijani (Arabic Script) language pair, this work underscores the transformative potential of NMT in low-resource contexts. The development and utilization of this corpus not only facilitate the advancement of machine translation systems tailored for specific linguistic needs but also promote inclusive language learning through technology. The findings demonstrate the corpus's effectiveness in training deep learning MT systems and underscore its role as an essential asset for researchers and educators aiming to foster bilingual education and multilingual communication. This research covers the way for future explorations into NMT applications for languages lacking substantial digital resources, thereby enhancing global language education frameworks. The Python package of our code is available at https://pypi.org/project/chevir-kartalol/, and we also have a website accessible at https://translate.kartalol.com/.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Employee Turnover Analysis Using Machine Learning Algorithms
Authors:
Mahyar Karimi,
Kamyar Seyedkazem Viliyani
Abstract:
Employee's knowledge is an organization asset. Turnover may impose apparent and hidden costs and irreparable damages. To overcome and mitigate this risk, employee's condition should be monitored. Due to high complexity of analyzing well-being features, employee's turnover predicting can be delegated to machine learning techniques. In this paper, we discuss employee's attrition rate. Three differen…
▽ More
Employee's knowledge is an organization asset. Turnover may impose apparent and hidden costs and irreparable damages. To overcome and mitigate this risk, employee's condition should be monitored. Due to high complexity of analyzing well-being features, employee's turnover predicting can be delegated to machine learning techniques. In this paper, we discuss employee's attrition rate. Three different supervised learning algorithms comprising AdaBoost, SVM and RandomForest are used to benchmark employee attrition accuracy. Attained models can help out at establishing predictive analytics.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Profiling and Modeling of Power Characteristics of Leadership-Scale HPC System Workloads
Authors:
Ahmad Maroof Karimi,
Naw Safrin Sattar,
Woong Shin,
Feiyi Wang
Abstract:
In the exascale era in which application behavior has large power & energy footprints, per-application job-level awareness of such impression is crucial in taking steps towards achieving efficiency goals beyond performance, such as energy efficiency, and sustainability.
To achieve these goals, we have developed a novel low-latency job power profiling machine learning pipeline that can group job-…
▽ More
In the exascale era in which application behavior has large power & energy footprints, per-application job-level awareness of such impression is crucial in taking steps towards achieving efficiency goals beyond performance, such as energy efficiency, and sustainability.
To achieve these goals, we have developed a novel low-latency job power profiling machine learning pipeline that can group job-level power profiles based on their shapes as they complete. This pipeline leverages a comprehensive feature extraction and clustering pipeline powered by a generative adversarial network (GAN) model to handle the feature-rich time series of job-level power measurements. The output is then used to train a classification model that can predict whether an incoming job power profile is similar to a known group of profiles or is completely new. With extensive evaluations, we demonstrate the effectiveness of each component in our pipeline. Also, we provide a preliminary analysis of the resulting clusters that depict the power profile landscape of the Summit supercomputer from more than 60K jobs sampled from the year 2021.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Accelerated Computational Micromechanics for Reactive Flow in Porous Media
Authors:
Mina Karimi,
Kaushik Bhattacharya
Abstract:
Reactive transport in permeable porous media is relevant for a variety of applications, but poses a significant challenge due to the range of length and time scales. Multiscale methods that aim to link microstructure with the macroscopic response of geo-materials have been developed, but require the repeated solution of the small-scale problem and provide the motivation for this work. We present a…
▽ More
Reactive transport in permeable porous media is relevant for a variety of applications, but poses a significant challenge due to the range of length and time scales. Multiscale methods that aim to link microstructure with the macroscopic response of geo-materials have been developed, but require the repeated solution of the small-scale problem and provide the motivation for this work. We present an efficient computational method to study fluid flow and solute transport problems in periodic porous media. Fluid flow is governed by the Stokes equation, and the solute transport is governed by the advection-diffusion equation. We follow the accelerated computational micromechanics approach that leads to an iterative computational method where each step is either local or the solution of a Poisson's equation. This enables us to implement these methods on accelerators like graphics processing units (GPUs) and exploit their massively parallel architecture. We verify the approach by comparing the results against established computational methods and then demonstrate the accuracy, efficacy, and performance by studying various examples. This method efficiently calculates the effective transport properties for complex pore geometries.
△ Less
Submitted 24 December, 2023;
originally announced December 2023.
-
Efficient Implementation of Interior-Point Methods for Quantum Relative Entropy
Authors:
Mehdi Karimi,
Levent Tuncel
Abstract:
Quantum Relative Entropy (QRE) programming is a recently popular and challenging class of convex optimization problems with significant applications in quantum computing and quantum information theory. We are interested in modern interior point (IP) methods based on optimal self-concordant barriers for the QRE cone. A range of theoretical and numerical challenges associated with such barrier funct…
▽ More
Quantum Relative Entropy (QRE) programming is a recently popular and challenging class of convex optimization problems with significant applications in quantum computing and quantum information theory. We are interested in modern interior point (IP) methods based on optimal self-concordant barriers for the QRE cone. A range of theoretical and numerical challenges associated with such barrier functions and the QRE cones have hindered the scalability of IP methods. To address these challenges, we propose a series of numerical and linear algebraic techniques and heuristics aimed at enhancing the efficiency of gradient and Hessian computations for the self-concordant barrier function, solving linear systems, and performing matrix-vector products. We also introduce and deliberate about some interesting concepts related to QRE such as symmetric quantum relative entropy (SQRE). We also introduce a two-phase method for performing facial reduction that can significantly improve the performance of QRE programming. Our new techniques have been implemented in the latest version (DDS 2.2) of the software package DDS. In addition to handling QRE constraints, DDS accepts any combination of several other conic and non-conic convex constraints. Our comprehensive numerical experiments encompass several parts including 1) a comparison of DDS 2.2 with Hypatia for the nearest correlation matrix problem, 2) using DDS for combining QRE constraints with various other constraint types, and 3) calculating the key rate for quantum key distribution (QKD) channels and presenting results for several QKD protocols.
△ Less
Submitted 9 March, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
Sinkhorn Flow: A Continuous-Time Framework for Understanding and Generalizing the Sinkhorn Algorithm
Authors:
Mohammad Reza Karimi,
Ya-** Hsieh,
Andreas Krause
Abstract:
Many problems in machine learning can be formulated as solving entropy-regularized optimal transport on the space of probability measures. The canonical approach involves the Sinkhorn iterates, renowned for their rich mathematical properties. Recently, the Sinkhorn algorithm has been recast within the mirror descent framework, thus benefiting from classical optimization theory insights. Here, we b…
▽ More
Many problems in machine learning can be formulated as solving entropy-regularized optimal transport on the space of probability measures. The canonical approach involves the Sinkhorn iterates, renowned for their rich mathematical properties. Recently, the Sinkhorn algorithm has been recast within the mirror descent framework, thus benefiting from classical optimization theory insights. Here, we build upon this result by introducing a continuous-time analogue of the Sinkhorn algorithm. This perspective allows us to derive novel variants of Sinkhorn schemes that are robust to noise and bias. Moreover, our continuous-time dynamics not only generalize but also offer a unified perspective on several recently discovered dynamics in machine learning and mathematics, such as the "Wasserstein mirror flow" of (Deb et al. 2023) or the "mean-field Schrödinger equation" of (Claisse et al. 2023).
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
CARTOS: A Charging-Aware Real-Time Operating System for Intermittent Batteryless Devices
Authors:
Mohsen Karimi,
Yidi Wang,
Youngbin Kim,
Yoo** Lim,
Hyoseung Kim
Abstract:
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of p…
▽ More
This paper presents CARTOS, a charging-aware real-time operating system designed to enhance the functionality of intermittently-powered batteryless devices (IPDs) for various Internet of Things (IoT) applications. While IPDs offer significant advantages such as extended lifespan and operability in extreme environments, they pose unique challenges, including the need to ensure forward progress of program execution amidst variable energy availability and maintaining reliable real-time time behavior during power disruptions. To address these challenges, CARTOS introduces a mixed-preemption scheduling model that classifies tasks into computational and peripheral tasks, and ensures their efficient and timely execution by adopting just-in-time checkpointing for divisible computation tasks and uninterrupted execution for indivisible peripheral tasks. CARTOS also supports processing chains of tasks with precedence constraints and adapts its scheduling in response to environmental changes to offer continuous execution under diverse conditions. CARTOS is implemented with new APIs and components added to FreeRTOS but is designed for portability to other embedded RTOSs. Through real hardware experiments and simulations, CARTOS exhibits superior performance over state-of-the-art methods, demonstrating that it can serve as a practical platform for develo** resilient, real-time sensing applications on IPDs.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
Riemannian stochastic optimization methods avoid strict saddle points
Authors:
Ya-** Hsieh,
Mohammad Reza Karimi,
Andreas Krause,
Panayotis Mertikopoulos
Abstract:
Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, and are typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is not geodesical…
▽ More
Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, and are typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is not geodesically convex, so the convergence of the chosen solver to a desirable solution - i.e., a local minimizer - is by no means guaranteed. In this paper, we study precisely this question, that is, whether stochastic Riemannian optimization algorithms are guaranteed to avoid saddle points with probability 1. For generality, we study a family of retraction-based methods which, in addition to having a potentially much lower per-iteration cost relative to Riemannian gradient descent, include other widely used algorithms, such as natural policy gradient methods and mirror descent in ordinary convex spaces. In this general setting, we show that, under mild assumptions for the ambient manifold and the oracle providing gradient information, the policies under study avoid strict saddle points / submanifolds with probability 1, from any initial condition. This result provides an important sanity check for the use of gradient methods on manifolds as it shows that, almost always, the limit state of a stochastic Riemannian algorithm can only be a local minimizer.
△ Less
Submitted 4 November, 2023;
originally announced November 2023.
-
An Intelligent Approach to Detecting Novel Fault Classes for Centrifugal Pumps Based on Deep CNNs and Unsupervised Methods
Authors:
Mahdi Abdollah Chalaki,
Daniyal Maroufi,
Mahdi Robati,
Mohammad Javad Karimi,
Ali Sadighi
Abstract:
Despite the recent success in data-driven fault diagnosis of rotating machines, there are still remaining challenges in this field. Among the issues to be addressed, is the lack of information about variety of faults the system may encounter in the field. In this paper, we assume a partial knowledge of the system faults and use the corresponding data to train a convolutional neural network. A comb…
▽ More
Despite the recent success in data-driven fault diagnosis of rotating machines, there are still remaining challenges in this field. Among the issues to be addressed, is the lack of information about variety of faults the system may encounter in the field. In this paper, we assume a partial knowledge of the system faults and use the corresponding data to train a convolutional neural network. A combination of t-SNE method and clustering techniques is then employed to detect novel faults. Upon detection, the network is augmented using the new data. Finally, a test setup is used to validate this two-stage methodology on a centrifugal pump and experimental results show high accuracy in detecting novel faults.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
A learning-based multiscale model for reactive flow in porous media
Authors:
Mina Karimi,
Kaushik Bhattacharya
Abstract:
We study solute-laden flow through permeable geological formations with a focus on advection-dominated transport and volume reactions. As the fluid flows through the permeable medium, it reacts with the medium, thereby changing the morphology and properties of the medium; this in turn, affects the flow conditions and chemistry. These phenomena occur at various lengths and time scales, and makes th…
▽ More
We study solute-laden flow through permeable geological formations with a focus on advection-dominated transport and volume reactions. As the fluid flows through the permeable medium, it reacts with the medium, thereby changing the morphology and properties of the medium; this in turn, affects the flow conditions and chemistry. These phenomena occur at various lengths and time scales, and makes the problem extremely complex. Multiscale modeling addresses this complexity by dividing the problem into those at individual scales, and systematically passing information from one scale to another. However, accurate implementation of these multiscale methods are still prohibitively expensive. We present a methodology to overcome this challenge that is computationally efficient and quantitatively accurate. We introduce a surrogate for the solution operator of the lower scale problem in the form of a recurrent neural operator, train it using one-time off-line data generated by repeated solutions of the lower scale problem, and then use this surrogate in application-scale calculations. The result is the accuracy of concurrent multiscale methods, at a cost comparable to those of classical models. We study various examples, and show the efficacy of this method in understanding the evolution of the morphology, properties and flow conditions over time in geological formations.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Deformation Decomposition versus Energy Decomposition for Chemo- and Poro- Mechanics
Authors:
Janel Chua,
Mina Karimi,
Patrick Kozlowski,
Mehrdad Massoudi,
Santosh Narasimhachary,
Kai Kadau,
George Gazonas,
Kaushik Dayal
Abstract:
We briefly compare the structure of two classes of popular models used to describe poro- and chemo- mechanics wherein a fluid phase is transported within a solid phase. The multiplicative deformation decomposition has been successfully used to model permanent inelastic shape change in plasticity, solid-solid phase transformation, and thermal expansion, which has motivated its application to poro-…
▽ More
We briefly compare the structure of two classes of popular models used to describe poro- and chemo- mechanics wherein a fluid phase is transported within a solid phase. The multiplicative deformation decomposition has been successfully used to model permanent inelastic shape change in plasticity, solid-solid phase transformation, and thermal expansion, which has motivated its application to poro- and chemo- mechanics. However, the energetic decomposition provides a more transparent structure and advantages, such as to couple to phase-field fracture, for models of poro- and chemo- mechanics.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Proposing a Dynamic Executive Microservices Architecture Model for AI Systems
Authors:
Mahyar Karimi,
Ahmad Abdollahzadeh Barfroush
Abstract:
Microservices architecture is one of the new architectural styles that has improved in recent years. It has become a popular architectural style among system architects and developers. This popularity increased with the advent of new technologies and technological advancements in cloud computing. These advancements caused the emergence of new design and development challenges for service-based sof…
▽ More
Microservices architecture is one of the new architectural styles that has improved in recent years. It has become a popular architectural style among system architects and developers. This popularity increased with the advent of new technologies and technological advancements in cloud computing. These advancements caused the emergence of new design and development challenges for service-based software systems. The increasing use of microservices architecture in large organizations and teams has increased the need to find appropriate solutions for architecture challenges. Orchestration of the components in the microservices architecture is one of the main challenges in distributed systems and affects the software quality in factors such as efficiency, compatibility, stability, and reusability. In such systems, software architecture consists of fine-grained components. Due to the increasing number of microservices in a large-scale system, proper management and communication orchestration of microservice components can become a point of failure. In this article, the challenges of Microservices architecture have been identified. To resolve the component orchestration challenges, an appropriate model to maintain and improve quality is proposed. The presented model, as a pattern, can be used at the both design and development level of the system. The Dynamicity of software at runtime is the main achievement of this pattern. In this model, microservice components orchestration tasks are performed by using a BPMN-based workflow engine as the orchestrator component. The orchestrator design gives the ability to create, track and modify new composite microservices without the need to change platform infrastructure.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Backscattering of topologically protected helical edge states by line defects
Authors:
Mohadese Karimi,
Mohsen Amini,
Morteza Soltani,
Mozhgan Sadeghizadeh
Abstract:
The quantization of conductance in the presence of non-magnetic point defects is a consequence of topological protection and the spin-momentum locking of helical edge states in two-dimensional topological insulators. This protection ensures the absence of backscattering of helical edge modes in the quantum Hall phase of the system. However, our study focuses on exploring a novel approach to disrup…
▽ More
The quantization of conductance in the presence of non-magnetic point defects is a consequence of topological protection and the spin-momentum locking of helical edge states in two-dimensional topological insulators. This protection ensures the absence of backscattering of helical edge modes in the quantum Hall phase of the system. However, our study focuses on exploring a novel approach to disrupt this protection. We propose that a linear arrangement of on-site impurities can effectively lift the topological protection of edge states in the Kane-Mele model. To investigate this phenomenon, we consider an armchair ribbon containing a line defect spanning its width. Utilizing the tight-binding model and non-equilibrium Green's function method, we calculate the transmission coefficient of the system. Our results reveal a suppression of conductance at energies near the lower edge of the bulk gap for positive on-site potentials. To further comprehend this behavior, we perform analytical calculations and discuss the formation of an impurity channel. This channel arises due to the overlap of in-gap bound states, linking the bottom edge of the ribbon to its top edge, consequently facilitating backscattering. Our explanation is supported by the analysis of the local density of states at sites near the position of impurities.
△ Less
Submitted 23 July, 2023;
originally announced July 2023.
-
Monitoring Algorithmic Fairness
Authors:
Thomas A. Henzinger,
Mahyar Karimi,
Konstantin Kueffner,
Kaushik Mallik
Abstract:
Machine-learned systems are in widespread use for making decisions about humans, and it is important that they are fair, i.e., not biased against individuals based on sensitive attributes. We present runtime verification of algorithmic fairness for systems whose models are unknown, but are assumed to have a Markov chain structure. We introduce a specification language that can model many common al…
▽ More
Machine-learned systems are in widespread use for making decisions about humans, and it is important that they are fair, i.e., not biased against individuals based on sensitive attributes. We present runtime verification of algorithmic fairness for systems whose models are unknown, but are assumed to have a Markov chain structure. We introduce a specification language that can model many common algorithmic fairness properties, such as demographic parity, equal opportunity, and social burden. We build monitors that observe a long sequence of events as generated by a given system, and output, after each observation, a quantitative estimate of how fair or biased the system was on that run until that point in time. The estimate is proven to be correct modulo a variable error bound and a given confidence level, where the error bound gets tighter as the observed sequence gets longer. Our monitors are of two types, and use, respectively, frequentist and Bayesian statistical inference techniques. While the frequentist monitors compute estimates that are objectively correct with respect to the ground truth, the Bayesian monitors compute estimates that are correct subject to a given prior belief about the system's model. Using a prototype implementation, we show how we can monitor if a bank is fair in giving loans to applicants from different social backgrounds, and if a college is fair in admitting students while maintaining a reasonable financial burden on the society. Although they exhibit different theoretical complexities in certain cases, in our experiments, both frequentist and Bayesian monitors took less than a millisecond to update their verdicts after each observation.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Enhancing data security against cyberattacks in artificial intelligence based smartgrid systems with crypto agility
Authors:
Marcelo Simoes,
Mohammed Elmusrati,
Tero Vartiainen,
Mike Mekkanen,
Mazaher Karimi,
Sayawu Diaba,
Emmanuel Anti,
Wilson Lopes
Abstract:
A new paradigm of electricity generation at the distribution level, with renewable and alternative sources, is possible with microgrids. The main idea is to have microgrids deployed on low- or medium-voltage active distribution networks. They can be advantageous in many different ways, such as improving the energy efficiency and reliability of the system and reducing transmission losses and networ…
▽ More
A new paradigm of electricity generation at the distribution level, with renewable and alternative sources, is possible with microgrids. The main idea is to have microgrids deployed on low- or medium-voltage active distribution networks. They can be advantageous in many different ways, such as improving the energy efficiency and reliability of the system and reducing transmission losses and network congestion. There are challenges in implementing MGs with DER units, those are related to power quality and stability issues voltage and fault level changes, energy management, low inertia, further complex protection schemes, load and generation forecasting, cyber-attacks, and cyber security. This paper shows the deep utilization of advanced, accurate, and fast methodologies such as artificial intelligence-based techniques. They guarantee efficient, optimal, safe, and reliable operation of smart grids safe against cyberattacks. AI refers to the computer-based system's ability to perform tasks with intelligence typically associated with human decision-making, they can learn from past experiences and solve problems.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
Runtime Monitoring of Dynamic Fairness Properties
Authors:
Thomas A. Henzinger,
Mahyar Karimi,
Konstantin Kueffner,
Kaushik Mallik
Abstract:
A machine-learned system that is fair in static decision-making tasks may have biased societal impacts in the long-run. This may happen when the system interacts with humans and feedback patterns emerge, reinforcing old biases in the system and creating new biases. While existing works try to identify and mitigate long-run biases through smart system design, we introduce techniques for monitoring…
▽ More
A machine-learned system that is fair in static decision-making tasks may have biased societal impacts in the long-run. This may happen when the system interacts with humans and feedback patterns emerge, reinforcing old biases in the system and creating new biases. While existing works try to identify and mitigate long-run biases through smart system design, we introduce techniques for monitoring fairness in real time. Our goal is to build and deploy a monitor that will continuously observe a long sequence of events generated by the system in the wild, and will output, with each event, a verdict on how fair the system is at the current point in time. The advantages of monitoring are two-fold. Firstly, fairness is evaluated at run-time, which is important because unfair behaviors may not be eliminated a priori, at design-time, due to partial knowledge about the system and the environment, as well as uncertainties and dynamic changes in the system and the environment, such as the unpredictability of human behavior. Secondly, monitors are by design oblivious to how the monitored system is constructed, which makes them suitable to be used as trusted third-party fairness watchdogs. They function as computationally lightweight statistical estimators, and their correctness proofs rely on the rigorous analysis of the stochastic process that models the assumptions about the underlying dynamics of the system. We show, both in theory and experiments, how monitors can warn us (1) if a bank's credit policy over time has created an unfair distribution of credit scores among the population, and (2) if a resource allocator's allocation policy over time has made unfair allocations. Our experiments demonstrate that the monitors introduce very low overhead. We believe that runtime monitoring is an important and mathematically rigorous new addition to the fairness toolbox.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Hessian-informed Hamiltonian Monte Carlo for high-dimensional problems
Authors:
Mina Karimi,
Kaushik Dayal,
Matteo Pozzi
Abstract:
We investigate the effect of using local and non-local second derivative information on the performance of Hamiltonian Monte Carlo (HMC) sampling methods, for high-dimension non-Gaussian distributions, with application to Bayesian inference and nonlinear inverse problems. The Riemannian Manifold Hamiltonian Monte Carlo (RMHMC) method uses second and third derivative information to improve the perf…
▽ More
We investigate the effect of using local and non-local second derivative information on the performance of Hamiltonian Monte Carlo (HMC) sampling methods, for high-dimension non-Gaussian distributions, with application to Bayesian inference and nonlinear inverse problems. The Riemannian Manifold Hamiltonian Monte Carlo (RMHMC) method uses second and third derivative information to improve the performance of the HMC approach. We propose using the local Hessian information at the start of each iteration, instead of re-calculating the higher order derivatives in all sub-steps of the leapfrog updating algorithm. We compare the result of Hessian-informed HMC method using the local and nonlocal Hessian information, in a test bed of a high-dimensional log-normal distribution, related to a problem of inferring soil properties.
△ Less
Submitted 28 March, 2023;
originally announced May 2023.
-
Distributed Optimization for Power Systems with Radial Partitioning
Authors:
Mehdi Karimi
Abstract:
This paper proposes group-based distributed optimization algorithms on top of intelligent partitioning for the optimal power flow (OPF) problem. Radial partitioning of the graph of a network is introduced as a systematic way to split a large-scale problem into more tractable sub-problems, which can potentially be solved efficiently with methods such as convex relaxations. The simple implementation…
▽ More
This paper proposes group-based distributed optimization algorithms on top of intelligent partitioning for the optimal power flow (OPF) problem. Radial partitioning of the graph of a network is introduced as a systematic way to split a large-scale problem into more tractable sub-problems, which can potentially be solved efficiently with methods such as convex relaxations. The simple implementation of a Distributed Consensus Algorithm (DiCA) with very few parameters makes it viable for different parameter selection methods, which are crucial for the fast convergence of the distributed algorithms. The DiCA algorithm returns more accurate solutions to the tested problems with fewer iterations than component-based algorithms. Our numerical results show the performance of the algorithms for different power network instances and the effect of parameter selection. A software package DiCARP is created, which is implemented in Python using the Pyomo optimization package.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
High-dimensional Nonlinear Bayesian Inference of Poroelastic Fields from Pressure Data
Authors:
Mina Karimi,
Mehrdad Massoudi,
Kaushik Dayal,
Matteo Pozzi
Abstract:
We investigate solution methods for large-scale inverse problems governed by partial differential equations (PDEs) via Bayesian inference. The Bayesian framework provides a statistical setting to infer uncertain parameters from noisy measurements. To quantify posterior uncertainty, we adopt Markov Chain Monte Carlo (MCMC) approaches for generating samples. To increase the efficiency of these appro…
▽ More
We investigate solution methods for large-scale inverse problems governed by partial differential equations (PDEs) via Bayesian inference. The Bayesian framework provides a statistical setting to infer uncertain parameters from noisy measurements. To quantify posterior uncertainty, we adopt Markov Chain Monte Carlo (MCMC) approaches for generating samples. To increase the efficiency of these approaches in high-dimension, we make use of local information about gradient and Hessian of the target potential, also via Hamiltonian Monte Carlo (HMC). Our target application is inferring the field of soil permeability processing observations of pore pressure, using a nonlinear PDE poromechanics model for predicting pressure from permeability. We compare the performance of different sampling approaches in this and other settings. We also investigate the effect of dimensionality and non-gaussianity of distributions on the performance of different sampling methods.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Isotropic Gaussian Processes on Finite Spaces of Graphs
Authors:
Viacheslav Borovitskiy,
Mohammad Reza Karimi,
Vignesh Ram Somnath,
Andreas Krause
Abstract:
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs: directed or undirected, with or without loops. We endow each of these sets with a geometric structure, inducing the notions of closeness and symmetries, by turning them into a vertex set of an appropriate metagraph. Building on this, we describe the class of priors that respect this structure and ar…
▽ More
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs: directed or undirected, with or without loops. We endow each of these sets with a geometric structure, inducing the notions of closeness and symmetries, by turning them into a vertex set of an appropriate metagraph. Building on this, we describe the class of priors that respect this structure and are analogous to the Euclidean isotropic processes, like squared exponential or Matérn. We propose an efficient computational technique for the ostensibly intractable problem of evaluating these priors' kernels, making such Gaussian processes usable within the usual toolboxes and downstream applications. We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon. We prove a hardness result, showing that in this case, exact kernel computation cannot be performed efficiently. However, we propose a simple Monte Carlo approximation for handling moderately sized cases. Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
△ Less
Submitted 25 February, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
A Dynamical System View of Langevin-Based Non-Convex Sampling
Authors:
Mohammad Reza Karimi,
Ya-** Hsieh,
Andreas Krause
Abstract:
Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain many important challenges: Existing guarantees (1) typically only hold for the averaged iterates rather than the more desirable last iterates, (2) lack convergence metrics that capture…
▽ More
Non-convex sampling is a key challenge in machine learning, central to non-convex optimization in deep learning as well as to approximate probabilistic inference. Despite its significance, theoretically there remain many important challenges: Existing guarantees (1) typically only hold for the averaged iterates rather than the more desirable last iterates, (2) lack convergence metrics that capture the scales of the variables such as Wasserstein distances, and (3) mainly apply to elementary schemes such as stochastic gradient Langevin dynamics. In this paper, we develop a new framework that lifts the above issues by harnessing several tools from the theory of dynamical systems. Our key result is that, for a large class of state-of-the-art sampling schemes, their last-iterate convergence in Wasserstein distances can be reduced to the study of their continuous-time counterparts, which is much better understood. Coupled with standard assumptions of MCMC sampling, our theory immediately yields the last-iterate Wasserstein convergence of many advanced sampling schemes such as proximal, randomized mid-point, and Runge-Kutta integrators. Beyond existing methods, our framework also motivates more efficient schemes that enjoy the same rigorous guarantees.
△ Less
Submitted 13 March, 2023; v1 submitted 25 October, 2022;
originally announced October 2022.
-
The Power of One Clean Qubit in Supervised Machine Learning
Authors:
Mahsa Karimi,
Ali Javadi-Abhari,
Christoph Simon,
Roohollah Ghobadi
Abstract:
This paper explores the potential benefits of quantum coherence and quantum discord in the non-universal quantum computing model called deterministic quantum computing with one qubit (DQC1) in supervised machine learning. We show that the DQC1 model can be leveraged to develop an efficient method for estimating complex kernel functions. We demonstrate a simple relationship between coherence consum…
▽ More
This paper explores the potential benefits of quantum coherence and quantum discord in the non-universal quantum computing model called deterministic quantum computing with one qubit (DQC1) in supervised machine learning. We show that the DQC1 model can be leveraged to develop an efficient method for estimating complex kernel functions. We demonstrate a simple relationship between coherence consumption and the kernel function, a crucial element in machine learning. The paper presents an implementation of a binary classification problem on IBM hardware using the DQC1 model and analyzes the impact of quantum coherence and hardware noise. The advantage of our proposal lies in its utilization of quantum discord, which is more resilient to noise than entanglement.
△ Less
Submitted 7 November, 2023; v1 submitted 17 October, 2022;
originally announced October 2022.
-
Providing Error Detection for Deep Learning Image Classifiers Using Self-Explainability
Authors:
Mohammad Mahdi Karimi,
Azin Heidarshenas,
William W. Edmonson
Abstract:
This paper proposes a self-explainable Deep Learning (SE-DL) system for an image classification problem that performs self-error detection. The self-error detection is key to improving the DL system's safe operation, especially in safety-critical applications such as automotive systems. A SE-DL system outputs both the class prediction and an explanation for that prediction, which provides insight…
▽ More
This paper proposes a self-explainable Deep Learning (SE-DL) system for an image classification problem that performs self-error detection. The self-error detection is key to improving the DL system's safe operation, especially in safety-critical applications such as automotive systems. A SE-DL system outputs both the class prediction and an explanation for that prediction, which provides insight into how the system makes its predictions for humans. Additionally, we leverage the explanation of the proposed SE-DL system to detect potential class prediction errors of the system. The proposed SE-DL system uses a set of concepts to generate the explanation. The concepts are human-understandable lower-level image features in each input image relevant to the higher-level class of that image. We present a concept selection methodology for scoring all concepts and selecting a subset of them based on their contribution to the error detection performance of the proposed SE-DL system. Finally, we present different error detection schemes using the proposed SE-DL system to compare them against an error detection scheme without any SE-DL system.
△ Less
Submitted 31 October, 2022; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Riemannian stochastic approximation algorithms
Authors:
Mohammad Reza Karimi,
Ya-** Hsieh,
Panayotis Mertikopoulos,
Andreas Krause
Abstract:
We examine a wide class of stochastic approximation algorithms for solving (stochastic) nonlinear problems on Riemannian manifolds. Such algorithms arise naturally in the study of Riemannian optimization, game theory and optimal transport, but their behavior is much less understood compared to the Euclidean case because of the lack of a global linear structure on the manifold. We overcome this dif…
▽ More
We examine a wide class of stochastic approximation algorithms for solving (stochastic) nonlinear problems on Riemannian manifolds. Such algorithms arise naturally in the study of Riemannian optimization, game theory and optimal transport, but their behavior is much less understood compared to the Euclidean case because of the lack of a global linear structure on the manifold. We overcome this difficulty by introducing a suitable Fermi coordinate frame which allows us to map the asymptotic behavior of the Riemannian Robbins-Monro (RRM) algorithms under study to that of an associated deterministic dynamical system. In so doing, we provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes, despite the significant complications that arise due to the curvature and topology of the underlying manifold. We showcase the flexibility of the proposed framework by applying it to a range of retraction-based variants of the popular optimistic / extra-gradient methods for solving minimization problems and games, and we provide a unified treatment for their convergence.
△ Less
Submitted 27 December, 2022; v1 submitted 14 June, 2022;
originally announced June 2022.
-
Deploying self-supervised learning in the wild for hybrid automatic speech recognition
Authors:
Mostafa Karimi,
Changliang Liu,
Kenichi Kumatani,
Yao Qian,
Tianyu Wu,
Jian Wu
Abstract:
Self-supervised learning (SSL) methods have proven to be very successful in automatic speech recognition (ASR). These great improvements have been reported mostly based on highly curated datasets such as LibriSpeech for non-streaming End-to-End ASR models. However, the pivotal characteristics of SSL is to be utilized for any untranscribed audio data. In this paper, we provide a full exploration on…
▽ More
Self-supervised learning (SSL) methods have proven to be very successful in automatic speech recognition (ASR). These great improvements have been reported mostly based on highly curated datasets such as LibriSpeech for non-streaming End-to-End ASR models. However, the pivotal characteristics of SSL is to be utilized for any untranscribed audio data. In this paper, we provide a full exploration on how to utilize uncurated audio data in SSL from data pre-processing to deploying an streaming hybrid ASR model. More specifically, we present (1) the effect of Audio Event Detection (AED) model in data pre-processing pipeline (2) analysis on choosing optimizer and learning rate scheduling (3) comparison of recently developed contrastive losses, (4) comparison of various pre-training strategies such as utilization of in-domain versus out-domain pre-training data, monolingual versus multilingual pre-training data, multi-head multilingual SSL versus single-head multilingual SSL and supervised pre-training versus SSL. The experimental results show that SSL pre-training with in-domain uncurated data can achieve better performance in comparison to all the alternative out-domain pre-training strategies.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
Energetic Formulation of Large-Deformation Poroelasticity
Authors:
Mina Karimi,
Mehrdad Massoudi,
Noel Walkington,
Matteo Pozzi,
Kaushik Dayal
Abstract:
The modeling of coupled fluid transport and deformation in a porous medium is essential to predict the various geomechanical process such as CO2 sequestration, hydraulic fracturing, and so on. Current applications of interest, for instance, that include fracturing or damage of the solid phase, require a nonlinear description of the large deformations that can occur. This paper presents a variation…
▽ More
The modeling of coupled fluid transport and deformation in a porous medium is essential to predict the various geomechanical process such as CO2 sequestration, hydraulic fracturing, and so on. Current applications of interest, for instance, that include fracturing or damage of the solid phase, require a nonlinear description of the large deformations that can occur. This paper presents a variational energy-based continuum mechanics framework to model large-deformation poroelasticity. The approach begins from the total free energy density that is additively composed of the free energy of the components. A variational procedure then provides the balance of momentum, fluid transport balance, and pressure relations. A numerical approach based on finite elements is applied to analyze the behavior of saturated and unsaturated porous media using a nonlinear constitutive model for the solid skeleton. Examples studied include the Terzaghi and Mandel problems; a gas-liquid phase-changing fluid; multiple immiscible gases; and unsaturated systems where we model injection of fluid into soil. The proposed variational approach can potentially have advantages for numerical methods as well as for combining with data-driven models in a Bayesian framework.
△ Less
Submitted 30 December, 2021;
originally announced December 2021.
-
Stochastic Maximum-Likelihood DOA Estimation and Source Enumeration in the Presence of Nonuniform Noise
Authors:
Mahmood Karimi
Abstract:
In this paper, the problem of determining the number of signal sources im**ing on an array of sensors and estimating their directions-of-arrival (DOAs) in the presence of spatially white nonuniform noise is considered. It is known that, in the case of nonuniform noise, the stochastic likelihood function cannot be concentrated with respect to the diagonal elements of noise covariance matrix. Ther…
▽ More
In this paper, the problem of determining the number of signal sources im**ing on an array of sensors and estimating their directions-of-arrival (DOAs) in the presence of spatially white nonuniform noise is considered. It is known that, in the case of nonuniform noise, the stochastic likelihood function cannot be concentrated with respect to the diagonal elements of noise covariance matrix. Therefore, the stochastic maximum-likelihood (SML) DOA estimation and source enumeration in the presence of nonuniform noise requires multidimensional search with very high computational complexity. Recently, two algorithms for estimating noise covariance matrix in the presence of nonuniform noise have been proposed in the literature. Using these new estimates of noise covariance matrix, an approach for obtaining the SML estimate of signal DOAs is proposed. In addition, new approaches are proposed for SML source enumeration with information criteria in the presence of nonuniform noise. The important feature of the proposed SML approaches for DOA estimation and source enumeration is that they have admissible computational complexity. In addition, some of them are robust against correlation between source signals. The performance of the proposed DOA estimation and source enumeration approaches are investigated using computer simulations.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Simulating of X-states and the two-qubit XYZ Heisenberg system on IBM quantum computer
Authors:
Fereshte Shahbeigi,
Mahsa Karimi,
Vahid Karimipour
Abstract:
Two qubit density matrices, which are of X-shape, are a natural generalization of Bell Diagonal States (BDSs) recently simulated on the IBM quantum device. We generalize the previous results and propose a quantum circuit for simulation of a general two qubit X-state, implement it on the same quantum device, and study its entanglement for several values of the extended parameter space. We also show…
▽ More
Two qubit density matrices, which are of X-shape, are a natural generalization of Bell Diagonal States (BDSs) recently simulated on the IBM quantum device. We generalize the previous results and propose a quantum circuit for simulation of a general two qubit X-state, implement it on the same quantum device, and study its entanglement for several values of the extended parameter space. We also show that their X-shape is approximately robust against noisy quantum gates. To further physically motivate this study, we invoke the two-spin Heisenberg XYZ system and show that for a wide class of initial states, it leads to dynamical density matrices which are X-states. Due to the symmetries of this Hamiltonian, we show that by only two qubits, one can simulate the dynamics of this system on the IBM quantum computer.
△ Less
Submitted 20 January, 2022; v1 submitted 30 May, 2021;
originally announced May 2021.
-
Lasing condition for trapped modes in subwavelength--wired PT--symmetric resonators
Authors:
Mauro Cuevas,
Mojtaba Karimi,
Carlos J Zapata-Ropdriguez
Abstract:
The ability to control the laser modes within a subwavelength resonator is of key relevance in modern optoelectronics. This work deals with the theoretical research on optical properties of a PT--symmetric nano--scaled dimer formed by two dielectric wires, one is with loss and the other with gain, wrapped with graphene sheets. We show the existence of two non--radiating trapped modes which transfo…
▽ More
The ability to control the laser modes within a subwavelength resonator is of key relevance in modern optoelectronics. This work deals with the theoretical research on optical properties of a PT--symmetric nano--scaled dimer formed by two dielectric wires, one is with loss and the other with gain, wrapped with graphene sheets. We show the existence of two non--radiating trapped modes which transform into radiating modes by increasing the gain--loss parameter. Moreover, these modes reach the lasing condition for suitable values of this parameter, a fact that makes these modes to achieve an ultra high quality factor that is manifested on the response of the structure when it is excited by a plane wave. Unlike other mechanism that transform trapped modes into radiating modes, we show that the variation of gain--loss parameter in the balanced loss--gain structure here studied leads to a variation in the phase difference between induced dipole moments on each wires, without appreciable variation in the modulus of these dipole moments. We provide an approximated method that reproduces the main results provided by the rigorous calculation. Our theoretical findings reveal the possibility to develop unconventional optical devices and structures with enhanced functionality.
△ Less
Submitted 19 December, 2020;
originally announced December 2020.
-
Investigation of Warrior Robots Behavior by Using Evolutionary Algorithms
Authors:
Shahriar Sharifi Borojerdi,
Mehdi Karimi,
Ehsan Amiri
Abstract:
In this study, we review robots behavior especially warrior robots by using evolutionary algorithms. This kind of algorithms is inspired by nature that causes robots behaviors get resemble to collective behavior. Collective behavior of creatures such as bees was shown that do some functions which depended on interaction and cooperation would need to a well-organized system so that all creatures wi…
▽ More
In this study, we review robots behavior especially warrior robots by using evolutionary algorithms. This kind of algorithms is inspired by nature that causes robots behaviors get resemble to collective behavior. Collective behavior of creatures such as bees was shown that do some functions which depended on interaction and cooperation would need to a well-organized system so that all creatures within it carry out their duty, very well. For robots which do not have any intelligence, we can define an algorithm and show the results by a simple simulation.
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
Online Active Model Selection for Pre-trained Classifiers
Authors:
Mohammad Reza Karimi,
Nezihe Merve Gürel,
Bojan Karlaš,
Johannes Rausch,
Ce Zhang,
Andreas Krause
Abstract:
Given $k$ pre-trained classifiers and a stream of unlabeled data examples, how can we actively decide when to query a label so that we can distinguish the best model from the rest while making a small number of queries? Answering this question has a profound impact on a range of practical scenarios. In this work, we design an online selective sampling approach that actively selects informative exa…
▽ More
Given $k$ pre-trained classifiers and a stream of unlabeled data examples, how can we actively decide when to query a label so that we can distinguish the best model from the rest while making a small number of queries? Answering this question has a profound impact on a range of practical scenarios. In this work, we design an online selective sampling approach that actively selects informative examples to label and outputs the best model with high probability at any round. Our algorithm can be used for online prediction tasks for both adversarial and stochastic streams. We establish several theoretical guarantees for our algorithm and extensively demonstrate its effectiveness in our experimental studies.
△ Less
Submitted 17 April, 2021; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Superscattering, Superabsorption, and Nonreciprocity in Nonlinear Antennas
Authors:
Lin Cheng,
Rasoul Alaee,
Akbar Safari,
Mohammad Karimi,
Lei Zhang,
Robert W. Boyd
Abstract:
We propose tunable nonlinear antennas based on an epsilon-near-zero material with a large optical nonlinearity. We show that the absorption and scattering cross sections of the antennas can be controlled dynamically from a nearly superscatterer to a nearly superabsorber by changing the intensity of the laser beam. Moreover, we demonstrate that a hybrid nonlinear antenna, composed of epsilon-near-z…
▽ More
We propose tunable nonlinear antennas based on an epsilon-near-zero material with a large optical nonlinearity. We show that the absorption and scattering cross sections of the antennas can be controlled dynamically from a nearly superscatterer to a nearly superabsorber by changing the intensity of the laser beam. Moreover, we demonstrate that a hybrid nonlinear antenna, composed of epsilon-near-zero and high-index dielectric materials, exhibits nonreciprocal radiation patterns because of broken spatial inversion symmetry and large optical nonlinearity of the epsilon-near-zero material. By changing the intensity of the laser, the radiation pattern of the antenna can be tuned between a bidirectional and a unidirectional emission known as a Huygens source. Our study provides a novel approach toward ultrafast dynamical control of metamaterials, for applications such as beam steering and optical limiting.
△ Less
Submitted 4 October, 2020;
originally announced October 2020.
-
Optimal Scheduling of Anticipated COVID-19 Vaccination: A Case Study of New York State
Authors:
Syed Irfan Ali Meerza,
Seyed M. Karimi,
Bert B. Little,
Jacek M. Zurada,
Tamer Inanc
Abstract:
This study aims to determine an optimal control strategy for vaccine scheduling in COVID-19 pandemic treatment by converting widely acknowledged infectious disease model named SEIR into an optimal control problem. The problem is augmented by adding medication and vaccine limitations to match real-world situations. Two version of the problem is formulated to minimize the number of infected individu…
▽ More
This study aims to determine an optimal control strategy for vaccine scheduling in COVID-19 pandemic treatment by converting widely acknowledged infectious disease model named SEIR into an optimal control problem. The problem is augmented by adding medication and vaccine limitations to match real-world situations. Two version of the problem is formulated to minimize the number of infected individuals at the same provide the optimal vaccine possible to reduce the susceptible population to a considerably lower state. Optimal control problems are solved using RBF-Galerkin method. These problems are tested with a benchmarking dataset to determine required parameters. After this step, problems are tested with recent data for New York State, USA. The results regarding the proposed optimal control problem provides a set of evidences from which an optimal strategy for vaccine scheduling can be chosen, when the vaccine for COVID-19 will be available.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
Machine Learning Based Framework for Estimation of Data Center Power Using Acoustic Side Channel
Authors:
Mohsen Karimi,
Fahimeh Arab
Abstract:
Data centers are high power consumers and the energy consumption of data centers keeps on rising in spite of all the efforts for increasing the energy efficiency. The need for energy-awareness in data centers makes the use of power modeling and estimation to be still a big challenge due to huge amount of uncertainty in this area. In this paper, a machine learning based method is proposed to approx…
▽ More
Data centers are high power consumers and the energy consumption of data centers keeps on rising in spite of all the efforts for increasing the energy efficiency. The need for energy-awareness in data centers makes the use of power modeling and estimation to be still a big challenge due to huge amount of uncertainty in this area. In this paper, a machine learning based method is proposed to approximately estimate the amount of power consumption by using acoustic side channel caused by fan in the fan-based cooling system in the server room. For doing so, frequency components of the acoustic signal, recorded by a microphone in the server room, is extracted, pre-processed, and fed to a Multi-Layer Neural-Network as an estimator. The proposed method performed well to estimate the power consumption, having more than 85 percent accuracy.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Enhanced Nonlinear Optical Responses of Layered Epsilon-Near-Zero Metamaterials at Visible Frequencies
Authors:
Sisira Suresh,
Orad Reshef,
M. Zahirul Alam,
Jeremy Upham,
Mohammad Karimi,
Robert W. Boyd
Abstract:
Optical materials with vanishing dielectric permittivity, known as epsilon-near-zero (ENZ) materials, have been shown to possess enhanced nonlinear optical responses in their ENZ region. These strong nonlinear optical properties have been firmly established in homogeneous materials; however, it is as of yet unclear whether metamaterials with effective optical parameters can exhibit a similar enhan…
▽ More
Optical materials with vanishing dielectric permittivity, known as epsilon-near-zero (ENZ) materials, have been shown to possess enhanced nonlinear optical responses in their ENZ region. These strong nonlinear optical properties have been firmly established in homogeneous materials; however, it is as of yet unclear whether metamaterials with effective optical parameters can exhibit a similar enhancement. Here, we probe an optical ENZ metamaterial composed of a subwavelength periodic stack of alternating Ag and SiO$_2$ layers and measure a nonlinear refractive index $n_2 = (1.2 \pm 0.1) \times 10^{-12}$ m$^2$/W and nonlinear absorption coefficient $β= (-1.5 \pm 0.2) \times 10^{-5}$ m/W at its effective zero-permittivity wavelength. The measured $n_2$ is $10^7$ times larger than $n_2$ of fused silica and four times larger than that the $n_2$ of silver. We observe that the nonlinear enhancement in $n_2$ scales as $1/(n_0 \mathrm{Re}[n_0])$, where $n_0$ is the linear effective refractive index. As opposed to homogeneous ENZ materials, whose optical properties are dictated by their intrinsic material properties and hence are not widely tunable, the zero-permittivity wavelength of the demonstrated metamaterials may be chosen to lie anywhere within the visible spectrum by selecting the right thicknesses of the sub-wavelength layers. Consequently, our results offer the promise of a means to design metamaterials with large nonlinearities for applications in nanophotonics at any specified optical wavelength.
△ Less
Submitted 24 February, 2021; v1 submitted 25 May, 2020;
originally announced May 2020.
-
Network-principled deep generative models for designing drug combinations as graph sets
Authors:
Mostafa Karimi,
Arman Hasanzadeh,
Yang shen
Abstract:
Combination therapy has shown to improve therapeutic efficacy while reducing side effects. Importantly, it has become an indispensable strategy to overcome resistance in antibiotics, anti-microbials, and anti-cancer drugs. Facing enormous chemical space and unclear design principles for small-molecule combinations, the computational drug-combination design has not seen generative models to meet it…
▽ More
Combination therapy has shown to improve therapeutic efficacy while reducing side effects. Importantly, it has become an indispensable strategy to overcome resistance in antibiotics, anti-microbials, and anti-cancer drugs. Facing enormous chemical space and unclear design principles for small-molecule combinations, the computational drug-combination design has not seen generative models to meet its potential to accelerate resistance-overcoming drug combination discovery. We have developed the first deep generative model for drug combination design, by jointly embedding graph-structured domain knowledge and iteratively training a reinforcement learning-based chemical graph-set designer. First, we have developed Hierarchical Variational Graph Auto-Encoders (HVGAE) trained end-to-end to jointly embed gene-gene, gene-disease, and disease-disease networks. Novel attentional pooling is introduced here for learning disease-representations from associated genes' representations. Second, targeting diseases in learned representations, we have recast the drug-combination design problem as graph-set generation and developed a deep learning-based model with novel rewards. Specifically, besides chemical validity rewards, we have introduced a novel generative adversarial award, being generalized sliced Wasserstein, for chemically diverse molecules with distributions similar to known drugs. We have also designed a network principle-based reward for drug combinations. Numerical results indicate that, compared to graph embedding methods, HVGAE learns more informative and generalizable disease representations. Case studies on four diseases show that network-principled drug combinations tend to have low toxicity. The generated drug combinations collectively cover the disease module similar to FDA-approved drug combinations and could potentially suggest novel systems-pharmacology strategies.
△ Less
Submitted 22 April, 2020; v1 submitted 16 April, 2020;
originally announced April 2020.
-
Directionally Dependent Multi-View Clustering Using Copula Model
Authors:
Kahkashan Afrin,
Ashif S. Iquebal,
Mostafa Karimi,
Allyson Souris,
Se Yoon Lee,
Bani K. Mallick
Abstract:
In recent biomedical scientific problems, it is a fundamental issue to integratively cluster a set of objects from multiple sources of datasets. Such problems are mostly encountered in genomics, where data is collected from various sources, and typically represent distinct yet complementary information. Integrating these data sources for multi-source clustering is challenging due to their complex…
▽ More
In recent biomedical scientific problems, it is a fundamental issue to integratively cluster a set of objects from multiple sources of datasets. Such problems are mostly encountered in genomics, where data is collected from various sources, and typically represent distinct yet complementary information. Integrating these data sources for multi-source clustering is challenging due to their complex dependence structure including directional dependency. Particularly in genomics studies, it is known that there is certain directional dependence between DNA expression, DNA methylation, and RNA expression, widely called The Central Dogma.
Most of the existing multi-view clustering methods either assume an independent structure or pair-wise (non-directional) dependency, thereby ignoring the directional relationship. Motivated by this, we propose a copula-based multi-view clustering model where a copula enables the model to accommodate the directional dependence existing in the datasets. We conduct a simulation experiment where the simulated datasets exhibiting inherent directional dependence: it turns out that ignoring the directional dependence negatively affects the clustering performance. As a real application, we applied our model to the breast cancer tumor samples collected from The Cancer Genome Altas (TCGA).
△ Less
Submitted 22 August, 2020; v1 submitted 16 March, 2020;
originally announced March 2020.
-
Influence of electron-vibration interactions on electronic current noise of atomic and molecular junctions
Authors:
S. G. Bahoosh,
M. A. Karimi,
W. Belzig,
E. Scheer,
F. Pauly
Abstract:
We present an ab-initio method to simulate the current noise in the presence of electron-vibration interactions in atomic and molecular junctions at finite temperature. Using a combination of nonequilibrium Keldysh Green's function techniques and density functional theory, we study the elastic and inelastic contributions to electron current and shot noise within a wide range of transmission values…
▽ More
We present an ab-initio method to simulate the current noise in the presence of electron-vibration interactions in atomic and molecular junctions at finite temperature. Using a combination of nonequilibrium Keldysh Green's function techniques and density functional theory, we study the elastic and inelastic contributions to electron current and shot noise within a wide range of transmission values in systems exhibiting multiple electronic levels and vibrational modes. Within our model we find the upper threshold, at which the inelastic noise contribution changes sign, at a total transmission between $τ\approx 0.90$ and $0.95$ for gold contacts. This is higher than predicted by the single-level Holstein model but in agreement with earlier experimental observations. We support our theoretical studies by noise measurements on single-atom gold contacts which confirm previous experiments but make use of a new setup with strongly reduced complexity of electronic circuitry. Furthermore, we identify 1,4-benzenedithiol connected to gold electrodes as a system to observe the lower sign change, which we predict at around $τ\approx 0.2$. Finally, we discuss the influence of vibrational heating on the current noise.
△ Less
Submitted 31 December, 2019;
originally announced December 2019.
-
Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts
Authors:
Mostafa Karimi,
Di Wu,
Zhangyang Wang,
Yang Shen
Abstract:
Predicting compound-protein affinity is critical for accelerating drug discovery. Recent progress made by machine learning focuses on accuracy but leaves much to be desired for interpretability. Through molecular contacts underlying affinities, our large-scale interpretability assessment finds commonly-used attention mechanisms inadequate. We thus formulate a hierarchical multi-objective learning…
▽ More
Predicting compound-protein affinity is critical for accelerating drug discovery. Recent progress made by machine learning focuses on accuracy but leaves much to be desired for interpretability. Through molecular contacts underlying affinities, our large-scale interpretability assessment finds commonly-used attention mechanisms inadequate. We thus formulate a hierarchical multi-objective learning problem whose predicted contacts form the basis for predicted affinities. We further design a physics-inspired deep relational network, DeepRelations, with intrinsically explainable architecture. Specifically, various atomic-level contacts or "relations" lead to molecular-level affinity prediction. And the embedded attentions are regularized with predicted structural contexts and supervised with partially available training contacts. DeepRelations shows superior interpretability to the state-of-the-art: without compromising affinity prediction, it boosts the AUPRC of contact prediction 9.5, 16.9, 19.3 and 5.7-fold for the test, compound-unique, protein-unique, and both-unique sets, respectively. Our study represents the first dedicated model development and systematic model assessment for interpretable machine learning of compound-protein affinity.
△ Less
Submitted 28 December, 2019;
originally announced December 2019.
-
The isomorphism problem of trees from the viewpoint of Terwilliger algebras
Authors:
Shuang-Dong Li,
Yi-Zheng Fan,
Tatsuro Ito,
Masoud Karimi,
**g Xu
Abstract:
Let $Γ^{(x_0)}$ be a finite rooted tree, for which $Γ$ is the underlying tree and $x_0$ the root. Let $T$ be the Terwilliger algebra of $Γ$ with respect to $x_0$. We study the structure of the principal $T$-module. As a result, it is shown that $T$ recognizes the isomorphism class of $Γ^{(x_0)}$.
Let $Γ^{(x_0)}$ be a finite rooted tree, for which $Γ$ is the underlying tree and $x_0$ the root. Let $T$ be the Terwilliger algebra of $Γ$ with respect to $x_0$. We study the structure of the principal $T$-module. As a result, it is shown that $T$ recognizes the isomorphism class of $Γ^{(x_0)}$.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
Tunable THz absorption in photonic crystal including graphene and metamaterial
Authors:
M. Montaseri,
M. Hosseini,
M. J. Karimi
Abstract:
In this paper, a photonic crystal containing graphene and metamaterial layers is investigated. The absorption spectrum of the structure in the terahertz range is obtained using the transfer matrix method. The results show that by adding a Si, SiO2 or metamaterial layer between two graphene layers, the terahertz absorption increases significantly. The results also reveal that in wide range of physi…
▽ More
In this paper, a photonic crystal containing graphene and metamaterial layers is investigated. The absorption spectrum of the structure in the terahertz range is obtained using the transfer matrix method. The results show that by adding a Si, SiO2 or metamaterial layer between two graphene layers, the terahertz absorption increases significantly. The results also reveal that in wide range of physical parameters, the approximately complete absorption occurs. Furthermore, the results indicate that the structure with metamaterial layer has the highest absorption performance.
△ Less
Submitted 14 October, 2019;
originally announced October 2019.
-
Illegible Text to Readable Text: An Image-to-Image Transformation using Conditional Sliced Wasserstein Adversarial Networks
Authors:
Mostafa Karimi,
Gopalkrishna Veni,
Yen-Yun Yu
Abstract:
Automatic text recognition from ancient handwritten record images is an important problem in the genealogy domain. However, critical challenges such as varying noise conditions, vanishing texts, and variations in handwriting make the recognition task difficult. We tackle this problem by develo** a handwritten-to-machine-print conditional Generative Adversarial network (HW2MP-GAN) model that form…
▽ More
Automatic text recognition from ancient handwritten record images is an important problem in the genealogy domain. However, critical challenges such as varying noise conditions, vanishing texts, and variations in handwriting make the recognition task difficult. We tackle this problem by develo** a handwritten-to-machine-print conditional Generative Adversarial network (HW2MP-GAN) model that formulates handwritten recognition as a text-Image-to-text-Image translation problem where a given image, typically in an illegible form, is converted into another image, close to its machine-print form. The proposed model consists of three-components including a generator, and word-level and character-level discriminators. The model incorporates Sliced Wasserstein distance (SWD) and U-Net architectures in HW2MP-GAN for better quality image-to-image transformation. Our experiments reveal that HW2MP-GAN outperforms state-of-the-art baseline cGAN models by almost 30 in Frechet Handwritten Distance (FHD), 0.6 on average Levenshtein distance and 39% in word accuracy for image-to-image translation on IAM database. Further, HW2MP-GAN improves handwritten recognition word accuracy by 1.3% compared to baseline handwritten recognition models on the IAM database.
△ Less
Submitted 11 October, 2019;
originally announced October 2019.
-
Optimization of terahertz absorption in periodic quantum well structures
Authors:
Zahra Javidi,
Mehdi Hosseini,
Mohammad Javad Karimi
Abstract:
In this work, the linear absorption spectra for periodic arrays of GaAs-GaAsAl quantum wells with different thicknesses, inside electromagnetic wave has been studied. The eigen energies and eigen functions are calculated by solving the Schrödinger equation numerically. The absorption spectra are obtained using the density matrix approach and the effects of quantum well parameters have been studied…
▽ More
In this work, the linear absorption spectra for periodic arrays of GaAs-GaAsAl quantum wells with different thicknesses, inside electromagnetic wave has been studied. The eigen energies and eigen functions are calculated by solving the Schrödinger equation numerically. The absorption spectra are obtained using the density matrix approach and the effects of quantum well parameters have been studied. Results show that for a wide range of parameters the absorption peaks lie in the terahertz region. Furthermore, it is possible to adjust the frequency of absorption peak in the terahertz range by changing the width and height of the wells or array numbers that could be used in terahertz devices.
△ Less
Submitted 25 September, 2019;
originally announced September 2019.
-
Domain-Driven Solver (DDS) Version 2.0: a MATLAB-based Software Package for Convex Optimization Problems in Domain-Driven Form
Authors:
Mehdi Karimi,
Levent Tunçel
Abstract:
Domain-Driven Solver (DDS) is a MATLAB-based software package for convex optimization problems in Domain-Driven form [Karimi and Tunçel, arXiv:1804.06925]. The current version of DDS accepts every combination of the following function/set constraints: (1) symmetric cones (LP, SOCP, and SDP); (2) quadratic constraints that are SOCP representable; (3) direct sums of an arbitrary collection of 2-dime…
▽ More
Domain-Driven Solver (DDS) is a MATLAB-based software package for convex optimization problems in Domain-Driven form [Karimi and Tunçel, arXiv:1804.06925]. The current version of DDS accepts every combination of the following function/set constraints: (1) symmetric cones (LP, SOCP, and SDP); (2) quadratic constraints that are SOCP representable; (3) direct sums of an arbitrary collection of 2-dimensional convex sets defined as the epigraphs of univariate convex functions (including as special cases geometric programming and entropy programming); (4) generalized power cone; (5) epigraphs of matrix norms (including as a special case minimization of nuclear norm over a linear subspace); (6) vector relative entropy; (7) epigraphs of quantum entropy and quantum relative entropy; and (8) constraints involving hyperbolic polynomials. DDS is a practical implementation of the infeasible-start primal-dual algorithm designed and analyzed in [Karimi and Tunçel, arXiv:1804.06925]. This manuscript contains the users' guide, as well as theoretical results needed for the implementation of the algorithms. To help the users, we included many examples. We also discussed some implementation details and techniques we used to improve the efficiency and further expansion of the software to cover the emerging classes of convex optimization problems.
△ Less
Submitted 10 November, 2020; v1 submitted 7 August, 2019;
originally announced August 2019.
-
Non-Iterative Subspace-Based DOA Estimation in the Presence of Nonuniform Noise
Authors:
M. Esfandiari,
S. A. Vorobyov,
S. Aliban,
M. Karimi
Abstract:
The uniform white noise assumption is one of the basic assumptions in most of the existing directional-of-arrival (DOA) estimation methods. In many applications, however, the non-uniform white noise model is more adequate. Then the noise variances at different sensors have to be also estimated as nuisance parameters while estimating DOAs. In this letter, different from the existing iterative metho…
▽ More
The uniform white noise assumption is one of the basic assumptions in most of the existing directional-of-arrival (DOA) estimation methods. In many applications, however, the non-uniform white noise model is more adequate. Then the noise variances at different sensors have to be also estimated as nuisance parameters while estimating DOAs. In this letter, different from the existing iterative methods that address the problem of non-uniform noise, a non-iterative two-phase subspace-based DOA estimation method is proposed. The first phase of the method is based on estimating the noise subspace via eigendecomposition (ED) of some properly designed matrix and it avoids estimating the noise covariance matrix. In the second phase, the results achieved in the first phase are used to estimate the noise covariance matrix, followed by estimating the noise subspace via generalized ED. Since the proposed method estimates DOAs in a non-iterative manner, it is computationally more efficient and has no convergence issues as compared to the existing methods. Simulation results demonstrate better performance of the proposed method as compared to other existing state-of-the-art methods.
△ Less
Submitted 27 March, 2019;
originally announced March 2019.
-
Status Determination by Interior-Point Methods for Convex Optimization Problems in Domain-Driven Form
Authors:
Mehdi Karimi,
Levent Tunçel
Abstract:
We study the geometry of convex optimization problems given in a Domain-Driven form and categorize possible statuses of these problems using duality theory. Our duality theory for the Domain-Driven form, which accepts both conic and non-conic constraints, lets us determine and certify statuses of a problem as rigorously as the best approaches for conic formulations (which have been demonstrably ve…
▽ More
We study the geometry of convex optimization problems given in a Domain-Driven form and categorize possible statuses of these problems using duality theory. Our duality theory for the Domain-Driven form, which accepts both conic and non-conic constraints, lets us determine and certify statuses of a problem as rigorously as the best approaches for conic formulations (which have been demonstrably very efficient in this context). We analyze the performance of an infeasible-start primal-dual algorithm for the Domain-Driven form in returning the certificates for the defined statuses. Our iteration complexity bounds for this more practical Domain-Driven form match the best ones available for conic formulations. At the end, we propose some stop** criteria for practical algorithms based on insights gained from our analyses.
△ Less
Submitted 21 January, 2019;
originally announced January 2019.
-
DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks
Authors:
Mostafa Karimi,
Di Wu,
Zhangyang Wang,
Yang Shen
Abstract:
Motivation: Drug discovery demands rapid quantification of compound-protein interaction (CPI). However, there is a lack of methods that can predict compound-protein affinity from sequences alone with high applicability, accuracy, and interpretability.
Results: We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally-annotat…
▽ More
Motivation: Drug discovery demands rapid quantification of compound-protein interaction (CPI). However, there is a lack of methods that can predict compound-protein affinity from sequences alone with high applicability, accuracy, and interpretability.
Results: We present a seamless integration of domain knowledges and learning-based approaches. Under novel representations of structurally-annotated protein sequences, a semi-supervised deep learning model that unifies recurrent and convolutional neural networks has been proposed to exploit both unlabeled and labeled data, for jointly encoding molecular representations and predicting affinities. Our representations and models outperform conventional options in achieving relative error in IC$_{50}$ within 5-fold for test cases and 20-fold for protein classes not included for training. Performances for new protein classes with few labeled data are further improved by transfer learning. Furthermore, separate and joint attention mechanisms are developed and embedded to our model to add to its interpretability, as illustrated in case studies for predicting and explaining selective drug-target interactions. Lastly, alternative representations using protein sequences or compound graphs and a unified RNN/GCNN-CNN model using graph CNN (GCNN) are also explored to reveal algorithmic challenges ahead.
Availability: Data and source codes are available at https://github.com/Shen-Lab/DeepAffinity
Supplementary Information: Supplementary data are available at http://shen-lab.github.io/deep-affinity-bioinf18-supp-rev.pdf
△ Less
Submitted 8 December, 2018; v1 submitted 19 June, 2018;
originally announced June 2018.
-
MAVI: A Research Platform for Telepresence and Teleoperation
Authors:
Mojtaba Karimi,
Tamay Aykut,
Eckehard Steinbach
Abstract:
One of the goals in telepresence is to be able to perform daily tasks remotely. A key requirement for this is a robust and reliable mobile robotic platform. Ideally, such a platform should support 360-degree stereoscopic vision and semi-autonomous telemanipulation ability. In this technical report, we present our latest work on designing the telepresence mobile robot platform called MAVI. MAVI is…
▽ More
One of the goals in telepresence is to be able to perform daily tasks remotely. A key requirement for this is a robust and reliable mobile robotic platform. Ideally, such a platform should support 360-degree stereoscopic vision and semi-autonomous telemanipulation ability. In this technical report, we present our latest work on designing the telepresence mobile robot platform called MAVI. MAVI is a low-cost and robust but extendable platform for research and educational purpose, especially for machine vision and human interaction in telepresence setups. The MAVI platform offers a balance between modularity, capabilities, accessibility, cost and an open source software framework. With a range of different sensors such as Inertial Measurement Unit (IMU), 360-degree laser rangefinder, ultrasonic proximity sensors, and force sensors along with smart actuation in omnidirectional holonomic locomotion, high load cylindrical manipulator, and actuated stereoscopic Pan-Tilt-Roll Unit (PTRU), not only MAVI can provide the basic feedbacks from its surroundings, but also can interact within the remote environment in multiple ways. The software architecture of MAVI is based on the Robot Operating System (ROS) which allows for the easy integration of the state-of-the-art software packages.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Investigating Power Outage Effects on Reliability of Solid-State Drives
Authors:
Saba Ahmadian,
Farhad Taheri,
Mehrshad Lotfi,
Maryam Karimi,
Hossein Asad
Abstract:
Solid-State Drives (SSDs) are recently employed in enterprise servers and high-end storage systems in order to enhance performance of storage subsystem. Although employing high speed SSDs in the storage subsystems can significantly improve system performance, it comes with significant reliability threat for write operations upon power failures. In this paper, we present a comprehensive analysis in…
▽ More
Solid-State Drives (SSDs) are recently employed in enterprise servers and high-end storage systems in order to enhance performance of storage subsystem. Although employing high speed SSDs in the storage subsystems can significantly improve system performance, it comes with significant reliability threat for write operations upon power failures. In this paper, we present a comprehensive analysis investigating the impact of workload dependent parameters on the reliability of SSDs under power failure for variety of SSDs (from top manufacturers). To this end, we first develop a platform to perform two important features required for study: a) a realistic fault injection into the SSD in the computing systems and b) data loss detection mechanism on the SSD upon power failure. In the proposed physical fault injection platform, SSDs experience a real discharge phase of Power Supply Unit (PSU) that occurs during power failure in data centers which was neglected in previous studies. The impact of workload dependent parameters such as workload Working Set Size (WSS), request size, request type, access pattern, and sequence of accesses on the failure of SSDs is carefully studied in the presence of realistic power failures. Experimental results over thousands number of fault injections show that data loss occurs even after completion of the request (up to 700ms) where the failure rate is influenced by the type, size, access pattern, and sequence of IO accesses while other parameters such as workload WSS has no impact on the failure of SSDs.
△ Less
Submitted 29 April, 2018;
originally announced May 2018.
-
Primal-Dual Interior-Point Methods for Domain-Driven Formulations
Authors:
Mehdi Karimi,
Levent Tunçel
Abstract:
We study infeasible-start primal-dual interior-point methods for convex optimization problems given in a typically natural form we denote as Domain-Driven formulation. Our algorithms extend many advantages of primal-dual interior-point techniques available for conic formulations, such as the current best complexity bounds, and more robust certificates of approximate optimality, unboundedness, and…
▽ More
We study infeasible-start primal-dual interior-point methods for convex optimization problems given in a typically natural form we denote as Domain-Driven formulation. Our algorithms extend many advantages of primal-dual interior-point techniques available for conic formulations, such as the current best complexity bounds, and more robust certificates of approximate optimality, unboundedness, and infeasibility, to Domain-Driven formulations. The complexity results are new for the infeasible-start setup used, even in the case of linear programming. In addition to complexity results, our algorithms aim for expanding the applications of, and software for interior-point methods to wider classes of problems beyond optimization over symmetric cones.
△ Less
Submitted 13 March, 2019; v1 submitted 18 April, 2018;
originally announced April 2018.