Search | arXiv e-print repository

Embed-Search-Align: DNA Sequence Alignment using Transformer Models

Authors: Pavan Holur, K. C. Enevoldsen, Shreyas Rajesh, Lajoyce Mboning, Thalia Georgiou, Louis-S. Bouchard, Matteo Pellegrini, Vwani Roychowdhury

Abstract: DNA sequence alignment involves assigning short DNA reads to the most probable locations on an extensive reference genome. This process is crucial for various genomic analyses, including variant calling, transcriptomics, and epigenomics. Conventional methods, refined over decades, tackle this challenge in two steps: genome indexing followed by efficient search to locate likely positions for given… ▽ More DNA sequence alignment involves assigning short DNA reads to the most probable locations on an extensive reference genome. This process is crucial for various genomic analyses, including variant calling, transcriptomics, and epigenomics. Conventional methods, refined over decades, tackle this challenge in two steps: genome indexing followed by efficient search to locate likely positions for given reads. Building on the success of Large Language Models (LLM) in encoding text into embeddings, where the distance metric captures semantic similarity, recent efforts have explored whether the same Transformer architecture can produce numerical representations for DNA sequences. Such models have shown early promise in tasks involving classification of short DNA sequences, such as the detection of coding vs non-coding regions, as well as the identification of enhancer and promoter sequences. Performance at sequence classification tasks does not, however, translate to sequence alignment, where it is necessary to conduct a genome-wide search to successfully align every read. We address this open problem by framing it as an Embed-Search-Align task. In this framework, a novel encoder model DNA-ESA generates representations of reads and fragments of the reference, which are projected into a shared vector space where the read-fragment distance is used as surrogate for alignment. In particular, DNA-ESA introduces: (1) Contrastive loss for self-supervised training of DNA sequence representations, facilitating rich sequence-level embeddings, and (2) a DNA vector store to enable search across fragments on a global scale. DNA-ESA is >97% accurate when aligning 250-length reads onto a human reference genome of 3 gigabases (single-haploid), far exceeds the performance of 6 recent DNA-Transformer model baselines and shows task transfer across chromosomes and species. △ Less

Submitted 23 April, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

Comments: 13 pages, Tables 7, Figures 6

arXiv:2307.07738 [pdf, other]

Promotion/Inhibition Effects in Networks: A Model with Negative Probabilities

Authors: Anqi Dong, Tryphon T. Georgiou, Allen Tannenbaum

Abstract: Biological networks often encapsulate promotion/inhibition as signed edge-weights of a graph. Nodes may correspond to genes assigned expression levels (mass) of respective proteins. The promotion/inhibition nature of co-expression between nodes is encoded in the sign of the corresponding entry of a sign-indefinite adjacency matrix, though the strength of such co-expression (i.e., the precise value… ▽ More Biological networks often encapsulate promotion/inhibition as signed edge-weights of a graph. Nodes may correspond to genes assigned expression levels (mass) of respective proteins. The promotion/inhibition nature of co-expression between nodes is encoded in the sign of the corresponding entry of a sign-indefinite adjacency matrix, though the strength of such co-expression (i.e., the precise value of edge weights) cannot typically be directly measured. Herein we address the inverse problem to determine network edge-weights based on a sign-indefinite adjacency and expression levels at the nodes. While our motivation originates in gene networks, the framework applies to networks where promotion/inhibition dictates a stationary mass distribution at the nodes. In order to identify suitable edge-weights we adopt a framework of ``negative probabilities,'' advocated by P.\ Dirac and R.\ Feynman, and we set up a likelihood formalism to obtain values for the sought edge-weights. The proposed optimization problem can be solved via a generalization of the well-known Sinkhorn algorithm; in our setting the Sinkhorn-type ``diagonal scalings'' are multiplicative or inverse-multiplicative, depending on the sign of the respective entries in the adjacency matrix, with value computed as the positive root of a quadratic polynomial. △ Less

Submitted 16 August, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

Comments: 6 pages

MSC Class: 92F99; 49M29; 90C30; 93-08; 90C25

arXiv:2307.05103 [pdf, other]

Control and estimation of multi-commodity network flow under aggregation

Authors: Yongxin Chen, Tryphon T. Georgiou, Michele Pavon

Abstract: A paradigm put forth by E. Schrödinger in 1931/32, known as Schrödinger bridges, represents a formalism to pose and solve control and estimation problems seeking a perturbation from an initial control schedule (in the case of control), or from a prior probability law (in the case of estimation), sufficient to reconcile data in the form of marginal distributions and minimal in the sense of relative… ▽ More A paradigm put forth by E. Schrödinger in 1931/32, known as Schrödinger bridges, represents a formalism to pose and solve control and estimation problems seeking a perturbation from an initial control schedule (in the case of control), or from a prior probability law (in the case of estimation), sufficient to reconcile data in the form of marginal distributions and minimal in the sense of relative entropy to the prior. In the same spirit, we consider traffic-flow and apply a Schrödinger-type dictum, to perturb minimally with respect to a suitable relative entropy functional a prior schedule/law so as to reconcile the traffic flow with scarce aggregate distributions on families of indistinguishable individuals. Specifically, we consider the problem to regulate/estimate multi-commodity network flow rates based only on empirical distributions of commodities being transported (e.g., types of vehicles through a network, in motion) at two given times. Thus, building on Schrödinger's large deviation rationale, we develop a method to identify {\em the most likely flow rates (traffic flow)}, given prior information and aggregate observations. Our method further extends the Schrödinger bridge formalism to the multi-commodity setting, allowing commodities to exit or enter the flow field as well (e.g., vehicles to enter and stop and park) at any time. The behavior of entering or exiting the flow field, by commodities or vehicles, is modeled by a Markov chains with killing and creation states. Our method is illustrated with a numerical experiment. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: 12 pages, 5 figures

MSC Class: 93E20; 90B10; 90C35; 90B06; 15B48; 97M40; 05C81; 82C41

arXiv:2304.01617 [pdf]

Investigating Concerns of Security and Privacy Among Rohingya Refugees in Malaysia

Authors: Theodoros Georgiou, Lynne Baillie, Ryan Shah

Abstract: The security and privacy of refugee communities have emerged as pressing concerns in the context of increasing global migration. The Rohingya refugees are a stateless Muslim minority group in Myanmar who were forced to flee their homes after conflict broke out, with many fleeing to neighbouring countries and ending up in refugee camps, such as in Bangladesh. However, others migrated to Malaysia an… ▽ More The security and privacy of refugee communities have emerged as pressing concerns in the context of increasing global migration. The Rohingya refugees are a stateless Muslim minority group in Myanmar who were forced to flee their homes after conflict broke out, with many fleeing to neighbouring countries and ending up in refugee camps, such as in Bangladesh. However, others migrated to Malaysia and those who arrive there live within the community as urban refugees. However, the Rohingya in Malaysia are not legally recognized and have limited and restricted access to public resources such as healthcare and education. This means they face security and privacy challenges, different to other refugee groups, which are often compounded by this lack of recognition, social isolation and lack of access to vital resources. This paper discusses the implications of security and privacy of the Rohingya refugees, focusing on available and accessible technological assistance, uncovering the heightened need for a human-centered approach to design and implementation of solutions that factor in these requirements. Overall, the discussions and findings presented in this paper on the security and privacy of the Rohingya provides a valuable resource for researchers, practitioners and policymakers in the wider HCI community. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: 5 pages, 3 figures, CHI'23 Workshop on Migration, Security and Privacy (see https://migrationsecurityprivacy.uk)

MSC Class: 68U01 ACM Class: K.4.0; J.0; H.5.0

arXiv:2212.14509 [pdf, other]

Monge-Kantorovich Optimal Transport Through Constrictions and Flow-rate Constraints

Authors: Anqi Dong, Arthur Stephanovitch, Tryphon T. Georgiou

Abstract: We consider the problem to transport resources/mass while abiding by constraints on the flow through constrictions along their path between specified terminal distributions. Constrictions, conceptualized as toll stations at specified points, limit the flow rate across. We quantify flow-rate constraints via a bound on a sought probability density of the times that mass-elements cross toll stations… ▽ More We consider the problem to transport resources/mass while abiding by constraints on the flow through constrictions along their path between specified terminal distributions. Constrictions, conceptualized as toll stations at specified points, limit the flow rate across. We quantify flow-rate constraints via a bound on a sought probability density of the times that mass-elements cross toll stations and cast the transportation scheduling in a Kantorovich-type of formalism. Recent work by our team focused on the existence of Monge maps for similarly constrained transport minimizing average kinetic energy. The present formulation in this paper, besides being substantially more general, is cast as a (generalized) multi-marginal transport problem - a problem of considerable interest in modern-day machine learning literature and motivated extensive computational analyses. An enabling feature of our formalism is the representation of an average quadratic cost on the speed of transport as a convex constraint that involves crossing times. △ Less

Submitted 1 May, 2023; v1 submitted 29 December, 2022; originally announced December 2022.

Comments: 8 pages, 6 figures

MSC Class: 49345; 90C08; 26B25

arXiv:2103.06583 [pdf, other]

Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks

Authors: Theodoros Georgiou, Sebastian Schmitt, Thomas Bäck, Wei Chen, Michael Lew

Abstract: Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method us… ▽ More Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method uses a loss function which pushes each weight vector to have a norm close to one, i.e. the weight matrix is smoothly steered toward the so-called Oblique manifold. We evaluate our method on the very popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets using two state-of-the-art architectures, namely the ResNet and wide-ResNet. Our method introduces negligible computational overhead and the results show that it is competitive to the state-of-the-art and in some cases superior to it. Additionally, the results are less sensitive to hyperparameter settings such as batch size and regularization factor. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Journal ref: Proceedings of the International Conference on Pattern Recognition (ICPR) 2020

arXiv:2103.06552 [pdf, other]

PREPRINT: Comparison of deep learning and hand crafted features for mining simulation data

Authors: Theodoros Georgiou, Sebastian Schmitt, Thomas Bäck, Nan Pu, Wei Chen, Michael Lew

Abstract: Computational Fluid Dynamics (CFD) simulations are a very important tool for many industrial applications, such as aerodynamic optimization of engineering designs like cars shapes, airplanes parts etc. The output of such simulations, in particular the calculated flow fields, are usually very complex and hard to interpret for realistic three-dimensional real-world applications, especially if time-d… ▽ More Computational Fluid Dynamics (CFD) simulations are a very important tool for many industrial applications, such as aerodynamic optimization of engineering designs like cars shapes, airplanes parts etc. The output of such simulations, in particular the calculated flow fields, are usually very complex and hard to interpret for realistic three-dimensional real-world applications, especially if time-dependent simulations are investigated. Automated data analysis methods are warranted but a non-trivial obstacle is given by the very large dimensionality of the data. A flow field typically consists of six measurement values for each point of the computational grid in 3D space and time (velocity vector values, turbulent kinetic energy, pressure and viscosity). In this paper we address the task of extracting meaningful results in an automated manner from such high dimensional data sets. We propose deep learning methods which are capable of processing such data and which can be trained to solve relevant tasks on simulation data, i.e. predicting drag and lift forces applied on an airfoil. We also propose an adaptation of the classical hand crafted features known from computer vision to address the same problem and compare a large variety of descriptors and detectors. Finally, we compile a large dataset of 2D simulations of the flow field around airfoils which contains 16000 flow fields with which we tested and compared approaches. Our results show that the deep learning-based methods, as well as hand crafted feature based approaches, are well-capable to accurately describe the content of the CFD simulation output on the proposed dataset. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Journal ref: Proceedings of the International Conference on Pattern Recognition (ICPR) 2020

arXiv:2101.11282 [pdf, other]

Deep Learning for Instance Retrieval: A Survey

Authors: Wei Chen, Yu Liu, Wei** Wang, Erwin Bakker, Theodoros Georgiou, Paul Fieguth, Li Liu, Michael S. Lew

Abstract: In recent years a vast amount of visual content has been generated and shared from many fields, such as social media platforms, medical imaging, and robotics. This abundance of content creation and sharing has introduced new challenges, particularly that of searching databases for similar content-Content Based Image Retrieval (CBIR)-a long-established research area in which improved efficiency and… ▽ More In recent years a vast amount of visual content has been generated and shared from many fields, such as social media platforms, medical imaging, and robotics. This abundance of content creation and sharing has introduced new challenges, particularly that of searching databases for similar content-Content Based Image Retrieval (CBIR)-a long-established research area in which improved efficiency and accuracy are needed for real-time retrieval. Artificial intelligence has made progress in CBIR and has significantly facilitated the process of instance search. In this survey we review recent instance retrieval works that are developed based on deep learning algorithms and techniques, with the survey organized by deep network architecture types, deep features, feature embedding and aggregation methods, and network fine-tuning strategies. Our survey considers a wide variety of recent methods, whereby we identify milestone work, reveal connections among various methods and present the commonly used benchmarks, evaluation results, common challenges, and propose promising future directions. △ Less

Submitted 30 October, 2022; v1 submitted 27 January, 2021; originally announced January 2021.

Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence

arXiv:2005.09152 [pdf, other]

Lasso formulation of the shortest path problem

Authors: Anqi Dong, Amirhossein Taghvaei, Tryphon T. Georgiou

Abstract: The shortest path problem is formulated as an $l_1$-regularized regression problem, known as lasso. Based on this formulation, a connection is established between Dijkstra's shortest path algorithm and the least angle regression (LARS) for the lasso problem. Specifically, the solution path of the lasso problem, obtained by varying the regularization parameter from infinity to zero (the regularizat… ▽ More The shortest path problem is formulated as an $l_1$-regularized regression problem, known as lasso. Based on this formulation, a connection is established between Dijkstra's shortest path algorithm and the least angle regression (LARS) for the lasso problem. Specifically, the solution path of the lasso problem, obtained by varying the regularization parameter from infinity to zero (the regularization path), corresponds to shortest path trees that appear in the bi-directional Dijkstra algorithm. Although Dijkstra's algorithm and the LARS formulation provide exact solutions, they become impractical when the size of the graph is exceedingly large. To overcome this issue, the alternating direction method of multipliers (ADMM) is proposed to solve the lasso formulation. The resulting algorithm produces good and fast approximations of the shortest path by sacrificing exactness that may not be absolutely essential in many applications. Numerical experiments are provided to illustrate the performance of the proposed approach. △ Less

Submitted 22 May, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: 17 pages

MSC Class: 05C38 (Primary) 62J07; 68R10; 90C25; 90C06(Secondary)

arXiv:2004.02053 [pdf, other]

Macroscopic network circulation for planar graphs

Authors: Fariba Ariaei, Zahra Askarzadeh, Yongxin Chen, Tryphon T. Georgiou

Abstract: The analysis of networks, aimed at suitably defined functionality, often focuses on partitions into subnetworks that capture desired features. Chief among the relevant concepts is a 2-partition, that underlies the classical Cheeger inequality, and highlights a constriction (bottleneck) that limits accessibility between the respective parts of the network. In a similar spirit, the purpose of the pr… ▽ More The analysis of networks, aimed at suitably defined functionality, often focuses on partitions into subnetworks that capture desired features. Chief among the relevant concepts is a 2-partition, that underlies the classical Cheeger inequality, and highlights a constriction (bottleneck) that limits accessibility between the respective parts of the network. In a similar spirit, the purpose of the present work is to introduce a new concept of maximal global circulation and to explore 3-partitions that expose this type of macroscopic feature of networks. Herein, graph circulation is motivated by transportation networks and probabilistic flows (Markov chains) on graphs. Our goal is to quantify the large-scale imbalance of network flows and delineate key parts that mediate such global features. While we introduce and propose these notions in a general setting, in this paper, we only work out the case of planar graphs. We explain that a scalar potential can be identified to encapsulate the concept of circulation, quite similarly as in the case of the curl of planar vector fields. Beyond planar graphs, in the general case, the problem to determine global circulation remains at present a combinatorial problem. △ Less

Submitted 28 September, 2020; v1 submitted 4 April, 2020; originally announced April 2020.

Comments: 20 pages, 11 figures

MSC Class: 05C10; 57M15; 05C81

arXiv:1910.00095 [pdf, other]

Fitting IVIM with Variable Projection and Simplicial Optimization

Authors: Shreyas Fadnavis, Hamza Farooq, Maryam Afzali, Christoph Lenglet, Tryphon Georgiou, Hu Cheng, Sharlene Newman, Shahnawaz Ahmed, Rafael Neto Henriques, Eric Peterson, Serge Koudoro, Ariel Rokem, Eleftherios Garyfallidis

Abstract: Fitting multi-exponential models to Diffusion MRI (dMRI) data has always been challenging due to various underlying complexities. In this work, we introduce a novel and robust fitting framework for the standard two-compartment IVIM microstructural model. This framework provides a significant improvement over the existing methods and helps estimate the associated diffusion and perfusion parameters… ▽ More Fitting multi-exponential models to Diffusion MRI (dMRI) data has always been challenging due to various underlying complexities. In this work, we introduce a novel and robust fitting framework for the standard two-compartment IVIM microstructural model. This framework provides a significant improvement over the existing methods and helps estimate the associated diffusion and perfusion parameters of IVIM in an automatic manner. As a part of this work we provide capabilities to switch between more advanced global optimization methods such as simplicial homology (SH) and differential evolution (DE). Our experiments show that the results obtained from this simultaneous fitting procedure disentangle the model parameters in a reduced subspace. The proposed framework extends the seminal work originated in the MIX framework, with improved procedures for multi-stage fitting. This framework has been made available as an open-source Python implementation and disseminated to the community through the DIPY project. △ Less

Submitted 15 February, 2020; v1 submitted 27 September, 2019; originally announced October 2019.

arXiv:1908.09487 [pdf, other]

doi 10.1146/annurev-control-053018-023843

Stochastic dynamical modeling of turbulent flows

Authors: Armin Zare, Tryphon T. Georgiou, Mihailo R. Jovanović

Abstract: Advanced measurement techniques and high performance computing have made large data sets available for a wide range of turbulent flows that arise in engineering applications. Drawing on this abundance of data, dynamical models can be constructed to reproduce structural and statistical features of turbulent flows, opening the way to the design of effective model-based flow control strategies. This… ▽ More Advanced measurement techniques and high performance computing have made large data sets available for a wide range of turbulent flows that arise in engineering applications. Drawing on this abundance of data, dynamical models can be constructed to reproduce structural and statistical features of turbulent flows, opening the way to the design of effective model-based flow control strategies. This review describes a framework for completing second-order statistics of turbulent flows by models that are based on the Navier-Stokes equations linearized around the turbulent mean velocity. Systems theory and convex optimization are combined to address the inherent uncertainty in the dynamics and the statistics of the flow by seeking a suitable parsimonious correction to the prior linearized model. Specifically, dynamical couplings between states of the linearized model dictate structural constraints on the statistics of flow fluctuations. Thence, colored-in-time stochastic forcing that drives the linearized model is sought to account for and reconcile dynamics with available data (i.e., partially known second order statistics). The number of dynamical degrees of freedom that are directly affected by stochastic excitation is minimized as a measure of model parsimony. The spectral content of the resulting colored-in-time stochastic contribution can alternatively be seen to arise from a low-rank structural perturbation of the linearized dynamical generator, pointing to suitable dynamical corrections that may account for the absence of the nonlinear interactions in the linearized model. △ Less

Submitted 26 August, 2019; originally announced August 2019.

Comments: To appear in the Annual Review of Control, Robotics, and Autonomous Systems

Journal ref: Annu. Rev. Control Robot. Auton. Syst., vol. 3, pp. 195-219, May 2020

arXiv:1908.09300 [pdf, other]

doi 10.1109/CBMI.2019.8877470

A Comparison of CNN and Classic Features for Image Retrieval

Authors: Umut Özaydın, Theodoros Georgiou, Michael Lew

Abstract: Feature detectors and descriptors have been successfully used for various computer vision tasks, such as video object tracking and content-based image retrieval. Many methods use image gradients in different stages of the detection-description pipeline to describe local image structures. Recently, some, or all, of these stages have been replaced by convolutional neural networks (CNNs), in order to… ▽ More Feature detectors and descriptors have been successfully used for various computer vision tasks, such as video object tracking and content-based image retrieval. Many methods use image gradients in different stages of the detection-description pipeline to describe local image structures. Recently, some, or all, of these stages have been replaced by convolutional neural networks (CNNs), in order to increase their performance. A detector is defined as a selection problem, which makes it more challenging to implement as a CNN. They are therefore generally defined as regressors, converting input images to score maps and keypoints can be selected with non-maximum suppression. This paper discusses and compares several recent methods that use CNNs for keypoint detection. Experiments are performed both on the CNN based approaches, as well as a selection of conventional methods. In addition to qualitative measures defined on keypoints and descriptors, the bag-of-words (BoW) model is used to implement an image retrieval application, in order to determine how the methods perform in practice. The results show that each type of features are best in different contexts. △ Less

Submitted 25 August, 2019; originally announced August 2019.

Comments: 5 pages, 3 figures, 3 tables, CBMI 2019

arXiv:1904.06762 [pdf, other]

Probabilistic Kernel Support Vector Machines

Authors: Yongxin Chen, Tryphon T. Georgiou, Allen R. Tannenbaum

Abstract: We propose a probabilistic enhancement of standard kernel Support Vector Machines for binary classification, in order to address the case when, along with given data sets, a description of uncertainty (e.g., error bounds) may be available on each datum. In the present paper, we specifically consider Gaussian distributions to model uncertainty. Thereby, our data consist of pairs $(x_i,Σ_i)$,… ▽ More We propose a probabilistic enhancement of standard kernel Support Vector Machines for binary classification, in order to address the case when, along with given data sets, a description of uncertainty (e.g., error bounds) may be available on each datum. In the present paper, we specifically consider Gaussian distributions to model uncertainty. Thereby, our data consist of pairs $(x_i,Σ_i)$, $i\in\{1,\ldots,N\}$, along with an indicator $y_i\in\{-1,1\}$ to declare membership in one of two categories for each pair. These pairs may be viewed to represent the mean and covariance, respectively, of random vectors $ξ_i$ taking values in a suitable linear space (typically $\mathbb R^n$). Thus, our setting may also be viewed as a modification of Support Vector Machines to classify distributions, albeit, at present, only Gaussian ones. We outline the formalism that allows computing suitable classifiers via a natural modification of the standard "kernel trick." The main contribution of this work is to point out a suitable kernel function for applying Support Vector techniques to the setting of uncertain data for which a detailed uncertainty description is also available (herein, "Gaussian points"). △ Less

Submitted 18 March, 2020; v1 submitted 14 April, 2019; originally announced April 2019.

Comments: 6 pages, 6 figures

MSC Class: 62G05; 93A30

arXiv:1807.01739 [pdf, other]

doi 10.1109/TAC.2019.2948268

Proximal algorithms for large-scale statistical modeling and sensor/actuator selection

Authors: Armin Zare, Hesameddin Mohammadi, Neil K. Dhingra, Tryphon T. Georgiou, Mihailo R. Jovanović

Abstract: Several problems in modeling and control of stochastically-driven dynamical systems can be cast as regularized semi-definite programs. We examine two such representative problems and show that they can be formulated in a similar manner. The first, in statistical modeling, seeks to reconcile observed statistics by suitably and minimally perturbing prior dynamics. The second seeks to optimally selec… ▽ More Several problems in modeling and control of stochastically-driven dynamical systems can be cast as regularized semi-definite programs. We examine two such representative problems and show that they can be formulated in a similar manner. The first, in statistical modeling, seeks to reconcile observed statistics by suitably and minimally perturbing prior dynamics. The second seeks to optimally select a subset of available sensors and actuators for control purposes. To address modeling and control of large-scale systems we develop a unified algorithmic framework using proximal methods. Our customized algorithms exploit problem structure and allow handling statistical modeling, as well as sensor and actuator selection, for substantially larger scales than what is amenable to current general-purpose solvers. We establish linear convergence of the proximal gradient algorithm, draw contrast between the proposed proximal algorithms and alternating direction method of multipliers, and provide examples that illustrate the merits and effectiveness of our framework. △ Less

Submitted 26 December, 2019; v1 submitted 4 July, 2018; originally announced July 2018.

Comments: To appear in IEEE Trans. Automat. Control

arXiv:1706.08841 [pdf, other]

An Efficient Algorithm for Matrix-Valued and Vector-Valued Optimal Mass Transport

Authors: Yongxin Chen, Eldad Haber, Kaoru Yamamoto, Tryphon T. Georgiou, Allen Tannenbaum

Abstract: We present an efficient algorithm for recent generalizations of optimal mass transport theory to matrix-valued and vector-valued densities. These generalizations lead to several applications including diffusion tensor imaging, color images processing, and multi-modality imaging. The algorithm is based on sequential quadratic programming (SQP). By approximating the Hessian of the cost and solving e… ▽ More We present an efficient algorithm for recent generalizations of optimal mass transport theory to matrix-valued and vector-valued densities. These generalizations lead to several applications including diffusion tensor imaging, color images processing, and multi-modality imaging. The algorithm is based on sequential quadratic programming (SQP). By approximating the Hessian of the cost and solving each iteration in an inexact manner, we are able to solve each iteration with relatively low cost while still maintaining a fast convergent rate. The core of the algorithm is solving a weighted Poisson equation, where different efficient preconditioners may be employed. We utilize incomplete Cholesky factorization, which yields an efficient and straightforward solver for our problem. Several illustrative examples are presented for both the matrix and vector-valued cases. △ Less

Submitted 26 June, 2017; originally announced June 2017.

Comments: 18 pages, 4 figures

arXiv:1706.03158 [pdf, other]

Stability Theory of Stochastic Models in Opinion Dynamics

Authors: Zahra Askarzadeh, Rui Fu, Abhishek Halder, Yongxin Chen, Tryphon T. Georgiou

Abstract: We consider a certain class of nonlinear maps that preserve the probability simplex, i.e., stochastic maps, that are inspired by the DeGroot-Friedkin model of belief/opinion propagation over influence networks. The corresponding dynamical models describe the evolution of the probability distribution of interacting species. Such models where the probability transition mechanism depends nonlinearly… ▽ More We consider a certain class of nonlinear maps that preserve the probability simplex, i.e., stochastic maps, that are inspired by the DeGroot-Friedkin model of belief/opinion propagation over influence networks. The corresponding dynamical models describe the evolution of the probability distribution of interacting species. Such models where the probability transition mechanism depends nonlinearly on the current state are often referred to as {\em nonlinear Markov chains}. In this paper we develop stability results and study the behavior of representative opinion models. The stability certificates are based on the contractivity of the nonlinear evolution in the $\ell_1$-metric. We apply the theory to two types of opinion models where the adaptation of the transition probabilities to the current state is exponential and linear, respectively--both of these can display a wide range of behaviors. We discuss continuous-time and other generalizations. △ Less

Submitted 10 October, 2018; v1 submitted 9 June, 2017; originally announced June 2017.

Comments: 11 pages, 6 figures

MSC Class: 93E03; 60J10; 60J20; 65C40; 68Q87

Showing 1–17 of 17 results for author: Georgiou, T