Search | arXiv e-print repository

Efficient Leverage Score Sampling for Tensor Train Decomposition

Authors: Vivek Bharadwaj, Beheshteh T. Rakhshan, Osman Asif Malik, Guillaume Rabusseau

Abstract: Tensor Train~(TT) decomposition is widely used in the machine learning and quantum physics communities as a popular tool to efficiently compress high-dimensional tensor data. In this paper, we propose an efficient algorithm to accelerate computing the TT decomposition with the Alternating Least Squares (ALS) algorithm relying on exact leverage scores sampling. For this purpose, we propose a data s… ▽ More Tensor Train~(TT) decomposition is widely used in the machine learning and quantum physics communities as a popular tool to efficiently compress high-dimensional tensor data. In this paper, we propose an efficient algorithm to accelerate computing the TT decomposition with the Alternating Least Squares (ALS) algorithm relying on exact leverage scores sampling. For this purpose, we propose a data structure that allows us to efficiently sample from the tensor with time complexity logarithmic in the tensor size. Our contribution specifically leverages the canonical form of the TT decomposition. By maintaining the canonical form through each iteration of ALS, we can efficiently compute (and sample from) the leverage scores, thus achieving significant speed-up in solving each sketched least-square problem. Experiments on synthetic and real data on dense and sparse tensors demonstrate that our method outperforms SVD-based and ALS-based algorithms. △ Less

Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2402.06022 [pdf, other]

Fluctuations and Persistence in Quantum Diffusion on Regular Lattices

Authors: Cheng Ma, Omar Malik, G. Korniss

Abstract: We investigate quantum persistence by analyzing amplitude and phase fluctuations of the wave function governed by the time-dependent free-particle Schrödinger equation. The quantum system is initialized with local random uncorrelated Gaussian amplitude and phase fluctuations. In analogy with classical diffusion, the persistence probability is defined as the probability that the local (amplitude or… ▽ More We investigate quantum persistence by analyzing amplitude and phase fluctuations of the wave function governed by the time-dependent free-particle Schrödinger equation. The quantum system is initialized with local random uncorrelated Gaussian amplitude and phase fluctuations. In analogy with classical diffusion, the persistence probability is defined as the probability that the local (amplitude or phase) fluctuations have not changed sign up to time $t$. Our results show that the persistence probability in quantum diffusion exhibits exponential-like tails. More specifically, in $d=1$ the persistence probability decays in a stretched exponential fashion, while in $d=2$ and $d=3$ as an exponential. We also provide some insights by analyzing the two-point spatial and temporal correlation functions in the limit of small fluctuations. In particular, in the long-time limit, the temporal correlation functions for both local amplitude and phase fluctuations become time-homogeneous, i.e., the zero-crossing events correspond to those of a stationary Gaussian process, with sufficiently fast-decaying power-law tail of its autocorrelation function, implying an exponential-like tail of the persistence probabilities. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 41 pages, 16 figures

arXiv:2312.05454 [pdf, other]

Model Evaluation for Domain Identification of Unknown Classes in Open-World Recognition: A Proposal

Authors: Gusti Ahmad Fanshuri Alfarisy, Owais Ahmed Malik, Ong Wee Hong

Abstract: Open-World Recognition (OWR) is an emerging field that makes a machine learning model competent in rejecting the unknowns, managing them, and incrementally adding novel samples to the base knowledge. However, this broad objective is not practical for an agent that works on a specific task. Not all rejected samples will be used for learning continually in the future. Some novel images in the open e… ▽ More Open-World Recognition (OWR) is an emerging field that makes a machine learning model competent in rejecting the unknowns, managing them, and incrementally adding novel samples to the base knowledge. However, this broad objective is not practical for an agent that works on a specific task. Not all rejected samples will be used for learning continually in the future. Some novel images in the open environment may not belong to the domain of interest. Hence, identifying the unknown in the domain of interest is essential for a machine learning model to learn merely the important samples. In this study, we propose an evaluation protocol for estimating a model's capability in separating unknown in-domain (ID) and unknown out-of-domain (OOD). We evaluated using three approaches with an unknown domain and demonstrated the possibility of identifying the domain of interest using the pre-trained parameters through traditional transfer learning, Automated Machine Learning (AutoML), and Nearest Class Mean (NCM) classifier with First Integer Neighbor Clustering Hierarchy (FINCH). We experimented with five different domains: garbage, food, dogs, plants, and birds. The results show that all approaches can be used as an initial baseline yielding a good accuracy. In addition, a Balanced Accuracy (BACCU) score from a pre-trained model indicates a tendency to excel in one or more domains of interest. We observed that MobileNetV3 yielded the highest BACCU score for the garbage domain and surpassed complex models such as the transformer network. Meanwhile, our results also suggest that a strong representation in the pre-trained model is important for identifying unknown classes in the same domain. This study could open the bridge toward open-world recognition in domain-specific tasks where the relevancy of the unknown classes is vital. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2310.03663 [pdf, other]

doi 10.1109/TPWRD.2023.3321844

Autoregressive Coefficients based Intelligent Protection of Transmission Lines Connected to Type-3 Wind Farms

Authors: Pallav Kumar Bera, Vajendra Kumar, Samita Rani Pani, Om P. Malik

Abstract: Protective relays can mal-operate for transmission lines connected to doubly fed induction generator (DFIG) based large capacity wind farms (WFs). The performance of distance relays protecting such lines is investigated and a statistical model based intelligent protection of the area between the grid and the WF is proposed in this article. The suggested method employs an adaptive fuzzy inference s… ▽ More Protective relays can mal-operate for transmission lines connected to doubly fed induction generator (DFIG) based large capacity wind farms (WFs). The performance of distance relays protecting such lines is investigated and a statistical model based intelligent protection of the area between the grid and the WF is proposed in this article. The suggested method employs an adaptive fuzzy inference system to detect faults based on autoregressive (AR) coefficients of the 3-phase currents selected using minimum redundancy maximum relevance algorithm. Deep learning networks are used to supervise the detection of faults, their subsequent localization, and classification. The effectiveness of the scheme is evaluated on IEEE 9-bus and IEEE 39-bus systems with varying fault resistances, fault inception times, locations, fault types, wind speeds, and transformer connections. Further, the impact of factors like the presence of type-4 WFs, double circuit lines, WF capacity, grid strength, FACTs devices, reclosing on permanent faults, power swings, fault during power swings, voltage instability, load encroachment, high impedance faults, evolving and cross-country faults, close-in and remote-end faults, CT saturation, sampling rate, data window size, synchronization error, noise, and semi-supervised learning are considered while validating the proposed scheme. The results show the efficacy of the suggested method in dealing with various system conditions and configurations while protecting the transmission lines that are connected to WFs. △ Less

Submitted 5 October, 2023; originally announced October 2023.

Journal ref: IEEE Transactions on Power Delivery, 2023

arXiv:2305.16530 [pdf, other]

doi 10.1016/j.cma.2024.116793

Bi-fidelity Variational Auto-encoder for Uncertainty Quantification

Authors: Nuo** Cheng, Osman Asif Malik, Subhayan De, Stephen Becker, Alireza Doostan

Abstract: Quantifying the uncertainty of quantities of interest (QoIs) from physical systems is a primary objective in model validation. However, achieving this goal entails balancing the need for computational efficiency with the requirement for numerical accuracy. To address this trade-off, we propose a novel bi-fidelity formulation of variational auto-encoders (BF-VAE) designed to estimate the uncertaint… ▽ More Quantifying the uncertainty of quantities of interest (QoIs) from physical systems is a primary objective in model validation. However, achieving this goal entails balancing the need for computational efficiency with the requirement for numerical accuracy. To address this trade-off, we propose a novel bi-fidelity formulation of variational auto-encoders (BF-VAE) designed to estimate the uncertainty associated with a QoI from low-fidelity (LF) and high-fidelity (HF) samples of the QoI. This model allows for the approximation of the statistics of the HF QoI by leveraging information derived from its LF counterpart. Specifically, we design a bi-fidelity auto-regressive model in the latent space that is integrated within the VAE's probabilistic encoder-decoder structure. An effective algorithm is proposed to maximize the variational lower bound of the HF log-likelihood in the presence of limited HF data, resulting in the synthesis of HF realizations with a reduced computational cost. Additionally, we introduce the concept of the bi-fidelity information bottleneck (BF-IB) to provide an information-theoretic interpretation of the proposed BF-VAE model. Our numerical results demonstrate that BF-VAE leads to considerably improved accuracy, as compared to a VAE trained using only HF data, when limited HF data is available. △ Less

Submitted 17 October, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

Journal ref: Computer Methods in Applied Mechanics and Engineering (CMAME), Volume 421, 1 March 2024, 116793

arXiv:2304.03593 [pdf, other]

Deep Reinforcement Learning-Based Mapless Crowd Navigation with Perceived Risk of the Moving Crowd for Mobile Robots

Authors: Hafiq Anas, Ong Wee Hong, Owais Ahmed Malik

Abstract: Current state-of-the-art crowd navigation approaches are mainly deep reinforcement learning (DRL)-based. However, DRL-based methods suffer from the issues of generalization and scalability. To overcome these challenges, we propose a method that includes a Collision Probability (CP) in the observation space to give the robot a sense of the level of danger of the moving crowd to help the robot navig… ▽ More Current state-of-the-art crowd navigation approaches are mainly deep reinforcement learning (DRL)-based. However, DRL-based methods suffer from the issues of generalization and scalability. To overcome these challenges, we propose a method that includes a Collision Probability (CP) in the observation space to give the robot a sense of the level of danger of the moving crowd to help the robot navigate safely through crowds with unseen behaviors. We studied the effects of changing the number of moving obstacles to pay attention during navigation. During training, we generated local waypoints to increase the reward density and improve the learning efficiency of the system. Our approach was developed using deep reinforcement learning (DRL) and trained using the Gazebo simulator in a non-cooperative crowd environment with obstacles moving at randomized speeds and directions. We then evaluated our model on four different crowd-behavior scenarios. The results show that our method achieved a 100% success rate in all test settings. We compared our approach with a current state-of-the-art DRL-based approach, and our approach has performed significantly better, especially in terms of social safety. Importantly, our method can navigate in different crowd behaviors and requires no fine-tuning after being trained once. We further demonstrated the crowd navigation capability of our model in real-world tests. △ Less

Submitted 23 September, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

Comments: 6 pages, 7 figures

arXiv:2302.11474 [pdf, other]

Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software

Authors: Riley Murray, James Demmel, Michael W. Mahoney, N. Benjamin Erichson, Maksim Melnichenko, Osman Asif Malik, Laura Grigori, Piotr Luszczek, Michał Dereziński, Miles E. Lopes, Tianyu Liang, Hengrui Luo, Jack Dongarra

Abstract: Randomized numerical linear algebra - RandNLA, for short - concerns the use of randomization as a resource to develop improved algorithms for large-scale linear algebra computations. The origins of contemporary RandNLA lay in theoretical computer science, where it blossomed from a simple idea: randomization provides an avenue for computing approximate solutions to linear algebra problems more ef… ▽ More Randomized numerical linear algebra - RandNLA, for short - concerns the use of randomization as a resource to develop improved algorithms for large-scale linear algebra computations. The origins of contemporary RandNLA lay in theoretical computer science, where it blossomed from a simple idea: randomization provides an avenue for computing approximate solutions to linear algebra problems more efficiently than deterministic algorithms. This idea proved fruitful in the development of scalable algorithms for machine learning and statistical data analysis applications. However, RandNLA's true potential only came into focus upon integration with the fields of numerical analysis and "classical" numerical linear algebra. Through the efforts of many individuals, randomized algorithms have been developed that provide full control over the accuracy of their solutions and that can be every bit as reliable as algorithms that might be found in libraries such as LAPACK. Recent years have even seen the incorporation of certain RandNLA methods into MATLAB, the NAG Library, NVIDIA's cuSOLVER, and SciKit-Learn. For all its success, we believe that RandNLA has yet to realize its full potential. In particular, we believe the scientific community stands to benefit significantly from suitably defined "RandBLAS" and "RandLAPACK" libraries, to serve as standards conceptually analogous to BLAS and LAPACK. This 200-page monograph represents a step toward defining such standards. In it, we cover topics spanning basic sketching, least squares and optimization, low-rank approximation, full matrix decompositions, leverage score sampling, and sketching data with tensor product structures (among others). Much of the provided pseudo-code has been tested via publicly available MATLAB and Python implementations. △ Less

Submitted 12 April, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: v1: this is the first arXiv release of LAPACK Working Note 299. v2: complete rewrite of the subsection on trace estimation, among other changes. See frontmatter page ii (pdf page 5) for revision history

arXiv:2302.01977 [pdf, other]

Construction of Hierarchically Semi-Separable matrix Representation using Adaptive Johnson-Lindenstrauss Sketching

Authors: Yotam Yaniv, Osman Asif Malik, Pieter Ghysels, Xiaoye S. Li

Abstract: We extend an adaptive partially matrix-free Hierarchically Semi-Separable (HSS) matrix construction algorithm by Gorman et al. [SIAM J. Sci. Comput. 41(5), 2019] which uses Gaussian sketching operators to a broader class of Johnson--Lindenstrauss (JL) sketching operators. We present theoretical work which justifies this extension. In particular, we extend the earlier concentration bounds to all JL… ▽ More We extend an adaptive partially matrix-free Hierarchically Semi-Separable (HSS) matrix construction algorithm by Gorman et al. [SIAM J. Sci. Comput. 41(5), 2019] which uses Gaussian sketching operators to a broader class of Johnson--Lindenstrauss (JL) sketching operators. We present theoretical work which justifies this extension. In particular, we extend the earlier concentration bounds to all JL sketching operators and examine this bound for specific classes of such operators including the original Gaussian sketching operators, subsampled randomized Hadamard transform (SRHT) and the sparse Johnson--Lindenstrauss transform (SJLT). We discuss the implementation details of applying SJLT efficiently and demonstrate experimentally that using SJLT instead of Gaussian sketching operators leads to 1.5--2.5x speedups of the HSS construction implementation in the STRUMPACK C++ library. The generalized algorithm allows users to select their own JL sketching operators with theoretical lower bounds on the size of the operators which may lead to faster run time with similar HSS construction accuracy. △ Less

Submitted 3 February, 2023; originally announced February 2023.

arXiv:2301.12584 [pdf, other]

Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition

Authors: Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Laura Grigori, Aydin Buluc, James Demmel

Abstract: We present a data structure to randomly sample rows from the Khatri-Rao product of several matrices according to the exact distribution of its leverage scores. Our proposed sampler draws each row in time logarithmic in the height of the Khatri-Rao product and quadratic in its column count, with persistent space overhead at most the size of the input matrices. As a result, it tractably draws sample… ▽ More We present a data structure to randomly sample rows from the Khatri-Rao product of several matrices according to the exact distribution of its leverage scores. Our proposed sampler draws each row in time logarithmic in the height of the Khatri-Rao product and quadratic in its column count, with persistent space overhead at most the size of the input matrices. As a result, it tractably draws samples even when the matrices forming the Khatri-Rao product have tens of millions of rows each. When used to sketch the linear least squares problems arising in CANDECOMP / PARAFAC tensor decomposition, our method achieves lower asymptotic complexity per solve than recent state-of-the-art methods. Experiments on billion-scale sparse tensors validate our claims, with our algorithm achieving higher accuracy than competing methods as the decomposition rank grows. △ Less

Submitted 28 February, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

Comments: The 37th Conference on Neural Information Processing Systems (Neurips'23). 28 pages, 10 figures, 6 tables

arXiv:2211.04803 [pdf]

DSCOT: An NFT-Based Blockchain Architecture for the Authentication of IoT-Enabled Smart Devices in Smart Cities

Authors: Usman Khalil, Owais Ahmed Malik, Ong Wee Hong, Mueen Uddin

Abstract: Smart city architecture brings all the underlying architectures, i.e., Internet of Things (IoT), Cyber-Physical Systems (CPSs), Internet of Cyber-Physical Things (IoCPT), and Internet of Everything (IoE), together to work as a system under its umbrella. The goal of smart city architecture is to come up with a solution that may integrate all the real-time response applications. However, the cyber-p… ▽ More Smart city architecture brings all the underlying architectures, i.e., Internet of Things (IoT), Cyber-Physical Systems (CPSs), Internet of Cyber-Physical Things (IoCPT), and Internet of Everything (IoE), together to work as a system under its umbrella. The goal of smart city architecture is to come up with a solution that may integrate all the real-time response applications. However, the cyber-physical space poses threats that can jeopardize the working of a smart city where all the data belonging to people, systems, and processes will be at risk. Various architectures based on centralized and distributed mechanisms support smart cities; however, the security concerns regarding traceability, scalability, security services, platform assistance, and resource management persist. In this paper, private blockchain-based architecture Decentralized Smart City of Things (DSCoT) is proposed. It actively utilizes fog computing for all the users and smart devices connected to a fog node in a particular management system in a smart city, i.e., a smart house or hospital, etc. Non-fungible tokens (NFTs) have been utilized for representation to define smart device attributes. NFTs in the proposed DSCoT architecture provide devices and user authentication (IoT) functionality. DSCoT has been designed to provide a smart city solution that ensures robust security features such as Confidentiality, Integrity, Availability (CIA), and authorization by defining new attributes and functions for Owner, User, Fog, and IoT devices authentication. The evaluation of the proposed functions and components in terms of Gas consumption and time complexity has shown promising results. Comparatively, the Gas consumption for minting DSCoT NFT showed approximately 27%, and a DSCoT approve() was approximately 11% more efficient than the PUF-based NFT solution. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 18 pages, 15 figures, 5 tables, journal

arXiv:2211.04781 [pdf]

Profiling Obese Subgroups in National Health and Nutritional Status Survey Data using Machine Learning Techniques: A Case Study from Brunei Darussalam

Authors: Usman Khalil, Owais Ahmed Malik, Daphne Teck Ching Lai, Ong Sok King

Abstract: National Health and Nutritional Status Survey (NHANSS) is conducted annually by the Ministry of Health in Negara Brunei Darussalam to assess the population health and nutritional patterns and characteristics. The main aim of this study was to discover meaningful patterns (groups) from the obese sample of NHANSS data by applying data reduction and interpretation techniques. The mixed nature of the… ▽ More National Health and Nutritional Status Survey (NHANSS) is conducted annually by the Ministry of Health in Negara Brunei Darussalam to assess the population health and nutritional patterns and characteristics. The main aim of this study was to discover meaningful patterns (groups) from the obese sample of NHANSS data by applying data reduction and interpretation techniques. The mixed nature of the variables (qualitative and quantitative) in the data set added novelty to the study. Accordingly, the Categorical Principal Component (CATPCA) technique was chosen to interpret the meaningful results. The relationships between obesity and the lifestyle factors like demography, socioeconomic status, physical activity, dietary behavior, history of blood pressure, diabetes, etc., were determined based on the principal components generated by CATPCA. The results were validated with the help of the split method technique to counter verify the authenticity of the generated groups. Based on the analysis and results, two subgroups were found in the data set, and the salient features of these subgroups have been reported. These results can be proposed for the betterment of the healthcare industry. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: A Case study of Obese Subgroups from Brunei Darussalam: 15 Pages, 4 figures, journal

arXiv:2210.05105 [pdf, other]

doi 10.1145/3626183.3659980

Distributed-Memory Randomized Algorithms for Sparse Tensor CP Decomposition

Authors: Vivek Bharadwaj, Osman Asif Malik, Riley Murray, Aydin Buluç, James Demmel

Abstract: Candecomp / PARAFAC (CP) decomposition, a generalization of the matrix singular value decomposition to higher-dimensional tensors, is a popular tool for analyzing multidimensional sparse data. On tensors with billions of nonzero entries, computing a CP decomposition is a computationally intensive task. We propose the first distributed-memory implementations of two randomized CP decomposition algor… ▽ More Candecomp / PARAFAC (CP) decomposition, a generalization of the matrix singular value decomposition to higher-dimensional tensors, is a popular tool for analyzing multidimensional sparse data. On tensors with billions of nonzero entries, computing a CP decomposition is a computationally intensive task. We propose the first distributed-memory implementations of two randomized CP decomposition algorithms, CP-ARLS-LEV and STS-CP, that offer nearly an order-of-magnitude speedup at high decomposition ranks over well-tuned non-randomized decomposition packages. Both algorithms rely on leverage score sampling and enjoy strong theoretical guarantees, each with varying time and accuracy tradeoffs. We tailor the communication schedule for our random sampling algorithms, eliminating expensive reduction collectives and forcing communication costs to scale with the random sample count. Finally, we optimize the local storage format for our methods, switching between analogues of compressed sparse column and compressed sparse row formats. Experiments show that our methods are fast and scalable, producing 11x speedup over SPLATT by decomposing the billion-scale Reddit tensor on 512 CPU cores in under two minutes. △ Less

Submitted 27 April, 2024; v1 submitted 10 October, 2022; originally announced October 2022.

Comments: To appear in the Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'24). 14 pages, 13 figures, 5 tables

arXiv:2210.03828 [pdf, other]

Sampling-Based Decomposition Algorithms for Arbitrary Tensor Networks

Authors: Osman Asif Malik, Vivek Bharadwaj, Riley Murray

Abstract: We show how to develop sampling-based alternating least squares (ALS) algorithms for decomposition of tensors into any tensor network (TN) format. Provided the TN format satisfies certain mild assumptions, resulting algorithms will have input sublinear per-iteration cost. Unlike most previous works on sampling-based ALS methods for tensor decomposition, the sampling in our framework is done accord… ▽ More We show how to develop sampling-based alternating least squares (ALS) algorithms for decomposition of tensors into any tensor network (TN) format. Provided the TN format satisfies certain mild assumptions, resulting algorithms will have input sublinear per-iteration cost. Unlike most previous works on sampling-based ALS methods for tensor decomposition, the sampling in our framework is done according to the exact leverage score distribution of the design matrices in the ALS subproblems. We implement and test two tensor decomposition algorithms that use our sampling framework in a feature extraction experiment where we compare them against a number of other decomposition algorithms. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: 20 pages, 8 figures

arXiv:2209.05705 [pdf, other]

doi 10.1137/22M1524989

Quadrature Sampling of Parametric Models with Bi-fidelity Boosting

Authors: Nuo** Cheng, Osman Asif Malik, Yiming Xu, Stephen Becker, Alireza Doostan, Akil Narayan

Abstract: Least squares regression is a ubiquitous tool for building emulators (a.k.a. surrogate models) of problems across science and engineering for purposes such as design space exploration and uncertainty quantification. When the regression data are generated using an experimental design process (e.g., a quadrature grid) involving computationally expensive models, or when the data size is large, sketch… ▽ More Least squares regression is a ubiquitous tool for building emulators (a.k.a. surrogate models) of problems across science and engineering for purposes such as design space exploration and uncertainty quantification. When the regression data are generated using an experimental design process (e.g., a quadrature grid) involving computationally expensive models, or when the data size is large, sketching techniques have shown promise to reduce the cost of the construction of the regression model while ensuring accuracy comparable to that of the full data. However, random sketching strategies, such as those based on leverage scores, lead to regression errors that are random and may exhibit large variability. To mitigate this issue, we present a novel boosting approach that leverages cheaper, lower-fidelity data of the problem at hand to identify the best sketch among a set of candidate sketches. This in turn specifies the sketch of the intended high-fidelity model and the associated data. We provide theoretical analyses of this bi-fidelity boosting (BFB) approach and discuss the conditions the low- and high-fidelity data must satisfy for a successful boosting. In doing so, we derive a bound on the residual norm of the BFB sketched solution relating it to its ideal, but computationally expensive, high-fidelity boosted counterpart. Empirical results on both manufactured and PDE data corroborate the theoretical analyses and illustrate the efficacy of the BFB solution in reducing the regression error, as compared to the non-boosted solution. △ Less

Submitted 12 September, 2022; originally announced September 2022.

Comments: 36 pages, 9 figures

Journal ref: SIAM/ASA Journal on Uncertainty Quantification, Vol. 12, Iss. 2 (2024)

arXiv:2209.05662 [pdf, other]

Fast Algorithms for Monotone Lower Subsets of Kronecker Least Squares Problems

Authors: Osman Asif Malik, Yiming Xu, Nuo** Cheng, Stephen Becker, Alireza Doostan, Akil Narayan

Abstract: Approximate solutions to large least squares problems can be computed efficiently using leverage score-based row-sketches, but directly computing the leverage scores, or sampling according to them with naive methods, still requires an expensive manipulation and processing of the design matrix. In this paper we develop efficient leverage score-based sampling methods for matrices with certain Kronec… ▽ More Approximate solutions to large least squares problems can be computed efficiently using leverage score-based row-sketches, but directly computing the leverage scores, or sampling according to them with naive methods, still requires an expensive manipulation and processing of the design matrix. In this paper we develop efficient leverage score-based sampling methods for matrices with certain Kronecker product-type structure; in particular we consider matrices that are monotone lower column subsets of Kronecker product matrices. Our discussion is general, encompassing least squares problems on infinite domains, in which case matrices formally have infinitely many rows. We briefly survey leverage score-based sampling guarantees from the numerical linear algebra and approximation theory communities, and follow this with efficient algorithms for sampling when the design matrix has Kronecker-type structure. Our numerical examples confirm that sketches based on exact leverage score sampling for our class of structured matrices achieve superior residual compared to approximate leverage score sampling methods. △ Less

Submitted 12 September, 2022; originally announced September 2022.

Comments: 33 pages, 5 figures

arXiv:2207.05878 [pdf, other]

doi 10.1103/PhysRevE.109.024113

Diffusive Persistence on Disordered Lattices and Random Networks

Authors: Omar Malik, Melinda Varga, Alaa Moussawi, David Hunt, Boleslaw Szymanski, Zoltan Toroczkai, Gyorgy Korniss

Abstract: To better understand the temporal characteristics and the lifetime of fluctuations in stochastic processes in networks, we investigated diffusive persistence in various graphs. Global diffusive persistence is defined as the fraction of nodes for which the diffusive field at a site (or node) has not changed sign up to time $t$ (or in general, that the node remained active/inactive in discrete model… ▽ More To better understand the temporal characteristics and the lifetime of fluctuations in stochastic processes in networks, we investigated diffusive persistence in various graphs. Global diffusive persistence is defined as the fraction of nodes for which the diffusive field at a site (or node) has not changed sign up to time $t$ (or in general, that the node remained active/inactive in discrete models). Here we investigate disordered and random networks and show that the behavior of the persistence depends on the topology of the network. In two-dimensional (2D) disordered networks, we find that above the percolation threshold diffusive persistence scales similarly as in the original 2D regular lattice, according to a power law $P(t,L)\sim t^{-θ}$ with an exponent $θ\simeq 0.186$, in the limit of large linear system size $L$. At the percolation threshold, however, the scaling exponent changes to $θ\simeq 0.141$, as the result of the interplay of diffusive persistence and the underlying structural transition in the disordered lattice at the percolation threshold. Moreover, studying finite-size effects for 2D lattices at and above the percolation threshold, we find that at the percolation threshold, the long-time asymptotic value obeys a power-law $P(t,L)\sim L^{-zθ}$ with $z\simeq 2.86$ instead of the value of $z=2$ normally associated with finite-size effects on 2D regular lattices. In contrast, we observe that in random networks without a local regular structure, such as Erdős-Rényi networks, no simple power-law scaling behavior exists above the percolation threshold. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Journal ref: Phys. Rev. E 109, 024113 (2004)

arXiv:2206.07099 [pdf, other]

doi 10.1145/3518997.3534959

Resource-Mediated Consensus Formation

Authors: Omar Malik, James Flamino, Boleslaw K. Szymanski

Abstract: In social sciences, simulating opinion dynamics to study the interplay between homophily and influence, and the subsequent formation of echo chambers, is of great importance. As such, in this paper we investigate echo chambers by implementing a unique social game in which we spawn in a large number of agents, each assigned one of the two opinions on an issue and a finite amount of influence in the… ▽ More In social sciences, simulating opinion dynamics to study the interplay between homophily and influence, and the subsequent formation of echo chambers, is of great importance. As such, in this paper we investigate echo chambers by implementing a unique social game in which we spawn in a large number of agents, each assigned one of the two opinions on an issue and a finite amount of influence in the form of a game currency. Agents attempt to have an opinion that is a majority at the end of the game, to obtain a reward also paid in the game currency. At the beginning of each round, a randomly selected agent is selected, referred to as a speaker. The second agent is selected in the radius of speaker influence (which is a set subset of the speaker's neighbors) to interact with the speaker as a listener. In this interaction, the speaker proposes a payoff in the game currency from their personal influence budget to persuade the listener to hold the speaker's opinion in future rounds until chosen listener again. The listener can either choose to accept or reject this payoff to hold the speaker's opinion for future rounds. The listener's choice is informed only by their estimate of global majority opinion through a limited view of the opinions of their neighboring agents. We show that the influence game leads to the formation of "echo chambers," or homogeneous clusters of opinions. We also investigate various scenarios to disrupt the creation of such echo chambers, including the introduction of resource disparity between agents with different opinions, initially preferentially assigning opinions to agents, and the introduction of committed agents, who never change their initial opinion. △ Less

Submitted 14 June, 2022; originally announced June 2022.

Comments: 8 pages, 9 figures

Journal ref: Proc. SIGSIM-PADS'22: SIGSIM Conference on Principles of Advanced Discrete Simulation, Atlanta, GA, USA, June 8-10, 2022, pp. 105-112,

arXiv:2110.07631 [pdf, other]

More Efficient Sampling for Tensor Decomposition With Worst-Case Guarantees

Authors: Osman Asif Malik

Abstract: Recent papers have developed alternating least squares (ALS) methods for CP and tensor ring decomposition with a per-iteration cost which is sublinear in the number of input tensor entries for low-rank decomposition. However, the per-iteration cost of these methods still has an exponential dependence on the number of tensor modes when parameters are chosen to achieve certain worst-case guarantees.… ▽ More Recent papers have developed alternating least squares (ALS) methods for CP and tensor ring decomposition with a per-iteration cost which is sublinear in the number of input tensor entries for low-rank decomposition. However, the per-iteration cost of these methods still has an exponential dependence on the number of tensor modes when parameters are chosen to achieve certain worst-case guarantees. In this paper, we propose sampling-based ALS methods for the CP and tensor ring decompositions whose cost does not have this exponential dependence, thereby significantly improving on the previous state-of-the-art. We provide a detailed theoretical analysis and also apply the methods in a feature extraction experiment. △ Less

Submitted 17 June, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: Accepted to ICML 2022

arXiv:2105.03809 [pdf, other]

doi 10.1109/WASPAA52581.2021.9632758

Superresolution photoacoustic tomography using random speckle illumination and second order moments

Authors: Osman Asif Malik, Venkatalakshmi Vyjayanthi Narumanchi, Stephen Becker, Todd W. Murray

Abstract: Idier et al. [IEEE Trans. Comput. Imaging 4(1), 2018] propose a method which achieves superresolution in the microscopy setting by leveraging random speckle illumination and knowledge about statistical second order moments for the illumination patterns and model noise. This is achieved without any assumptions on the sparsity of the imaged object. In this paper, we show that their technique can be… ▽ More Idier et al. [IEEE Trans. Comput. Imaging 4(1), 2018] propose a method which achieves superresolution in the microscopy setting by leveraging random speckle illumination and knowledge about statistical second order moments for the illumination patterns and model noise. This is achieved without any assumptions on the sparsity of the imaged object. In this paper, we show that their technique can be extended to photoacoustic tomography. We propose a simple algorithm for doing the reconstruction which only requires a small number of linear algebra steps. It is therefore much faster than the iterative method used by Idier et al. We also propose a new representation of the imaged object based on Dirac delta expansion functions. △ Less

Submitted 31 January, 2022; v1 submitted 8 May, 2021; originally announced May 2021.

Comments: 5 pages, 5 figures; accepted to WASPAA 2021

Journal ref: 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2021, pp. 141-145

arXiv:2104.08732 [pdf]

Application of Computer Vision and Machine Learning for Digitized Herbarium Specimens: A Systematic Literature Review

Authors: Burhan Rashid Hussein, Owais Ahmed Malik, Wee-Hong Ong, Johan Willem Frederik Slik

Abstract: Herbarium contains treasures of millions of specimens which have been preserved for several years for scientific studies. To speed up more scientific discoveries, a digitization of these specimens is currently on going to facilitate easy access and sharing of its data to a wider scientific community. Online digital repositories such as IDigBio and GBIF have already accumulated millions of specimen… ▽ More Herbarium contains treasures of millions of specimens which have been preserved for several years for scientific studies. To speed up more scientific discoveries, a digitization of these specimens is currently on going to facilitate easy access and sharing of its data to a wider scientific community. Online digital repositories such as IDigBio and GBIF have already accumulated millions of specimen images yet to be explored. This presents a perfect time to automate and speed up more novel discoveries using machine learning and computer vision. In this study, a thorough analysis and comparison of more than 50 peer-reviewed studies which focus on application of computer vision and machine learning techniques to digitized herbarium specimen have been examined. The study categorizes different techniques and applications which have been commonly used and it also highlights existing challenges together with their possible solutions. It is our hope that the outcome of this study will serve as a strong foundation for beginners of the relevant field and will also shed more light for both computer science and ecology experts. △ Less

Submitted 18 April, 2021; originally announced April 2021.

Comments: 42 pages, 9 figures, journal

arXiv:2010.08693 [pdf, other]

doi 10.1371/journal.pone.0261250

Binary matrix factorization on special purpose hardware

Authors: Osman Asif Malik, Hayato Ushijima-Mwesigwa, Arnab Roy, Avradip Mandal, Indradeep Ghosh

Abstract: Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose computers but often require the problem to be modeled in a special form, such as an Ising or quadratic u… ▽ More Many fundamental problems in data mining can be reduced to one or more NP-hard combinatorial optimization problems. Recent advances in novel technologies such as quantum and quantum-inspired hardware promise a substantial speedup for solving these problems compared to when using general purpose computers but often require the problem to be modeled in a special form, such as an Ising or quadratic unconstrained binary optimization (QUBO) model, in order to take advantage of these devices. In this work, we focus on the important binary matrix factorization (BMF) problem which has many applications in data mining. We propose two QUBO formulations for BMF. We show how clustering constraints can easily be incorporated into these formulations. The special purpose hardware we consider is limited in the number of variables it can handle which presents a challenge when factorizing large matrices. We propose a sampling based approach to overcome this challenge, allowing us to factorize large rectangular matrices. In addition to these methods, we also propose a simple baseline algorithm which outperforms our more sophisticated methods in a few situations. We run experiments on the Fujitsu Digital Annealer, a quantum-inspired complementary metal-oxide-semiconductor (CMOS) annealer, on both synthetic and real data, including gene expression data. These experiments show that our approach is able to produce more accurate BMFs than competing methods. △ Less

Submitted 7 January, 2022; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: Accepted to PLOS ONE. 22 pages, 2 figures

Journal ref: PLOS ONE 16(12): e0261250, 2021

arXiv:2010.08581 [pdf, other]

A Sampling-Based Method for Tensor Ring Decomposition

Authors: Osman Asif Malik, Stephen Becker

Abstract: We propose a sampling-based method for computing the tensor ring (TR) decomposition of a data tensor. The method uses leverage score sampled alternating least squares to fit the TR cores in an iterative fashion. By taking advantage of the special structure of TR tensors, we can efficiently estimate the leverage scores and attain a method which has complexity sublinear in the number of input tensor… ▽ More We propose a sampling-based method for computing the tensor ring (TR) decomposition of a data tensor. The method uses leverage score sampled alternating least squares to fit the TR cores in an iterative fashion. By taking advantage of the special structure of TR tensors, we can efficiently estimate the leverage scores and attain a method which has complexity sublinear in the number of input tensor entries. We provide high-probability relative-error guarantees for the sampled least squares problems. We compare our proposal to existing methods in experiments on both synthetic and real data. Our method achieves substantial speedup -- sometimes two or three orders of magnitude -- over competing methods, while maintaining good accuracy. We also provide an example of how our method can be used for rapid feature extraction. △ Less

Submitted 12 June, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

Comments: Accepted to ICML 2021

Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139:7400-7411, 2021

arXiv:2005.10720 [pdf]

doi 10.1109/OJEMB.2021.3053215

Face Coverings, Aerosol Dispersion and Mitigation of Virus Transmission Risk

Authors: I. M. Viola, B. Peterson, G. Pisetta, G. Pavar, H. Akhtar, F. Menoloascina, E. Mangano, K. E. Dunn, R. Gabl, A. Nila, E. Molinari, C. Cummins, G. Thompson, C. M. McDougall, T. Y. M. Lo, F. C. Denison, P. Digard, O. Malik, M. J. G. Dunn, F. Mehendale

Abstract: The SARS-CoV-2 virus is primarily transmitted through virus-laden fluid particles ejected from the mouth of infected people. Face covers can mitigate the risk of virus transmission but their outward effectiveness is not fully ascertained. Objective: by using a background oriented schlieren technique, we aim to investigate the air flow ejected by a person while quietly and heavily breathing, while… ▽ More The SARS-CoV-2 virus is primarily transmitted through virus-laden fluid particles ejected from the mouth of infected people. Face covers can mitigate the risk of virus transmission but their outward effectiveness is not fully ascertained. Objective: by using a background oriented schlieren technique, we aim to investigate the air flow ejected by a person while quietly and heavily breathing, while coughing, and with different face covers. Results: we found that all face covers without an outlet valve reduce the front flow through by at least 63% and perhaps as high as 86% if the unfiltered cough jet distance was resolved to the anticipated maximum distance of 2-3 m. However, surgical and handmade masks, and face shields, generate significant leakage jets that may present major hazards. Conclusions: the effectiveness of the masks should mostly be considered based on the generation of secondary jets rather than on the ability to mitigate the front throughflow. △ Less

Submitted 30 January, 2021; v1 submitted 19 May, 2020; originally announced May 2020.

Journal ref: IEEE Open Journal of Engineering in Medicine and Biology, 2021

arXiv:1911.08424 [pdf, other]

doi 10.1016/j.laa.2020.05.004

Guarantees for the Kronecker Fast Johnson-Lindenstrauss Transform Using a Coherence and Sampling Argument

Authors: Osman Asif Malik, Stephen Becker

Abstract: In the recent paper [**, Kolda & Ward, arXiv:1909.04801], it is proved that the Kronecker fast Johnson-Lindenstrauss transform (KFJLT) is, in fact, a Johnson-Lindenstrauss transform, which had previously only been conjectured. In this paper, we provide an alternative proof of this, for when the KFJLT is applied to Kronecker vectors, using a coherence and sampling argument. Our proof yields a diff… ▽ More In the recent paper [**, Kolda & Ward, ar** stone to proving our result, we also show that the KFJLT is a subspace embedding for matrices with columns that have Kronecker product structure. Lastly, we compare the KFJLT to four other sketch techniques in numerical experiments on both synthetic and real-world data. △ Less

Submitted 16 May, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

Comments: Accepted to Linear Algebra and its Applications

MSC Class: 15-02; 65F30

Journal ref: Linear Algebra and its Applications 602, 2020, pp. 120-137

arXiv:1910.07643 [pdf, other]

doi 10.1137/1.9781611976700.82

Dynamic Graph Convolutional Networks Using the Tensor M-Product

Authors: Osman Asif Malik, Shashanka Ubaru, Lior Horesh, Misha E. Kilmer, Haim Avron

Abstract: Many irregular domains such as social networks, financial transactions, neuron connections, and natural language constructs are represented using graph structures. In recent years, a variety of graph neural networks (GNNs) have been successfully applied for representation learning and prediction on such graphs. In many of the real-world applications, the underlying graph changes over time, however… ▽ More Many irregular domains such as social networks, financial transactions, neuron connections, and natural language constructs are represented using graph structures. In recent years, a variety of graph neural networks (GNNs) have been successfully applied for representation learning and prediction on such graphs. In many of the real-world applications, the underlying graph changes over time, however, most of the existing GNNs are inadequate for handling such dynamic graphs. In this paper we propose a novel technique for learning embeddings of dynamic graphs using a tensor algebra framework. Our method extends the popular graph convolutional network (GCN) for learning representations of dynamic graphs using the recently proposed tensor M-product technique. Theoretical results presented establish a connection between the proposed tensor approach and spectral convolution of tensors. The proposed method TM-GCN is consistent with the Message Passing Neural Network (MPNN) framework, accounting for both spatial and temporal message passing. Numerical experiments on real-world datasets demonstrate the performance of the proposed method for edge classification and link prediction tasks on dynamic graphs. We also consider an application related to the COVID-19 pandemic, and show how our method can be used for early detection of infected individuals from contact tracing data. △ Less

Submitted 22 January, 2021; v1 submitted 16 October, 2019; originally announced October 2019.

Comments: Accepted to SIAM International Conference on Data Mining (SDM) 2021

arXiv:1905.07439 [pdf, other]

doi 10.1080/23799927.2020.1861104

Randomization of Approximate Bilinear Computation for Matrix Multiplication

Authors: Osman Asif Malik, Stephen Becker

Abstract: We present a method for randomizing formulas for bilinear computation of matrix products. We consider the implications of such randomization when there are two sources of error: One due to the formula itself only being approximately correct, and one due to using floating point arithmetic. Our theoretical results and numerical experiments indicate that our method can improve performance when each o… ▽ More We present a method for randomizing formulas for bilinear computation of matrix products. We consider the implications of such randomization when there are two sources of error: One due to the formula itself only being approximately correct, and one due to using floating point arithmetic. Our theoretical results and numerical experiments indicate that our method can improve performance when each of these error sources are present individually, as well as when they are present at the same time. △ Less

Submitted 10 January, 2022; v1 submitted 17 May, 2019; originally announced May 2019.

Comments: 36 pages, 29 figures; accepted to Int J Comput Math: Comput Syst Theory

Journal ref: International Journal of Computer Mathematics: Computer Systems Theory, 6:1, 54-93, 2021

arXiv:1901.10559 [pdf, other]

doi 10.1007/s10444-020-09816-9

Fast Randomized Matrix and Tensor Interpolative Decomposition Using CountSketch

Authors: Osman Asif Malik, Stephen Becker

Abstract: We propose a new fast randomized algorithm for interpolative decomposition of matrices which utilizes CountSketch. We then extend this approach to the tensor interpolative decomposition problem introduced by Biagioni et al. (J. Comput. Phys. 281, pp. 116-134, 2015). Theoretical performance guarantees are provided for both the matrix and tensor settings. Numerical experiments on both synthetic and… ▽ More We propose a new fast randomized algorithm for interpolative decomposition of matrices which utilizes CountSketch. We then extend this approach to the tensor interpolative decomposition problem introduced by Biagioni et al. (J. Comput. Phys. 281, pp. 116-134, 2015). Theoretical performance guarantees are provided for both the matrix and tensor settings. Numerical experiments on both synthetic and real data demonstrate that our algorithms maintain the accuracy of competing methods, while running in less time, achieving at least an order of magnitude speed-up on large matrices and tensors. △ Less

Submitted 22 November, 2021; v1 submitted 29 January, 2019; originally announced January 2019.

Comments: 29 pages, 2 figures; accepted to Adv Comput Math

MSC Class: 15-02

Journal ref: Advances in Computational Mathematics 46, article number: 76, 2020

arXiv:1507.08820 [pdf, other]

doi 10.1103/PhysRevA.92.063829

Spectral method for efficient computation of time-dependent phenomena in complex lasers

Authors: O. Malik, K. G. Makris, H. E. Türeci

Abstract: Studying time-dependent behavior in lasers is analytically difficult due to the saturating non-linearity inherent in the Maxwell-Bloch equations and numerically demanding because of the computational resources needed to discretize both time and space in conventional FDTD approaches. We describe here an efficient spectral method to overcome these shortcomings in complex lasers of arbitrary shape, g… ▽ More Studying time-dependent behavior in lasers is analytically difficult due to the saturating non-linearity inherent in the Maxwell-Bloch equations and numerically demanding because of the computational resources needed to discretize both time and space in conventional FDTD approaches. We describe here an efficient spectral method to overcome these shortcomings in complex lasers of arbitrary shape, gain medium distribution, and pum** profile. We apply this approach to a quasi-degenerate two-mode laser in different dynamical regimes and compare the results in the long-time limit to the Steady State Ab Initio Laser Theory (SALT), which is also built on a spectral method but makes a more specific ansatz about the long-time dynamical evolution of the semiclassical laser equations. Analyzing a parameter regime outside the known domain of validity of the stationary inversion approximation, we find that for only a narrow regime of pump powers the inversion is not stationary, and that this, as pump power is further increased, triggers a synchronization transition upon which the inversion becomes stationary again. We provide a detailed analysis of mode synchronization (aka cooperative frequency locking), revealing interesting dynamical features of such a laser system in the vicinity of the synchronization threshold. △ Less

Submitted 31 July, 2015; originally announced July 2015.

Journal ref: Phys. Rev. A 92, 063829 (2015)

arXiv:1410.4630 [pdf, other]

doi 10.1038/nphoton.2014.244

Enhancement of Laser Power Efficiency by Control of Spatial Hole Burning Interactions

Authors: Li Ge, Omer Malik, Hakan E. Tureci

Abstract: The laser is an out-of-equilibrium nonlinear wave system where the interplay of the cavity geometry and nonlinear wave interactions, mediated by the gain medium, determines the self-organized oscillation frequencies and the associated spatial field patterns. In the steady state, a constant energy flux flows through the laser from the pump to the far field, with the ratio of the total output power… ▽ More The laser is an out-of-equilibrium nonlinear wave system where the interplay of the cavity geometry and nonlinear wave interactions, mediated by the gain medium, determines the self-organized oscillation frequencies and the associated spatial field patterns. In the steady state, a constant energy flux flows through the laser from the pump to the far field, with the ratio of the total output power to the input power determining the power-efficiency. While nonlinear wave interactions have been modeled and well understood since the early days of laser theory, their impact on the power-efficiency of a laser system is poorly understood. Here, we show that spatial hole burning interactions generally decrease the power efficiency. We then demonstrate how spatial hole burning interactions can be controlled by a spatially tailored pump profile, thereby boosting the power-efficiency, in some cases by orders of magnitude. △ Less

Submitted 17 October, 2014; originally announced October 2014.

Comments: 5 pages, 3 figures, in press. appears in Nature Photonics (2014)

Journal ref: Nat. Photon. 8, 871-875 (2014)

Showing 1–29 of 29 results for author: Malik, O