Search | arXiv e-print repository

Strategies to enhance THz harmonic generation combining multilayered, gated, and metamaterial-based architectures

Authors: Ali Maleki, Moritz B. Heindl, Yongbao Xin, Robert W. Boyd, Georg Herink, Jean-Michel Ménard

Abstract: Graphene has unique properties paving the way for groundbreaking future applications. Its large optical nonlinearity and ease of integration in devices notably makes it an ideal candidate to become a key component for all-optical switching and frequency conversion applications. In the terahertz (THz) region, various approaches have been independently demonstrated to optimize the nonlinear effects… ▽ More Graphene has unique properties paving the way for groundbreaking future applications. Its large optical nonlinearity and ease of integration in devices notably makes it an ideal candidate to become a key component for all-optical switching and frequency conversion applications. In the terahertz (THz) region, various approaches have been independently demonstrated to optimize the nonlinear effects in graphene, addressing a critical limitation arising from the atomically thin interaction length. Here, we demonstrate sample architectures that combine strategies to enhance THz nonlinearities in graphene-based structures. We achieve this by increasing the interaction length through a multilayered design, controlling carrier density with an electrical gate, and modulating the THz field spatial distribution with a metallic metasurface substrate. Our study specifically investigates third harmonic generation (THG) using a table-top high-field THz source. We measure THG enhancement factors exceeding thirty and propose architectures capable of achieving a two-order-of-magnitude increase. These findings highlight the potential of engineered graphene-based samples in advancing THz frequency conversion technologies for signal processing and wireless communication applications. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 13 pages (4 Figures) + 5 pages Supplementary Information (4 Figures)

arXiv:2405.05344 [pdf, other]

A note on the minimax risk of sparse linear regression

Authors: Yilin Guo, Shubhangi Ghosh, Haolei Weng, Arian Maleki

Abstract: Sparse linear regression is one of the classical and extensively studied problems in high-dimensional statistics and compressed sensing. Despite the substantial body of literature dedicated to this problem, the precise determination of its minimax risk remains elusive. This paper aims to fill this gap by deriving asymptotically constant-sharp characterization for the minimax risk of sparse linear… ▽ More Sparse linear regression is one of the classical and extensively studied problems in high-dimensional statistics and compressed sensing. Despite the substantial body of literature dedicated to this problem, the precise determination of its minimax risk remains elusive. This paper aims to fill this gap by deriving asymptotically constant-sharp characterization for the minimax risk of sparse linear regression. More specifically, the paper focuses on scenarios where the sparsity level, denoted as k, satisfies the condition $(k \log p)/n {\to} 0$, with p and n representing the number of features and observations respectively. We establish that the minimax risk under isotropic Gaussian random design is asymptotically equal to $2σ^2k/n log(p/k)$, where $σ$ denotes the standard deviation of the noise. In addition to this result, we will summarize the existing results in the literature, and mention some of the fundamental problems that have still remained open. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.19331 [pdf, other]

Fusing Depthwise and Pointwise Convolutions for Efficient Inference on GPUs

Authors: Fareed Qararyah, Muhammad Waqar Azhar, Mohammad Ali Maleki, Pedro Trancoso

Abstract: Depthwise and pointwise convolutions have fewer parameters and perform fewer operations than standard convolutions. As a result, they have become increasingly used in various compact DNNs, including convolutional neural networks (CNNs) and vision transformers (ViTs). However, they have a lower compute-to-memory-access ratio than standard convolutions, making their memory accesses often the perform… ▽ More Depthwise and pointwise convolutions have fewer parameters and perform fewer operations than standard convolutions. As a result, they have become increasingly used in various compact DNNs, including convolutional neural networks (CNNs) and vision transformers (ViTs). However, they have a lower compute-to-memory-access ratio than standard convolutions, making their memory accesses often the performance bottleneck. This paper explores fusing depthwise and pointwise convolutions to overcome the memory access bottleneck. The focus is on fusing these operators on GPUs. The prior art on GPU-based fusion suffers from one or more of the following: (1) fusing either a convolution with an element-wise or multiple non-convolutional operators, (2) not explicitly optimizing for memory accesses, (3) not supporting depthwise convolutions. This paper proposes Fused Convolutional Modules (FCMs), a set of novel fused depthwise and pointwise GPU kernels. FCMs significantly reduce pointwise and depthwise convolutions memory accesses, improving execution time and energy efficiency. To evaluate the trade-offs associated with fusion and determine which convolutions are beneficial to fuse and the optimal FCM parameters, we propose FusePlanner. FusePlanner consists of cost models to estimate the memory accesses of depthwise, pointwise, and FCM kernels given GPU characteristics. Our experiments on three GPUs using representative CNNs and ViTs demonstrate that FCMs save up to 83% of the memory accesses and achieve speedups of up to 3.7x compared to cuDNN. Complete model implementations of various CNNs using our modules outperform TVMs' achieving speedups of up to 1.8x and saving up to two-thirds of the energy. △ Less

Submitted 30 April, 2024; originally announced April 2024.

arXiv:2402.15635 [pdf, other]

Bagged Deep Image Prior for Recovering Images in the Presence of Speckle Noise

Authors: Xi Chen, Zhewen Hou, Christopher A. Metzler, Arian Maleki, Shirin Jalali

Abstract: We investigate both the theoretical and algorithmic aspects of likelihood-based methods for recovering a complex-valued signal from multiple sets of measurements, referred to as looks, affected by speckle (multiplicative) noise. Our theoretical contributions include establishing the first existing theoretical upper bound on the Mean Squared Error (MSE) of the maximum likelihood estimator under the… ▽ More We investigate both the theoretical and algorithmic aspects of likelihood-based methods for recovering a complex-valued signal from multiple sets of measurements, referred to as looks, affected by speckle (multiplicative) noise. Our theoretical contributions include establishing the first existing theoretical upper bound on the Mean Squared Error (MSE) of the maximum likelihood estimator under the deep image prior hypothesis. Our theoretical results capture the dependence of MSE upon the number of parameters in the deep image prior, the number of looks, the signal dimension, and the number of measurements per look. On the algorithmic side, we introduce the concept of bagged Deep Image Priors (Bagged-DIP) and integrate them with projected gradient descent. Furthermore, we show how employing Newton-Schulz algorithm for calculating matrix inverses within the iterations of PGD reduces the computational complexity of the algorithm. We will show that this method achieves the state-of-the-art performance. △ Less

Submitted 23 February, 2024; originally announced February 2024.

arXiv:2402.08543 [pdf, ps, other]

Theoretical Analysis of Leave-one-out Cross Validation for Non-differentiable Penalties under High-dimensional Settings

Authors: Haolin Zou, Arnab Auddy, Kamiar Rahnama Rad, Arian Maleki

Abstract: Despite a large and significant body of recent work focused on estimating the out-of-sample risk of regularized models in the high dimensional regime, a theoretical understanding of this problem for non-differentiable penalties such as generalized LASSO and nuclear norm is missing. In this paper we resolve this challenge. We study this problem in the proportional high dimensional regime where both… ▽ More Despite a large and significant body of recent work focused on estimating the out-of-sample risk of regularized models in the high dimensional regime, a theoretical understanding of this problem for non-differentiable penalties such as generalized LASSO and nuclear norm is missing. In this paper we resolve this challenge. We study this problem in the proportional high dimensional regime where both the sample size n and number of features p are large, and n/p and the signal-to-noise ratio (per observation) remain finite. We provide finite sample upper bounds on the expected squared error of leave-one-out cross-validation (LO) in estimating the out-of-sample risk. The theoretical framework presented here provides a solid foundation for elucidating empirical findings that show the accuracy of LO. △ Less

Submitted 14 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: 30 pages

arXiv:2401.16503 [pdf]

High detectivity terahertz radiation sensing using frequency-noise-optimized nanomechanical resonators

Authors: Chang Zhang, Eeswar K. Yalavarthi, Mathieu Giroux, Wei Cui, Michel Stephan, Ali Maleki, Arnaud Weck, Jean-Michel Ménard, Raphael St-Gelais

Abstract: We achieve high detectivity terahertz sensing using a silicon nitride nanomechanical resonator functionalized with a metasurface absorber. High performances are achieved by striking a fine balance between the frequency stability of the resonator, and its responsivity to absorbed radiation. Using this approach, we demonstrate a detectivity $D^*=3.4\times10^9~\mathrm{cm\cdot\sqrt{Hz}/W}$ and a noise… ▽ More We achieve high detectivity terahertz sensing using a silicon nitride nanomechanical resonator functionalized with a metasurface absorber. High performances are achieved by striking a fine balance between the frequency stability of the resonator, and its responsivity to absorbed radiation. Using this approach, we demonstrate a detectivity $D^*=3.4\times10^9~\mathrm{cm\cdot\sqrt{Hz}/W}$ and a noise equivalent power $\mathrm{NEP}=36~\mathrm{pW/\sqrt{Hz}}$ that outperform the best room-temperature on-chip THz detectors (i.e., pyroelectrics). Our optical absorber consists of a 1-mm diameter metasurface, which currently enables a 0.5-3 THz detection range but can easily be scaled to other frequencies in the THz and infrared ranges. In addition to demonstrating high-performance terahertz sensing, our work unveils an important fundamental trade-off between high frequency stability and high responsivity in thermal-based nanomechanical radiation sensors. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.00631 [pdf, other]

Coordinated Deep Neural Networks: A Versatile Edge Offloading Algorithm

Authors: Alireza Maleki, Hamed Shah-Mansouri, Babak H. Khalaj

Abstract: As artificial intelligence (AI) applications continue to expand, there is a growing need for deep neural network (DNN) models. Although DNN models deployed at the edge are promising to provide AI as a service with low latency, their cooperation is yet to be explored. In this paper, we consider the DNN service providers share their computing resources as well as their models' parameters and allow o… ▽ More As artificial intelligence (AI) applications continue to expand, there is a growing need for deep neural network (DNN) models. Although DNN models deployed at the edge are promising to provide AI as a service with low latency, their cooperation is yet to be explored. In this paper, we consider the DNN service providers share their computing resources as well as their models' parameters and allow other DNNs to offload their computations without mirroring. We propose a novel algorithm called coordinated DNNs on edge (\textbf{CoDE}) that facilitates coordination among DNN services by creating multi-task DNNs out of individual models. CoDE aims to find the optimal path that results in the lowest possible cost, where the cost reflects the inference delay, model accuracy, and local computation workload. With CoDE, DNN models can make new paths for inference by using their own or other models' parameters. We then evaluate the performance of CoDE through numerical experiments. The results demonstrate a $75\%$ reduction in the local service computation workload while degrading the accuracy by only $2\%$ and having the same inference time in a balanced load condition. Under heavy load, CoDE can further decrease the inference time by $30\%$ while the accuracy is reduced by only $4\%$. △ Less

Submitted 31 December, 2023; originally announced January 2024.

arXiv:2312.06399 [pdf]

Who Are Tweeting About Academic Publications? A Cochrane Systematic Review and Meta-Analysis of Altmetric Studies

Authors: Ashraf Maleki, Kim Holmberg

Abstract: Previous studies have developed different categorizations of Twitter users who interact with scientific publications online, reflecting the difficulty in creating a unified approach. Using Cochrane Review meta-analysis to analyse earlier research (including 79,014 Twitter users, over twenty million tweets, and over five million tweeted publications from 23 studies), we created a consolidated robus… ▽ More Previous studies have developed different categorizations of Twitter users who interact with scientific publications online, reflecting the difficulty in creating a unified approach. Using Cochrane Review meta-analysis to analyse earlier research (including 79,014 Twitter users, over twenty million tweets, and over five million tweeted publications from 23 studies), we created a consolidated robust categorization consisting of 11 user categories, at different dimensions, covering most of any future needs for user categorizations on Twitter and possibly also other social media platforms. Our findings showed, with moderate certainty, covering all the earlier different approaches employed, that the predominant group of Twitter was individual users (66%), being responsible for the majority of tweets (55%) and tweeted publications (50%), while organizations (22%, 27%, and 28%, respectively) and science communicators (16%, 13%, and 30%) clearly contributed to a lesser degree. These individual users consisted of both academic individuals (33%) and other individuals (28%). While academic individuals shared more academic publications than other individuals (42% vs. 31%), they posted fewer tweets overall (22% vs. 30%), but these differences do not reach statistical significance. Despite significant heterogeneity arising from variations in earlier categorizations, the findings consistently indicate the importance of academics in disseminating academic publications on Twitter. △ Less

Submitted 14 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

arXiv:2310.17629 [pdf, ps, other]

Approximate Leave-one-out Cross Validation for Regression with $\ell_1$ Regularizers (extended version)

Authors: Arnab Auddy, Haolin Zou, Kamiar Rahnama Rad, Arian Maleki

Abstract: The out-of-sample error (OO) is the main quantity of interest in risk estimation and model selection. Leave-one-out cross validation (LO) offers a (nearly) distribution-free yet computationally demanding approach to estimate OO. Recent theoretical work showed that approximate leave-one-out cross validation (ALO) is a computationally efficient and statistically reliable estimate of LO (and OO) for… ▽ More The out-of-sample error (OO) is the main quantity of interest in risk estimation and model selection. Leave-one-out cross validation (LO) offers a (nearly) distribution-free yet computationally demanding approach to estimate OO. Recent theoretical work showed that approximate leave-one-out cross validation (ALO) is a computationally efficient and statistically reliable estimate of LO (and OO) for generalized linear models with differentiable regularizers. For problems involving non-differentiable regularizers, despite significant empirical evidence, the theoretical understanding of ALO's error remains unknown. In this paper, we present a novel theory for a wide class of problems in the generalized linear model family with non-differentiable regularizers. We bound the error |ALO - LO| in terms of intuitive metrics such as the size of leave-i-out perturbations in active sets, sample size n, number of features p and regularization parameters. As a consequence, for the $\ell_1$-regularized problems, we show that |ALO - LO| goes to zero as p goes to infinity while n/p and SNR are fixed and bounded. △ Less

Submitted 26 October, 2023; originally announced October 2023.

arXiv:2310.10503 [pdf, other]

A Tutorial on Chirp Spread Spectrum for LoRaWAN: Basics and Key Advances

Authors: Alireza Maleki, Ha H. Nguyen, Ebrahim Bedeer, Robert Barton

Abstract: Chirps spread spectrum (CSS) modulation is the heart of long-range (LoRa) modulation used in the context of long-range wide area network (LoRaWAN) in internet of things (IoT) scenarios. Despite being a proprietary technology owned by Semtech Corp., LoRa modulation has drawn much attention from the research and industry communities in recent years. However, to the best of our knowledge, a comprehen… ▽ More Chirps spread spectrum (CSS) modulation is the heart of long-range (LoRa) modulation used in the context of long-range wide area network (LoRaWAN) in internet of things (IoT) scenarios. Despite being a proprietary technology owned by Semtech Corp., LoRa modulation has drawn much attention from the research and industry communities in recent years. However, to the best of our knowledge, a comprehensive tutorial, investigating the CSS modulation in the LoRaWAN application, is missing in the literature. Therefore, in the first part of this paper, we provide a thorough analysis and tutorial of CSS modulation modified by LoRa specifications, discussing various aspects such as signal generation, detection, error performance, and spectral characteristics. Moreover, a summary of key recent advances in the context of CSS modulation applications in IoT networks is presented in the second part of this paper under four main categories of transceiver configuration and design, data rate improvement, interference modeling, and synchronization algorithms. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: This work is under review in IEEE Communications Surveys and Tutorials

arXiv:2305.18297 [pdf]

Using disturbance function for vibration analysis of a beam with an open edge crack

Authors: Mousa Rezaee, Saeed Lotfan, Vahid A. Maleki

Abstract: In this article, the model presented by Shen and Pierre to investigate the transverse vibration behavior of a simply supported beam has been revised. This is done by applying more realistic assumptions. The crack is modeled as a continuous disturbance and the disturbance function is provided based on fracture mechanics. Next, the natural frequencies corresponding to the model are extracted using t… ▽ More In this article, the model presented by Shen and Pierre to investigate the transverse vibration behavior of a simply supported beam has been revised. This is done by applying more realistic assumptions. The crack is modeled as a continuous disturbance and the disturbance function is provided based on fracture mechanics. Next, the natural frequencies corresponding to the model are extracted using the Galerkin method. The effect of crack parameters on the vibration behavior of the cracked beam is investigated. The obtained results show that the natural frequencies of the beam decrease with increasing crack depth. At the end, the obtained results are compared with the experimental results. The results show that the presented model is improved compared to previous models and predicts the vibration behavior of cracked beams with better accuracy for different crack parameters. △ Less

Submitted 12 March, 2023; originally announced May 2023.

Comments: in Persian language. 20th Annual International Conference of Iranian Society of Mechanical, Shiraz, Iran, 2012

arXiv:2305.09162 [pdf, other]

doi 10.1103/PhysRevFluids.9.013103

Activity-induced asymmetric dispersion in confined channels with constriction

Authors: Armin Maleki, Malihe Ghodrat, Ignacio Pagonabarraga

Abstract: Microorganisms, such as E.Coli, are known to display upstream behavior and respond rheotactically to shear flows. In particular, E.Coli suspensions have been shown to display strong sensitivity to spatial constrictions, leading to an anomalous densification past the constriction for incoming fluid velocities comparable to the microoganism's self propulsion speed. We introduce a Brownian dynamics m… ▽ More Microorganisms, such as E.Coli, are known to display upstream behavior and respond rheotactically to shear flows. In particular, E.Coli suspensions have been shown to display strong sensitivity to spatial constrictions, leading to an anomalous densification past the constriction for incoming fluid velocities comparable to the microoganism's self propulsion speed. We introduce a Brownian dynamics model for ellipsoidal self-propelling particles in a confined channel subject to a constriction. The model allows to identify the relevant parameters that characterize the relevant dynamical regimes of the accumulation of the active particles at the constriction, and clarify the mechanisms underlying the experimental observations. We find that particles are trapped in butterfly-like attractors in front of the constriction, which is the origin of the symmetry breaking in the emerging density profiles of active particles passing the constriction. In addition, the probability of trap** and thus the strength of asymmetry is affected by size of the particles and geometry of the channel, as well as the ratio of fluid velocity to propulsion speed. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2304.03654 [pdf, other]

doi 10.1038/s41467-024-48764-6

Hybrid THz architectures for molecular polaritonics

Authors: Ahmed Jaber, Michael Reitz, Avinash Singh, Ali Maleki, Yongbao Xin, Brian Sullivan, Ksenia Dolgaleva, Robert W. Boyd, Claudiu Genes, Jean-Michel Ménard

Abstract: Physical and chemical properties of materials can be modified by a resonant optical mode. Such recent demonstrations have mostly relied on a planar cavity geometry, others have relied on a plasmonic resonator. However, the combination of these two device architectures have remained largely unexplored, especially in the context of maximizing light-matter interactions. Here, we investigate several s… ▽ More Physical and chemical properties of materials can be modified by a resonant optical mode. Such recent demonstrations have mostly relied on a planar cavity geometry, others have relied on a plasmonic resonator. However, the combination of these two device architectures have remained largely unexplored, especially in the context of maximizing light-matter interactions. Here, we investigate several schemes of electromagnetic field confinement aimed at facilitating the collective coupling of a localized photonic mode to molecular vibrations in the terahertz region. The key aspects are the use of metasurface plasmonic structures combined with standard Fabry-Perot configurations and the deposition of a thin layer of glucose, via a spray coating technique, within a tightly focused electromagnetic mode volume. More importantly, we demonstrate enhanced vacuum Rabi splittings reaching up to 200 GHz when combining plasmonic resonances, photonic cavity modes and low-energy molecular resonances. Furthermore, we demonstrate how a cavity mode can be utilized to enhance the zero-point electric field amplitude of a plasmonic resonator. Our study provides key insight into the design of polaritonic platforms with organic molecules to harvest the unique properties of hybrid light-matter states. △ Less

Submitted 25 May, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

Comments: 7 pages (5 Figures) + 7 pages Appendix (5 Figures), updated version

Journal ref: Nature Communications 15, 4427 (2024)

arXiv:2212.04331 [pdf, other]

D2D-aided LoRaWAN LR-FHSS in Direct-to-Satellite IoT Networks

Authors: Alireza Maleki, Ha H. Nguyen, Ebrahim Bedeer, Robert Barton

Abstract: In this paper, we present a device-to-device (D2D) transmission scheme for aiding long-range frequency hop** spread spectrum (LR-FHSS) LoRaWAN protocol with application in direct-to-satellite IoT networks. We consider a practical ground-to-satellite fading model, i.e. shadowed-Rice channel, and derive the outage performance of the LR-FHSS network. With the help of network coding, D2D-aided LR-FH… ▽ More In this paper, we present a device-to-device (D2D) transmission scheme for aiding long-range frequency hop** spread spectrum (LR-FHSS) LoRaWAN protocol with application in direct-to-satellite IoT networks. We consider a practical ground-to-satellite fading model, i.e. shadowed-Rice channel, and derive the outage performance of the LR-FHSS network. With the help of network coding, D2D-aided LR-FHSS transmission scheme is proposed to improve the network capacity for which a closed-form outage probability expression is also derived. The obtained analytical expressions for both LR-FHSS and D2D-aided LR-FHSS outage probabilities are validated by computer simulations for different parts of the analysis capturing the effects of noise, fading, unslotted ALOHA-based time scheduling, the receiver's capture effect, IoT device distributions, and distance from node to satellite. The total outage probability for the D2D-aided LR-FHSS shows a considerable increase of 249.9% and 150.1% in network capacity at a typical outage of 10^-2 for DR6 and DR5, respectively, when compared to LR-FHSS. This is obtained at the cost of minimum of one and maximum of two additional transmissions per each IoT end device imposed by the D2D scheme in each time-slot. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: This paper is unuder review in IEEE Internet of Things Journal

arXiv:2211.11939 [pdf]

Evaluation of the Antibacterial and Wound Healing Properties of a Burn Ointment Containing Curcumin, Honey, and Potassium Aluminium

Authors: Mahsa Shahbandeh, Mahsa Amin Salehi, Maryam Soltanyzadeh, Mehrnaz Mirzaei, Ali Maleki, Abdolkarim Chehregani rad, Mohammad Javad Fatemi, Reza Mirnejad, Mostafa Dahmardehei

Abstract: Burn wounds can severely trouble the health system and life quality of patients. The present study aimed to analyze the synergistic healing properties of curcumin, honey, and potassium alum substances merged in a newly-devised burn ointment on second-degree burn wounds in rats. The MIC and MBC tests on 200 clinical isolates of Pseudomonas aeruginous are compared to imipenem in vitro. Their killing… ▽ More Burn wounds can severely trouble the health system and life quality of patients. The present study aimed to analyze the synergistic healing properties of curcumin, honey, and potassium alum substances merged in a newly-devised burn ointment on second-degree burn wounds in rats. The MIC and MBC tests on 200 clinical isolates of Pseudomonas aeruginous are compared to imipenem in vitro. Their killing time and cytotoxicity are also studied using a standard isolate of P. aeruginous, fibroblast stem cells (FSC) and mouse embryonic fibroblasts (MEF). Furthermore, histopathological and histomorphological assessments are conducted on 150 male Wistar rats whitin four experimental groups to evaluate the efficiency of the prepared burn ointment. We found a significant wound healing in both macroscopical observations and microscopical evaluations. Both curcumin and honey show strong antimicrobial effects with no cytotoxicity. Also, the histopathological results present a considerable and comparable wound re-epithelization in the a group of rats treated with both honey and curcumin after 7 days. The burn ointment containing curcumin, honey, and potassium alum show considerable efficacy in accelerating the healing of experimentally-induced burn wounds in animals. Th novel onement product is propose as a powerful alternative for the topical treatment of burn injuries. △ Less

Submitted 21 November, 2022; originally announced November 2022.

arXiv:2211.05954 [pdf, other]

Signal-to-noise ratio aware minimaxity and higher-order asymptotics

Authors: Yilin Guo, Haolei Weng, Arian Maleki

Abstract: Since its development, the minimax framework has been one of the corner stones of theoretical statistics, and has contributed to the popularity of many well-known estimators, such as the regularized M-estimators for high-dimensional problems. In this paper, we will first show through the example of sparse Gaussian sequence model, that the theoretical results under the classical minimax framework a… ▽ More Since its development, the minimax framework has been one of the corner stones of theoretical statistics, and has contributed to the popularity of many well-known estimators, such as the regularized M-estimators for high-dimensional problems. In this paper, we will first show through the example of sparse Gaussian sequence model, that the theoretical results under the classical minimax framework are insufficient for explaining empirical observations. In particular, both hard and soft thresholding estimators are (asymptotically) minimax, however, in practice they often exhibit sub-optimal performances at various signal-to-noise ratio (SNR) levels. The first contribution of this paper is to demonstrate that this issue can be resolved if the signal-to-noise ratio is taken into account in the construction of the parameter space. We call the resulting minimax framework the signal-to-noise ratio aware minimaxity. The second contribution of this paper is to showcase how one can use higher-order asymptotics to obtain accurate approximations of the SNR-aware minimax risk and discover minimax estimators. The theoretical findings obtained from this refined minimax framework provide new insights and practical guidance for the estimation of sparse signals. △ Less

Submitted 28 December, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

arXiv:2208.08434 [pdf]

Metamaterial-based octave-wide terahertz bandpass filters

Authors: Ali Maleki, Avinash Singh, Ahmed Jaber, Wei Cui, Yongbao Xin, Brian T. Sullivan, Robert W. Boyd, Jean-Michel Menard

Abstract: We present octave-wide bandpass filters in the terahertz (THz) region based on bilayer-metamaterial (BLMM) structures. The passband region has a super-Gaussian shape with a maximum transmittance approaching 70% and a typical stopband rejection of 20 dB. The design is based on a metasurface consisting of a metallic square-hole array deposited on a transparent polymer, which is stacked on top of an… ▽ More We present octave-wide bandpass filters in the terahertz (THz) region based on bilayer-metamaterial (BLMM) structures. The passband region has a super-Gaussian shape with a maximum transmittance approaching 70% and a typical stopband rejection of 20 dB. The design is based on a metasurface consisting of a metallic square-hole array deposited on a transparent polymer, which is stacked on top of an identical metasurface with a sub-wavelength separation. The superimposed metasurface structures were designed using finite-difference time-domain (FDTD) simulations and fabricated using a photolithography process. Experimental characterization of these structures between 0.3 to 5.8 THz is performed with a time-domain THz spectroscopy system. Good agreement between experiment and simulation results is observed. We also demonstrate that two superimposed BLMM (2BLMM) devices increase the steepness of the roll-offs to more than 85 dB/octave and enable a superior stopband rejection approaching 40 dB while the maximum transmittance remains above 64%. This work paves the way toward new THz applications, including the detection of THz pulses centered at specific frequencies, and an enhanced time-resolved detection sensitivity towards molecular vibrations that are noise dominated by a strong, off-resonant, driving field. △ Less

Submitted 17 August, 2022; originally announced August 2022.

Comments: 8 pages, 4 figures

arXiv:2206.12605 [pdf]

Heterogeneous Multi-core Array-based DNN Accelerator

Authors: Mohammad Ali Maleki, Mehdi Kamal, Ali Afzali-Kusha

Abstract: In this article, we investigate the impact of architectural parameters of array-based DNN accelerators on accelerator's energy consumption and performance in a wide variety of network topologies. For this purpose, we have developed a tool that simulates the execution of neural networks on array-based accelerators and has the capability of testing different configurations for the estimation of ener… ▽ More In this article, we investigate the impact of architectural parameters of array-based DNN accelerators on accelerator's energy consumption and performance in a wide variety of network topologies. For this purpose, we have developed a tool that simulates the execution of neural networks on array-based accelerators and has the capability of testing different configurations for the estimation of energy consumption and processing latency. Based on our analysis of the behavior of benchmark networks under different architectural parameters, we offer a few recommendations for having an efficient yet high performance accelerator design. Next, we propose a heterogeneous multi-core chip scheme for deep neural network execution. The evaluations of a selective small search space indicate that the execution of neural networks on their near-optimal core configuration can save up to 36% and 67% of energy consumption and energy-delay product respectively. Also, we suggest an algorithm to distribute the processing of network's layers across multiple cores of the same type in order to speed up the computations through model parallelism. Evaluations on different networks and with the different number of cores verify the effectiveness of the proposed algorithm in speeding up the processing to near-optimal values. △ Less

Submitted 25 June, 2022; originally announced June 2022.

Comments: This is the first version of the paper (V.0). We may revise the paper in the near future in order to better reflect its context. please consider the latest version

arXiv:2205.01967 [pdf, other]

doi 10.1103/PhysRevD.105.086024

Complementarity-Entanglement Tradeoff in Quantum Gravity

Authors: Yusef Maleki, Alireza Maleki

Abstract: Quantization of the gravity remains one of the most important, yet extremely illusive, challenges at the heart of modern physics. Any attempt to resolve this long-standing problem seems to be doomed, as the route to any direct empirical evidence (i.e., detecting gravitons) for shedding light on the quantum aspect of the gravity is far beyond the current capabilities. Recently, it has been discover… ▽ More Quantization of the gravity remains one of the most important, yet extremely illusive, challenges at the heart of modern physics. Any attempt to resolve this long-standing problem seems to be doomed, as the route to any direct empirical evidence (i.e., detecting gravitons) for shedding light on the quantum aspect of the gravity is far beyond the current capabilities. Recently, it has been discovered that gravitationally-induced entanglement, tailored in the interferometric frameworks, can be used to witness the quantum nature of the gravity. Even though these schemes offer promising tools for investigating quantum gravity, many fundamental and empirical aspects of the schemes are yet to be discovered. Considering the fact that, beside quantum entanglement, quantum uncertainty and complementarity principles are the two other foundational aspects of quantum physics, the quantum nature of the gravity needs to manifest all of these features. Here, we lay out an interferometric platform for testing these three nonclassical aspects of quantum mechanics in quantum gravity setting, which connects gravity and quantum physics in a broader and deeper context. As we show in this work, all of these three fundamental features of quantum gravity can be framed and fully analyzed in an interferometric scheme. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: 9 pages, 5 figures

Journal ref: Phys. Rev. D 105, 086024 (2022)

arXiv:2111.03237 [pdf, other]

Towards Designing Optimal Sensing Matrices for Generalized Linear Inverse Problems

Authors: Junjie Ma, Ji Xu, Arian Maleki

Abstract: We consider an inverse problem $\mathbf{y}= f(\mathbf{Ax})$, where $\mathbf{x}\in\mathbb{R}^n$ is the signal of interest, $\mathbf{A}$ is the sensing matrix, $f$ is a nonlinear function and $\mathbf{y} \in \mathbb{R}^m$ is the measurement vector. In many applications, we have some level of freedom to design the sensing matrix $\mathbf{A}$, and in such circumstances we could optimize $\mathbf{A}$ t… ▽ More We consider an inverse problem $\mathbf{y}= f(\mathbf{Ax})$, where $\mathbf{x}\in\mathbb{R}^n$ is the signal of interest, $\mathbf{A}$ is the sensing matrix, $f$ is a nonlinear function and $\mathbf{y} \in \mathbb{R}^m$ is the measurement vector. In many applications, we have some level of freedom to design the sensing matrix $\mathbf{A}$, and in such circumstances we could optimize $\mathbf{A}$ to achieve better reconstruction performance. As a first step towards optimal design, it is important to understand the impact of the sensing matrix on the difficulty of recovering $\mathbf{x}$ from $\mathbf{y}$. In this paper, we study the performance of one of the most successful recovery methods, i.e., the expectation propagation (EP) algorithm. We define a notion of spikiness for the spectrum of $\bmmathbfA}$ and show the importance of this measure for the performance of EP. We show that whether a spikier spectrum can hurt or help the recovery performance depends on $f$. Based on our framework, we are able to show that, in phase-retrieval problems, matrices with spikier spectrums are better for EP, while in 1-bit compressed sensing problems, less spiky spectrums lead to better performance. Our results unify and substantially generalize existing results that compare Gaussian and orthogonal matrices, and provide a platform towards designing optimal sensing systems. △ Less

Submitted 19 August, 2023; v1 submitted 4 November, 2021; originally announced November 2021.

Comments: to appear in IEEE transactions on information theory

arXiv:2110.14866 [pdf, other]

Quantum Steering Ellipsoid and Unruh Effect

Authors: Yusef Maleki, Bahram Ahansaz, Kangle Li, Alireza Maleki

Abstract: Quantum steering is a perplexing feature at the heart of quantum mechanics that provides profound implications in understanding the nature of physical reality. On the other hand, the effect of relativistic features on quantum systems is vital in understanding the underlying foundations of physics. In this work, we study the effects of Unruh acceleration on the quantum steering of a two-qubit syste… ▽ More Quantum steering is a perplexing feature at the heart of quantum mechanics that provides profound implications in understanding the nature of physical reality. On the other hand, the effect of relativistic features on quantum systems is vital in understanding the underlying foundations of physics. In this work, we study the effects of Unruh acceleration on the quantum steering of a two-qubit system. In particular, we consider the so-called quantum steering ellipsoid and the maximally-steered coherence in a non-inertial frame and find closed-form analytic expressions for the role of the Unruh acceleration in these quantities. Analyzing the conditions for the steerability of the system, we develop a geometric description for the effect of Unruh acceleration on the quantum steering of a two-qubit system. △ Less

Submitted 27 October, 2021; originally announced October 2021.

Comments: 9 pages, 4 figures

arXiv:2110.03780 [pdf, other]

A composable autoencoder-based iterative algorithm for accelerating numerical simulations

Authors: Rishikesh Ranade, Chris Hill, Haiyang He, Amir Maleki, Norman Chang, Jay Pathak

Abstract: Numerical simulations for engineering applications solve partial differential equations (PDE) to model various physical processes. Traditional PDE solvers are very accurate but computationally costly. On the other hand, Machine Learning (ML) methods offer a significant computational speedup but face challenges with accuracy and generalization to different PDE conditions, such as geometry, boundary… ▽ More Numerical simulations for engineering applications solve partial differential equations (PDE) to model various physical processes. Traditional PDE solvers are very accurate but computationally costly. On the other hand, Machine Learning (ML) methods offer a significant computational speedup but face challenges with accuracy and generalization to different PDE conditions, such as geometry, boundary conditions, initial conditions and PDE source terms. In this work, we propose a novel ML-based approach, CoAE-MLSim (Composable AutoEncoder Machine Learning Simulation), which is an unsupervised, lower-dimensional, local method, that is motivated from key ideas used in commercial PDE solvers. This allows our approach to learn better with relatively fewer samples of PDE solutions. The proposed ML-approach is compared against commercial solvers for better benchmarks as well as latest ML-approaches for solving PDEs. It is tested for a variety of complex engineering cases to demonstrate its computational speed, accuracy, scalability, and generalization across different PDE conditions. The results show that our approach captures physics accurately across all metrics of comparison (including measures such as results on section cuts and lines). △ Less

Submitted 7 October, 2021; originally announced October 2021.

arXiv:2108.01220 [pdf, ps, other]

OVERT: An Algorithm for Safety Verification of Neural Network Control Policies for Nonlinear Systems

Authors: Chelsea Sidrane, Amir Maleki, Ahmed Irfan, Mykel J. Kochenderfer

Abstract: Deep learning methods can be used to produce control policies, but certifying their safety is challenging. The resulting networks are nonlinear and often very large. In response to this challenge, we present OVERT: a sound algorithm for safety verification of nonlinear discrete-time closed loop dynamical systems with neural network control policies. The novelty of OVERT lies in combining ideas fro… ▽ More Deep learning methods can be used to produce control policies, but certifying their safety is challenging. The resulting networks are nonlinear and often very large. In response to this challenge, we present OVERT: a sound algorithm for safety verification of nonlinear discrete-time closed loop dynamical systems with neural network control policies. The novelty of OVERT lies in combining ideas from the classical formal methods literature with ideas from the newer neural network verification literature. The central concept of OVERT is to abstract nonlinear functions with a set of optimally tight piecewise linear bounds. Such piecewise linear bounds are designed for seamless integration into ReLU neural network verification tools. OVERT can be used to prove bounded-time safety properties by either computing reachable sets or solving feasibility queries directly. We demonstrate various examples of safety verification for several classical benchmark examples. OVERT compares favorably to existing methods both in computation time and in tightness of the reachable set. △ Less

Submitted 2 August, 2021; originally announced August 2021.

Comments: 44 pages, under review

MSC Class: 68Q60 (Primary) 68T07; 37N35 (Secondary) ACM Class: I.2.6; I.2.8; D.2.4

Journal ref: Journal of Machine Learning Research 23 (2022) 1-45

arXiv:2108.00329 [pdf, other]

Compressed sensing in the presence of speckle noise

Authors: Wenda Zhou, Shirin Jalali, Arian Maleki

Abstract: The problem of recovering a structured signal from its linear measurements in the presence of speckle noise is studied. This problem appears in many imaging systems such as synthetic aperture radar and optical coherence tomography. The current acquisition technology oversamples signals and converts the problem into a denoising problem with multiplicative noise. However, this paper explores the pos… ▽ More The problem of recovering a structured signal from its linear measurements in the presence of speckle noise is studied. This problem appears in many imaging systems such as synthetic aperture radar and optical coherence tomography. The current acquisition technology oversamples signals and converts the problem into a denoising problem with multiplicative noise. However, this paper explores the possibility of reducing the number of measurements below the ambient dimension of the signal. The sophistications that appear in the study of multiplicative noises have so far impeded theoretical analysis of such problems. This paper aims to present the first theoretical result regarding the recovery of signals from their undersampled measurements under the speckle noise. It is shown that if the signal class is structured, in the sense that the signals can be compressed efficiently, then one can obtain accurate estimates of the signal from fewer measurements than the ambient dimension. We demonstrate the effectiveness of the methods we propose through simulation results. △ Less

Submitted 31 July, 2021; originally announced August 2021.

arXiv:2104.07792 [pdf, other]

Geometry encoding for numerical simulations

Authors: Amir Maleki, Jan Heyse, Rishikesh Ranade, Haiyang He, Priya Kasimbeg, Jay Pathak

Abstract: We present a notion of geometry encoding suitable for machine learning-based numerical simulation. In particular, we delineate how this notion of encoding is different than other encoding algorithms commonly used in other disciplines such as computer vision and computer graphics. We also present a model comprised of multiple neural networks including a processor, a compressor and an evaluator.Thes… ▽ More We present a notion of geometry encoding suitable for machine learning-based numerical simulation. In particular, we delineate how this notion of encoding is different than other encoding algorithms commonly used in other disciplines such as computer vision and computer graphics. We also present a model comprised of multiple neural networks including a processor, a compressor and an evaluator.These parts each satisfy a particular requirement of our encoding. We compare our encoding model with the analogous models in the literature △ Less

Submitted 15 April, 2021; originally announced April 2021.

arXiv:2104.02452 [pdf, other]

A Latent space solver for PDE generalization

Authors: Rishikesh Ranade, Chris Hill, Haiyang He, Amir Maleki, Jay Pathak

Abstract: In this work we propose a hybrid solver to solve partial differential equation (PDE)s in the latent space. The solver uses an iterative inferencing strategy combined with solution initialization to improve generalization of PDE solutions. The solver is tested on an engineering case and the results show that it can generalize well to several PDE conditions. In this work we propose a hybrid solver to solve partial differential equation (PDE)s in the latent space. The solver uses an iterative inferencing strategy combined with solution initialization to improve generalization of PDE solutions. The solver is tested on an engineering case and the results show that it can generalize well to several PDE conditions. △ Less

Submitted 6 April, 2021; originally announced April 2021.

arXiv:2103.02727 [pdf, other]

Preference-based Learning of Reward Function Features

Authors: Sydney M. Katz, Amir Maleki, Erdem Bıyık, Mykel J. Kochenderfer

Abstract: Preference-based learning of reward functions, where the reward function is learned using comparison data, has been well studied for complex robotic tasks such as autonomous driving. Existing algorithms have focused on learning reward functions that are linear in a set of trajectory features. The features are typically hand-coded, and preference-based learning is used to determine a particular use… ▽ More Preference-based learning of reward functions, where the reward function is learned using comparison data, has been well studied for complex robotic tasks such as autonomous driving. Existing algorithms have focused on learning reward functions that are linear in a set of trajectory features. The features are typically hand-coded, and preference-based learning is used to determine a particular user's relative weighting for each feature. Designing a representative set of features to encode reward is challenging and can result in inaccurate models that fail to model the users' preferences or perform the task properly. In this paper, we present a method to learn both the relative weighting among features as well as additional features that help encode a user's reward function. The additional features are modeled as a neural network that is trained on the data from pairwise comparison queries. We apply our methods to a driving scenario used in previous work and compare the predictive power of our method to that of only hand-coded features. We perform additional analysis to interpret the learned features and examine the optimal trajectories. Our results show that adding an additional learned feature to the reward model enhances both its predictive power and expressiveness, producing unique results for each user. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: 8 pages, 8 figures

arXiv:2008.02337 [pdf, other]

doi 10.1109/TSP.2021.3108095

Optimal Data Detection and Signal Estimation in Systems with Input Noise

Authors: Ramina Ghods, Charles Jeon, Arian Maleki, Christoph Studer

Abstract: Practical systems often suffer from hardware impairments that already appear during signal generation. Despite the limiting effect of such input-noise impairments on signal processing systems, they are routinely ignored in the literature. In this paper, we propose an algorithm for data detection and signal estimation, referred to as Approximate Message Passing with Input noise (AMPI), which takes… ▽ More Practical systems often suffer from hardware impairments that already appear during signal generation. Despite the limiting effect of such input-noise impairments on signal processing systems, they are routinely ignored in the literature. In this paper, we propose an algorithm for data detection and signal estimation, referred to as Approximate Message Passing with Input noise (AMPI), which takes into account input-noise impairments. To demonstrate the efficacy of AMPI, we investigate two applications: Data detection in large multiple-input multiple output (MIMO) wireless systems and sparse signal recovery in compressive sensing. For both applications, we provide precise conditions in the large-system limit for which AMPI achieves optimal performance. We furthermore use simulations to demonstrate that AMPI achieves near-optimal performance at low complexity in realistic, finite-dimensional systems. △ Less

Submitted 5 August, 2020; originally announced August 2020.

arXiv:2007.06491 [pdf, other]

doi 10.1109/TSP.2021.3121634

Mismatched Data Detection in Massive MU-MIMO

Authors: Charles Jeon, Arian Maleki, Christoph Studer

Abstract: We investigate mismatched data detection for massive multi-user (MU) multiple-input multiple-output (MIMO) wireless systems in which the prior distribution of the transmit signal used in the data detector differs from the true prior. In order to minimize the performance loss caused by the prior mismatch, we include a tuning stage into the recently proposed large-MIMO approximate message passing (L… ▽ More We investigate mismatched data detection for massive multi-user (MU) multiple-input multiple-output (MIMO) wireless systems in which the prior distribution of the transmit signal used in the data detector differs from the true prior. In order to minimize the performance loss caused by the prior mismatch, we include a tuning stage into the recently proposed large-MIMO approximate message passing (LAMA) algorithm, which enables the development of data detectors with optimal as well as sub-optimal parameter tuning. We show that carefully-selected priors enable the design of simpler and computationally more efficient data detection algorithms compared to LAMA that uses the optimal prior, while achieving near-optimal error-rate performance. In particular, we demonstrate that a hardware-friendly approximation of the exact prior enables the design of low-complexity data detectors that achieve near individually-optimal performance. Furthermore, for Gaussian priors and uniform priors within a hypercube covering the quadrature amplitude modulation (QAM) constellation, our performance analysis recovers classical and recent results on linear and non-linear massive MU-MIMO data detection, respectively. △ Less

Submitted 18 October, 2021; v1 submitted 10 July, 2020; originally announced July 2020.

Comments: to appear in the IEEE Transactions on Signal Processing. arXiv admin note: text overlap with arXiv:1605.02324

arXiv:2003.13819 [pdf, other]

Sharp Concentration Results for Heavy-Tailed Distributions

Authors: Milad Bakhshizadeh, Arian Maleki, Victor H. de la Pena

Abstract: We obtain concentration and large deviation for the sums of independent and identically distributed random variables with heavy-tailed distributions. Our concentration results are concerned with random variables whose distributions satisfy $\mathbb{P}(X>t) \leq {\rm e}^{- I(t)}$, where $I: \mathbb{R} \rightarrow \mathbb{R}$ is an increasing function and $I(t)/t \rightarrow α\in [0, \infty)$ as… ▽ More We obtain concentration and large deviation for the sums of independent and identically distributed random variables with heavy-tailed distributions. Our concentration results are concerned with random variables whose distributions satisfy $\mathbb{P}(X>t) \leq {\rm e}^{- I(t)}$, where $I: \mathbb{R} \rightarrow \mathbb{R}$ is an increasing function and $I(t)/t \rightarrow α\in [0, \infty)$ as $t \rightarrow \infty$. Our main theorem can not only recover some of the existing results, such as the concentration of the sum of subWeibull random variables, but it can also produce new results for the sum of random variables with heavier tails. We show that the concentration inequalities we obtain are sharp enough to offer large deviation results for the sums of independent random variables as well. Our analyses which are based on standard truncation arguments simplify, unify and generalize the existing results on the concentration and large deviation of heavy-tailed random variables. △ Less

Submitted 25 July, 2022; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: 28 pages, 1 figure

arXiv:2003.01770 [pdf, other]

Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions

Authors: Kamiar Rahnama Rad, Wenda Zhou, Arian Maleki

Abstract: We study the problem of out-of-sample risk estimation in the high dimensional regime where both the sample size $n$ and number of features $p$ are large, and $n/p$ can be less than one. Extensive empirical evidence confirms the accuracy of leave-one-out cross validation (LO) for out-of-sample risk estimation. Yet, a unifying theoretical evaluation of the accuracy of LO in high-dimensional problems… ▽ More We study the problem of out-of-sample risk estimation in the high dimensional regime where both the sample size $n$ and number of features $p$ are large, and $n/p$ can be less than one. Extensive empirical evidence confirms the accuracy of leave-one-out cross validation (LO) for out-of-sample risk estimation. Yet, a unifying theoretical evaluation of the accuracy of LO in high-dimensional problems has remained an open problem. This paper aims to fill this gap for penalized regression in the generalized linear family. With minor assumptions about the data generating process, and without any sparsity assumptions on the regression coefficients, our theoretical analysis obtains finite sample upper bounds on the expected squared error of LO in estimating the out-of-sample error. Our bounds show that the error goes to zero as $n,p \rightarrow \infty$, even when the dimension $p$ of the feature vectors is comparable with or greater than the sample size $n$. One technical advantage of the theory is that it can be used to clarify and connect some results from the recent literature on scalable approximate LO. △ Less

Submitted 3 March, 2020; originally announced March 2020.

Journal ref: AISTATS 2020

arXiv:2001.04454 [pdf, ps, other]

doi 10.1103/PhysRevD.101.103504

Constraint on the mass of fuzzy dark matter from the rotation curve of the Milky Way

Authors: Alireza Maleki, Shant Baghram, Sohrab Rahvar

Abstract: Fuzzy Dark Matter (FDM) is one of the recent models for dark matter. According to this model, dark matter is made of very light scalar particles with considerable quantum mechanical effects on the galactic scale, which solves many problems of the cold dark matter (CDM). Here we use the observed data from the rotation curve of the Milky Way (MW) Galaxy to compare the results from FDM and CDM models… ▽ More Fuzzy Dark Matter (FDM) is one of the recent models for dark matter. According to this model, dark matter is made of very light scalar particles with considerable quantum mechanical effects on the galactic scale, which solves many problems of the cold dark matter (CDM). Here we use the observed data from the rotation curve of the Milky Way (MW) Galaxy to compare the results from FDM and CDM models. We show FDM adds a local peak on the rotation curve close to the center of the bulge, where its position and amplitude depend on the mass of FDM particles. By fitting the observed rotation curve with our expectation from FDM, we find that the mass of FDM is $m = 2.5^{+3.6}_{-2.0} \times10^{-21}$eV. We note that the local peak of the rotation curve in MW can also be explained in the CDM model with an extra inner bulge model for the MW Galaxy. We conclude that the FDM model explains this peak without a need for extra structure for the bulge. △ Less

Submitted 12 May, 2020; v1 submitted 13 January, 2020; originally announced January 2020.

Comments: 7 pages, 4 figures, 2 tables. Final published version

Journal ref: Phys. Rev. D 101, 103504 (2020)

arXiv:1911.00486 [pdf, other]

doi 10.1103/PhysRevD.101.023508

Investigation of two colliding solitonic cores in Fuzzy Dark Matter models

Authors: Alireza Maleki, Shant Baghram, Sohrab Rahvar

Abstract: One of the challenging questions in cosmology is the nature of dark matter particles. Fuzzy Dark Matter (FDM) is one of the candidates which is made of very light ($m_{FDM}\simeq 10^{-22}-10^{-21}$ eV) bosonic particles with no self-interaction. It is introduced by the motivation to solve the core-cusp problem in the galactic halos. In this work, we investigate the observational features from FDM… ▽ More One of the challenging questions in cosmology is the nature of dark matter particles. Fuzzy Dark Matter (FDM) is one of the candidates which is made of very light ($m_{FDM}\simeq 10^{-22}-10^{-21}$ eV) bosonic particles with no self-interaction. It is introduced by the motivation to solve the core-cusp problem in the galactic halos. In this work, we investigate the observational features from FDM halo collisions. Taking into account the quantum wave-length of the condensed bosonic structure, we determine the interference of the wave function of cores after collision. The fringe formation in the wave function is associated to the density contrast of the dark matter inside the colliding galaxies. The observational signatures of the fringes of the distribution of the dark matter are (i) on the lensing of the background sources, (ii) accumulation of the baryonic plasma tracking the interference of the FDM potential and (iii) excess in the X-ray emission from dense regions. Finally, we provide prospects for the observations of quantum wave features of FDM in the colliding galaxies. The NGC6240 colliding galaxy at the redshift of $z=0.024$ is a suitable candidate for this study. No signal is detected from the fringes in the Chandra data and taking into account the angular resolution of the telescope, we put constrain of $m> 7 \times10^{-23}$ eV on the mass of FDM particles. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Comments: 15 pages, 16 figures. Comments are welcome

Journal ref: Phys. Rev. D 101, 023508 (2020)

arXiv:1910.11849 [pdf, ps, other]

Information Theoretic Limits for Phase Retrieval with Subsampled Haar Sensing Matrices

Authors: Rishabh Dudeja, Junjie Ma, Arian Maleki

Abstract: We study information theoretic limits of recovering an unknown $n$ dimensional, complex signal vector $\mathbf{x}_\star$ with unit norm from $m$ magnitude-only measurements of the form $y_i = |(\mathbf{A} \mathbf{x}_\star)_i|^2, \; i = 1,2 \dots , m$, where $\mathbf{A}$ is the sensing matrix. This is known as the Phase Retrieval problem and models practical imaging systems where measuring the phas… ▽ More We study information theoretic limits of recovering an unknown $n$ dimensional, complex signal vector $\mathbf{x}_\star$ with unit norm from $m$ magnitude-only measurements of the form $y_i = |(\mathbf{A} \mathbf{x}_\star)_i|^2, \; i = 1,2 \dots , m$, where $\mathbf{A}$ is the sensing matrix. This is known as the Phase Retrieval problem and models practical imaging systems where measuring the phase of the observations is difficult. Since in a number of applications, the sensing matrix has orthogonal columns, we model the sensing matrix as a subsampled Haar matrix formed by picking $n$ columns of a uniformly random $m \times m$ unitary matrix. We study this problem in the high dimensional asymptotic regime, where $m,n \rightarrow \infty$, while $m/n \rightarrow δ$ with $δ$ being a fixed number, and show that if $m < (2-o_n(1))\cdot n$, then any estimator is asymptotically orthogonal to the true signal vector $\mathbf{x}_\star$. This lower bound is sharp since when $m > (2+o_n(1)) \cdot n $, estimators that achieve a non trivial asymptotic correlation with the signal vector are known from previous works. △ Less

Submitted 4 August, 2020; v1 submitted 25 October, 2019; originally announced October 2019.

Comments: Some references added, reviewer comments addressed

arXiv:1909.09345 [pdf, other]

Does SLOPE outperform bridge regression?

Authors: Shuaiwen Wang, Haolei Weng, Arian Maleki

Abstract: A recently proposed SLOPE estimator (arXiv:1407.3824) has been shown to adaptively achieve the minimax $\ell_2$ estimation rate under high-dimensional sparse linear regression models (arXiv:1503.08393). Such minimax optimality holds in the regime where the sparsity level $k$, sample size $n$, and dimension $p$ satisfy $k/p \rightarrow 0$, $k\log p/n \rightarrow 0$. In this paper, we characterize t… ▽ More A recently proposed SLOPE estimator (arXiv:1407.3824) has been shown to adaptively achieve the minimax $\ell_2$ estimation rate under high-dimensional sparse linear regression models (arXiv:1503.08393). Such minimax optimality holds in the regime where the sparsity level $k$, sample size $n$, and dimension $p$ satisfy $k/p \rightarrow 0$, $k\log p/n \rightarrow 0$. In this paper, we characterize the estimation error of SLOPE under the complementary regime where both $k$ and $n$ scale linearly with $p$, and provide new insights into the performance of SLOPE estimators. We first derive a concentration inequality for the finite sample mean square error (MSE) of SLOPE. The quantity that MSE concentrates around takes a complicated and implicit form. With delicate analysis of the quantity, we prove that among all SLOPE estimators, LASSO is optimal for estimating $k$-sparse parameter vectors that do not have tied non-zero components in the low noise scenario. On the other hand, in the large noise scenario, the family of SLOPE estimators are sub-optimal compared with bridge regression such as the Ridge estimator. △ Less

Submitted 22 September, 2021; v1 submitted 20 September, 2019; originally announced September 2019.

Comments: 50 pages, 18 figures

arXiv:1906.10894 [pdf, other]

doi 10.1016/j.physletb.2020.135700

Speed limit of quantum dynamics near the event horizon of black holes

Authors: Yusef Maleki, Alireza Maleki

Abstract: Quantum mechanics imposes a fundamental bound on the minimum time required for the quantum systems to evolve between two states of interest. This bound introduces a limit on the speed of the dynamical evolution of the systems, known as the quantum speed limit. We show that black holes can drastically affect the speed limit of a two-level fermionic quantum system subjected to an open quantum dynami… ▽ More Quantum mechanics imposes a fundamental bound on the minimum time required for the quantum systems to evolve between two states of interest. This bound introduces a limit on the speed of the dynamical evolution of the systems, known as the quantum speed limit. We show that black holes can drastically affect the speed limit of a two-level fermionic quantum system subjected to an open quantum dynamics. As we demonstrate, the quantum speed limit can enhance at the vicinity of a black hole's event horizon in the Schwarzschild spacetime. △ Less

Submitted 26 June, 2019; originally announced June 2019.

Comments: 6 pages, 5 figures

arXiv:1906.01855 [pdf, ps, other]

Orbital angular momentum transfer via spontaneously generated coherence

Authors: Zahra Amini Sabegh, Mohammad Mohammadi, Mohammad Ali Maleki, Mohammad Mahmoudi

Abstract: We study the orbital angular momentum (OAM) transfer from a weak Laguerre-Gaussian (LG) field to a weak plane-wave in two closed-loop three-level $V$-type atomic systems. In the first scheme, the atomic system has two non-degenerate upper levels which the corresponding transition is excited by a microwave plane-wave. It is analytically shown that the microwave field induces an OAM transfer from an… ▽ More We study the orbital angular momentum (OAM) transfer from a weak Laguerre-Gaussian (LG) field to a weak plane-wave in two closed-loop three-level $V$-type atomic systems. In the first scheme, the atomic system has two non-degenerate upper levels which the corresponding transition is excited by a microwave plane-wave. It is analytically shown that the microwave field induces an OAM transfer from an LG field to a generated third field. In the second scheme, we consider a three-level $V$-type atomic system with two near-degenerate excited states and study the effect of the quantum interference due to the spontaneous emission on the OAM transfer. It is found that the spontaneously generated coherence (SGC) induces the OAM transfer from the LG field to the weak planar field, while the OAM transfer does not occur in the absence of the SGC. The suggested models prepare a rather simple method for the OAM transfer which can be used in quantum information processing and data storage. △ Less

Submitted 5 June, 2019; originally announced June 2019.

arXiv:1904.05279 [pdf]

A Configurable Memristor-based Finite Impulse Response Filter

Authors: Mohammad Hemmati, Vahid Rashtchi, Ahmad Maleki, Siroos Toofan

Abstract: There are two main methods to implement FIR filters: software and hardware. In the software method, an FIR filter can be implemented within the processor by programming; it uses too much memory and it is extremely time-consuming while it gives the design more configurability. In most hardware-based implementations of FIR filters, Analog-to-Digital (A/D) and Digital-to-Analog (D/A) converters are m… ▽ More There are two main methods to implement FIR filters: software and hardware. In the software method, an FIR filter can be implemented within the processor by programming; it uses too much memory and it is extremely time-consuming while it gives the design more configurability. In most hardware-based implementations of FIR filters, Analog-to-Digital (A/D) and Digital-to-Analog (D/A) converters are mandatory and increase the cost. The most important advantage of hardware implementation of a FIR filter is its higher speed compared to its software counterpart. In this work, considering the advantages of software and hardware approaches, a method to implement direct form FIR filters using analog components and memristors is proposed. Not only the A/D and D/A converters are omitted, but also using memristors avails configurability. A new circuit is presented to handle negative coefficients of the filter and memristance values are calculated using a heuristic method in order to achieve a better accuracy in setting coefficients. Moreover, an appropriate sample and delay topology is employed which overcomes the limitations of the previous research in implementation of high-order filters. Proper operation and usefulness of the proposed structures are all validated via simulation in Cadence. △ Less

Submitted 10 April, 2019; originally announced April 2019.

Comments: 9 pages, 18 figures, 4 tables, and 8 equations, 44 high quality references, brief biographies of the authors

arXiv:1903.02676 [pdf, other]

Analysis of Spectral Methods for Phase Retrieval with Random Orthogonal Matrices

Authors: Rishabh Dudeja, Milad Bakhshizadeh, Junjie Ma, Arian Maleki

Abstract: Phase retrieval refers to algorithmic methods for recovering a signal from its phaseless measurements. Local search algorithms that work directly on the non-convex formulation of the problem have been very popular recently. Due to the nonconvexity of the problem, the success of these local search algorithms depends heavily on their starting points. The most widely used initialization scheme is the… ▽ More Phase retrieval refers to algorithmic methods for recovering a signal from its phaseless measurements. Local search algorithms that work directly on the non-convex formulation of the problem have been very popular recently. Due to the nonconvexity of the problem, the success of these local search algorithms depends heavily on their starting points. The most widely used initialization scheme is the spectral method, in which the leading eigenvector of a data-dependent matrix is used as a starting point. Recently, the performance of the spectral initialization was characterized accurately for measurement matrices with independent and identically distributed entries. This paper aims to obtain the same level of knowledge for isotropically random column-orthogonal matrices, which are substantially better models for practical phase retrieval systems. Towards this goal, we consider the asymptotic setting in which the number of measurements $m$, and the dimension of the signal, $n$, diverge to infinity with $m/n = δ\in(1,\infty)$, and obtain a simple expression for the overlap between the spectral estimator and the true signal vector. △ Less

Submitted 4 March, 2020; v1 submitted 6 March, 2019; originally announced March 2019.

arXiv:1903.02505 [pdf, other]

Spectral Method for Phase Retrieval: an Expectation Propagation Perspective

Authors: Junjie Ma, Rishabh Dudeja, Ji Xu, Arian Maleki, Xiaodong Wang

Abstract: Phase retrieval refers to the problem of recovering a signal $\mathbf{x}_{\star}\in\mathbb{C}^n$ from its phaseless measurements $y_i=|\mathbf{a}_i^{\mathrm{H}}\mathbf{x}_{\star}|$, where $\{\mathbf{a}_i\}_{i=1}^m$ are the measurement vectors. Many popular phase retrieval algorithms are based on the following two-step procedure: (i) initialize the algorithm based on a spectral method, (ii) refine… ▽ More Phase retrieval refers to the problem of recovering a signal $\mathbf{x}_{\star}\in\mathbb{C}^n$ from its phaseless measurements $y_i=|\mathbf{a}_i^{\mathrm{H}}\mathbf{x}_{\star}|$, where $\{\mathbf{a}_i\}_{i=1}^m$ are the measurement vectors. Many popular phase retrieval algorithms are based on the following two-step procedure: (i) initialize the algorithm based on a spectral method, (ii) refine the initial estimate by a local search algorithm (e.g., gradient descent). The quality of the spectral initialization step can have a major impact on the performance of the overall algorithm. In this paper, we focus on the model where the measurement matrix $\mathbf{A}=[\mathbf{a}_1,\ldots,\mathbf{a}_m]^{\mathrm{H}}$ has orthonormal columns, and study the spectral initialization under the asymptotic setting $m,n\to\infty$ with $m/n\toδ\in(1,\infty)$. We use the expectation propagation framework to characterize the performance of spectral initialization for Haar distributed matrices. Our numerical results confirm that the predictions of the EP method are accurate for not-only Haar distributed matrices, but also for realistic Fourier based models (e.g. the coded diffraction model). The main findings of this paper are the following: (1) There exists a threshold on $δ$ (denoted as $δ_{\mathrm{weak}}$) below which the spectral method cannot produce a meaningful estimate. We show that $δ_{\mathrm{weak}}=2$ for the column-orthonormal model. In contrast, previous results by Mondelli and Montanari show that $δ_{\mathrm{weak}}=1$ for the i.i.d. Gaussian model. (2) The optimal design for the spectral method coincides with that for the i.i.d. Gaussian model, where the latter was recently introduced by Luo, Alghamdi and Lu. △ Less

Submitted 9 September, 2020; v1 submitted 6 March, 2019; originally announced March 2019.

Comments: Accepted by IEEE Transactions on Information Theory

arXiv:1902.01753 [pdf, other]

Consistent Risk Estimation in Moderately High-Dimensional Linear Regression

Authors: Ji Xu, Arian Maleki, Kamiar Rahnama Rad, Daniel Hsu

Abstract: Risk estimation is at the core of many learning systems. The importance of this problem has motivated researchers to propose different schemes, such as cross validation, generalized cross validation, and Bootstrap. The theoretical properties of such estimates have been extensively studied in the low-dimensional settings, where the number of predictors $p$ is much smaller than the number of observa… ▽ More Risk estimation is at the core of many learning systems. The importance of this problem has motivated researchers to propose different schemes, such as cross validation, generalized cross validation, and Bootstrap. The theoretical properties of such estimates have been extensively studied in the low-dimensional settings, where the number of predictors $p$ is much smaller than the number of observations $n$. However, a unifying methodology accompanied with a rigorous theory is lacking in high-dimensional settings. This paper studies the problem of risk estimation under the moderately high-dimensional asymptotic setting $n,p \rightarrow \infty$ and $n/p \rightarrow δ>1$ ($δ$ is a fixed number), and proves the consistency of three risk estimates that have been successful in numerical studies, i.e., leave-one-out cross validation (LOOCV), approximate leave-one-out (ALO), and approximate message passing (AMP)-based techniques. A corner stone of our analysis is a bound that we obtain on the discrepancy of the `residuals' obtained from AMP and LOOCV. This connection not only enables us to obtain a more refined information on the estimates of AMP, ALO, and LOOCV, but also offers an upper bound on the convergence rate of each estimate. △ Less

Submitted 18 January, 2021; v1 submitted 5 February, 2019; originally announced February 2019.

arXiv:1901.10296 [pdf, ps, other]

Minimax Linear Estimation of the Retargeted Mean

Authors: David A. Hirshberg, Arian Maleki, Jose R. Zubizarreta

Abstract: Evaluating treatments received by one population for application to a different target population of scientific interest is a central problem in causal inference from observational studies. We study the minimax linear estimator of the treatment-specific mean outcome on a target population and provide a theoretical basis for inference based on it. In particular, we provide a justification for the c… ▽ More Evaluating treatments received by one population for application to a different target population of scientific interest is a central problem in causal inference from observational studies. We study the minimax linear estimator of the treatment-specific mean outcome on a target population and provide a theoretical basis for inference based on it. In particular, we provide a justification for the common practice of ignoring bias when building confidence intervals with these linear estimators. Focusing on the case that the class of the unknown outcome function is the unit ball of a reproducing kernel Hilbert space, we show that the resulting linear estimator is asymptotically optimal under conditions only marginally stronger than those used with augmented estimators. We establish bounds attesting to the estimator's good finite sample properties. In an extensive simulation study, we observe promising performance of the estimator throughout a wide range of sample sizes, noise levels, and levels of overlap between the covariate distributions of the treated and target populations. △ Less

Submitted 26 February, 2021; v1 submitted 10 January, 2019; originally announced January 2019.

Comments: 25 pages, 4 figures

arXiv:1901.08932 [pdf, other]

Theories and Practice of Agent based Modeling: Some practical Implications for Economic Planners

Authors: Hossein Sabzian, Mohammad Ali Shafia, Ali Maleki, Seyeed Mostapha Seyeed Hashemi, Ali Baghaei, Hossein Gharib

Abstract: Nowadays, we are surrounded by a large number of complex phenomena ranging from rumor spreading, social norms formation to rise of new economic trends and disruption of traditional businesses. To deal with such phenomena,Complex Adaptive System (CAS) framework has been found very influential among social scientists,especially economists. As the most powerful methodology of CAS modeling, Agent-base… ▽ More Nowadays, we are surrounded by a large number of complex phenomena ranging from rumor spreading, social norms formation to rise of new economic trends and disruption of traditional businesses. To deal with such phenomena,Complex Adaptive System (CAS) framework has been found very influential among social scientists,especially economists. As the most powerful methodology of CAS modeling, Agent-based modeling (ABM) has gained a growing application among academicians and practitioners. ABMs show how simple behavioral rules of agents and local interactions among them at micro-scale can generate surprisingly complex patterns at macro-scale. Despite a growing number of ABM publications, those researchers unfamiliar with this methodology have to study a number of works to understand (1) the why and what of ABMs and (2) the ways they are rigorously developed. Therefore, the major focus of this paper is to help social sciences researchers,especially economists get a big picture of ABMs and know how to develop them both systematically and rigorously. △ Less

Submitted 23 January, 2019; originally announced January 2019.

Comments: 54 Pages, 11 Tables, 16 Figures. arXiv admin note: substantial text overlap with arXiv:1804.09284

arXiv:1811.03881 [pdf, ps, other]

Microwave-induced orbital angular momentum transfer

Authors: Zahra Amini Sabegh, Mohammad Ali Maleki, Mohammad Mahmoudi

Abstract: The microwave-induced orbital angular momentum (OAM) transfer from a Laguerre-Gaussian (LG) beam to a weak plane-wave is studied in a closed-loop four-level ladder-type atomic system. The analytical investigation shows that the generated fourth field is a LG beam with the same OAM of the applied LG field. Moreover, the microwave-induced subluminal generated pulse can be switched to the superlumina… ▽ More The microwave-induced orbital angular momentum (OAM) transfer from a Laguerre-Gaussian (LG) beam to a weak plane-wave is studied in a closed-loop four-level ladder-type atomic system. The analytical investigation shows that the generated fourth field is a LG beam with the same OAM of the applied LG field. Moreover, the microwave-induced subluminal generated pulse can be switched to the superluminal one only by changing the relative phase of applied fields. It is shown that the OAM transfer in subluminal regime is accompanied by a slightly absorption, however, it switches to the slightly gain in superluminal regime. The transfer of light's OAM and control of the group velocity of generated pulse can prepare a high-dimensional Hilbert space which has a major role in quantum communication and information processing. △ Less

Submitted 9 November, 2018; originally announced November 2018.

arXiv:1811.01917 [pdf, other]

Optimal Data Detection in Large MIMO

Authors: Charles Jeon, Ramina Ghods, Arian Maleki, Christoph Studer

Abstract: Large multiple-input multiple-output (MIMO) appears in massive multi-user MIMO and randomly-spread code-division multiple access (CDMA)-based wireless systems. In order to cope with the excessively high complexity of optimal data detection in such systems, a variety of efficient yet sub-optimal algorithms have been proposed in the past. In this paper, we propose a data detection algorithm that is… ▽ More Large multiple-input multiple-output (MIMO) appears in massive multi-user MIMO and randomly-spread code-division multiple access (CDMA)-based wireless systems. In order to cope with the excessively high complexity of optimal data detection in such systems, a variety of efficient yet sub-optimal algorithms have been proposed in the past. In this paper, we propose a data detection algorithm that is computationally efficient and optimal in a sense that it is able to achieve the same error-rate performance as the individually optimal (IO) data detector under certain assumptions on the MIMO system matrix and constellation alphabet. Our algorithm, which we refer to as LAMA (short for large MIMO AMP), builds on complex-valued Bayesian approximate message passing (AMP), which enables an exact analytical characterization of the performance and complexity in the large-system limit via the state-evolution framework. We derive optimality conditions for LAMA and investigate performance/complexity trade-offs. As a byproduct of our analysis, we recover classical results of IO data detection for randomly-spread CDMA. We furthermore provide practical ways for LAMA to approach the theoretical performance limits in realistic, finite-dimensional systems at low computational complexity. △ Less

Submitted 5 November, 2018; originally announced November 2018.

arXiv:1810.11344 [pdf, other]

Benefits of over-parameterization with EM

Authors: Ji Xu, Daniel Hsu, Arian Maleki

Abstract: Expectation Maximization (EM) is among the most popular algorithms for maximum likelihood estimation, but it is generally only guaranteed to find its stationary points of the log-likelihood objective. The goal of this article is to present theoretical and empirical evidence that over-parameterization can help EM avoid spurious local optima in the log-likelihood. We consider the problem of estimati… ▽ More Expectation Maximization (EM) is among the most popular algorithms for maximum likelihood estimation, but it is generally only guaranteed to find its stationary points of the log-likelihood objective. The goal of this article is to present theoretical and empirical evidence that over-parameterization can help EM avoid spurious local optima in the log-likelihood. We consider the problem of estimating the mean vectors of a Gaussian mixture model in a scenario where the mixing weights are known. Our study shows that the global behavior of EM, when one uses an over-parameterized model in which the mixing weights are treated as unknown, is better than that when one uses the (correct) model with the mixing weights fixed to the known values. For symmetric Gaussians mixtures with two components, we prove that introducing the (statistically redundant) weight parameters enables EM to find the global maximizer of the log-likelihood starting from almost any initial mean parameters, whereas EM without this over-parameterization may very often fail. For other Gaussian mixtures, we provide empirical evidence that shows similar behavior. Our results corroborate the value of over-parameterization in solving non-convex optimization problems, previously observed in other domains. △ Less

Submitted 26 October, 2018; originally announced October 2018.

Comments: Accepted at NIPS 2018

arXiv:1810.05813 [pdf, ps, other]

The absolutely Koszul and Backelin-Roos properties for spaces of quadrics of small codimension

Authors: Rasoul Ahangari Maleki, Liana M. Şega

Abstract: Let $\kk$ be a field, $R$ a standard graded quadratic $\kk$-algebra with $\dim_{\kk}R_2\le 3$, and let $\ov\kk$ denote an algebraic closure of $\kk$. We construct a graded surjective Golod homomorphism $\varphi \colon P\to R\otimes_{\kk}\ov{\kk}$ such that $P$ is a complete intersection of codimension at most $3$. Furthermore, we show that $R$ is absolutely Koszul (that is, every finitely generate… ▽ More Let $\kk$ be a field, $R$ a standard graded quadratic $\kk$-algebra with $\dim_{\kk}R_2\le 3$, and let $\ov\kk$ denote an algebraic closure of $\kk$. We construct a graded surjective Golod homomorphism $\varphi \colon P\to R\otimes_{\kk}\ov{\kk}$ such that $P$ is a complete intersection of codimension at most $3$. Furthermore, we show that $R$ is absolutely Koszul (that is, every finitely generated $R$-module has finite linearity defect) if and only if $R$ is Koszul if and only if $R$ is not a trivial fiber extension of a standard graded $\kk$-algebra with Hilbert series $(1+2t-2t^3)(1-t)^{-1}$. In particular, we recover earlier results on the Koszul property of Backelin, Conca and D'Alì. △ Less

Submitted 19 January, 2020; v1 submitted 13 October, 2018; originally announced October 2018.

Comments: 38 pages, revised version, To appear in Journal of Algebra

MSC Class: 13D02

arXiv:1810.02716 [pdf, other]

Approximate Leave-One-Out for High-Dimensional Non-Differentiable Learning Problems

Authors: Shuaiwen Wang, Wenda Zhou, Arian Maleki, Haihao Lu, Vahab Mirrokni

Abstract: Consider the following class of learning schemes: \begin{equation} \label{eq:main-problem1} \hat{\boldsymbolβ} := \underset{\boldsymbolβ \in \mathcal{C}}{\arg\min} \;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbolβ; y_j) + λR(\boldsymbolβ), \qquad \qquad \qquad (1) \end{equation} where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\rm th}$ feature and response va… ▽ More Consider the following class of learning schemes: \begin{equation} \label{eq:main-problem1} \hat{\boldsymbolβ} := \underset{\boldsymbolβ \in \mathcal{C}}{\arg\min} \;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbolβ; y_j) + λR(\boldsymbolβ), \qquad \qquad \qquad (1) \end{equation} where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\rm th}$ feature and response variable respectively. Let $\ell$ and $R$ be the convex loss function and regularizer, $\boldsymbolβ$ denote the unknown weights, and $λ$ be a regularization parameter. $\mathcal{C} \subset \mathbb{R}^{p}$ is a closed convex set. Finding the optimal choice of $λ$ is a challenging problem in high-dimensional regimes where both $n$ and $p$ are large. We propose three frameworks to obtain a computationally efficient approximation of the leave-one-out cross validation (LOOCV) risk for nonsmooth losses and regularizers. Our three frameworks are based on the primal, dual, and proximal formulations of (1). Each framework shows its strength in certain types of problems. We prove the equivalence of the three approaches under smoothness conditions. This equivalence enables us to justify the accuracy of the three methods under such conditions. We use our approaches to obtain a risk estimate for several standard problems, including generalized LASSO, nuclear norm regularization, and support vector machines. We empirically demonstrate the effectiveness of our results for non-differentiable cases. △ Less

Submitted 4 October, 2018; originally announced October 2018.

Comments: 63 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1807.02694

arXiv:1807.03542 [pdf]

A strategic framework for identifying the critical factors of 4G technology diffusion in I.R. Iran - A Fuzzy DEMATEL approach

Authors: Hossein Sabzian, Hossein Gharib, Seyyed Mostafa Seyyed Hashemi, Ali Maleki

Abstract: As the most prominent representative of 4G, Long term evolution (LTE) technology has become a focal point for mobile network operators all over the world. However, although Iranian main operators like MCI and Irancell have hugely invested on deployment of this technology, its diffusion has been very slow with a penetration rate of 0.06 at the end of spring 2017. Nevertheless, if this rate doesn't… ▽ More As the most prominent representative of 4G, Long term evolution (LTE) technology has become a focal point for mobile network operators all over the world. However, although Iranian main operators like MCI and Irancell have hugely invested on deployment of this technology, its diffusion has been very slow with a penetration rate of 0.06 at the end of spring 2017. Nevertheless, if this rate doesn't increase, it will yield some negative unintended consequences for telecom operators such as (I) Failure to provide a large number of high quality services (II) Inability to compete with OTT technologies (III) Loss of many revenue opportunities (IV) Prolongation of payback period and (V) The lack of technological integrability with fifth generation networks (5G) and loss of many IOT opportunities. Through discussing the literature of technology adoption and diffusion both generally and specifically, identifying the major limitations of these studies and establishing a comprehensive factor set based on four major groups of (I) mobile handset and operators-related factors (II) subscribers-related biological factors, (III) subscribers-related perceptual factors and (IV) subscribers-related contextual factors, a novel fuzzy DEMATEL model has been developed by which all ICT policy makers can not only get a clear knowledge of factors influencing technology adoption but also know the critical success factors (CSFs) influencing Iranians' mindsets towards LTE adoption. Therefore, they can make effective and actionable policies to scale up LTE diffusion or other ICT-related technologies throughout the society. △ Less

Submitted 10 July, 2018; originally announced July 2018.

Comments: 20 pages, 5 figures, 7 tables, 14 equations

arXiv:1807.02694 [pdf, other]

Approximate Leave-One-Out for Fast Parameter Tuning in High Dimensions

Authors: Shuaiwen Wang, Wenda Zhou, Haihao Lu, Arian Maleki, Vahab Mirrokni

Abstract: Consider the following class of learning schemes: $$\hat{\boldsymbolβ} := \arg\min_{\boldsymbolβ}\;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbolβ; y_j) + λR(\boldsymbolβ),\qquad\qquad (1) $$ where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\text{th}}$ feature and response variable respectively. Let $\ell$ and $R$ be the loss function and regularizer,… ▽ More Consider the following class of learning schemes: $$\hat{\boldsymbolβ} := \arg\min_{\boldsymbolβ}\;\sum_{j=1}^n \ell(\boldsymbol{x}_j^\top\boldsymbolβ; y_j) + λR(\boldsymbolβ),\qquad\qquad (1) $$ where $\boldsymbol{x}_i \in \mathbb{R}^p$ and $y_i \in \mathbb{R}$ denote the $i^{\text{th}}$ feature and response variable respectively. Let $\ell$ and $R$ be the loss function and regularizer, $\boldsymbolβ$ denote the unknown weights, and $λ$ be a regularization parameter. Finding the optimal choice of $λ$ is a challenging problem in high-dimensional regimes where both $n$ and $p$ are large. We propose two frameworks to obtain a computationally efficient approximation ALO of the leave-one-out cross validation (LOOCV) risk for nonsmooth losses and regularizers. Our two frameworks are based on the primal and dual formulations of (1). We prove the equivalence of the two approaches under smoothness conditions. This equivalence enables us to justify the accuracy of both methods under such conditions. We use our approaches to obtain a risk estimate for several standard problems, including generalized LASSO, nuclear norm regularization, and support vector machines. We empirically demonstrate the effectiveness of our results for non-differentiable cases. △ Less

Submitted 7 July, 2018; originally announced July 2018.

Comments: The paper is published on ICML 2018

Showing 1–50 of 103 results for author: Maleki, A