Skip to main content

Showing 1–50 of 79 results for author: Liu, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.06903  [pdf, ps, other

    stat.ML cs.LG math.ST

    On the Limitation of Kernel Dependence Maximization for Feature Selection

    Authors: Keli Liu, Feng Ruan

    Abstract: A simple and intuitive method for feature selection consists of choosing the feature subset that maximizes a nonparametric measure of dependence between the response and the features. A popular proposal from the literature uses the Hilbert-Schmidt Independence Criterion (HSIC) as the nonparametric dependence measure. The rationale behind this approach to feature selection is that important feature… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  2. arXiv:2404.19145  [pdf, other

    stat.ME cs.LG econ.EM math.ST stat.ML

    Orthogonal Bootstrap: Efficient Simulation of Input Uncertainty

    Authors: Kaizhao Liu, Jose Blanchet, Lexing Ying, Yi** Lu

    Abstract: Bootstrap is a popular methodology for simulating input uncertainty. However, it can be computationally expensive when the number of samples is large. We propose a new approach called \textbf{Orthogonal Bootstrap} that reduces the number of required Monte Carlo replications. We decomposes the target being simulated into two parts: the \textit{non-orthogonal part} which has a closed-form result kno… ▽ More

    Submitted 30 April, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  3. arXiv:2404.15207  [pdf, other

    cs.CE cond-mat.mtrl-sci cs.LG stat.AP

    Simulation-Free Determination of Microstructure Representative Volume Element Size via Fisher Scores

    Authors: Wei Liu, Satyajit Mojumder, Wing Kam Liu, Wei Chen, Daniel W. Apley

    Abstract: A representative volume element (RVE) is a reasonably small unit of microstructure that can be simulated to obtain the same effective properties as the entire microstructure sample. Finite element (FE) simulation of RVEs, as opposed to much larger samples, saves computational expense, especially in multiscale modeling. Therefore, it is desirable to have a framework that determines RVE size prior t… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Journal ref: APL Mach. Learn. 2(2): 026101 (2024)

  4. arXiv:2404.10004  [pdf

    cs.LG physics.soc-ph stat.AP

    A Strategy Transfer and Decision Support Approach for Epidemic Control in Experience Shortage Scenarios

    Authors: X. Xiao, P. Chen, X. Cao, K. Liu, L. Deng, D. Zhao, Z. Chen, Q. Deng, F. Yu, H. Zhang

    Abstract: Epidemic outbreaks can cause critical health concerns and severe global economic crises. For countries or regions with new infectious disease outbreaks, it is essential to generate preventive strategies by learning lessons from others with similar risk profiles. A Strategy Transfer and Decision Support Approach (STDSA) is proposed based on the profile similarity evaluation. There are four steps in… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 20 pages, 9 figures

  5. arXiv:2404.00220  [pdf, other

    stat.ML cs.LG

    Partially-Observable Sequential Change-Point Detection for Autocorrelated Data via Upper Confidence Region

    Authors: Haijie Xu, Xiaochen Xian, Chen Zhang, Kaibo Liu

    Abstract: Sequential change point detection for multivariate autocorrelated data is a very common problem in practice. However, when the sensing resources are limited, only a subset of variables from the multivariate system can be observed at each sensing time point. This raises the problem of partially observable multi-sensor sequential change point detection. For it, we propose a detection scheme called a… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

  6. arXiv:2403.07185  [pdf, other

    cs.LG stat.ML

    Uncertainty in Graph Neural Networks: A Survey

    Authors: Fangxin Wang, Yuqing Liu, Kay Liu, Yibo Wang, Sourav Medya, Philip S. Yu

    Abstract: Graph Neural Networks (GNNs) have been extensively used in various real-world applications. However, the predictive uncertainty of GNNs stemming from diverse sources such as inherent randomness in data and model training errors can lead to unstable and erroneous predictions. Therefore, identifying, quantifying, and utilizing uncertainty are essential to enhance the performance of the model for the… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 13 main pages, 3 figures, 1 table. Under review

  7. arXiv:2402.10062  [pdf, other

    cs.LG stat.ML

    Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

    Authors: Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jie** Ye

    Abstract: For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive traini… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted by NeurIPS 2023. 19 pages

    Journal ref: NeurIPS 2023

  8. arXiv:2402.05395  [pdf, other

    stat.ME

    Efficient Estimation for Functional Accelerated Failure Time Model

    Authors: Changyu Liu, Wen Su, Kin-Yat Liu, Guosheng Yin, Xingqiu Zhao

    Abstract: We propose a functional accelerated failure time model to characterize effects of both functional and scalar covariates on the time to event of interest, and provide regularity conditions to guarantee model identifiability. For efficient estimation of model parameters, we develop a sieve maximum likelihood approach where parametric and nonparametric coefficients are bundled with an unknown baselin… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  9. arXiv:2312.00359  [pdf, other

    cs.LG stat.ML

    Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

    Authors: Yefan Zhou, Tianyu Pang, Keqin Liu, Charles H. Martin, Michael W. Mahoney, Yaoqing Yang

    Abstract: Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely ad… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: NeurIPS 2023 Spotlight, first two authors contributed equally

  10. arXiv:2311.15221  [pdf, other

    cs.IT cs.LG eess.SP math.OC math.ST stat.ML

    The Local Landscape of Phase Retrieval Under Limited Samples

    Authors: Kaizhao Liu, Zihao Wang, Lei Wu

    Abstract: In this paper, we provide a fine-grained analysis of the local landscape of phase retrieval under the regime with limited samples. Our aim is to ascertain the minimal sample size necessary to guarantee a benign local landscape surrounding global minima in high dimensions. Let $n$ and $d$ denote the sample size and input dimension, respectively. We first explore the local convexity and establish th… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: 41 pages

  11. arXiv:2310.11736  [pdf, ps, other

    math.ST math.OC stat.ML

    Kernel Learning in Ridge Regression "Automatically" Yields Exact Low Rank Solution

    Authors: Yunlu Chen, Yang Li, Keli Liu, Feng Ruan

    Abstract: We consider kernels of the form $(x,x') \mapsto φ(\|x-x'\|^2_Σ)$ parametrized by $Σ$. For such kernels, we study a variant of the kernel ridge regression problem which simultaneously optimizes the prediction function and the parameter $Σ$ of the reproducing kernel Hilbert space. The eigenspace of the $Σ$ learned from this kernel ridge regression problem can inform us which directions in covariate… ▽ More

    Submitted 27 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Add code links and correct a figure

  12. arXiv:2309.05925  [pdf, other

    cs.LG cs.AI stat.ML

    On Regularized Sparse Logistic Regression

    Authors: Mengyuan Zhang, Kai Liu

    Abstract: Sparse logistic regression is for classification and feature selection simultaneously. Although many studies have been done to solve $\ell_1$-regularized logistic regression, there is no equivalently abundant work on solving sparse logistic regression with nonconvex regularization term. In this paper, we propose a unified framework to solve $\ell_1$-regularized logistic regression, which can be na… ▽ More

    Submitted 11 October, 2023; v1 submitted 11 September, 2023; originally announced September 2023.

    Comments: Accepted to ICDM2023

  13. arXiv:2307.03034  [pdf, ps, other

    stat.ML cs.LG

    PCL-Indexability and Whittle Index for Restless Bandits with General Observation Models

    Authors: Keqin Liu, Chengzhong Zhang

    Abstract: In this paper, we consider a general observation model for restless multi-armed bandit problems. The operation of the player needs to be based on certain feedback mechanism that is error-prone due to resource constraints or environmental or intrinsic noises. By establishing a general probabilistic model for dynamics of feedback/observation, we formulate the problem as a restless bandit with a coun… ▽ More

    Submitted 3 July, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

  14. arXiv:2305.01034  [pdf, other

    cs.LG cs.AI stat.ML

    Model-agnostic Measure of Generalization Difficulty

    Authors: Akhilan Boopathy, Kevin Liu, Jaedong Hwang, Shu Ge, Asaad Mohammedsaleh, Ila Fiete

    Abstract: The measure of a machine learning algorithm is the difficulty of the tasks it can perform, and sufficiently difficult tasks are critical drivers of strong machine learning models. However, quantifying the generalization difficulty of machine learning benchmarks has remained challenging. We propose what is to our knowledge the first model-agnostic measure of the inherent generalization difficulty o… ▽ More

    Submitted 2 June, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

    Comments: Published at ICML 2023, 28 pages, 6 figures

  15. arXiv:2303.00288   

    stat.AP

    The Race of mRNA therapy: Evidence from Patent Landscape

    Authors: Jianxiong Ren, Xiaoming Zhang, Xingyong Si, Xiangjun Kong, **yu Cong, **** Wang, Xiang Li, Qianru Zhang, Peifen Yao, Mengyao Li, Yuanqi Cai, Zhaocai Sun, Kunmeng Liu, Benzheng Wei

    Abstract: mRNA therapy is gaining worldwide attention as an emerging therapeutic approach. The widespread use of mRNA vaccines during the COVID-19 outbreak has demonstrated the potential of mRNA therapy. As mRNA-based drugs have expanded and their indications have broadened, more patents for mRNA innovations have emerged. The global patent landscape for mRNA therapy has not yet been analyzed, indicating a r… ▽ More

    Submitted 15 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: I have received requests from co-authors and funding agencies to withdraw the manuscript

  16. arXiv:2212.03515  [pdf, other

    eess.SP cs.IT cs.LG stat.ML

    FPGA Implementation of Multi-Layer Machine Learning Equalizer with On-Chip Training

    Authors: Keren Liu, Erik Börjeson, Christian Häger, Per Larsson-Edefors

    Abstract: We design and implement an adaptive machine learning equalizer that alternates multiple linear and nonlinear computational layers on an FPGA. On-chip training via gradient backpropagation is shown to allow for real-time adaptation to time-varying channel impairments.

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: To be presented at the 2023 Optical Fiber Communication Conference (OFC)

  17. arXiv:2210.02192  [pdf, other

    cs.LG cs.AI cs.IT math.OC stat.ML

    Are All Losses Created Equal: A Neural Collapse Perspective

    Authors: **xin Zhou, Chong You, Xiao Li, Kangning Liu, Sheng Liu, Qing Qu, Zhihui Zhu

    Abstract: While cross entropy (CE) is the most commonly used loss to train deep neural networks for classification tasks, many alternative losses have been developed to obtain better empirical performance. Among them, which one is the best to use is still a mystery, because there seem to be multiple factors affecting the answer, such as properties of the dataset, the choice of network architecture, and so o… ▽ More

    Submitted 8 October, 2022; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 32 page, 10 figures, NeurIPS 2022

  18. arXiv:2209.05356  [pdf, ps, other

    stat.ME math.ST

    The E-Bayesian Estimation and its E-MSE of Lomax distribution under different loss functions

    Authors: Kaiwei Liu, Yuxuan Zhang

    Abstract: This paper studies the E-Bayesian (expectation of the Bayesian estimation) estimation of the parameter of Lomax distribution based on different loss functions. Under different loss functions, we calculate the Bayesian estimation of the parameter and then calculate the expectation of the estimated value to get the E-Bayesian estimation. To measure the estimated error, the E-MSE (expected mean squar… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  19. arXiv:2202.08695  [pdf, other

    cs.DL stat.AP

    Article's Scientific Prestige: measuring the impact of individual articles in the Web of Science

    Authors: Ying Chen, Thorsten Koch, Nazgul Zakiyeva, Kailiang Liu, Zhitong Xu, Chun-houh Chen, Junji Nakano, Keisuke Honda

    Abstract: We performed a citation analysis on the Web of Science publications consisting of more than 63 million articles and 1.45 billion citations on 254 subjects from 1981 to 2020. We proposed the Article's Scientific Prestige (ASP) metric and compared this metric to number of citations (#Cit) and journal grade in measuring the scientific impact of individual articles in the large-scale hierarchical and… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

  20. arXiv:2111.09473  [pdf

    stat.AP

    Number of New Top 2% Researchers from China and USA Over Time

    Authors: Lei Liu, Song Yao, Kevin Liu

    Abstract: In this paper we compare the numbers of new top 2% researchers from China and USA annually since 1980. We find that the log ratio of the numbers decreases almost linearly over time. As early as 2009, the total number of new top 2% researchers across all subfields from China exceeds that of USA. In particular, such trend is more striking in many subfields, e.g., Engineering, Chemistry, and Enabling… ▽ More

    Submitted 17 November, 2021; originally announced November 2021.

  21. arXiv:2110.05852  [pdf, other

    stat.ML cs.LG math.ST

    On the Self-Penalization Phenomenon in Feature Selection

    Authors: Michael I. Jordan, Keli Liu, Feng Ruan

    Abstract: We describe an implicit sparsity-inducing mechanism based on minimization over a family of kernels: \begin{equation*} \min_{β, f}~\widehat{\mathbb{E}}[L(Y, f(β^{1/q} \odot X)] + λ_n \|f\|_{\mathcal{H}_q}^2~~\text{subject to}~~β\ge 0, \end{equation*} where $L$ is the loss, $\odot$ is coordinate-wise multiplication and $\mathcal{H}_q$ is the reproducing kernel Hilbert space based on the kernel… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: 54 pages

  22. arXiv:2106.09387  [pdf, other

    math.ST stat.ME stat.ML

    Taming Nonconvexity in Kernel Feature Selection -- Favorable Properties of the Laplace Kernel

    Authors: Feng Ruan, Keli Liu, Michael I. Jordan

    Abstract: Kernel-based feature selection is an important tool in nonparametric statistics. Despite many practical applications of kernel-based feature selection, there is little statistical theory available to support the method. A core challenge is the objective function of the optimization problems used to define kernel-based feature selection are nonconvex. The literature has only studied the statistical… ▽ More

    Submitted 25 May, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: 26 pages main text; 74 pages total; appendix rewritten (typo fixed; proof structure reorganized)

  23. arXiv:2105.09557  [pdf, other

    cs.LG cond-mat.dis-nn cond-mat.stat-mech stat.ML

    Power-law escape rate of SGD

    Authors: Takashi Mori, Liu Ziyin, Kangqiao Liu, Masahito Ueda

    Abstract: Stochastic gradient descent (SGD) undergoes complicated multiplicative noise for the mean-square loss. We use this property of SGD noise to derive a stochastic differential equation (SDE) with simpler additive noise by performing a random time change. Using this formalism, we show that the log loss barrier $Δ\log L=\log[L(θ^s)/L(θ^*)]$ between a local minimum $θ^*$ and a saddle $θ^s$ determines th… ▽ More

    Submitted 29 January, 2022; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: 17+8 pages

  24. arXiv:2102.05375  [pdf, other

    cs.LG stat.ML

    Strength of Minibatch Noise in SGD

    Authors: Liu Ziyin, Kangqiao Liu, Takashi Mori, Masahito Ueda

    Abstract: The noise in stochastic gradient descent (SGD), caused by minibatch sampling, is poorly understood despite its practical importance in deep learning. This work presents the first systematic study of the SGD noise and fluctuations close to a local minimum. We first analyze the SGD noise in linear regression in detail and then derive a general formula for approximating SGD noise in different types o… ▽ More

    Submitted 8 March, 2022; v1 submitted 10 February, 2021; originally announced February 2021.

    Comments: ICLR 2022 spotlight

  25. arXiv:2012.03636  [pdf, other

    stat.ML cs.LG

    Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent

    Authors: Kangqiao Liu, Liu Ziyin, Masahito Ueda

    Abstract: In the vanishing learning rate regime, stochastic gradient descent (SGD) is now relatively well understood. In this work, we propose to study the basic properties of SGD and its variants in the non-vanishing learning rate regime. The focus is on deriving exactly solvable results and discussing their implications. The main contributions of this work are to derive the stationary distribution for dis… ▽ More

    Submitted 11 June, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: Camera-ready version for the Thirty-eighth International Conference on Machine Learning (ICML 2021). 12 + 14 pages, 6 + 3 figures, 1 + 0 table. *First two authors contributed equally

  26. arXiv:2011.12215  [pdf, other

    stat.ME cs.LG math.ST

    A Self-Penalizing Objective Function for Scalable Interaction Detection

    Authors: Keli Liu, Feng Ruan

    Abstract: We tackle the problem of nonparametric variable selection with a focus on discovering interactions between variables. With $p$ variables there are $O(p^s)$ possible order-$s$ interactions making exhaustive search infeasible. It is nonetheless possible to identify the variables involved in interactions with only linear computation cost, $O(p)$. The trick is to maximize a class of parametrized nonpa… ▽ More

    Submitted 12 December, 2020; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: 34 pages; the Appendix can be found on the authors' personal websites (the url is in the pdf)

  27. arXiv:2010.02506  [pdf, other

    cs.LG stat.ML

    Interactive Reinforcement Learning for Feature Selection with Decision Tree in the Loop

    Authors: Wei Fan, Kunpeng Liu, Hao Liu, Yong Ge, Hui Xiong, Yanjie Fu

    Abstract: We study the problem of balancing effectiveness and efficiency in automated feature selection. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection is mostly efficient, but difficult to identify the best subset; 2) the emerging reinforced feature selection automatically navigates to the best subset, but is usually inefficient. Can we… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2008.12001

  28. arXiv:2009.10337  [pdf, other

    cs.LG cs.RO eess.SY stat.ML

    Learning Task-Agnostic Action Spaces for Movement Optimization

    Authors: Amin Babadi, Michiel van de Panne, C. Karen Liu, Perttu Hämäläinen

    Abstract: We propose a novel method for exploring the dynamics of physically based animated characters, and learning a task-agnostic action space that makes movement optimization easier. Like several previous papers, we parameterize actions as target states, and learn a short-horizon goal-conditioned low-level control policy that drives the agent's state towards the targets. Our novel contribution is that w… ▽ More

    Submitted 23 July, 2021; v1 submitted 22 September, 2020; originally announced September 2020.

    Comments: Accepted as a regular paper by IEEE Transactions on Visualization and Computer Graphics (TVCG) in July 2021

  29. arXiv:2009.09283  [pdf, other

    cs.CV cs.AI cs.CR cs.LG stat.ML

    Subverting Privacy-Preserving GANs: Hiding Secrets in Sanitized Images

    Authors: Kang Liu, Benjamin Tan, Siddharth Garg

    Abstract: Unprecedented data collection and sharing have exacerbated privacy concerns and led to increasing interest in privacy-preserving tools that remove sensitive attributes from images while maintaining useful information for other tasks. Currently, state-of-the-art approaches use privacy-preserving generative adversarial networks (PP-GANs) for this purpose, for instance, to enable reliable facial expr… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  30. arXiv:2009.09230  [pdf, other

    cs.LG stat.ML

    Simplifying Reinforced Feature Selection via Restructured Choice Strategy of Single Agent

    Authors: Xiaosa Zhao, Kunpeng Liu, Wei Fan, Lu Jiang, Xiaowei Zhao, Minghao Yin, Yanjie Fu

    Abstract: Feature selection aims to select a subset of features to optimize the performances of downstream predictive tasks. Recently, multi-agent reinforced feature selection (MARFS) has been introduced to automate feature selection, by creating agents for each feature to select or deselect corresponding features. Although MARFS enjoys the automation of the selection process, MARFS suffers from not just th… ▽ More

    Submitted 19 September, 2020; originally announced September 2020.

  31. arXiv:2008.12001  [pdf, other

    cs.LG cs.AI stat.ML

    AutoFS: Automated Feature Selection via Diversity-aware Interactive Reinforcement Learning

    Authors: Wei Fan, Kunpeng Liu, Hao Liu, Pengyang Wang, Yong Ge, Yanjie Fu

    Abstract: In this paper, we study the problem of balancing effectiveness and efficiency in automated feature selection. Feature selection is a fundamental intelligence for machine learning and predictive analysis. After exploring many feature selection methods, we observe a computational dilemma: 1) traditional feature selection methods (e.g., mRMR) are mostly efficient, but difficult to identify the best s… ▽ More

    Submitted 16 September, 2020; v1 submitted 27 August, 2020; originally announced August 2020.

    Comments: Accepted by ICDM 2020. In this version, we revised some typos or mistakes for camera-ready

  32. arXiv:2008.03392  [pdf, other

    stat.ME cs.LG

    Grou** effects of sparse CCA models in variable selection

    Authors: Kefei Liu, Qi Long, Li Shen

    Abstract: The sparse canonical correlation analysis (SCCA) is a bi-multivariate association model that finds sparse linear combinations of two sets of variables that are maximally correlated with each other. In addition to the standard SCCA model, a simplified SCCA criterion which maixmizes the cross-covariance between a pair of canonical variables instead of their cross-correlation, is widely used in the l… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

  33. arXiv:2007.03383  [pdf, other

    cs.LG cs.IR stat.ML

    RGCF: Refined Graph Convolution Collaborative Filtering with concise and expressive embedding

    Authors: Kang Liu, Feng Xue, Richang Hong

    Abstract: Graph Convolution Network (GCN) has attracted significant attention and become the most popular method for learning graph representations. In recent years, many efforts have been focused on integrating GCN into the recommender tasks and have made remarkable progress. At its core is to explicitly capture high-order connectivities between the nodes in user-item bipartite graph. However, we theoretic… ▽ More

    Submitted 11 July, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

  34. Hybrid Spatio-Temporal Graph Convolutional Network: Improving Traffic Prediction with Navigation Data

    Authors: Rui Dai, Shenkun Xu, Qian Gu, Chenguang Ji, Kaikui Liu

    Abstract: Traffic forecasting has recently attracted increasing interest due to the popularity of online navigation services, ridesharing and smart city projects. Owing to the non-stationary nature of road traffic, forecasting accuracy is fundamentally limited by the lack of contextual information. To address this issue, we propose the Hybrid Spatio-Temporal Graph Convolutional Network (H-STGCN), which is a… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

  35. arXiv:2004.12492  [pdf, other

    cs.LG cs.CR stat.ML

    Bias Busters: Robustifying DL-based Lithographic Hotspot Detectors Against Backdooring Attacks

    Authors: Kang Liu, Benjamin Tan, Gaurav Rajavendra Reddy, Siddharth Garg, Yiorgos Makris, Ramesh Karri

    Abstract: Deep learning (DL) offers potential improvements throughout the CAD tool-flow, one promising application being lithographic hotspot detection. However, DL techniques have been shown to be especially vulnerable to inference and training time adversarial attacks. Recent work has demonstrated that a small fraction of malicious physical designers can stealthily "backdoor" a DL-based hotspot detector d… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

  36. arXiv:2003.10484  [pdf, other

    stat.AP

    Large-P Variable Selection in Two-Stage Models

    Authors: Haim Bar, Kangyan Liu

    Abstract: Model selection in the large-P small-N scenario is discussed in the framework of two-stage models. Two specific models are considered, namely, two-stage least squares (TSLS) involving instrumental variables (IVs), and mediation models. In both cases, the number of putative variables (e.g. instruments or mediators) is large, but only a small subset should be included in the two-stage model. We use… ▽ More

    Submitted 23 March, 2020; originally announced March 2020.

  37. arXiv:2002.07613  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization

    Authors: Yiqiu Shen, Nan Wu, Jason Phang, Jungkyu Park, Kangning Liu, Sudarshini Tyagi, Laura Heacock, S. Gene Kim, Linda Moy, Kyunghyun Cho, Krzysztof J. Geras

    Abstract: Medical images differ from natural images in significantly higher resolutions and smaller regions of interest. Because of these differences, neural network architectures that work well for natural images might not be applicable to medical image analysis. In this work, we extend the globally-aware multiple instance classifier, a framework we proposed to address these unique properties of medical im… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

  38. arXiv:2002.03419  [pdf, other

    q-bio.PE stat.AP

    The Alzheimer's Disease Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after 1 Year Follow-up

    Authors: Razvan V. Marinescu, Neil P. Oxtoby, Alexandra L. Young, Esther E. Bron, Arthur W. Toga, Michael W. Weiner, Frederik Barkhof, Nick C. Fox, Arman Eshaghi, Tina Toni, Marcin Salaterski, Veronika Lunina, Manon Ansart, Stanley Durrleman, Pascal Lu, Samuel Iddi, Dan Li, Wesley K. Thompson, Michael C. Donohue, Aviv Nahon, Yarden Levy, Dan Halbersberg, Mariya Cohen, Huiling Liao, Tengfei Li , et al. (71 additional authors not shown)

    Abstract: We present the findings of "The Alzheimer's Disease Prediction Of Longitudinal Evolution" (TADPOLE) Challenge, which compared the performance of 92 algorithms from 33 international teams at predicting the future trajectory of 219 individuals at risk of Alzheimer's disease. Challenge participants were required to make a prediction, for each month of a 5-year future time period, of three key outcome… ▽ More

    Submitted 27 December, 2021; v1 submitted 9 February, 2020; originally announced February 2020.

    Comments: Presents final results of the TADPOLE competition. 60 pages, 7 tables, 14 figures

    Journal ref: Machine Learning for Biomedical Imaging (MELBA), Dec 2021

  39. arXiv:2001.07646  [pdf

    stat.AP

    How Fast You Can Actually Fly: A Comparative Investigation of Flight Airborne Time in China and the U.S

    Authors: Ke Liu, Zhe Zheng, Bo Zou, Mark Hansen

    Abstract: Actual airborne time (AAT) is the time between wheels-off and wheels-on of a flight. Understanding the behavior of AAT is increasingly important given the ever growing demand for air travel and flight delays becoming more rampant. As no research on AAT exists, this paper performs the first empirical analysis of AAT behavior, comparatively for the U.S. and China. The focus is on how AAT is affected… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: 44 pages, 11 figures

    MSC Class: 62P30

  40. arXiv:1911.05142  [pdf, other

    cs.LG stat.ML

    Incentivized Exploration for Multi-Armed Bandits under Reward Drift

    Authors: Zhiyuan Liu, Huazheng Wang, Fan Shen, Kai Liu, Lijun Chen

    Abstract: We study incentivized exploration for the multi-armed bandit (MAB) problem where the players receive compensation for exploring arms other than the greedy choice and may provide biased feedback on reward. We seek to understand the impact of this drifted reward feedback by analyzing the performance of three instantiations of the incentivized MAB algorithm: UCB, $\varepsilon$-Greedy, and Thompson Sa… ▽ More

    Submitted 15 December, 2019; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: 10 pages, 2 figures, AAAI 2020

  41. arXiv:1909.07869  [pdf, other

    cs.LG stat.ML

    Visualizing Movement Control Optimization Landscapes

    Authors: Perttu Hämäläinen, Juuso Toikka, Amin Babadi, C. Karen Liu

    Abstract: A large body of animation research focuses on optimization of movement control, either as action sequences or policy parameters. However, as closed-form expressions of the objective functions are often not available, our understanding of the optimization problems is limited. Building on recent work on analyzing neural network training, we contribute novel visualizations of high-dimensional control… ▽ More

    Submitted 22 August, 2020; v1 submitted 17 September, 2019; originally announced September 2019.

    Comments: Accepted to IEEE Transactions on Visualization and Computer Graphics (IEEE TVCG)

  42. arXiv:1908.10999  [pdf, other

    cs.LG stat.ML

    Spectral Regularization for Combating Mode Collapse in GANs

    Authors: Kanglin Liu, Wenming Tang, Fei Zhou, Guo** Qiu

    Abstract: Despite excellent progress in recent years, mode collapse remains a major unsolved problem in generative adversarial networks (GANs).In this paper, we present spectral regularization for GANs (SR-GANs), a new and robust method for combating the mode collapse problem in GANs. Theoretical analysis shows that the optimal solution to the discriminator has a strong relationship to the spectral distribu… ▽ More

    Submitted 12 October, 2019; v1 submitted 28 August, 2019; originally announced August 2019.

    Comments: 24 pages, 33 figures

  43. arXiv:1908.01052  [pdf, other

    cs.LG cs.NE stat.ML

    Weight Friction: A Simple Method to Overcome Catastrophic Forgetting and Enable Continual Learning

    Authors: Gabrielle K. Liu

    Abstract: In recent years, deep neural networks have found success in replicating human-level cognitive skills, yet they suffer from several major obstacles. One significant limitation is the inability to learn new tasks without forgetting previously learned tasks, a shortcoming known as catastrophic forgetting. In this research, we propose a simple method to overcome catastrophic forgetting and enable cont… ▽ More

    Submitted 17 August, 2019; v1 submitted 2 August, 2019; originally announced August 2019.

    Comments: 9 pages, 6 figures, 1 table

  44. arXiv:1907.07129  [pdf, other

    cs.LG stat.ML

    Topology Based Scalable Graph Kernels

    Authors: Kin Sum Liu, Chien-Chun Ni, Yu-Yao Lin, Jie Gao

    Abstract: We propose a new graph kernel for graph classification and comparison using Ollivier Ricci curvature. The Ricci curvature of an edge in a graph describes the connectivity in the local neighborhood. An edge in a densely connected neighborhood has positive curvature and an edge serving as a local bridge has negative curvature. We use the edge curvature distribution to form a graph kernel which is th… ▽ More

    Submitted 14 July, 2019; originally announced July 2019.

  45. arXiv:1906.10773  [pdf, other

    cs.LG cs.CR stat.ML

    Are Adversarial Perturbations a Showstopper for ML-Based CAD? A Case Study on CNN-Based Lithographic Hotspot Detection

    Authors: Kang Liu, Haoyu Yang, Yuzhe Ma, Benjamin Tan, Bei Yu, Evangeline F. Y. Young, Ramesh Karri, Siddharth Garg

    Abstract: There is substantial interest in the use of machine learning (ML) based techniques throughout the electronic computer-aided design (CAD) flow, particularly those based on deep learning. However, while deep learning methods have surpassed state-of-the-art performance in several applications, they have exhibited intrinsic susceptibility to adversarial perturbations --- small but deliberate alteratio… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Journal ref: ACM Trans. Des. Autom. Electron. Syst. 25, 5, Article 48 (August 2020)

  46. arXiv:1904.13007  [pdf, other

    q-bio.NC cs.LG stat.ML

    Reconstruction of Natural Visual Scenes from Neural Spikes with Deep Neural Networks

    Authors: Yichen Zhang, Shanshan Jia, Ya**g Zheng, Zhaofei Yu, Yonghong Tian, Siwei Ma, Tiejun Huang, Jian K. Liu

    Abstract: Neural coding is one of the central questions in systems neuroscience for understanding how the brain processes stimulus from the environment, moreover, it is also a cornerstone for designing algorithms of brain-machine interface, where decoding incoming stimulus is highly demanded for better performance of physical devices. Traditionally researchers have focused on functional magnetic resonance i… ▽ More

    Submitted 28 January, 2020; v1 submitted 29 April, 2019; originally announced April 2019.

    Comments: 35 pages, 10 figures

    ACM Class: I.2.6

  47. arXiv:1903.06877  [pdf, other

    cs.LG math.OC stat.ML

    Spherical Principal Component Analysis

    Authors: Kai Liu, Qiuwei Li, Hua Wang, Gongguo Tang

    Abstract: Principal Component Analysis (PCA) is one of the most important methods to handle high dimensional data. However, most of the studies on PCA aim to minimize the loss after projection, which usually measures the Euclidean distance, though in some fields, angle distance is known to be more important and critical for analysis. In this paper, we propose a method by adding constraints on factors to uni… ▽ More

    Submitted 16 March, 2019; originally announced March 2019.

  48. arXiv:1903.01048  [pdf, other

    stat.AP q-bio.PE

    Early Detection of Influenza outbreaks in the United States

    Authors: Kai Liu, Ravi Srinivasan, Lauren Ancel Meyers

    Abstract: Public health surveillance systems often fail to detect emerging infectious diseases, particularly in resource limited settings. By integrating relevant clinical and internet-source data, we can close critical gaps in coverage and accelerate outbreak detection. Here, we present a multivariate algorithm that uses freely available online data to provide early warning of emerging influenza epidemics… ▽ More

    Submitted 3 March, 2019; originally announced March 2019.

  49. arXiv:1902.08411  [pdf, other

    q-bio.NC cs.LG stat.ML

    Probabilistic Inference of Binary Markov Random Fields in Spiking Neural Networks through Mean-field Approximation

    Authors: Ya**g Zheng, Shanshan Jia, Zhaofei Yu, Tiejun Huang, Jian K. Liu, Yonghong Tian

    Abstract: Recent studies have suggested that the cognitive process of the human brain is realized as probabilistic inference and can be further modeled by probabilistic graphical models like Markov random fields. Nevertheless, it remains unclear how probabilistic inference can be implemented by a network of spiking neurons in the brain. Previous studies have tried to relate the inference equation of binary… ▽ More

    Submitted 12 March, 2020; v1 submitted 22 February, 2019; originally announced February 2019.

    Comments: Accepted in Neural Networks

  50. arXiv:1811.05642  [pdf, other

    cs.LG stat.ML

    Drop** Symmetry for Fast Symmetric Nonnegative Matrix Factorization

    Authors: Zhihui Zhu, Xiao Li, Kai Liu, Qiuwei Li

    Abstract: Symmetric nonnegative matrix factorization (NMF), a special but important class of the general NMF, is demonstrated to be useful for data analysis and in particular for various clustering tasks. Unfortunately, designing fast algorithms for Symmetric NMF is not as easy as for the nonsymmetric counterpart, the latter admitting the splitting property that allows efficient alternating-type algorithms.… ▽ More

    Submitted 14 November, 2018; originally announced November 2018.

    Comments: Accepted in NIPS 2018

    MSC Class: 65K10; 90C26; 68Q25; 68W40;