Skip to main content

Showing 1–15 of 15 results for author: Yin, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.15086  [pdf, other

    stat.ME

    A modified debiased inverse-variance weighted estimator in two-sample summary-data Mendelian randomization

    Authors: Youpeng Su, Siqi Xu, Yilei Ma, ** Yin, Wing Kam Fung, Hongwei Jiang, Peng Wang

    Abstract: Mendelian randomization uses genetic variants as instrumental variables to make causal inferences about the effects of modifiable risk factors on diseases from observational data. One of the major challenges in Mendelian randomization is that many genetic variants are only modestly or even weakly associated with the risk factor of interest, a setting known as many weak instruments. Many existing m… ▽ More

    Submitted 18 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: 33 pages, 6 figures

  2. arXiv:2204.03758  [pdf, other

    cs.LG cs.PL stat.ML

    Compositional Generalization and Decomposition in Neural Program Synthesis

    Authors: Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

    Abstract: When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks. While it is difficult to measure whether neural program synthesis methods have similar capabilities, what we can measure is whether they compositionally generalize, that is, whether a model that has been trained on the simpler subtasks is subsequently able to solve… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: Published at the Deep Learning for Code (DL4C) Workshop at ICLR 2022

  3. arXiv:2002.12563  [pdf, other

    cs.LG math.OC stat.ML

    Global Convergence and Geometric Characterization of Slow to Fast Weight Evolution in Neural Network Training for Classifying Linearly Non-Separable Data

    Authors: Ziang Long, Penghang Yin, Jack Xin

    Abstract: In this paper, we study the dynamics of gradient descent in learning neural networks for classification problems. Unlike in existing works, we consider the linearly non-separable case where the training data of different classes lie in orthogonal subspaces. We show that when the network has sufficient (but not exceedingly large) number of neurons, (1) the corresponding minimization problem has a d… ▽ More

    Submitted 10 December, 2020; v1 submitted 28 February, 2020; originally announced February 2020.

  4. arXiv:1903.05662  [pdf, other

    cs.LG math.OC stat.ML

    Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets

    Authors: Penghang Yin, Jiancheng Lyu, Shuai Zhang, Stanley Osher, Yingyong Qi, Jack Xin

    Abstract: Training activation quantized neural networks involves minimizing a piecewise constant function whose gradient vanishes almost everywhere, which is undesirable for the standard back-propagation or chain rule. An empirical way around this issue is to use a straight-through estimator (STE) (Bengio et al., 2013) in the backward pass only, so that the "gradient" through the modified chain rule becomes… ▽ More

    Submitted 25 September, 2019; v1 submitted 13 March, 2019; originally announced March 2019.

    Comments: in International Conference on Learning Representations (ICLR) 2019

  5. arXiv:1811.01777  [pdf, other

    math.OC stat.ML

    Non-ergodic Convergence Analysis of Heavy-Ball Algorithms

    Authors: Tao Sun, Penghang Yin, Dongsheng Li, Chun Huang, Lei Guan, Hao Jiang

    Abstract: In this paper, we revisit the convergence of the Heavy-ball method, and present improved convergence complexity results in the convex setting. We provide the first non-ergodic O(1/k) rate result of the Heavy-ball algorithm with constant step size for coercive objective functions. For objective functions satisfying a relaxed strongly convex condition, the linear convergence is established under wea… ▽ More

    Submitted 9 November, 2018; v1 submitted 5 November, 2018; originally announced November 2018.

  6. arXiv:1810.13337  [pdf, other

    cs.LG cs.SE stat.ML

    Learning to Represent Edits

    Authors: Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt

    Abstract: We introduce the problem of learning distributed representations of edits. By combining a "neural editor" with an "edit encoder", our models learn to represent the salient information of an edit and can be used to apply edits to new inputs. We experiment on natural language and source code edit data. Our evaluation yields promising results that suggest that our neural network models learn to captu… ▽ More

    Submitted 22 February, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

    Comments: ICLR 2019

  7. arXiv:1809.08516  [pdf, other

    cs.LG math.NA stat.ML

    Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

    Authors: Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher

    Abstract: We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation. This data-dependent activation remarkably improves both the generalization and robustness of DNN. In the CIFAR10 benchmark, we raise the robust accuracy of the adversarially trained ResNet20 from $\sim 46\%$ to $\sim 69\%$ under the state-of-the-art Iterative Fast… ▽ More

    Submitted 29 April, 2020; v1 submitted 22 September, 2018; originally announced September 2018.

    Comments: 17 pages, 6 figures

    MSC Class: 68Pxx

    Journal ref: Inverse Problems and Imaging, 2020

  8. arXiv:1808.05240  [pdf, other

    cs.LG cs.CV stat.ML

    Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks

    Authors: Penghang Yin, Shuai Zhang, Jiancheng Lyu, Stanley Osher, Yingyong Qi, Jack Xin

    Abstract: Quantized deep neural networks (QDNNs) are attractive due to their much lower memory storage and faster inference speed than their regular full precision counterparts. To maintain the same performance level especially at low bit-widths, QDNNs must be retrained. Their training involves piecewise constant activation functions and discrete weights, hence mathematical challenges arise. We introduce th… ▽ More

    Submitted 6 January, 2019; v1 submitted 15 August, 2018; originally announced August 2018.

  9. arXiv:1806.06317  [pdf, other

    cs.LG math.NA stat.ML

    Laplacian Smoothing Gradient Descent

    Authors: Stanley Osher, Bao Wang, Penghang Yin, Xiyang Luo, Farzin Barekat, Minh Pham, Alex Lin

    Abstract: We propose a class of very simple modifications of gradient descent and stochastic gradient descent. We show that when applied to a large variety of machine learning problems, ranging from logistic regression to deep neural nets, the proposed surrogates can dramatically reduce the variance, allow to take a larger step size, and improve the generalization accuracy. The methods only involve multiply… ▽ More

    Submitted 27 April, 2019; v1 submitted 16 June, 2018; originally announced June 2018.

    Comments: 28 pages, 15 figures

    MSC Class: 65-06

  10. arXiv:1711.08833  [pdf, other

    cs.LG math.NA stat.ML

    Deep Learning for Real-Time Crime Forecasting and its Ternarization

    Authors: Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin

    Abstract: Real-time crime forecasting is important. However, accurate prediction of when and where the next crime will happen is difficult. No known physical model provides a reasonable approximation to such a complex system. Historical crime data are sparse in both space and time and the signal of interests is weak. In this work, we first present a proper representation of crime data. We then adapt the spa… ▽ More

    Submitted 23 November, 2017; originally announced November 2017.

    Comments: 14 pages, 7 figures

    MSC Class: 62-07

  11. arXiv:1710.07746  [pdf, other

    math.OC cs.LG stat.ML

    Stochastic Backward Euler: An Implicit Gradient Descent Algorithm for $k$-means Clustering

    Authors: Penghang Yin, Minh Pham, Adam Oberman, Stanley Osher

    Abstract: In this paper, we propose an implicit gradient descent algorithm for the classic $k$-means problem. The implicit gradient step or backward Euler is solved via stochastic fixed-point iteration, in which we randomly sample a mini-batch gradient in every iteration. It is the average of the fixed-point trajectory that is carried over to the next gradient step. We draw connections between the proposed… ▽ More

    Submitted 21 May, 2018; v1 submitted 20 October, 2017; originally announced October 2017.

  12. arXiv:1705.07136  [pdf, other

    cs.LG cs.CL stat.ML

    Softmax Q-Distribution Estimation for Structured Prediction: A Theoretical Interpretation for RAML

    Authors: Xuezhe Ma, Pengcheng Yin, **gzhou Liu, Graham Neubig, Eduard Hovy

    Abstract: Reward augmented maximum likelihood (RAML), a simple and effective learning framework to directly optimize towards the reward function in structured prediction tasks, has led to a number of impressive empirical successes. RAML incorporates task-specific reward by performing maximum-likelihood updates on candidate outputs sampled according to an exponentiated payoff distribution, which gives higher… ▽ More

    Submitted 27 October, 2017; v1 submitted 19 May, 2017; originally announced May 2017.

    Comments: Under Review of ICLR 2018

  13. arXiv:1701.03980  [pdf, other

    stat.ML cs.CL cs.MS

    DyNet: The Dynamic Neural Network Toolkit

    Authors: Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

    Abstract: We describe DyNet, a toolkit for implementing neural network models based on dynamic declaration of network structure. In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its deriva… ▽ More

    Submitted 14 January, 2017; originally announced January 2017.

    Comments: 33 pages

  14. arXiv:1501.05788  [pdf, ps, other

    stat.ME

    Simulation-based Sensitivity Analysis for Non-ignorable Missing Data

    Authors: Peng Yin, Jian Qing Shi

    Abstract: Sensitivity analysis is popular in dealing with missing data problems particularly for non-ignorable missingness. It analyses how sensitively the conclusions may depend on assumptions about missing data e.g. missing data mechanism (MDM). We called models under certain assumptions sensitivity models. To make sensitivity analysis useful in practice we need to define some simple and interpretable sta… ▽ More

    Submitted 23 January, 2015; originally announced January 2015.

    Comments: 18 pages, two additional examples at Appendix. Novel approach for sensitivity analysis

  15. arXiv:1301.0339  [pdf, ps, other

    math.NA stat.ML

    A Geometric Blind Source Separation Method Based on Facet Component Analysis

    Authors: P. Yin, Y. Sun, J. Xin

    Abstract: Given a set of mixtures, blind source separation attempts to retrieve the source signals without or with very little information of the the mixing process. We present a geometric approach for blind separation of nonnegative linear mixtures termed {\em facet component analysis} (FCA). The approach is based on facet identification of the underlying cone structure of the data. Earlier works focus on… ▽ More

    Submitted 2 January, 2013; originally announced January 2013.