Skip to main content

Showing 1–11 of 11 results for author: Zhong, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2208.07105  [pdf, other

    cs.LG cs.AI stat.ML

    Gras** Core Rules of Time Series through Pure Models

    Authors: Gedi Liu, Yifeng Jiang, Yi Ouyang, Keyang Zhong, Yang Wang

    Abstract: Time series underwent the transition from statistics to deep learning, as did many other machine learning fields. Although it appears that the accuracy has been increasing as the model is updated in a number of publicly available datasets, it typically only increases the scale by several times in exchange for a slight difference in accuracy. Through this experiment, we point out a different line o… ▽ More

    Submitted 15 August, 2022; originally announced August 2022.

    Comments: To be submitted to the conference

  2. arXiv:2006.02804  [pdf, other

    cs.LG stat.ML

    Exploring the Potential of Low-bit Training of Convolutional Neural Networks

    Authors: Kai Zhong, Xuefei Ning, Guohao Dai, Zhenhua Zhu, Tianchen Zhao, Shulin Zeng, Yu Wang, Huazhong Yang

    Abstract: In this work, we propose a low-bit training framework for convolutional neural networks, which is built around a novel multi-level scaling (MLS) tensor format. Our framework focuses on reducing the energy consumption of convolution operations by quantizing all the convolution operands to low bit-width format. Specifically, we propose the MLS tensor format, in which the element-wise bit-width can b… ▽ More

    Submitted 14 July, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: 13 pages, 7 figures

  3. arXiv:2003.12101  [pdf, other

    cs.DC cs.AR cs.LG stat.ML

    Enabling Efficient and Flexible FPGA Virtualization for Deep Learning in the Cloud

    Authors: Shulin Zeng, Guohao Dai, Hanbo Sun, Kai Zhong, Guangjun Ge, Kaiyuan Guo, Yu Wang, Huazhong Yang

    Abstract: FPGAs have shown great potential in providing low-latency and energy-efficient solutions for deep neural network (DNN) inference applications. Currently, the majority of FPGA-based DNN accelerators in the cloud run in a time-division multiplexing way for multiple users sharing a single FPGA, and require re-compilation with $\sim$100 s overhead. Such designs lead to poor isolation and heavy perform… ▽ More

    Submitted 26 March, 2020; originally announced March 2020.

  4. arXiv:1905.02331  [pdf, other

    cs.LG cs.AI cs.IR stat.ML

    Taming Pretrained Transformers for Extreme Multi-label Text Classification

    Authors: Wei-Cheng Chang, Hsiang-Fu Yu, Kai Zhong, Yiming Yang, Inderjit Dhillon

    Abstract: We consider the extreme multi-label text classification (XMC) problem: given an input text, return the most relevant labels from a large label collection. For example, the input text could be a product description on Amazon.com and the labels could be product categories. XMC is an important yet challenging problem in the NLP community. Recently, deep pretrained transformer models have achieved sta… ▽ More

    Submitted 23 June, 2020; v1 submitted 6 May, 2019; originally announced May 2019.

    Comments: KDD 2020 Applied Data Track

  5. arXiv:1806.00640  [pdf, ps, other

    stat.ML cs.LG

    Binary Classification with Karmic, Threshold-Quasi-Concave Metrics

    Authors: Bowei Yan, Oluwasanmi Koyejo, Kai Zhong, Pradeep Ravikumar

    Abstract: Complex performance measures, beyond the popular measure of accuracy, are increasingly being used in the context of binary classification. These complex performance measures are typically not even decomposable, that is, the loss evaluated on a batch of samples cannot typically be expressed as a sum or average of losses evaluated at individual samples, which in turn requires new theoretical and met… ▽ More

    Submitted 2 June, 2018; originally announced June 2018.

    Comments: ICML 2018

  6. arXiv:1805.10477  [pdf, other

    cs.LG stat.ML

    Nonlinear Inductive Matrix Completion based on One-layer Neural Networks

    Authors: Kai Zhong, Zhao Song, Prateek Jain, Inderjit S. Dhillon

    Abstract: The goal of a recommendation system is to predict the interest of a user in a given item by exploiting the existing set of ratings as well as certain user/item features. A standard approach to modeling this problem is Inductive Matrix Completion where the predicted rating is modeled as an inner product of the user and the item features projected onto a latent space. In order to learn the parameter… ▽ More

    Submitted 26 May, 2018; originally announced May 2018.

  7. arXiv:1711.03440  [pdf, other

    cs.LG cs.DS stat.ML

    Learning Non-overlap** Convolutional Neural Networks with Multiple Kernels

    Authors: Kai Zhong, Zhao Song, Inderjit S. Dhillon

    Abstract: In this paper, we consider parameter recovery for non-overlap** convolutional neural networks (CNNs) with multiple kernels. We show that when the inputs follow Gaussian distribution and the sample size is sufficiently large, the squared loss of such CNNs is $\mathit{~locally~strongly~convex}$ in a basin of attraction near the global optima for most popular activation functions, like ReLU, Leaky… ▽ More

    Submitted 8 November, 2017; originally announced November 2017.

    Comments: arXiv admin note: text overlap with arXiv:1706.03175

  8. arXiv:1706.03175  [pdf, other

    cs.LG cs.DS stat.ML

    Recovery Guarantees for One-hidden-layer Neural Networks

    Authors: Kai Zhong, Zhao Song, Prateek Jain, Peter L. Bartlett, Inderjit S. Dhillon

    Abstract: In this paper, we consider regression problems with one-hidden-layer neural networks (1NNs). We distill some properties of activation functions that lead to $\mathit{local~strong~convexity}$ in the neighborhood of the ground-truth parameters for the 1NN squared-loss objective. Most popular nonlinear activation functions satisfy the distilled properties, including rectified linear units (ReLUs), le… ▽ More

    Submitted 9 June, 2017; originally announced June 2017.

    Comments: ICML 2017

  9. arXiv:1610.07116   

    stat.ML cs.LG

    Online Classification with Complex Metrics

    Authors: Bowei Yan, Oluwasanmi Koyejo, Kai Zhong, Pradeep Ravikumar

    Abstract: We present a framework and analysis of consistent binary classification for complex and non-decomposable performance metrics such as the F-measure and the Jaccard measure. The proposed framework is general, as it applies to both batch and online learning, and to both linear and non-linear models. Our work follows recent results showing that the Bayes optimal classifier for many complex metrics is… ▽ More

    Submitted 10 February, 2018; v1 submitted 22 October, 2016; originally announced October 2016.

    Comments: An error was found in the proof

  10. arXiv:1509.01404  [pdf, ps, other

    math.NA cs.CV cs.LG math.OC stat.ML

    Coordinate Descent Methods for Symmetric Nonnegative Matrix Factorization

    Authors: Arnaud Vandaele, Nicolas Gillis, Qi Lei, Kai Zhong, Inderjit Dhillon

    Abstract: Given a symmetric nonnegative matrix $A$, symmetric nonnegative matrix factorization (symNMF) is the problem of finding a nonnegative matrix $H$, usually with much fewer columns than $A$, such that $A \approx HH^T$. SymNMF can be used for data analysis and in particular for various clustering tasks. In this paper, we propose simple and very efficient coordinate descent schemes to solve this proble… ▽ More

    Submitted 31 May, 2016; v1 submitted 4 September, 2015; originally announced September 2015.

    Comments: 25 pages, 5 figures, 7 tables. Main changes: comparison with another symNMF algorithm (namely, BetaSNMF), and correction of an error in the convergence proof

    Journal ref: IEEE Transactions on Signal Processing 64 (21), pp. 5571-5584, 2016

  11. arXiv:1406.7321  [pdf, other

    stat.ML

    Proximal Quasi-Newton for Computationally Intensive L1-regularized M-estimators

    Authors: Kai Zhong, Ian E. H. Yen, Inderjit S. Dhillon, Pradeep Ravikumar

    Abstract: We consider the class of optimization problems arising from computationally intensive L1-regularized M-estimators, where the function or gradient values are very expensive to compute. A particular instance of interest is the L1-regularized MLE for learning Conditional Random Fields (CRFs), which are a popular class of statistical models for varied structured prediction problems such as sequence la… ▽ More

    Submitted 23 January, 2015; v1 submitted 27 June, 2014; originally announced June 2014.