Skip to main content

Showing 1–10 of 10 results for author: Cha, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2303.07160  [pdf, ps, other

    cs.LG math.OC stat.ML

    Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

    Authors: Jaeyoung Cha, Jaewook Lee, Chulhee Yun

    Abstract: We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components $n$ and the number of epochs $K$, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the… ▽ More

    Submitted 9 June, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: 58 pages

  2. Spam four ways: Making sense of text data

    Authors: Nicholas J. Horton, Jie Chao, William Finzer, Phebe Palmer

    Abstract: The world is full of text data, yet text analytics has not traditionally played a large part in statistics education. We consider four different ways to provide students with opportunities to explore whether email messages are unwanted correspondence (spam). Text from subject lines are used to identify features that can be used in classification. The approaches include use of a Model Eliciting Act… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: in press, CHANCE

  3. arXiv:2105.03855  [pdf

    cs.LG stat.ML

    GMOTE: Gaussian based minority oversampling technique for imbalanced classification adapting tail probability of outliers

    Authors: Seung Jee Yang, Kyung Joon Cha

    Abstract: Classification of imbalanced data is one of the common problems in the recent field of data mining. Imbalanced data substantially affects the performance of standard classification models. Data-level approaches mainly use the oversampling methods to solve the problem, such as synthetic minority oversampling Technique (SMOTE). However, since the methods such as SMOTE generate instances by linear in… ▽ More

    Submitted 9 May, 2021; originally announced May 2021.

    Comments: 20 pages, 6 figures

    MSC Class: 62P99

  4. arXiv:2011.08930  [pdf, other

    cs.LG stat.ML

    Distributed Online Learning with Multiple Kernels

    Authors: Jeongmin Chae, Songnam Hong

    Abstract: In the Internet-of-Things (IoT) systems, there are plenty of informative data provided by a massive number of IoT devices (e.g., sensors). Learning a function from such data is of great interest in machine learning tasks for IoT systems. Focusing on streaming (or sequential) data, we present a privacy-preserving distributed online learning framework with multiplekernels (named DOMKL). The proposed… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  5. arXiv:2005.03188  [pdf, other

    cs.LG cs.IT stat.ML

    Active Learning with Multiple Kernels

    Authors: Songnam Hong, Jeongmin Chae

    Abstract: Online multiple kernel learning (OMKL) has provided an attractive performance in nonlinear function learning tasks. Leveraging a random feature approximation, the major drawback of OMKL, known as the curse of dimensionality, has been recently alleviated. In this paper, we introduce a new research problem, termed (stream-based) active multiple kernel learning (AMKL), in which a learner is allowed t… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

  6. arXiv:1909.12291  [pdf, other

    cs.LG cs.DC stat.ML

    Exascale Deep Learning to Accelerate Cancer Research

    Authors: Robert M. Patton, J. Travis Johnston, Steven R. Young, Catherine D. Schuman, Thomas E. Potok, Derek C. Rose, Seung-Hwan Lim, Junghoon Chae, Le Hou, Shahira Abousamra, Dimitris Samaras, Joel Saltz

    Abstract: Deep learning, through the use of neural networks, has demonstrated remarkable ability to automate many routine tasks when presented with sufficient data for training. The neural network architecture (e.g. number of layers, types of layers, connections between layers, etc.) plays a critical role in determining what, if anything, the neural network is able to learn from the training data. The trend… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: Submitted to IEEE Big Data

  7. arXiv:1906.00852  [pdf, other

    cs.LG stat.ML

    Hierarchical Auxiliary Learning

    Authors: Jaehoon Cha, Kyeong Soo Kim, Sanghyuk Lee

    Abstract: Conventional application of convolutional neural networks (CNNs) for image classification and recognition is based on the assumption that all target classes are equal(i.e., no hierarchy) and exclusive of one another (i.e., no overlap). CNN-based image classifiers built on this assumption, therefore, cannot take into account an innate hierarchy among target classes (e.g., cats and dogs in animal im… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  8. arXiv:1901.08479  [pdf, other

    cs.LG stat.ML

    On the Transformation of Latent Space in Autoencoders

    Authors: Jaehoon Cha, Kyeong Soo Kim, Sanghyuk Lee

    Abstract: Noting the importance of the latent variables in inference and learning, we propose a novel framework for autoencoders based on the homeomorphic transformation of latent variables, which could reduce the distance between vectors in the transformed space, while preserving the topological properties of the original space, and investigate the effect of the latent space transformation on learning gene… ▽ More

    Submitted 3 June, 2019; v1 submitted 24 January, 2019; originally announced January 2019.

  9. arXiv:1801.05755  [pdf

    stat.OT math.OC

    Discussions on non-probabilistic convex modelling for uncertain problems

    Authors: Ni Bingyu, Jiang Chao, Huang Zhiliang

    Abstract: Non-probabilistic convex model utilizes a convex set to quantify the uncertainty domain of uncertain-but-bounded parameters, which is very effective for structural uncertainty analysis with limited or poor-quality experimental data. To overcome the complexity and diversity of the formulations of current convex models, in this paper, a unified framework for construction of the non-probabilistic con… ▽ More

    Submitted 29 November, 2017; originally announced January 2018.

  10. arXiv:1710.06618   

    stat.AP

    On Optimal Operational Sequence of Components in a Warm Standby System

    Authors: M. Finkelstein, N. K. Hazra, J. H. Cha

    Abstract: We consider an open problem of optimal operational sequence for the $1$-out-of-$n$ system with warm standby. Using the virtual age concept and the cumulative exposure model, we show that the components should be activated in accordance with the increasing sequence of their lifetimes. Lifetimes of the components and the system are compared with respect to the stochastic precedence order. Only speci… ▽ More

    Submitted 6 December, 2018; v1 submitted 18 October, 2017; originally announced October 2017.

    Comments: The proof of one of the theorems is erroneous. Apart from this, there are some other technical issues