Skip to main content

Showing 1–8 of 8 results for author: Yeh, C M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2009.10989  [pdf, other

    cs.LG cs.AI cs.DB cs.IR stat.ML

    Towards a Flexible Embedding Learning Framework

    Authors: Chin-Chia Michael Yeh, Dhruv Gelda, Zhongfang Zhuang, Yan Zheng, Liang Gou, Wei Zhang

    Abstract: Representation learning is a fundamental building block for analyzing entities in a database. While the existing embedding learning methods are effective in various data mining problems, their applicability is often limited because these methods have pre-determined assumptions on the type of semantics captured by the learned embeddings, and the assumptions may not well align with specific downstre… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: 10 pages

  2. arXiv:2008.01670  [pdf, other

    q-fin.ST cs.LG stat.ML

    Multi-stream RNN for Merchant Transaction Prediction

    Authors: Zhongfang Zhuang, Chin-Chia Michael Yeh, Liang Wang, Wei Zhang, Junpeng Wang

    Abstract: Recently, digital payment systems have significantly changed people's lifestyles. New challenges have surfaced in monitoring and guaranteeing the integrity of payment processing systems. One important task is to predict the future transaction statistics of each merchant. These predictions can thus be used to steer other tasks, ranging from fraud detection to recommendation. This problem is challen… ▽ More

    Submitted 24 July, 2020; originally announced August 2020.

    Comments: Accepted by KDD 2020 Workshop on Machine Learning in Finance

  3. arXiv:2007.05303  [pdf, other

    cs.LG cs.AI stat.ML

    Multi-future Merchant Transaction Prediction

    Authors: Chin-Chia Michael Yeh, Zhongfang Zhuang, Wei Zhang, Liang Wang

    Abstract: The multivariate time series generated from merchant transaction history can provide critical insights for payment processing companies. The capability of predicting merchants' future is crucial for fraud detection and recommendation systems. Conventionally, this problem is formulated to predict one multivariate time series under the multi-horizon setting. However, real-world applications often re… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

  4. arXiv:1910.05862  [pdf, other

    cs.LG stat.ML

    Constrained Non-Affine Alignment of Embeddings

    Authors: Yuwei Wang, Yan Zheng, Yanqing Peng, Chin-Chia Michael Yeh, Zhongfang Zhuang, Das Mahashweta, Bendre Mangesh, Feifei Li, Wei Zhang, Jeff M. Phillips

    Abstract: Embeddings are one of the fundamental building blocks for data analysis tasks. Embeddings are already essential tools for large language models and image analysis, and their use is being extended to many other research domains. The generation of these distributed representations is often a data- and computation-expensive process; yet the holistic analysis and adjustment of them after they have bee… ▽ More

    Submitted 19 November, 2021; v1 submitted 13 October, 2019; originally announced October 2019.

  5. arXiv:1811.03149  [pdf

    cs.LG stat.ML

    Time Series Classification to Improve Poultry Welfare

    Authors: Alireza Abdoli, Amy C. Murillo, Chin-Chia M. Yeh, Alec C. Gerry, Eamonn J. Keogh

    Abstract: Poultry farms are an important contributor to the human food chain. Worldwide, humankind keeps an enormous number of domesticated birds (e.g. chickens) for their eggs and their meat, providing rich sources of low-fat protein. However, around the world, there have been growing concerns about the quality of life for the livestock in poultry farms; and increasingly vocal demands for improved standard… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

  6. arXiv:1811.03064  [pdf, other

    cs.LG cs.AI stat.ML

    Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile

    Authors: Chin-Chia Michael Yeh

    Abstract: The last decade has seen a flurry of research on all-pairs-similarity-search (or, self-join) for text, DNA, and a handful of other datatypes, and these systems have been applied to many diverse data mining problems. Surprisingly, however, little progress has been made on addressing this problem for time series subsequences. In this thesis, we have introduced a near universal time series data minin… ▽ More

    Submitted 11 July, 2020; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: PhD dissertation (2018)

  7. arXiv:1811.01557  [pdf, other

    cs.LG cs.AI stat.ML

    Representation Learning by Reconstructing Neighborhoods

    Authors: Chin-Chia Michael Yeh, Yan Zhu, Evangelos E. Papalexakis, Abdullah Mueen, Eamonn Keogh

    Abstract: Since its introduction, unsupervised representation learning has attracted a lot of attention from the research community, as it is demonstrated to be highly effective and easy-to-apply in tasks such as dimension reduction, clustering, visualization, information retrieval, and semi-supervised learning. In this work, we propose a novel unsupervised representation learning framework called neighbor-… ▽ More

    Submitted 6 November, 2018; v1 submitted 5 November, 2018; originally announced November 2018.

  8. arXiv:1810.07758  [pdf, other

    cs.LG stat.ML

    The UCR Time Series Archive

    Authors: Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, Eamonn Keogh

    Abstract: The UCR Time Series Archive - introduced in 2002, has become an important resource in the time series data mining community, with at least one thousand published papers making use of at least one data set from the archive. The original incarnation of the archive had sixteen data sets but since that time, it has gone through periodic expansions. The last expansion took place in the summer of 2015 w… ▽ More

    Submitted 8 September, 2019; v1 submitted 17 October, 2018; originally announced October 2018.