Skip to main content

Showing 1–24 of 24 results for author: Keogh, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.09489  [pdf

    cs.LG cs.AI

    PUPAE: Intuitive and Actionable Explanations for Time Series Anomalies

    Authors: Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn J. Keogh

    Abstract: In recent years there has been significant progress in time series anomaly detection. However, after detecting an (perhaps tentative) anomaly, can we explain it? Such explanations would be useful to triage anomalies. For example, in an oil refinery, should we respond to an anomaly by dispatching a hydraulic engineer, or an intern to replace the battery on a sensor? There have been some parallel ef… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 9 Page Manuscript, 1 Page Supplementary (Supplement not published in conference proceedings.)

    Journal ref: SIAM SDM 2024

  2. arXiv:2311.03393  [pdf, other

    cs.DB cs.AI

    Sketching Multidimensional Time Series for Fast Discord Mining

    Authors: Chin-Chia Michael Yeh, Yan Zheng, Menghai Pan, Huiyuan Chen, Zhongfang Zhuang, Junpeng Wang, Liang Wang, Wei Zhang, Jeff M. Phillips, Eamonn Keogh

    Abstract: Time series discords are a useful primitive for time series anomaly detection, and the matrix profile is capable of capturing discord effectively. There exist many research efforts to improve the scalability of discord discovery with respect to the length of time series. However, there is surprisingly little work focused on reducing the time complexity of matrix profile computation associated with… ▽ More

    Submitted 7 December, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

  3. arXiv:2311.02563  [pdf, other

    cs.DB cs.AI cs.CR cs.LG

    Time Series Synthesis Using the Matrix Profile for Anonymization

    Authors: Audrey Der, Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: Publishing and sharing data is crucial for the data mining community, allowing collaboration and driving open innovation. However, many researchers cannot release their data due to privacy regulations or fear of leaking confidential business information. To alleviate such issues, we propose the Time Series Synthesis Using the Matrix Profile (TSSUMP) method, where synthesized time series can be rel… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  4. arXiv:2311.02561  [pdf, other

    cs.LG cs.AI

    Ego-Network Transformer for Subsequence Classification in Time Series Data

    Authors: Chin-Chia Michael Yeh, Huiyuan Chen, Yujie Fan, Xin Dai, Yan Zheng, Vivian Lai, Junpeng Wang, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: Time series classification is a widely studied problem in the field of time series data mining. Previous research has predominantly focused on scenarios where relevant or foreground subsequences have already been extracted, with each subsequence corresponding to a single label. However, real-world time series data often contain foreground subsequences that are intertwined with background subsequen… ▽ More

    Submitted 5 November, 2023; originally announced November 2023.

  5. arXiv:2212.06146  [pdf

    cs.LG cs.AI

    Matrix Profile XXVII: A Novel Distance Measure for Comparing Long Time Series

    Authors: Audrey Der, Chin-Chia Michael Yeh, Renjie Wu, Junpeng Wang, Yan Zheng, Zhongfang Zhuang, Liang Wang, Wei Zhang, Eamonn Keogh

    Abstract: The most useful data mining primitives are distance measures. With an effective distance measure, it is possible to perform classification, clustering, anomaly detection, segmentation, etc. For single-event time series Euclidean Distance and Dynamic Time War** distance are known to be extremely effective. However, for time series containing cyclical behaviors, the semantic meaningfulness of such… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: Accepted at IEEE ICKG 2022. (Previously entitled IEEE ICBK.) Abridged abstract as per arxiv's requirements

  6. arXiv:2112.12965  [pdf, other

    cs.DB cs.LG

    Error-bounded Approximate Time Series Joins Using Compact Dictionary Representations of Time Series

    Authors: Chin-Chia Michael Yeh, Yan Zheng, Junpeng Wang, Huiyuan Chen, Zhongfang Zhuang, Wei Zhang, Eamonn Keogh

    Abstract: The matrix profile is an effective data mining tool that provides similarity join functionality for time series data. Users of the matrix profile can either join a time series with itself using intra-similarity join (i.e., self-join) or join a time series with another time series using inter-similarity join. By invoking either or both types of joins, the matrix profile can help users discover both… ▽ More

    Submitted 5 November, 2023; v1 submitted 24 December, 2021; originally announced December 2021.

  7. When is Early Classification of Time Series Meaningful?

    Authors: Renjie Wu, Audrey Der, Eamonn J. Keogh

    Abstract: Since its introduction two decades ago, there has been increasing interest in the problem of early classification of time series. This problem generalizes classic time series classification to ask if we can classify a time series subsequence with sufficient accuracy and confidence after seeing only some prefix of a target pattern. The idea is that the earlier classification would allow us to take… ▽ More

    Submitted 3 September, 2022; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: Full paper accepted by IEEE TKDE, extended abstract accepted by IEEE ICDE 2022

    Journal ref: 38th IEEE International Conference on Data Engineering (ICDE), 2022, pp. 1477-1478

  8. Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress

    Authors: Renjie Wu, Eamonn J. Keogh

    Abstract: Time series anomaly detection has been a perennially important topic in data science, with papers dating back to the 1950s. However, in recent years there has been an explosion of interest in this topic, much of it driven by the success of deep learning in other domains and for other time series tasks. Most of these papers test on one or more of a handful of popular benchmark datasets, created by… ▽ More

    Submitted 3 September, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: Full paper accepted by IEEE TKDE, extended abstract accepted by IEEE ICDE 2022

    Journal ref: 38th IEEE International Conference on Data Engineering (ICDE), 2022, pp. 1479-1480

  9. arXiv:2009.07907  [pdf

    cs.LG stat.ML

    Matrix Profile XXII: Exact Discovery of Time Series Motifs under DTW

    Authors: Sara Alaee, Kaveh Kamgar, Eamonn Keogh

    Abstract: Over the last decade, time series motif discovery has emerged as a useful primitive for many downstream analytical tasks, including clustering, classification, rule discovery, segmentation, and summarization. In parallel, there has been an increased understanding that Dynamic Time War** (DTW) is the best time series similarity measure in a host of settings. Surprisingly however, there has been v… ▽ More

    Submitted 16 September, 2020; originally announced September 2020.

  10. arXiv:2008.13447  [pdf, other

    cs.DB

    Matrix Profile Goes MAD: Variable-Length Motif And Discord Discovery in Data Series

    Authors: Michele Linardi, Yan Zhu, Themis Palpanas, Eamonn Keogh

    Abstract: In the last fifteen years, data series motif and discord discovery have emerged as two useful and well-used primitives for data series mining, with applications to many domains, including robotics, entomology, seismology, medicine, and climatology. Nevertheless, the state-of-the-art motif and discord discovery tools still require the user to provide the relative length. Yet, in several cases, the… ▽ More

    Submitted 31 August, 2020; originally announced August 2020.

  11. arXiv:2008.13432  [pdf, other

    cs.DB

    VALMOD: A Suite for Easy and Exact Detection of Variable Length Motifs in Data Series

    Authors: Michele Linardi, Yan Zhu, Themis Palpanas, Eamonn Keogh

    Abstract: Data series motif discovery represents one of the most useful primitives for data series mining, with applications to many domains, such as robotics, entomology, seismology, medicine, and climatology, and others. The state-of-the-art motif discovery tools still require the user to provide the motif length. Yet, in several cases, the choice of motif length is critical for their detection. Unfortuna… ▽ More

    Submitted 31 August, 2020; originally announced August 2020.

  12. FastDTW is approximate and Generally Slower than the Algorithm it Approximates

    Authors: Renjie Wu, Eamonn J. Keogh

    Abstract: Many time series data mining problems can be solved with repeated use of distance measure. Examples of such tasks include similarity search, clustering, classification, anomaly detection and segmentation. For over two decades it has been known that the Dynamic Time War** (DTW) distance measure is the best measure to use for most tasks, in most domains. Because the classic DTW algorithm has quadr… ▽ More

    Submitted 3 September, 2022; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: Full paper accepted by IEEE TKDE, extended abstract accepted by IEEE ICDE 2021

    Journal ref: IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 8, pp. 3779-3785; 37th IEEE International Conference on Data Engineering (ICDE), 2021, pp. 2327-2328

  13. arXiv:1912.09614  [pdf

    cs.LG stat.ML

    Features or Shape? Tackling the False Dichotomy of Time Series Classification

    Authors: Sara Alaee, Alireza Abdoli, Christian Shelton, Amy C. Murillo, Alec C. Gerry, Eamonn Keogh

    Abstract: Time series classification is an important task in its own right, and it is often a precursor to further downstream analytics. To date, virtually all works in the literature have used either shape-based classification using a distance measure or feature-based classification after finding some suitable features for the domain. It seems to be underappreciated that in many datasets it is the case tha… ▽ More

    Submitted 19 December, 2019; originally announced December 2019.

  14. Time Series Classification: Lessons Learned in the (Literal) Field while Studying Chicken Behavior

    Authors: Alireza Abdoli, Amy C. Murillo, Alec C. Gerry, Eamonn J. Keogh

    Abstract: Poultry farms are a major contributor to the human food chain. However, around the world, there have been growing concerns about the quality of life for the livestock in poultry farms; and increasingly vocal demands for improved standards of animal welfare. Recent advances in sensing technologies and machine learning allow the possibility of monitoring birds, and employing the lessons learned to i… ▽ More

    Submitted 20 December, 2019; v1 submitted 21 November, 2019; originally announced December 2019.

    Comments: arXiv admin note: text overlap with arXiv:1811.03149

  15. arXiv:1910.04341  [pdf, other

    cs.LG stat.ML

    Time series classification for varying length series

    Authors: Chang Wei Tan, Francois Petitjean, Eamonn Keogh, Geoffrey I. Webb

    Abstract: Research into time series classification has tended to focus on the case of series of uniform length. However, it is common for real-world time series data to have unequal lengths. Differing time series lengths may arise from a number of fundamentally different mechanisms. In this work, we identify and evaluate two classes of such mechanisms -- variations in sampling rate relative to the relevant… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: 23 pages

  16. arXiv:1811.03149  [pdf

    cs.LG stat.ML

    Time Series Classification to Improve Poultry Welfare

    Authors: Alireza Abdoli, Amy C. Murillo, Chin-Chia M. Yeh, Alec C. Gerry, Eamonn J. Keogh

    Abstract: Poultry farms are an important contributor to the human food chain. Worldwide, humankind keeps an enormous number of domesticated birds (e.g. chickens) for their eggs and their meat, providing rich sources of low-fat protein. However, around the world, there have been growing concerns about the quality of life for the livestock in poultry farms; and increasingly vocal demands for improved standard… ▽ More

    Submitted 7 November, 2018; originally announced November 2018.

  17. arXiv:1811.01557  [pdf, other

    cs.LG cs.AI stat.ML

    Representation Learning by Reconstructing Neighborhoods

    Authors: Chin-Chia Michael Yeh, Yan Zhu, Evangelos E. Papalexakis, Abdullah Mueen, Eamonn Keogh

    Abstract: Since its introduction, unsupervised representation learning has attracted a lot of attention from the research community, as it is demonstrated to be highly effective and easy-to-apply in tasks such as dimension reduction, clustering, visualization, information retrieval, and semi-supervised learning. In this work, we propose a novel unsupervised representation learning framework called neighbor-… ▽ More

    Submitted 6 November, 2018; v1 submitted 5 November, 2018; originally announced November 2018.

  18. arXiv:1811.00075  [pdf, other

    cs.LG stat.ML

    The UEA multivariate time series classification archive, 2018

    Authors: Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, Eamonn Keogh

    Abstract: In 2002, the UCR time series classification archive was first released with sixteen datasets. It gradually expanded, until 2015 when it increased in size from 45 datasets to 85 datasets. In October 2018 more datasets were added, bringing the total to 128. The new archive contains a wide range of problems, including variable length series, but it still only contains univariate time series classific… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

  19. arXiv:1810.07758  [pdf, other

    cs.LG stat.ML

    The UCR Time Series Archive

    Authors: Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, Eamonn Keogh

    Abstract: The UCR Time Series Archive - introduced in 2002, has become an important resource in the time series data mining community, with at least one thousand published papers making use of at least one data set from the archive. The original incarnation of the archive had sixteen data sets but since that time, it has gone through periodic expansions. The last expansion took place in the summer of 2015 w… ▽ More

    Submitted 8 September, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

  20. arXiv:1802.05472  [pdf

    cs.LG cs.AI stat.ML

    Admissible Time Series Motif Discovery with Missing Data

    Authors: Yan Zhu, Abdullah Mueen, Eamonn Keogh

    Abstract: The discovery of time series motifs has emerged as one of the most useful primitives in time series data mining. Researchers have shown its utility for exploratory data mining, summarization, visualization, segmentation, classification, clustering, and rule discovery. Although there has been more than a decade of extensive research, there is still no technique to allow the discovery of time series… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

  21. arXiv:1711.05586  [pdf, other

    cs.CV

    People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting

    Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Ciara E. Keogh, Noel E. O'Connor

    Abstract: In this paper we propose a technique to adapt a convolutional neural network (CNN) based object counter to additional visual domains and object types while still preserving the original counting function. Domain-specific normalisation and scaling operators are trained to allow the model to adjust to the statistical distributions of the various visual domains. The developed adaptation technique is… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: 10 pages

  22. arXiv:1612.00637  [pdf

    cs.LG

    A General Framework for Density Based Time Series Clustering Exploiting a Novel Admissible Pruning Strategy

    Authors: Nurjahan Begum, Liudmila Ulanova, Hoang Anh Dau, Jun Wang, Eamonn Keogh

    Abstract: Time Series Clustering is an important subroutine in many higher-level data mining analyses, including data editing for classifiers, summarization, and outlier detection. It is well known that for similarity search the superiority of Dynamic Time War** (DTW) over Euclidean distance gradually diminishes as we consider ever larger datasets. However, as we shall show, the same is not true for clust… ▽ More

    Submitted 2 December, 2016; originally announced December 2016.

  23. arXiv:1403.2654  [pdf

    cs.LG cs.CE

    Flying Insect Classification with Inexpensive Sensors

    Authors: Yan** Chen, Adena Why, Gustavo Batista, Agenor Mafra-Neto, Eamonn Keogh

    Abstract: The ability to use inexpensive, noninvasive sensors to accurately classify flying insects would have significant implications for entomological research, and allow for the development of many useful applications in vector control for both medical and agricultural entomology. Given this, the last sixty years have seen many research efforts on this task. To date, however, none of this research has h… ▽ More

    Submitted 11 March, 2014; originally announced March 2014.

    MSC Class: 68T00 ACM Class: I.2.6

  24. arXiv:1012.2789  [pdf, ps, other

    cs.AI

    Experimental Comparison of Representation Methods and Distance Measures for Time Series Data

    Authors: Xiaoyue Wang, Hui Ding, Goce Trajcevski, Peter Scheuermann, Eamonn Keogh

    Abstract: The previous decade has brought a remarkable increase of the interest in applications that deal with querying and mining of time series data. Many of the research efforts in this context have focused on introducing new representation methods for dimensionality reduction or novel similarity measures for the underlying data. In the vast majority of cases, each individual work introducing a particula… ▽ More

    Submitted 9 December, 2010; originally announced December 2010.