Skip to main content

Showing 1–9 of 9 results for author: Güttel, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.07679  [pdf, other

    cs.IR cs.DS

    Fast and exact fixed-radius neighbor search based on sorting

    Authors: Xinye Chen, Stefan Güttel

    Abstract: Fixed-radius near neighbor search is a fundamental data operation that retrieves all data points within a user-specified distance to a query point. There are efficient algorithms that can provide fast approximate query responses, but they often have a very compute-intensive indexing phase and require careful parameter tuning. Therefore, exact brute force and tree-based search methods are still wid… ▽ More

    Submitted 29 January, 2024; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: arXiv admin note: text overlap with arXiv:2202.01456

  2. arXiv:2202.01456  [pdf, other

    cs.LG cs.DS stat.CO stat.ML

    Fast and explainable clustering based on sorting

    Authors: Xinye Chen, Stefan Güttel

    Abstract: We introduce a fast and explainable clustering method called CLASSIX. It consists of two phases, namely a greedy aggregation phase of the sorted data into groups of nearby data points, followed by the merging of groups into clusters. The algorithm is controlled by two scalar parameters, namely a distance parameter for the aggregation and another parameter controlling the minimal cluster size. Exte… ▽ More

    Submitted 15 February, 2024; v1 submitted 3 February, 2022; originally announced February 2022.

  3. arXiv:2201.05697  [pdf, other

    cs.LG cs.MS

    An efficient aggregation method for the symbolic representation of temporal data

    Authors: Xinye Chen, Stefan Güttel

    Abstract: Symbolic representations are a useful tool for the dimension reduction of temporal data, allowing for the efficient storage of and information retrieval from time series. They can also enhance the training of machine learning algorithms on time series data through noise reduction and reduced sensitivity to hyperparameters. The adaptive Brownian bridge-based aggregation (ABBA) method is one such ef… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

  4. arXiv:2111.11251  [pdf

    cs.LG eess.SY

    Machine Learning-Based Soft Sensors for Vacuum Distillation Unit

    Authors: Kamil Oster, Stefan Güttel, Lu Chen, Jonathan L. Shapiro, Megan Jobson

    Abstract: Product quality assessment in the petroleum processing industry can be difficult and time-consuming, e.g. due to a manual collection of liquid samples from the plant and subsequent chemical laboratory analysis of the samples. The product quality is an important property that informs whether the products of the process are within the specifications. In particular, the delays caused by sample proces… ▽ More

    Submitted 19 November, 2021; originally announced November 2021.

    Comments: 9 pages; 7 figures; Conference Proceedings of 2021 AIChE Annual Meeting (7th - 19th November 2021)

  5. A comparison of LSTM and GRU networks for learning symbolic sequences

    Authors: Roberto Cahuantzi, Xinye Chen, Stefan Güttel

    Abstract: We explore the architecture of recurrent neural networks (RNNs) by studying the complexity of string sequences it is able to memorize. Symbolic sequences of different complexity are generated to simulate RNN training and study parameter configurations with a view to the network's capability of learning and inference. We compare Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs… ▽ More

    Submitted 4 January, 2023; v1 submitted 5 July, 2021; originally announced July 2021.

    MSC Class: 68T10 ACM Class: I.2.6; I.5.1

  6. arXiv:2106.14641  [pdf

    stat.AP cs.LG

    Pre-treatment of outliers and anomalies in plant data: Methodology and case study of a Vacuum Distillation Unit

    Authors: Kamil Oster, Stefan Güttel, Jonathan L. Shapiro, Lu Chen, Megan Jobson

    Abstract: Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3$σ$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall sta… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

    Comments: 33 pages, 20 figures, submitted to the Journal of Process Control (ref: JPROCONT-D-21-00332)

  7. arXiv:2003.12469  [pdf, other

    cs.LG stat.ML

    ABBA: Adaptive Brownian bridge-based symbolic aggregation of time series

    Authors: Steven Elsworth, Stefan Güttel

    Abstract: A new symbolic representation of time series, called ABBA, is introduced. It is based on an adaptive polygonal chain approximation of the time series into a sequence of tuples, followed by a mean-based clustering to obtain the symbolic representation. We show that the reconstruction error of this representation can be modelled as a random walk with pinned start and end points, a so-called Brownian… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Comments: 18 pages, 13 figures

  8. arXiv:2003.05672  [pdf, other

    cs.LG stat.ML

    Time Series Forecasting Using LSTM Networks: A Symbolic Approach

    Authors: Steven Elsworth, Stefan Güttel

    Abstract: Machine learning methods trained on raw numerical time series data exhibit fundamental limitations such as a high sensitivity to the hyper parameters and even to the initialization of random weights. A combination of a recurrent neural network with a dimension-reducing symbolic representation is proposed and applied for the purpose of time series forecasting. It is shown that the symbolic represen… ▽ More

    Submitted 12 March, 2020; originally announced March 2020.

    Comments: 12 pages, 17 figures

  9. arXiv:1407.8078  [pdf, ps, other

    math.NA cs.MS

    Zolotarev Quadrature Rules and Load Balancing for the FEAST Eigensolver

    Authors: Stefan Guettel, Eric Polizzi, ** Tak Peter Tang, Gautier Viaud

    Abstract: The FEAST method for solving large sparse eigenproblems is equivalent to subspace iteration with an approximate spectral projector and implicit orthogonalization. This relation allows to characterize the convergence of this method in terms of the error of a certain rational approximant to an indicator function. We propose improved rational approximants leading to FEAST variants with faster converg… ▽ More

    Submitted 30 July, 2014; originally announced July 2014.

    Comments: 22 pages, 8 figures