Skip to main content

Showing 1–2 of 2 results for author: Loh, J M

Searching in archive cs. Search in all archives.
.
  1. arXiv:1907.13276  [pdf, other

    cs.LG stat.ML

    Are Outlier Detection Methods Resilient to Sampling?

    Authors: Laure Berti-Equille, Ji Meng Loh, Saravanan Thirumuruganathan

    Abstract: Outlier detection is a fundamental task in data mining and has many applications including detecting errors in databases. While there has been extensive prior work on methods for outlier detection, modern datasets often have sizes that are beyond the ability of commonly used methods to process the data within a reasonable time. To overcome this issue, outlier detection methods can be trained over… ▽ More

    Submitted 30 July, 2019; originally announced July 2019.

    Comments: 18 pages

  2. arXiv:1208.1932  [pdf, other

    cs.DB

    Statistical Distortion: Consequences of Data Cleaning

    Authors: Tamraparni Dasu, Ji Meng Loh

    Abstract: We introduce the notion of statistical distortion as an essential metric for measuring the effectiveness of data cleaning strategies. We use this metric to propose a widely applicable yet scalable experimental framework for evaluating data cleaning strategies along three dimensions: glitch improvement, statistical distortion and cost-related criteria. Existing metrics focus on glitch improvement a… ▽ More

    Submitted 9 August, 2012; originally announced August 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 11, pp. 1674-1683 (2012)