Skip to main content

Showing 1–4 of 4 results for author: Lorena, A C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2212.01897  [pdf, other

    cs.LG

    Characterizing instance hardness in classification and regression problems

    Authors: Gustavo P. Torquette, Victor S. Nunes, Pedro Y. A. Paiva, Lourenço B. C. Neto, Ana C. Lorena

    Abstract: Some recent pieces of work in the Machine Learning (ML) literature have demonstrated the usefulness of assessing which observations are hardest to have their label predicted accurately. By identifying such instances, one may inspect whether they have any quality issues that should be addressed. Learning strategies based on the difficulty level of the observations can also be devised. This paper pr… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

  2. arXiv:2201.09936  [pdf, other

    cs.SI cs.LG

    Community-based anomaly detection using spectral graph filtering

    Authors: Rodrigo Francisquini, Ana Carolina Lorena, Mariá C. V. Nascimento

    Abstract: Several applications have a community structure where the nodes of the same community share similar attributes. Anomaly or outlier detection in networks is a relevant and widely studied research topic with applications in various domains. Despite a significant amount of anomaly detection frameworks, there is a dearth on the literature of methods that consider both attributed graphs and the communi… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  3. arXiv:2109.14430  [pdf, other

    cs.LG cs.HC

    PyHard: a novel tool for generating hardness embeddings to support data-centric analysis

    Authors: Pedro Yuri Arbs Paiva, Kate Smith-Miles, Maria Gabriela Valeriano, Ana Carolina Lorena

    Abstract: For building successful Machine Learning (ML) systems, it is imperative to have high quality data and well tuned learning models. But how can one assess the quality of a given dataset? And how can the strengths and weaknesses of a model on a dataset be revealed? Our new tool PyHard employs a methodology known as Instance Space Analysis (ISA) to produce a hardness embedding of a dataset relating th… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

  4. arXiv:1808.03591  [pdf, other

    cs.LG stat.ML

    How Complex is your classification problem? A survey on measuring classification complexity

    Authors: Ana C. Lorena, Luís P. F. Garcia, Jens Lehmann, Marcilio C. P. Souto, Tin K. Ho

    Abstract: Characteristics extracted from the training datasets of classification problems have proven to be effective predictors in a number of meta-analyses. Among them, measures of classification complexity can be used to estimate the difficulty in separating the data points into their expected classes. Descriptors of the spatial distribution of the data and estimates of the shape and size of the decision… ▽ More

    Submitted 30 December, 2020; v1 submitted 10 August, 2018; originally announced August 2018.

    Comments: Survey paper