-
Insect Identification in the Wild: The AMI Dataset
Authors:
Aditya Jain,
Fagner Cunha,
Michael James Bunsen,
Juan Sebastián Cañas,
Léonard Pasi,
Nathan Pinoy,
Flemming Helsing,
JoAnne Russo,
Marc Botham,
Michael Sabourin,
Jonathan Fréchette,
Alexandre Anctil,
Yacksecari Lopez,
Eduardo Navarro,
Filonila Perez Pimentel,
Ana Cecilia Zamora,
José Alejandro Ramirez Silva,
Jonathan Gagnon,
Tom August,
Kim Bjerge,
Alba Gomez Segura,
Marc Bélisle,
Yves Basset,
Kent P. McFarland,
David Roy
, et al. (3 additional authors not shown)
Abstract:
Insects represent half of all global biodiversity, yet many of the world's insects are disappearing, with severe implications for ecosystems and agriculture. Despite this crisis, data on insect diversity and abundance remain woefully inadequate, due to the scarcity of human experts and the lack of scalable tools for monitoring. Ecologists have started to adopt camera traps to record and study inse…
▽ More
Insects represent half of all global biodiversity, yet many of the world's insects are disappearing, with severe implications for ecosystems and agriculture. Despite this crisis, data on insect diversity and abundance remain woefully inadequate, due to the scarcity of human experts and the lack of scalable tools for monitoring. Ecologists have started to adopt camera traps to record and study insects, and have proposed computer vision algorithms as an answer for scalable data processing. However, insect monitoring in the wild poses unique challenges that have not yet been addressed within computer vision, including the combination of long-tailed data, extremely similar classes, and significant distribution shifts. We provide the first large-scale machine learning benchmarks for fine-grained insect recognition, designed to match real-world tasks faced by ecologists. Our contributions include a curated dataset of images from citizen science platforms and museums, and an expert-annotated dataset drawn from automated camera traps across multiple continents, designed to test out-of-distribution generalization under field conditions. We train and evaluate a variety of baseline algorithms and introduce a combination of data augmentation techniques that enhance generalization across geographies and hardware setups. Code and datasets are made publicly available.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring
Authors:
Juan Sebastián Cañas,
Maria Paula Toro-Gómez,
Larissa Sayuri Moreira Sugai,
Hernán Darío Benítez Restrepo,
Jorge Rudas,
Breyner Posso Bautista,
Luís Felipe Toledo,
Simone Dena,
Adão Henrique Rosa Domingos,
Franco Leandro de Souza,
Selvino Neckel-Oliveira,
Anderson da Rosa,
Vítor Carvalho-Rocha,
José Vinícius Bernardy,
José Luiz Massao Moreira Sugai,
Carolina Emília dos Santos,
Rogério Pereira Bastos,
Diego Llusia,
Juan Sebastián Ulloa
Abstract:
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians ca…
▽ More
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy. All our experiments and resources can be found on our GitHub repository https://github.com/soundclim/anuraset.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Counterfactual Explanations and Predictive Models to Enhance Clinical Decision-Making in Schizophrenia using Digital Phenoty**
Authors:
Juan Sebastian Canas,
Francisco Gomez,
Omar Costilla-Reyes
Abstract:
Clinical practice in psychiatry is burdened with the increased demand for healthcare services and the scarce resources available. New paradigms of health data powered with machine learning techniques could open the possibility to improve clinical workflow in critical stages of clinical assessment and treatment in psychiatry. In this work, we propose a machine learning system capable of predicting,…
▽ More
Clinical practice in psychiatry is burdened with the increased demand for healthcare services and the scarce resources available. New paradigms of health data powered with machine learning techniques could open the possibility to improve clinical workflow in critical stages of clinical assessment and treatment in psychiatry. In this work, we propose a machine learning system capable of predicting, detecting, and explaining individual changes in symptoms of patients with Schizophrenia by using behavioral digital phenoty** data. We forecast symptoms of patients with an error rate below 10%. The system detects decreases in symptoms using changepoint algorithms and uses counterfactual explanations as a recourse in a simulated continuous monitoring scenario in healthcare. Overall, this study offers valuable insights into the performance and potential of counterfactual explanations, predictive models, and change-point detection within a simulated clinical workflow. These findings lay the foundation for further research to explore additional facets of the workflow, aiming to enhance its effectiveness and applicability in real-world healthcare settings. By leveraging these components, the goal is to develop an actionable, interpretable, and trustworthy integrative decision support system that combines real-time clinical assessments with sensor-based inputs.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.