-
AI Competitions and Benchmarks, Practical issues: Proposals, grant money, sponsors, prizes, dissemination, publicity
Authors:
Magali Richard,
Yuna Blum,
Justin Guinney,
Gustavo Stolovitzky,
Adrien Pavão
Abstract:
This chapter provides a comprehensive overview of the pragmatic aspects involved in organizing AI competitions. We begin by discussing strategies to incentivize participation, touching upon effective communication techniques, aligning with trending topics in the field, structuring awards, potential recruitment opportunities, and more. We then shift to the essence of community engagement, and into…
▽ More
This chapter provides a comprehensive overview of the pragmatic aspects involved in organizing AI competitions. We begin by discussing strategies to incentivize participation, touching upon effective communication techniques, aligning with trending topics in the field, structuring awards, potential recruitment opportunities, and more. We then shift to the essence of community engagement, and into organizational best practices and effective means of disseminating challenge outputs. Lastly, the chapter addresses the logistics, exposing on costs, required manpower, and resource allocation for effectively managing and executing a challenge. By examining these practical problems, readers will gain actionable insights to navigate the multifaceted landscape of AI competition organization, from inception to completion.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Unmasking information manipulation: A quantitative approach to detecting Copy-pasta, Rewording, and Translation on Social Media
Authors:
Manon Richard,
Lisa Giordani,
Cristian Brokate,
Jean Liénard
Abstract:
This study proposes a comprehensive methodology for identifying three techniques utilized in foreign-operated information manipulation campaigns: Copy-Pasta, Rewording, and Translation. Our approach, dubbed the ``$3Δ$-space duplicate methodology'', quantifies the semantic, grapheme, and language aspects of messages. Computing pairwise distances within these dimensions enables detection of abnormal…
▽ More
This study proposes a comprehensive methodology for identifying three techniques utilized in foreign-operated information manipulation campaigns: Copy-Pasta, Rewording, and Translation. Our approach, dubbed the ``$3Δ$-space duplicate methodology'', quantifies the semantic, grapheme, and language aspects of messages. Computing pairwise distances within these dimensions enables detection of abnormally close messages that are likely part of a coordinated campaign. We validate our approach using a synthetic dataset generated with ChatGPT and DeepL, further applying it to a real-world dataset on Venezuelan actors from Twitter Transparency. Our method successfully identifies all three types of inauthentic duplicates in the synthetic dataset, and is able to uncover inauthentic duplicates across political, commercial, and entertainment contexts in the Twitter dataset. The distinct focus on clustered alterations to messages, rather than individual messages, makes our approach efficient and effective at detecting large-scale instances of textual manipulation, including AI-generated ones. Moreover, our method offers a robust tool for identifying translated content, overlooked in previous research. This research also represents the first comprehensive analysis of copy-pasta detection, providing a reliable technique for tracking duplicate textual content across social networks.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking Platform
Authors:
Zhen Xu,
Sergio Escalera,
Isabelle Guyon,
Adrien Pavão,
Magali Richard,
Wei-Wei Tu,
Quanming Yao,
Huan Zhao
Abstract:
Obtaining standardized crowdsourced benchmark of computational methods is a major issue in data science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here we introduce Codabench, an open-source, community-driven platform for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench (https://w…
▽ More
Obtaining standardized crowdsourced benchmark of computational methods is a major issue in data science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here we introduce Codabench, an open-source, community-driven platform for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench (https://www.codabench.org/) is open to everyone, free of charge, and allows benchmark organizers to compare fairly submissions, under the same setting (software, hardware, data, algorithms), with custom protocols and data formats. Codabench has unique features facilitating the organization of benchmarks flexibly, easily and reproducibly, such as the possibility of re-using templates of benchmarks, and supplying compute resources on-demand. Codabench has been used internally and externally on various applications, receiving more than 130 users and 2500 submissions. As illustrative use cases, we introduce 4 diverse benchmarks covering Graph Machine Learning, Cancer Heterogeneity, Clinical Diagnosis and Reinforcement Learning.
△ Less
Submitted 25 February, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants
Authors:
Denis P. Rubanga,
Loyani K. Loyani,
Mgaya Richard,
Sawahiko Shimada
Abstract:
Early quantification of Tuta absoluta pest's effects in tomato plants is a very important factor in controlling and preventing serious damages of the pest. The invasion of Tuta absoluta is considered a major threat to tomato production causing heavy loss ranging from 80 to 100 percent when not properly managed. Therefore, real-time and early quantification of tomato leaf miner Tuta absoluta, can p…
▽ More
Early quantification of Tuta absoluta pest's effects in tomato plants is a very important factor in controlling and preventing serious damages of the pest. The invasion of Tuta absoluta is considered a major threat to tomato production causing heavy loss ranging from 80 to 100 percent when not properly managed. Therefore, real-time and early quantification of tomato leaf miner Tuta absoluta, can play an important role in addressing the issue of pest management and enhance farmers' decisions. In this study, we propose a Convolutional Neural Network (CNN) approach in determining the effects of Tuta absoluta in tomato plants. Four CNN pre-trained architectures (VGG16, VGG19, ResNet and Inception-V3) were used in training classifiers on a dataset containing health and infested tomato leaves collected from real field experiments. Among the pre-trained architectures, experimental results showed that Inception-V3 yielded the best results with an average accuracy of 87.2 percent in estimating the severity status of Tuta absoluta in tomato plants. The pre-trained models could also easily identify High Tuta severity status compared to other severity status (Low tuta and No tuta)
△ Less
Submitted 8 April, 2020;
originally announced April 2020.