Skip to main content

Showing 1–17 of 17 results for author: Leite, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12614  [pdf, other

    cs.CL cs.LG

    EUvsDisinfo: a Dataset for Multilingual Detection of Pro-Kremlin Disinformation in News Articles

    Authors: João A. Leite, Olesya Razuvayevskaya, Kalina Bontcheva, Carolina Scarton

    Abstract: This work introduces EUvsDisinfo, a multilingual dataset of trustworthy and disinformation articles related to pro-Kremlin themes. It is sourced directly from the debunk articles written by experts leading the EUvsDisinfo project. Our dataset is the largest to-date resource in terms of the overall number of articles and distinct languages. It also provides the largest topical and temporal coverage… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 4 pages, 3 figures, 2 tables

  2. arXiv:2405.05025  [pdf, other

    stat.ML cs.LG

    Learning Structural Causal Models through Deep Generative Models: Methods, Guarantees, and Challenges

    Authors: Audrey Poinsot, Alessandro Leite, Nicolas Chesneau, Michèle Sébag, Marc Schoenauer

    Abstract: This paper provides a comprehensive review of deep structural causal models (DSCMs), particularly focusing on their ability to answer counterfactual queries using observational data within known causal structures. It delves into the characteristics of DSCMs by analyzing the hypotheses, guarantees, and applications inherent to the underlying deep learning components and structural causal models, fo… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted to the 33rd International Joint Conference on Artificial Intelligence

  3. arXiv:2401.07733  [pdf, other

    stat.ML cs.LG

    Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees

    Authors: Edgar Jaber, Vincent Blot, Nicolas Brunel, Vincent Chabridon, Emmanuel Remy, Bertrand Iooss, Didier Lucor, Mathilde Mougeot, Alessandro Leite

    Abstract: Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaus… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  4. arXiv:2312.03817  [pdf, other

    cs.CV

    Diffusion Illusions: Hiding Images in Plain Sight

    Authors: Ryan Burgert, Xiang Li, Abe Leite, Kanchana Ranasinghe, Michael S. Ryoo

    Abstract: We explore the problem of computationally generating special `prime' images that produce optical illusions when physically arranged and viewed in a certain way. First, we propose a formal definition for this problem. Next, we introduce Diffusion Illusions, the first comprehensive pipeline designed to automatically generate a wide range of these illusions. Specifically, we both adapt the existing `… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  5. arXiv:2311.10885  [pdf, ps, other

    cs.CV cs.RO

    A Video-Based Activity Classification of Human Pickers in Agriculture

    Authors: Abhishesh Pal, Antonio C. Leite, Jon G. O. Gjevestad, Pål J. From

    Abstract: In farming systems, harvesting operations are tedious, time- and resource-consuming tasks. Based on this, deploying a fleet of autonomous robots to work alongside farmworkers may provide vast productivity and logistics benefits. Then, an intelligent robotic system should monitor human behavior, identify the ongoing activities and anticipate the worker's needs. In this work, the main contribution c… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 4 pages, 6 figures, 3 tables

  6. arXiv:2310.08982  [pdf

    cs.AI cs.CE eess.SY

    Big data-driven prediction of airspace congestion

    Authors: Samet Ayhan, Ítalo Romani de Oliveira, Glaucia Balvedi, Pablo Costas, Alexandre Leite, Felipe C. F. de Azevedo

    Abstract: Air Navigation Service Providers (ANSP) worldwide have been making a considerable effort for the development of a better method to measure and predict aircraft counts within a particular airspace, also referred to as airspace density. An accurate measurement and prediction of airspace density is crucial for a better managed airspace, both strategically and tactically, yielding a higher level of au… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: Submitted to the 2023 IEEE/AIAA Digital Aviation Systems Conference (DASC)

  7. arXiv:2309.07601  [pdf, other

    cs.CL cs.AI cs.LG

    Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision

    Authors: João A. Leite, Olesya Razuvayevskaya, Kalina Bontcheva, Carolina Scarton

    Abstract: Credibility signals represent a wide range of heuristics that are typically used by journalists and fact-checkers to assess the veracity of online content. Automating the task of credibility signal extraction, however, is very challenging as it requires high-accuracy signal-specific extractors to be trained, while there are currently no sufficiently large datasets annotated with all credibility si… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  8. Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

    Authors: Olesya Razuvayevskaya, Ben Wu, Joao A. Leite, Freddy Heppell, Ivan Srba, Carolina Scarton, Kalina Bontcheva, Xingyi Song

    Abstract: Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation… ▽ More

    Submitted 8 April, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Journal ref: PLOS ONE 2024

  9. arXiv:2307.16609  [pdf, other

    cs.CL cs.LG cs.SI

    Noisy Self-Training with Data Augmentations for Offensive and Hate Speech Detection Tasks

    Authors: João A. Leite, Carolina Scarton, Diego F. Silva

    Abstract: Online social media is rife with offensive and hateful comments, prompting the need for their automatic detection given the sheer amount of posts created every second. Creating high-quality human-labelled datasets for this task is difficult and costly, especially because non-offensive posts are significantly more frequent than offensive ones. However, unlabelled data is abundant, easier, and cheap… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted to RANLP 2023

  10. arXiv:2304.01237  [pdf, other

    cs.LG stat.ME

    A Guide for Practical Use of ADMG Causal Data Augmentation

    Authors: Audrey Poinsot, Alessandro Leite

    Abstract: Data augmentation is essential when applying Machine Learning in small-data regimes. It generates new samples following the observed data distribution while increasing their diversity and variability to help researchers and practitioners improve their models' robustness and, thus, deploy them in the real world. Nevertheless, its usage in tabular data still needs to be improved, as prior knowledge… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Journal ref: Workshop on the pitfalls of limited data and computation for Trustworthy ML, ICLR 2023, Kigali, Rwanda

  11. SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches for news genre, topic and persuasion technique classification

    Authors: Ben Wu, Olesya Razuvayevskaya, Freddy Heppell, João A. Leite, Carolina Scarton, Kalina Bontcheva, Xingyi Song

    Abstract: This paper describes our approach for SemEval-2023 Task 3: Detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup. For Subtask 1 (News Genre), we propose an ensemble of fully trained and adapter mBERT models which was ranked joint-first for German, and had the highest mean rank of multi-language teams. For Subtask 2 (Framing), we achieved first p… ▽ More

    Submitted 9 May, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Journal ref: Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 1995-2008, Toronto, Canada. Association for Computational Linguistics

  12. arXiv:2208.00087  [pdf, ps, other

    cs.LG cs.CC cs.CV eess.SP stat.ME

    Low-complexity Approximate Convolutional Neural Networks

    Authors: R. J. Cintra, S. Duffner, C. Garcia, A. Leite

    Abstract: In this paper, we present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet). The idea is to approximate all elements of a given ConvNet and replace the original convolutional filters and parameters (pooling and bias coefficients; and activation function) with efficient approximations capable of extreme reductions in computational complexity.… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

    Comments: 13 pages, 4 figures, 8 tables

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, v. 29, n. 12, Dec. 2018

  13. Effects of Human vs. Automatic Feedback on Students' Understanding of AI Concepts and Programming Style

    Authors: Abe Leite, Saúl A. Blanco

    Abstract: The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses, and recent work has focused on improving the quality of automatically generated feedback. However, there is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback. This paper addresses this gap by splitting one 90-stu… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

    Comments: Published in SIGCSE '20: Proceedings of the 51st ACM Technical Symposium on Computer Science Education

    ACM Class: K.3.2

    Journal ref: SIGCSE '20: Proceedings of the 51st ACM Technical Symposium on Computer Science Education (Feb 2020) 44-50

  14. arXiv:2010.04543  [pdf, other

    cs.CL cs.LG cs.SI

    Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis

    Authors: João A. Leite, Diego F. Silva, Kalina Bontcheva, Carolina Scarton

    Abstract: Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainl… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: Accepted to AACL-IJCNLP 2020

  15. Sabrina: Modeling and Visualization of Economy Data with Incremental Domain Knowledge

    Authors: Alessio Arleo, Christos Tsigkanos, Chao Jia, Roger A. Leite, Ilir Murturi, Manfred Klaffenboeck, Schahram Dustdar, Michael Wimmer, Silvia Miksch, Johannes Sorger

    Abstract: Investment planning requires knowledge of the financial landscape on a large scale, both in terms of geo-spatial and industry sector distribution. There is plenty of data available, but it is scattered across heterogeneous sources (newspapers, open data, etc.), which makes it difficult for financial analysts to understand the big picture. In this paper, we present Sabrina, a financial data analysi… ▽ More

    Submitted 8 January, 2020; v1 submitted 5 August, 2019; originally announced August 2019.

  16. arXiv:1901.00761  [pdf, other

    cs.RO

    Robotic Tankette for Intelligent BioEnergy Agriculture: Design, Development and Field Tests

    Authors: Marco F. S. Xaud, Antonio C. Leite, Evelyn S. Barbosa, Henrique D. Faria, Gabriel S. M. Loureiro, Pål J. From

    Abstract: In recent years, the use of robots in agriculture has been increasing mainly due to the high demand of productivity, precision and efficiency, which follow the climate change effects and world population growth. Unlike conventional agriculture, sugarcane farms are usually regions with dense vegetation, gigantic areas, and subjected to extreme weather conditions, such as intense heat, moisture and… ▽ More

    Submitted 3 January, 2019; originally announced January 2019.

    Comments: 9 pages, 15 figures

    MSC Class: 68T40 (Primary) 70B15 (Secondary)

    Journal ref: The XXII Brazilian Conference on Automation. SBA, 2018

  17. arXiv:1808.02950  [pdf, other

    eess.IV cs.MM eess.SP stat.CO

    Low-complexity 8-point DCT Approximation Based on Angle Similarity for Image and Video Coding

    Authors: R. S. Oliveira, R. J. Cintra, F. M. Bayer, T. L. T. da Silveira, A. Madanayake, A. Leite

    Abstract: The principal component analysis (PCA) is widely used for data decorrelation and dimensionality reduction. However, the use of PCA may be impractical in real-time applications, or in situations were energy and computing constraints are severe. In this context, the discrete cosine transform (DCT) becomes a low-cost alternative to data decorrelation. This paper presents a method to derive computatio… ▽ More

    Submitted 30 January, 2024; v1 submitted 8 August, 2018; originally announced August 2018.

    Comments: Corrected typo in formula for the coding gain. 16 pages, 12 figures, 10 tables

    Journal ref: Multidimensional Systems and Signal Processing, 1-32, 2018