-
EUvsDisinfo: a Dataset for Multilingual Detection of Pro-Kremlin Disinformation in News Articles
Authors:
João A. Leite,
Olesya Razuvayevskaya,
Kalina Bontcheva,
Carolina Scarton
Abstract:
This work introduces EUvsDisinfo, a multilingual dataset of trustworthy and disinformation articles related to pro-Kremlin themes. It is sourced directly from the debunk articles written by experts leading the EUvsDisinfo project. Our dataset is the largest to-date resource in terms of the overall number of articles and distinct languages. It also provides the largest topical and temporal coverage…
▽ More
This work introduces EUvsDisinfo, a multilingual dataset of trustworthy and disinformation articles related to pro-Kremlin themes. It is sourced directly from the debunk articles written by experts leading the EUvsDisinfo project. Our dataset is the largest to-date resource in terms of the overall number of articles and distinct languages. It also provides the largest topical and temporal coverage. Using this dataset, we investigate the dissemination of pro-Kremlin disinformation across different languages, uncovering language-specific patterns targeting specific disinformation topics. We further analyse the evolution of topic distribution over an eight-year period, noting a significant surge in disinformation content before the full-scale invasion of Ukraine in 2022. Lastly, we demonstrate the dataset's applicability in training models to effectively distinguish between disinformation and trustworthy content in multilingual settings.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Learning Structural Causal Models through Deep Generative Models: Methods, Guarantees, and Challenges
Authors:
Audrey Poinsot,
Alessandro Leite,
Nicolas Chesneau,
Michèle Sébag,
Marc Schoenauer
Abstract:
This paper provides a comprehensive review of deep structural causal models (DSCMs), particularly focusing on their ability to answer counterfactual queries using observational data within known causal structures. It delves into the characteristics of DSCMs by analyzing the hypotheses, guarantees, and applications inherent to the underlying deep learning components and structural causal models, fo…
▽ More
This paper provides a comprehensive review of deep structural causal models (DSCMs), particularly focusing on their ability to answer counterfactual queries using observational data within known causal structures. It delves into the characteristics of DSCMs by analyzing the hypotheses, guarantees, and applications inherent to the underlying deep learning components and structural causal models, fostering a finer understanding of their capabilities and limitations in addressing different counterfactual queries. Furthermore, it highlights the challenges and open questions in the field of deep structural causal modeling. It sets the stages for researchers to identify future work directions and for practitioners to get an overview in order to find out the most appropriate methods for their needs.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees
Authors:
Edgar Jaber,
Vincent Blot,
Nicolas Brunel,
Vincent Chabridon,
Emmanuel Remy,
Bertrand Iooss,
Didier Lucor,
Mathilde Mougeot,
Alessandro Leite
Abstract:
Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaus…
▽ More
Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaussianity of the simulation model as well as the well-specification of the priors which are not always appropriate. We propose to address this issue with the help of conformal prediction. In the present work, a method for building adaptive cross-conformal prediction intervals is proposed by weighting the non-conformity score with the posterior standard deviation of the GP. The resulting conformal prediction intervals exhibit a level of adaptivity akin to Bayesian credibility sets and display a significant correlation with the surrogate model local approximation error, while being free from the underlying model assumptions and having frequentist coverage guarantees. These estimators can thus be used for evaluating the quality of a GP surrogate model and can assist a decision-maker in the choice of the best prior for the specific application of the GP. The performance of the method is illustrated through a panel of numerical examples based on various reference databases. Moreover, the potential applicability of the method is demonstrated in the context of surrogate modeling of an expensive-to-evaluate simulator of the clogging phenomenon in steam generators of nuclear reactors.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Diffusion Illusions: Hiding Images in Plain Sight
Authors:
Ryan Burgert,
Xiang Li,
Abe Leite,
Kanchana Ranasinghe,
Michael S. Ryoo
Abstract:
We explore the problem of computationally generating special `prime' images that produce optical illusions when physically arranged and viewed in a certain way. First, we propose a formal definition for this problem. Next, we introduce Diffusion Illusions, the first comprehensive pipeline designed to automatically generate a wide range of these illusions. Specifically, we both adapt the existing `…
▽ More
We explore the problem of computationally generating special `prime' images that produce optical illusions when physically arranged and viewed in a certain way. First, we propose a formal definition for this problem. Next, we introduce Diffusion Illusions, the first comprehensive pipeline designed to automatically generate a wide range of these illusions. Specifically, we both adapt the existing `score distillation loss' and propose a new `dream target loss' to optimize a group of differentially parametrized prime images, using a frozen text-to-image diffusion model. We study three types of illusions, each where the prime images are arranged in different ways and optimized using the aforementioned losses such that images derived from them align with user-chosen text prompts or images. We conduct comprehensive experiments on these illusions and verify the effectiveness of our proposed method qualitatively and quantitatively. Additionally, we showcase the successful physical fabrication of our illusions -- as they are all designed to work in the real world. Our code and examples are publicly available at our interactive project website: https://diffusionillusions.com
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
A Video-Based Activity Classification of Human Pickers in Agriculture
Authors:
Abhishesh Pal,
Antonio C. Leite,
Jon G. O. Gjevestad,
Pål J. From
Abstract:
In farming systems, harvesting operations are tedious, time- and resource-consuming tasks. Based on this, deploying a fleet of autonomous robots to work alongside farmworkers may provide vast productivity and logistics benefits. Then, an intelligent robotic system should monitor human behavior, identify the ongoing activities and anticipate the worker's needs. In this work, the main contribution c…
▽ More
In farming systems, harvesting operations are tedious, time- and resource-consuming tasks. Based on this, deploying a fleet of autonomous robots to work alongside farmworkers may provide vast productivity and logistics benefits. Then, an intelligent robotic system should monitor human behavior, identify the ongoing activities and anticipate the worker's needs. In this work, the main contribution consists of creating a benchmark model for video-based human pickers detection, classifying their activities to serve in harvesting operations for different agricultural scenarios. Our solution uses the combination of a Mask Region-based Convolutional Neural Network (Mask R-CNN) for object detection and optical flow for motion estimation with newly added statistical attributes of flow motion descriptors, named as Correlation Sensitivity (CS). A classification criterion is defined based on the Kernel Density Estimation (KDE) analysis and K-means clustering algorithm, which are implemented upon in-house collected dataset from different crop fields like strawberry polytunnels and apple tree orchards. The proposed framework is quantitatively analyzed using sensitivity, specificity, and accuracy measures and shows satisfactory results amidst various dataset challenges such as lighting variation, blur, and occlusions.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
Big data-driven prediction of airspace congestion
Authors:
Samet Ayhan,
Ítalo Romani de Oliveira,
Glaucia Balvedi,
Pablo Costas,
Alexandre Leite,
Felipe C. F. de Azevedo
Abstract:
Air Navigation Service Providers (ANSP) worldwide have been making a considerable effort for the development of a better method to measure and predict aircraft counts within a particular airspace, also referred to as airspace density. An accurate measurement and prediction of airspace density is crucial for a better managed airspace, both strategically and tactically, yielding a higher level of au…
▽ More
Air Navigation Service Providers (ANSP) worldwide have been making a considerable effort for the development of a better method to measure and predict aircraft counts within a particular airspace, also referred to as airspace density. An accurate measurement and prediction of airspace density is crucial for a better managed airspace, both strategically and tactically, yielding a higher level of automation and thereby reducing the air traffic controller's workload. Although the prior approaches have been able to address the problem to some extent, data management and query processing of ever-increasing vast volume of air traffic data at high rates, for various analytics purposes such as predicting aircraft counts, still remains a challenge especially when only linear prediction models are used.
In this paper, we present a novel data management and prediction system that accurately predicts aircraft counts for a particular airspace sector within the National Airspace System (NAS). The incoming Traffic Flow Management (TFM) data is streaming, big, uncorrelated and noisy. In the preprocessing step, the system continuously processes the incoming raw data, reduces it to a compact size, and stores it in a NoSQL database, where it makes the data available for efficient query processing. In the prediction step, the system learns from historical trajectories and uses their segments to collect key features such as sector boundary crossings, weather parameters, and other air traffic data. The features are fed into various regression models, including linear, non-linear and ensemble models, and the best performing model is used for prediction. Evaluation on an extensive set of real track, weather, and air traffic data including boundary crossings in the U.S. verify that our system efficiently and accurately predicts aircraft counts in each airspace sector.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision
Authors:
João A. Leite,
Olesya Razuvayevskaya,
Kalina Bontcheva,
Carolina Scarton
Abstract:
Credibility signals represent a wide range of heuristics that are typically used by journalists and fact-checkers to assess the veracity of online content. Automating the task of credibility signal extraction, however, is very challenging as it requires high-accuracy signal-specific extractors to be trained, while there are currently no sufficiently large datasets annotated with all credibility si…
▽ More
Credibility signals represent a wide range of heuristics that are typically used by journalists and fact-checkers to assess the veracity of online content. Automating the task of credibility signal extraction, however, is very challenging as it requires high-accuracy signal-specific extractors to be trained, while there are currently no sufficiently large datasets annotated with all credibility signals. This paper investigates whether large language models (LLMs) can be prompted effectively with a set of 18 credibility signals to produce weak labels for each signal. We then aggregate these potentially noisy labels using weak supervision in order to predict content veracity. We demonstrate that our approach, which combines zero-shot LLM credibility signal labeling and weak supervision, outperforms state-of-the-art classifiers on two misinformation datasets without using any ground-truth labels for training. We also analyse the contribution of the individual credibility signals towards predicting content veracity, which provides new valuable insights into their role in misinformation detection.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification
Authors:
Olesya Razuvayevskaya,
Ben Wu,
Joao A. Leite,
Freddy Heppell,
Ivan Srba,
Carolina Scarton,
Kalina Bontcheva,
Xingyi Song
Abstract:
Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation…
▽ More
Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient. Previous results demonstrated that these methods can even improve performance on some classification tasks. This paper complements the existing research by investigating how these techniques influence the classification performance and computation costs compared to full fine-tuning when applied to multilingual text classification tasks (genre, framing, and persuasion techniques detection; with different input lengths, number of predicted classes and classification difficulty), some of which have limited training data. In addition, we conduct in-depth analyses of their efficacy across different training scenarios (training on the original multilingual data; on the translations into English; and on a subset of English-only data) and different languages. Our findings provide valuable insights into the applicability of the parameter-efficient fine-tuning techniques, particularly to complex multilingual and multilabel classification tasks.
△ Less
Submitted 8 April, 2024; v1 submitted 14 August, 2023;
originally announced August 2023.
-
Noisy Self-Training with Data Augmentations for Offensive and Hate Speech Detection Tasks
Authors:
João A. Leite,
Carolina Scarton,
Diego F. Silva
Abstract:
Online social media is rife with offensive and hateful comments, prompting the need for their automatic detection given the sheer amount of posts created every second. Creating high-quality human-labelled datasets for this task is difficult and costly, especially because non-offensive posts are significantly more frequent than offensive ones. However, unlabelled data is abundant, easier, and cheap…
▽ More
Online social media is rife with offensive and hateful comments, prompting the need for their automatic detection given the sheer amount of posts created every second. Creating high-quality human-labelled datasets for this task is difficult and costly, especially because non-offensive posts are significantly more frequent than offensive ones. However, unlabelled data is abundant, easier, and cheaper to obtain. In this scenario, self-training methods, using weakly-labelled examples to increase the amount of training data, can be employed. Recent "noisy" self-training approaches incorporate data augmentation techniques to ensure prediction consistency and increase robustness against noisy data and adversarial attacks. In this paper, we experiment with default and noisy self-training using three different textual data augmentation techniques across five different pre-trained BERT architectures varying in size. We evaluate our experiments on two offensive/hate-speech datasets and demonstrate that (i) self-training consistently improves performance regardless of model size, resulting in up to +1.5% F1-macro on both datasets, and (ii) noisy self-training with textual data augmentations, despite being successfully applied in similar settings, decreases performance on offensive and hate-speech domains when compared to the default method, even with state-of-the-art augmentations such as backtranslation.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
A Guide for Practical Use of ADMG Causal Data Augmentation
Authors:
Audrey Poinsot,
Alessandro Leite
Abstract:
Data augmentation is essential when applying Machine Learning in small-data regimes. It generates new samples following the observed data distribution while increasing their diversity and variability to help researchers and practitioners improve their models' robustness and, thus, deploy them in the real world. Nevertheless, its usage in tabular data still needs to be improved, as prior knowledge…
▽ More
Data augmentation is essential when applying Machine Learning in small-data regimes. It generates new samples following the observed data distribution while increasing their diversity and variability to help researchers and practitioners improve their models' robustness and, thus, deploy them in the real world. Nevertheless, its usage in tabular data still needs to be improved, as prior knowledge about the underlying data mechanism is seldom considered, limiting the fidelity and diversity of the generated data. Causal data augmentation strategies have been pointed out as a solution to handle these challenges by relying on conditional independence encoded in a causal graph. In this context, this paper experimentally analyzed the ADMG causal augmentation method considering different settings to support researchers and practitioners in understanding under which conditions prior knowledge helps generate new data points and, consequently, enhances the robustness of their models. The results highlighted that the studied method (a) is independent of the underlying model mechanism, (b) requires a minimal number of observations that may be challenging in a small-data regime to improve an ML model's accuracy, (c) propagates outliers to the augmented set degrading the performance of the model, and (d) is sensitive to its hyperparameter's value.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches for news genre, topic and persuasion technique classification
Authors:
Ben Wu,
Olesya Razuvayevskaya,
Freddy Heppell,
João A. Leite,
Carolina Scarton,
Kalina Bontcheva,
Xingyi Song
Abstract:
This paper describes our approach for SemEval-2023 Task 3: Detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup. For Subtask 1 (News Genre), we propose an ensemble of fully trained and adapter mBERT models which was ranked joint-first for German, and had the highest mean rank of multi-language teams. For Subtask 2 (Framing), we achieved first p…
▽ More
This paper describes our approach for SemEval-2023 Task 3: Detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup. For Subtask 1 (News Genre), we propose an ensemble of fully trained and adapter mBERT models which was ranked joint-first for German, and had the highest mean rank of multi-language teams. For Subtask 2 (Framing), we achieved first place in 3 languages, and the best average rank across all the languages, by using two separate ensembles: a monolingual RoBERTa-MUPPETLARGE and an ensemble of XLM-RoBERTaLARGE with adapters and task adaptive pretraining. For Subtask 3 (Persuasion Techniques), we train a monolingual RoBERTa-Base model for English and a multilingual mBERT model for the remaining languages, which achieved top 10 for all languages, including 2nd for English. For each subtask, we compared monolingual and multilingual approaches, and considered class imbalance techniques.
△ Less
Submitted 9 May, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Low-complexity Approximate Convolutional Neural Networks
Authors:
R. J. Cintra,
S. Duffner,
C. Garcia,
A. Leite
Abstract:
In this paper, we present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet). The idea is to approximate all elements of a given ConvNet and replace the original convolutional filters and parameters (pooling and bias coefficients; and activation function) with efficient approximations capable of extreme reductions in computational complexity.…
▽ More
In this paper, we present an approach for minimizing the computational complexity of trained Convolutional Neural Networks (ConvNet). The idea is to approximate all elements of a given ConvNet and replace the original convolutional filters and parameters (pooling and bias coefficients; and activation function) with efficient approximations capable of extreme reductions in computational complexity. Low-complexity convolution filters are obtained through a binary (zero-one) linear programming scheme based on the Frobenius norm over sets of dyadic rationals. The resulting matrices allow for multiplication-free computations requiring only addition and bit-shifting operations. Such low-complexity structures pave the way for low-power, efficient hardware designs. We applied our approach on three use cases of different complexity: (i) a "light" but efficient ConvNet for face detection (with around 1000 parameters); (ii) another one for hand-written digit classification (with more than 180000 parameters); and (iii) a significantly larger ConvNet: AlexNet with $\approx$1.2 million matrices. We evaluated the overall performance on the respective tasks for different levels of approximations. In all considered applications, very low-complexity approximations have been derived maintaining an almost equal classification performance.
△ Less
Submitted 29 July, 2022;
originally announced August 2022.
-
Effects of Human vs. Automatic Feedback on Students' Understanding of AI Concepts and Programming Style
Authors:
Abe Leite,
Saúl A. Blanco
Abstract:
The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses, and recent work has focused on improving the quality of automatically generated feedback. However, there is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback. This paper addresses this gap by splitting one 90-stu…
▽ More
The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses, and recent work has focused on improving the quality of automatically generated feedback. However, there is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback. This paper addresses this gap by splitting one 90-student class into two feedback groups and analyzing differences in the two cohorts' performance. The class is an intro to AI with programming HW assignments. One group of students received detailed computer-generated feedback on their programming assignments describing which parts of the algorithms' logic was missing; the other group additionally received human-written feedback describing how their programs' syntax relates to issues with their logic, and qualitative (style) recommendations for improving their code. Results on quizzes and exam questions suggest that human feedback helps students obtain a better conceptual understanding, but analyses found no difference between the groups' ability to collaborate on the final project. The course grade distribution revealed that students who received human-written feedback performed better overall; this effect was the most pronounced in the middle two quartiles of each group. These results suggest that feedback about the syntax-logic relation may be a primary mechanism by which human feedback improves student outcomes.
△ Less
Submitted 20 November, 2020;
originally announced November 2020.
-
Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis
Authors:
João A. Leite,
Diego F. Silva,
Kalina Bontcheva,
Carolina Scarton
Abstract:
Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainl…
▽ More
Hate speech and toxic comments are a common concern of social media platform users. Although these comments are, fortunately, the minority in these platforms, they are still capable of causing harm. Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media. Previous work in automatically detecting toxic comments focus mainly in English, with very few work in languages like Brazilian Portuguese. In this paper, we propose a new large-scale dataset for Brazilian Portuguese with tweets annotated as either toxic or non-toxic or in different types of toxicity. We present our dataset collection and annotation process, where we aimed to select candidates covering multiple demographic groups. State-of-the-art BERT models were able to achieve 76% macro-F1 score using monolingual data in the binary case. We also show that large-scale monolingual data is still needed to create more accurate models, despite recent advances in multilingual approaches. An error analysis and experiments with multi-label classification show the difficulty of classifying certain types of toxic comments that appear less frequently in our data and highlights the need to develop models that are aware of different categories of toxicity.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
Sabrina: Modeling and Visualization of Economy Data with Incremental Domain Knowledge
Authors:
Alessio Arleo,
Christos Tsigkanos,
Chao Jia,
Roger A. Leite,
Ilir Murturi,
Manfred Klaffenboeck,
Schahram Dustdar,
Michael Wimmer,
Silvia Miksch,
Johannes Sorger
Abstract:
Investment planning requires knowledge of the financial landscape on a large scale, both in terms of geo-spatial and industry sector distribution. There is plenty of data available, but it is scattered across heterogeneous sources (newspapers, open data, etc.), which makes it difficult for financial analysts to understand the big picture. In this paper, we present Sabrina, a financial data analysi…
▽ More
Investment planning requires knowledge of the financial landscape on a large scale, both in terms of geo-spatial and industry sector distribution. There is plenty of data available, but it is scattered across heterogeneous sources (newspapers, open data, etc.), which makes it difficult for financial analysts to understand the big picture. In this paper, we present Sabrina, a financial data analysis and visualization approach that incorporates a pipeline for the generation of firm-to-firm financial transaction networks. The pipeline is capable of fusing the ground truth on individual firms in a region with (incremental) domain knowledge on general macroscopic aspects of the economy. Sabrina unites these heterogeneous data sources within a uniform visual interface that enables the visual analysis process. In a user study with three domain experts, we illustrate the usefulness of Sabrina, which eases their analysis process.
△ Less
Submitted 8 January, 2020; v1 submitted 5 August, 2019;
originally announced August 2019.
-
Robotic Tankette for Intelligent BioEnergy Agriculture: Design, Development and Field Tests
Authors:
Marco F. S. Xaud,
Antonio C. Leite,
Evelyn S. Barbosa,
Henrique D. Faria,
Gabriel S. M. Loureiro,
Pål J. From
Abstract:
In recent years, the use of robots in agriculture has been increasing mainly due to the high demand of productivity, precision and efficiency, which follow the climate change effects and world population growth. Unlike conventional agriculture, sugarcane farms are usually regions with dense vegetation, gigantic areas, and subjected to extreme weather conditions, such as intense heat, moisture and…
▽ More
In recent years, the use of robots in agriculture has been increasing mainly due to the high demand of productivity, precision and efficiency, which follow the climate change effects and world population growth. Unlike conventional agriculture, sugarcane farms are usually regions with dense vegetation, gigantic areas, and subjected to extreme weather conditions, such as intense heat, moisture and rain. TIBA - Tankette for Intelligent BioEnergy Agriculture - is the first result of an R&D project which strives to develop an autonomous mobile robotic system for carrying out a number of agricultural tasks in sugarcane fields. The proposed concept consists of a semi-autonomous, low-cost, dust and waterproof tankette-type vehicle, capable of infiltrating dense vegetation in plantation tunnels and carry several sensing systems, in order to perform map** of hard-to-access areas and collecting samples. This paper presents an overview of the robot mechanical design, the embedded electronics and software architecture, and the construction of a first prototype. Preliminary results obtained in field tests validate the proposed conceptual design and bring about several challenges and potential applications for robot autonomous navigation, as well as to build a new prototype with additional functionality.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Low-complexity 8-point DCT Approximation Based on Angle Similarity for Image and Video Coding
Authors:
R. S. Oliveira,
R. J. Cintra,
F. M. Bayer,
T. L. T. da Silveira,
A. Madanayake,
A. Leite
Abstract:
The principal component analysis (PCA) is widely used for data decorrelation and dimensionality reduction. However, the use of PCA may be impractical in real-time applications, or in situations were energy and computing constraints are severe. In this context, the discrete cosine transform (DCT) becomes a low-cost alternative to data decorrelation. This paper presents a method to derive computatio…
▽ More
The principal component analysis (PCA) is widely used for data decorrelation and dimensionality reduction. However, the use of PCA may be impractical in real-time applications, or in situations were energy and computing constraints are severe. In this context, the discrete cosine transform (DCT) becomes a low-cost alternative to data decorrelation. This paper presents a method to derive computationally efficient approximations to the DCT. The proposed method aims at the minimization of the angle between the rows of the exact DCT matrix and the rows of the approximated transformation matrix. The resulting transformations matrices are orthogonal and have extremely low arithmetic complexity. Considering popular performance measures, one of the proposed transformation matrices outperforms the best competitors in both matrix error and coding capabilities. Practical applications in image and video coding demonstrate the relevance of the proposed transformation. In fact, we show that the proposed approximate DCT can outperform the exact DCT for image encoding under certain compression ratios. The proposed transform and its direct competitors are also physically realized as digital prototype circuits using FPGA technology.
△ Less
Submitted 30 January, 2024; v1 submitted 8 August, 2018;
originally announced August 2018.