-
Labeling Comic Mischief Content in Online Videos with a Multimodal Hierarchical-Cross-Attention Model
Authors:
Elaheh Baharlouei,
Mahsa Shafaei,
Yigeng Zhang,
Hugo Jair Escalante,
Thamar Solorio
Abstract:
We address the challenge of detecting questionable content in online media, specifically the subcategory of comic mischief. This type of content combines elements such as violence, adult content, or sarcasm with humor, making it difficult to detect. Employing a multimodal approach is vital to capture the subtle details inherent in comic mischief content. To tackle this problem, we propose a novel…
▽ More
We address the challenge of detecting questionable content in online media, specifically the subcategory of comic mischief. This type of content combines elements such as violence, adult content, or sarcasm with humor, making it difficult to detect. Employing a multimodal approach is vital to capture the subtle details inherent in comic mischief content. To tackle this problem, we propose a novel end-to-end multimodal system for the task of comic mischief detection. As part of this contribution, we release a novel dataset for the targeted task consisting of three modalities: video, text (video captions and subtitles), and audio. We also design a HIerarchical Cross-attention model with CAPtions (HICCAP) to capture the intricate relationships among these modalities. The results show that the proposed approach makes a significant improvement over robust baselines and state-of-the-art models for comic mischief detection and its type classification. This emphasizes the potential of our system to empower users, to make informed decisions about the online content they choose to see. In addition, we conduct experiments on the UCF101, HMDB51, and XD-Violence datasets, comparing our model against other state-of-the-art approaches showcasing the outstanding performance of our proposed model in various scenarios.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Unified Physical-Digital Attack Detection Challenge
Authors:
Haocheng Yuan,
Ajian Liu,
Junze Zheng,
Jun Wan,
Jiankang Deng,
Sergio Escalera,
Hugo Jair Escalante,
Isabelle Guyon,
Zhen Lei
Abstract:
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attac…
▽ More
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attack Detection (UAD) algorithms, a large-scale UniAttackData dataset has been collected. UniAttackData is the largest public dataset for Unified Attack Detection, with a total of 28,706 videos, where each unique identity encompasses all advanced attack types. Based on this dataset, we organized a Unified Physical-Digital Face Attack Detection Challenge to boost the research in Unified Attack Detections. It attracted 136 teams for the development phase, with 13 qualifying for the final round. The results re-verified by the organizing team were used for the final ranking. This paper comprehensively reviews the challenge, detailing the dataset introduction, protocol definition, evaluation criteria, and a summary of published results. Finally, we focus on the detailed analysis of the highest-performing algorithms and offer potential directions for unified physical-digital attack detection inspired by this competition. Challenge Website: https://sites.google.com/view/face-anti-spoofing-challenge/welcome/challengecvpr2024.
△ Less
Submitted 18 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Analysis of Systems' Performance in Natural Language Processing Competitions
Authors:
Sergio Nava-Muñoz,
Mario Graff,
Hugo Jair Escalante
Abstract:
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises whe…
▽ More
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when comparing algorithms' performance, assessing multiple participants, and ranking them. Statistical tools are often used for this purpose; however, traditional statistical methods often fail to capture decisive differences between systems' performance. This manuscript describes an evaluation methodology for statistically analyzing competition results and competition. The methodology is designed to be universally applicable; however, it is illustrated using eight natural language competitions as case studies involving classification and regression problems. The proposed methodology offers several advantages, including off-the-shell comparisons with correction mechanisms and the inclusion of confidence intervals. Furthermore, we introduce metrics that allow organizers to assess the difficulty of competitions. Our analysis shows the potential usefulness of our methodology for effectively evaluating competition results.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Academic competitions
Authors:
Hugo Jair Escalante,
Aleksandra Kruchinina
Abstract:
Academic challenges comprise effective means for (i) advancing the state of the art, (ii) putting in the spotlight of a scientific community specific topics and problems, as well as (iii) closing the gap for under represented communities in terms of accessing and participating in the sha** of research fields. Competitions can be traced back for centuries and their achievements have had great inf…
▽ More
Academic challenges comprise effective means for (i) advancing the state of the art, (ii) putting in the spotlight of a scientific community specific topics and problems, as well as (iii) closing the gap for under represented communities in terms of accessing and participating in the sha** of research fields. Competitions can be traced back for centuries and their achievements have had great influence in our modern world. Recently, they (re)gained popularity, with the overwhelming amounts of data that is being generated in different domains, as well as the need of pushing the barriers of existing methods, and available tools to handle such data. This chapter provides a survey of academic challenges in the context of machine learning and related fields. We review the most influential competitions in the last few years and analyze challenges per area of knowledge. The aims of scientific challenges, their goals, major achievements and expectations for the next few years are reviewed.
△ Less
Submitted 30 November, 2023;
originally announced December 2023.
-
TurboGP: A flexible and advanced python based GP library
Authors:
Lino Rodriguez-Coayahuitl,
Alicia Morales-Reyes,
Hugo Jair Escalante
Abstract:
We introduce TurboGP, a Genetic Programming (GP) library fully written in Python and specifically designed for machine learning tasks. TurboGP implements modern features not available in other GP implementations, such as island and cellular population schemes, different types of genetic operations (migration, protected crossovers), online learning, among other features. TurboGP's most distinctive…
▽ More
We introduce TurboGP, a Genetic Programming (GP) library fully written in Python and specifically designed for machine learning tasks. TurboGP implements modern features not available in other GP implementations, such as island and cellular population schemes, different types of genetic operations (migration, protected crossovers), online learning, among other features. TurboGP's most distinctive characteristic is its native support for different types of GP nodes to allow different abstraction levels, this makes TurboGP particularly useful for processing a wide variety of data sources.
△ Less
Submitted 31 August, 2023;
originally announced September 2023.
-
Continuous Cartesian Genetic Programming based representation for Multi-Objective Neural Architecture Search
Authors:
Cosijopii Garcia-Garcia,
Alicia Morales-Reyes,
Hugo Jair Escalante
Abstract:
We propose a novel approach for the challenge of designing less complex yet highly effective convolutional neural networks (CNNs) through the use of cartesian genetic programming (CGP) for neural architecture search (NAS). Our approach combines real-based and block-chained CNNs representations based on CGP for optimization in the continuous domain using multi-objective evolutionary algorithms (MOE…
▽ More
We propose a novel approach for the challenge of designing less complex yet highly effective convolutional neural networks (CNNs) through the use of cartesian genetic programming (CGP) for neural architecture search (NAS). Our approach combines real-based and block-chained CNNs representations based on CGP for optimization in the continuous domain using multi-objective evolutionary algorithms (MOEAs). Two variants are introduced that differ in the granularity of the search space they consider. The proposed CGP-NASV1 and CGP-NASV2 algorithms were evaluated using the non-dominated sorting genetic algorithm II (NSGA-II) on the CIFAR-10 and CIFAR-100 datasets. The empirical analysis was extended to assess the crossover operator from differential evolution (DE), the multi-objective evolutionary algorithm based on decomposition (MOEA/D) and S metric selection evolutionary multi-objective algorithm (SMS-EMOA) using the same representation. Experimental results demonstrate that our approach is competitive with state-of-the-art proposals in terms of classification performance and model complexity.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Comparison of classifiers in challenge scheme
Authors:
Sergio Nava-Muñoz,
Mario Graff Guerrero,
Hugo Jair Escalante
Abstract:
In recent decades, challenges have become very popular in scientific research as these are crowdsourcing schemes. In particular, challenges are essential for develo** machine learning algorithms. For the challenges settings, it is vital to establish the scientific question, the dataset (with adequate quality, quantity, diversity, and complexity), performance metrics, as well as a way to authenti…
▽ More
In recent decades, challenges have become very popular in scientific research as these are crowdsourcing schemes. In particular, challenges are essential for develo** machine learning algorithms. For the challenges settings, it is vital to establish the scientific question, the dataset (with adequate quality, quantity, diversity, and complexity), performance metrics, as well as a way to authenticate the participants' results (Gold Standard). This paper addresses the problem of evaluating the performance of different competitors (algorithms) under the restrictions imposed by the challenge scheme, such as the comparison of multiple competitors with a unique dataset (with fixed size), a minimal number of submissions and, a set of metrics chosen to assess performance. The algorithms are sorted according to the performance metric. Still, it is common to observe performance differences among competitors as small as hundredths or even thousandths, so the question is whether the differences are significant. This paper analyzes the results of the MeOffendEs@IberLEF 2021 competition and proposes to make inference through resampling techniques (bootstrap) to support Challenge organizers' decision-making.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Surveillance Face Presentation Attack Detection Challenge
Authors:
Hao Fang,
Ajian Liu,
Jun Wan,
Sergio Escalera,
Hugo Jair Escalante,
Zhen Lei
Abstract:
Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, most of the studies lacked consideration of long-distance scenarios. Specifically, compared with FAS in traditional scenes such as phone unlocking, face payment, and self-service security inspection, FAS in long-distance such as station squares, parks, and self-service supermarkets are…
▽ More
Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, most of the studies lacked consideration of long-distance scenarios. Specifically, compared with FAS in traditional scenes such as phone unlocking, face payment, and self-service security inspection, FAS in long-distance such as station squares, parks, and self-service supermarkets are equally important, but it has not been sufficiently explored yet. In order to fill this gap in the FAS community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask). SuHiFiMask contains $10,195$ videos from $101$ subjects of different age groups, which are collected by $7$ mainstream surveillance cameras. Based on this dataset and protocol-$3$ for evaluating the robustness of the algorithm under quality changes, we organized a face presentation attack detection challenge in surveillance scenarios. It attracted 180 teams for the development phase with a total of 37 teams qualifying for the final round. The organization team re-verified and re-ran the submitted code and used the results as the final ranking. In this paper, we present an overview of the challenge, including an introduction to the dataset used, the definition of the protocol, the evaluation metrics, and the announcement of the competition results. Finally, we present the top-ranked algorithms and the research ideas provided by the competition for attack detection in long-range surveillance scenarios.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
Wild Face Anti-Spoofing Challenge 2023: Benchmark and Results
Authors:
Dong Wang,
Jia Guo,
Qiqi Shao,
Haochi He,
Zhian Chen,
Chuanbao Xiao,
Ajian Liu,
Sergio Escalera,
Hugo Jair Escalante,
Zhen Lei,
Jun Wan,
Jiankang Deng
Abstract:
Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during trainin…
▽ More
Face anti-spoofing (FAS) is an essential mechanism for safeguarding the integrity of automated face recognition systems. Despite substantial advancements, the generalization of existing approaches to real-world applications remains challenging. This limitation can be attributed to the scarcity and lack of diversity in publicly available FAS datasets, which often leads to overfitting during training or saturation during testing. In terms of quantity, the number of spoof subjects is a critical determinant. Most datasets comprise fewer than 2,000 subjects. With regard to diversity, the majority of datasets consist of spoof samples collected in controlled environments using repetitive, mechanical processes. This data collection methodology results in homogenized samples and a dearth of scenario diversity. To address these shortcomings, we introduce the Wild Face Anti-Spoofing (WFAS) dataset, a large-scale, diverse FAS dataset collected in unconstrained settings. Our dataset encompasses 853,729 images of 321,751 spoof subjects and 529,571 images of 148,169 live subjects, representing a substantial increase in quantity. Moreover, our dataset incorporates spoof data obtained from the internet, spanning a wide array of scenarios and various commercial sensors, including 17 presentation attacks (PAs) that encompass both 2D and 3D forms. This novel data collection strategy markedly enhances FAS data diversity. Leveraging the WFAS dataset and Protocol 1 (Known-Type), we host the Wild Face Anti-Spoofing Challenge at the CVPR2023 workshop. Additionally, we meticulously evaluate representative methods using Protocol 1 and Protocol 2 (Unknown-Type). Through an in-depth examination of the challenge outcomes and benchmark baselines, we provide insightful analyses and propose potential avenues for future research. The dataset is released under Insightface.
△ Less
Submitted 4 May, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Predicting students' performance in online courses using multiple data sources
Authors:
Mélina Verger,
Hugo Jair Escalante
Abstract:
Data-driven decision making is serving and transforming education. We approached the problem of predicting students' performance by using multiple data sources which came from online courses, including one we created. Experimental results show preliminary conclusions towards which data are to be considered for the task.
Data-driven decision making is serving and transforming education. We approached the problem of predicting students' performance by using multiple data sources which came from online courses, including one we created. Experimental results show preliminary conclusions towards which data are to be considered for the task.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
Multi-Branch Deep Radial Basis Function Networks for Facial Emotion Recognition
Authors:
Fernanda Hernández-Luquin,
Hugo Jair Escalante
Abstract:
Emotion recognition (ER) from facial images is one of the landmark tasks in affective computing with major developments in the last decade. Initial efforts on ER relied on handcrafted features that were used to characterize facial images and then feed to standard predictive models. Recent methodologies comprise end-to-end trainable deep learning methods that simultaneously learn both, features and…
▽ More
Emotion recognition (ER) from facial images is one of the landmark tasks in affective computing with major developments in the last decade. Initial efforts on ER relied on handcrafted features that were used to characterize facial images and then feed to standard predictive models. Recent methodologies comprise end-to-end trainable deep learning methods that simultaneously learn both, features and predictive model. Perhaps the most successful models are based on convolutional neural networks (CNNs). While these models have excelled at this task, they still fail at capturing local patterns that could emerge in the learning process. We hypothesize these patterns could be captured by variants based on locally weighted learning. Specifically, in this paper we propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units that aims at exploiting local information at the final stage of the learning process. Intuitively, these RBF units capture local patterns shared by similar instances using an intermediate representation, then the outputs of the RBFs are feed to a softmax layer that exploits this information to improve the predictive performance of the model. This feature could be particularly advantageous in ER as cultural / ethnicity differences may be identified by the local units. We evaluate the proposed method in several ER datasets and show the proposed methodology achieves state-of-the-art in some of them, even when we adopt a pre-trained VGG-Face model as backbone. We show it is the incorporation of local information what makes the proposed model competitive.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
3D High-Fidelity Mask Face Presentation Attack Detection Challenge
Authors:
Ajian Liu,
Chenxu Zhao,
Zitong Yu,
Anyang Su,
Xing Liu,
Zijian Kong,
Jun Wan,
Sergio Escalera,
Hugo Jair Escalante,
Zhen Lei,
Guodong Guo
Abstract:
The threat of 3D masks to face recognition systems is increasingly serious and has been widely concerned by researchers. To facilitate the study of the algorithms, a large-scale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask) has been collected. Specifically, it consists of a total amount of 54, 600 videos which are recorded from 75 subjects with 225 realistic masks under…
▽ More
The threat of 3D masks to face recognition systems is increasingly serious and has been widely concerned by researchers. To facilitate the study of the algorithms, a large-scale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask) has been collected. Specifically, it consists of a total amount of 54, 600 videos which are recorded from 75 subjects with 225 realistic masks under 7 new kinds of sensors. Based on this dataset and Protocol 3 which evaluates both the discrimination and generalization ability of the algorithm under the open set scenarios, we organized a 3D High-Fidelity Mask Face Presentation Attack Detection Challenge to boost the research of 3D mask-based attack detection. It attracted 195 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-run by the organizing team, and the results were used for the final ranking. This paper presents an overview of the challenge, including the introduction of the dataset used, the definition of the protocol, the calculation of the evaluation criteria, and the summary and publication of the competition results. Finally, we focus on introducing and analyzing the top ranking algorithms, the conclusion summary, and the research ideas for mask attack detection provided by this competition.
△ Less
Submitted 16 August, 2021;
originally announced August 2021.
-
ChaLearn Looking at People: Inpainting and Denoising challenges
Authors:
Sergio Escalera,
Marti Soler,
Stephane Ayache,
Umut Guclu,
Jun Wan,
Meysam Madadi,
Xavier Baro,
Hugo Jair Escalante,
Isabelle Guyon
Abstract:
Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of…
▽ More
Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of an academic competition focusing on inpainting of images and video sequences that was part of the competition program of WCCI2018 and had a satellite event collocated with ECCV2018. The ChaLearn Looking at People Inpainting Challenge aimed at advancing the state of the art on visual inpainting by promoting the development of methods for recovering missing and occluded information from images and video. Three tracks were proposed in which visual inpainting might be helpful but still challenging: human body pose estimation, text overlays removal and fingerprint denoising. This chapter describes the design of the challenge, which includes the release of three novel datasets, and the description of evaluation metrics, baselines and evaluation protocol. The results of the challenge are analyzed and discussed in detail and conclusions derived from this event are outlined.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Automated Machine Learning -- a brief review at the end of the early years
Authors:
Hugo Jair Escalante
Abstract:
Automated machine learning (AutoML) is the sub-field of machine learning that aims at automating, to some extend, all stages of the design of a machine learning system. In the context of supervised learning, AutoML is concerned with feature extraction, pre processing, model design and post processing. Major contributions and achievements in AutoML have been taking place during the recent decade. W…
▽ More
Automated machine learning (AutoML) is the sub-field of machine learning that aims at automating, to some extend, all stages of the design of a machine learning system. In the context of supervised learning, AutoML is concerned with feature extraction, pre processing, model design and post processing. Major contributions and achievements in AutoML have been taking place during the recent decade. We are therefore in perfect timing to look back and realize what we have learned. This chapter aims to summarize the main findings in the early years of AutoML. More specifically, in this chapter an introduction to AutoML for supervised learning is provided and an historical review of progress in this field is presented. Likewise, the main paradigms of AutoML are described and research opportunities are outlined.
△ Less
Submitted 24 August, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Cross-ethnicity Face Anti-spoofing Recognition Challenge: A Review
Authors:
Ajian Liu,
Xuan Li,
Jun Wan,
Sergio Escalera,
Hugo Jair Escalante,
Meysam Madadi,
Yi **,
Zhuoyuan Wu,
Xiaogang Yu,
Zichang Tan,
Qi Yuan,
Ruikun Yang,
Benjia Zhou,
Guodong Guo,
Stan Z. Li
Abstract:
Face anti-spoofing is critical to prevent face recognition systems from a security breach. The biometrics community has %possessed achieved impressive progress recently due the excellent performance of deep neural networks and the availability of large datasets. Although ethnic bias has been verified to severely affect the performance of face recognition systems, it still remains an open research…
▽ More
Face anti-spoofing is critical to prevent face recognition systems from a security breach. The biometrics community has %possessed achieved impressive progress recently due the excellent performance of deep neural networks and the availability of large datasets. Although ethnic bias has been verified to severely affect the performance of face recognition systems, it still remains an open research problem in face anti-spoofing. Recently, a multi-ethnic face anti-spoofing dataset, CASIA-SURF CeFA, has been released with the goal of measuring the ethnic bias. It is the largest up to date cross-ethnicity face anti-spoofing dataset covering $3$ ethnicities, $3$ modalities, $1,607$ subjects, 2D plus 3D attack types, and the first dataset including explicit ethnic labels among the recently released datasets for face anti-spoofing. We organized the Chalearn Face Anti-spoofing Attack Detection Challenge which consists of single-modal (e.g., RGB) and multi-modal (e.g., RGB, Depth, Infrared (IR)) tracks around this novel resource to boost research aiming to alleviate the ethnic bias. Both tracks have attracted $340$ teams in the development stage, and finally 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively. All the results were verified and re-ran by the organizing team, and the results were used for the final ranking. This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results. We analyze the top ranked solutions and draw conclusions derived from the competition. In addition we outline future work directions.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
Causal Games and Causal Nash Equilibrium
Authors:
Mauricio Gonzalez-Soto,
Luis E. Sucar,
Hugo J. Escalante
Abstract:
Classical results of Decision Theory, and its extension to a multi-agent setting: Game Theory, operate only at the associative level of information; this is, classical decision makers only take into account probabilities of events; we go one step further and consider causal information: in this work, we define Causal Decision Problems and extend them to a multi-agent decision problem, which we cal…
▽ More
Classical results of Decision Theory, and its extension to a multi-agent setting: Game Theory, operate only at the associative level of information; this is, classical decision makers only take into account probabilities of events; we go one step further and consider causal information: in this work, we define Causal Decision Problems and extend them to a multi-agent decision problem, which we call a causal game. For such games, we study belief updating in a class of strategic games in which any player's action causes some consequence via a causal model, which is unknown by all players; for this reason, the most suitable model is Harsanyi's Bayesian Game. We propose a probability updating for the Bayesian Game in such a way that the knowledge of any player in terms of probabilistic beliefs about the causal model, as well as what is caused by her actions as well as the actions of every other player are taken into account. Based on such probability updating we define a Nash equilibria for Causal Games.
△ Less
Submitted 11 October, 2019;
originally announced October 2019.
-
CASIA-SURF: A Large-scale Multi-modal Benchmark for Face Anti-spoofing
Authors:
Shifeng Zhang,
Ajian Liu,
Jun Wan,
Yanyan Liang,
Guogong Guo,
Sergio Escalera,
Hugo Jair Escalante,
Stan Z. Li
Abstract:
Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects ($\le\negmedspace170$) and modalities ($\leq\negmedspace2$), which hinder the further development of the academi…
▽ More
Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face anti-spoofing benchmarks have limited number of subjects ($\le\negmedspace170$) and modalities ($\leq\negmedspace2$), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities. Specifically, it consists of $1,000$ subjects with $21,000$ videos and each sample has $3$ modalities (i.e., RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, develo** a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2019?authuser=0
△ Less
Submitted 4 February, 2020; v1 submitted 28 August, 2019;
originally announced August 2019.
-
Towards AutoML in the presence of Drift: first results
Authors:
Jorge G. Madrid,
Hugo Jair Escalante,
Eduardo F. Morales,
Wei-Wei Tu,
Yang Yu,
Lisheng Sun-Hosoya,
Isabelle Guyon,
Michele Sebag
Abstract:
Research progress in AutoML has lead to state of the art solutions that can cope quite wellwith supervised learning task, e.g., classification with AutoSklearn. However, so far thesesystems do not take into account the changing nature of evolving data over time (i.e., theystill assume i.i.d. data); even when this sort of domains are increasingly available in realapplications (e.g., spam filtering,…
▽ More
Research progress in AutoML has lead to state of the art solutions that can cope quite wellwith supervised learning task, e.g., classification with AutoSklearn. However, so far thesesystems do not take into account the changing nature of evolving data over time (i.e., theystill assume i.i.d. data); even when this sort of domains are increasingly available in realapplications (e.g., spam filtering, user preferences, etc.). We describe a first attempt to de-velop an AutoML solution for scenarios in which data distribution changes relatively slowlyover time and in which the problem is approached in a lifelong learning setting. We extendAuto-Sklearn with sound and intuitive mechanisms that allow it to cope with this sort ofproblems. The extended Auto-Sklearn is combined with concept drift detection techniquesthat allow it to automatically determine when the initial models have to be adapted. Wereport experimental results in benchmark data from AutoML competitions that adhere tothis scenario. Results demonstrate the effectiveness of the proposed methodology.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
Meta-learning of textual representations
Authors:
Jorge Madrid,
Hugo Jair Escalante,
Eduardo Morales
Abstract:
Recent progress in AutoML has lead to state-of-the-art methods (e.g., AutoSKLearn) that can be readily used by non-experts to approach any supervised learning problem. Whereas these methods are quite effective, they are still limited in the sense that they work for tabular (matrix formatted) data only. This paper describes one step forward in trying to automate the design of supervised learning me…
▽ More
Recent progress in AutoML has lead to state-of-the-art methods (e.g., AutoSKLearn) that can be readily used by non-experts to approach any supervised learning problem. Whereas these methods are quite effective, they are still limited in the sense that they work for tabular (matrix formatted) data only. This paper describes one step forward in trying to automate the design of supervised learning methods in the context of text mining. We introduce a meta learning methodology for automatically obtaining a representation for text mining tasks starting from raw text. We report experiments considering 60 different textual representations and more than 80 text mining datasets associated to a wide variety of tasks. Experimental results show the proposed methodology is a promising solution to obtain highly effective off the shell text classification pipelines.
△ Less
Submitted 19 July, 2019; v1 submitted 20 June, 2019;
originally announced June 2019.
-
AutoML @ NeurIPS 2018 challenge: Design and Results
Authors:
Hugo Jair Escalante,
Wei-Wei Tu,
Isabelle Guyon,
Daniel L. Silver,
Evelyne Viegas,
Yuqiang Chen,
Wenyuan Dai,
Qiang Yang
Abstract:
We organized a competition on Autonomous Lifelong Machine Learning with Drift that was part of the competition program of NeurIPS 2018. This data driven competition asked participants to develop computer programs capable of solving supervised learning problems where the i.i.d. assumption did not hold. Large data sets were arranged in a lifelong learning and evaluation scenario and CodaLab was used…
▽ More
We organized a competition on Autonomous Lifelong Machine Learning with Drift that was part of the competition program of NeurIPS 2018. This data driven competition asked participants to develop computer programs capable of solving supervised learning problems where the i.i.d. assumption did not hold. Large data sets were arranged in a lifelong learning and evaluation scenario and CodaLab was used as the challenge platform. The challenge attracted more than 300 participants in its two month duration. This chapter describes the design of the challenge and summarizes its main results.
△ Less
Submitted 13 March, 2019; v1 submitted 12 March, 2019;
originally announced March 2019.
-
A Guiding Principle for Causal Decision Problems
Authors:
M. Gonzalez-Soto,
L. E. Sucar,
H. J. Escalante
Abstract:
We define a Causal Decision Problem as a Decision Problem where the available actions, the family of uncertain events and the set of outcomes are related through the variables of a Causal Graphical Model $\mathcal{G}$. A solution criteria based on Pearl's Do-Calculus and the Expected Utility criteria for rational preferences is proposed. The implementation of this criteria leads to an on-line deci…
▽ More
We define a Causal Decision Problem as a Decision Problem where the available actions, the family of uncertain events and the set of outcomes are related through the variables of a Causal Graphical Model $\mathcal{G}$. A solution criteria based on Pearl's Do-Calculus and the Expected Utility criteria for rational preferences is proposed. The implementation of this criteria leads to an on-line decision making procedure that has been shown to have similar performance to classic Reinforcement Learning algorithms while allowing for a causal model of an environment to be learned. Thus, we aim to provide the theoretical guarantees of the usefulness and optimality of a decision making procedure based on causal information.
△ Less
Submitted 6 February, 2019;
originally announced February 2019.
-
Playing against Nature: causal discovery for decision making under uncertainty
Authors:
M. Gonzalez-Soto,
L. E. Sucar,
H. J. Escalante
Abstract:
We consider decision problems under uncertainty where the options available to a decision maker and the resulting outcome are related through a causal mechanism which is unknown to the decision maker. We ask how a decision maker can learn about this causal mechanism through sequential decision making as well as using current causal knowledge inside each round in order to make better choices had sh…
▽ More
We consider decision problems under uncertainty where the options available to a decision maker and the resulting outcome are related through a causal mechanism which is unknown to the decision maker. We ask how a decision maker can learn about this causal mechanism through sequential decision making as well as using current causal knowledge inside each round in order to make better choices had she not considered causal knowledge and propose a decision making procedure in which an agent holds \textit{beliefs} about her environment which are used to make a choice and are updated using the observed outcome. As proof of concept, we present an implementation of this causal decision making model and apply it in a simple scenario. We show that the model achieves a performance similar to the classic Q-learning while it also acquires a causal model of the environment.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.
-
A visual approach for age and gender identification on Twitter
Authors:
Miguel A. Alvarez-Carmona,
Luis Pellegrin,
Manuel Montes-y-Gómez,
Fernando Sánchez-Vega,
Hugo Jair Escalante,
A. Pastor López-Monroy,
Luis Villaseñor-Pineda,
Esaú Villatoro-Tello
Abstract:
The goal of Author Profiling (AP) is to identify demographic aspects (e.g., age, gender) from a given set of authors by analyzing their written texts. Recently, the AP task has gained interest in many problems related to computer forensics, psychology, marketing, but specially in those related with social media exploitation. As known, social media data is shared through a wide range of modalities…
▽ More
The goal of Author Profiling (AP) is to identify demographic aspects (e.g., age, gender) from a given set of authors by analyzing their written texts. Recently, the AP task has gained interest in many problems related to computer forensics, psychology, marketing, but specially in those related with social media exploitation. As known, social media data is shared through a wide range of modalities (e.g., text, images and audio), representing valuable information to be exploited for extracting valuable insights from users. Nevertheless, most of the current work in AP using social media data has been devoted to analyze textual information only, and there are very few works that have started exploring the gender identification using visual information. Contrastingly, this paper focuses in exploiting the visual modality to perform both age and gender identification in social media, specifically in Twitter. Our goal is to evaluate the pertinence of using visual information in solving the AP task. Accordingly, we have extended the Twitter corpus from PAN 2014, incorporating posted images from all the users, making a distinction between tweeted and retweeted images. Performed experiments provide interesting evidence on the usefulness of visual information in comparison with traditional textual representations for the AP task.
△ Less
Submitted 28 May, 2018;
originally announced May 2018.
-
First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis
Authors:
Julio C. S. Jacques Junior,
Yağmur Güçlütürk,
Marc Pérez,
Umut Güçlü,
Carlos Andujar,
Xavier Baró,
Hugo Jair Escalante,
Isabelle Guyon,
Marcel A. J. van Gerven,
Rob van Lier,
Sergio Escalera
Abstract:
Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing inter…
▽ More
Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.
△ Less
Submitted 17 July, 2019; v1 submitted 21 April, 2018;
originally announced April 2018.
-
Towards Deep Representation Learning with Genetic Programming
Authors:
Lino Rodriguez-Coayahuitl,
Alicia Morales-Reyes,
Hugo Jair Escalante
Abstract:
Genetic Programming (GP) is an evolutionary algorithm commonly used for machine learning tasks. In this paper we present a method that allows GP to transform the representation of a large-scale machine learning dataset into a more compact representation, by means of processing features from the original representation at individual level. We develop as a proof of concept of this method an autoenco…
▽ More
Genetic Programming (GP) is an evolutionary algorithm commonly used for machine learning tasks. In this paper we present a method that allows GP to transform the representation of a large-scale machine learning dataset into a more compact representation, by means of processing features from the original representation at individual level. We develop as a proof of concept of this method an autoencoder. We tested a preliminary version of our approach in a variety of well-known machine learning image datasets. We speculate that this method, used in an iterative manner, can produce results competitive with state-of-art deep neural networks.
△ Less
Submitted 20 February, 2018;
originally announced February 2018.
-
Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos
Authors:
Hugo Jair Escalante,
Heysem Kaya,
Albert Ali Salah,
Sergio Escalera,
Yagmur Gucluturk,
Umut Guclu,
Xavier Baro,
Isabelle Guyon,
Julio Jacques Junior,
Meysam Madadi,
Stephane Ayache,
Evelyne Viegas,
Furkan Gurpinar,
Achmadnoer Sukma Wicaksana,
Cynthia C. S. Liem,
Marcel A. J. van Gerven,
Rob van Lier
Abstract:
Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in…
▽ More
Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks. Specifically, we review and study those mechanisms in the context of first impressions analysis. To the best of our knowledge, this is the first effort in this direction. Additionally, we describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, the evaluation protocol, and summarize the results of the challenge. Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field.
△ Less
Submitted 28 September, 2019; v1 submitted 2 February, 2018;
originally announced February 2018.
-
Fractal dimension analysis for automatic morphological galaxy classification
Authors:
Jorge de la Calleja,
Elsa M. de la Calleja,
Hugo Jair Escalante
Abstract:
In this report we present experimental results using \emph{Haussdorf-Besicovich} fractal dimension for performing morphological galaxy classification. The fractal dimension is a topological, structural and spatial property that give us information about the space were an object lives. We have calculated the fractal dimension value of the main types of galaxies: ellipticals, spirals and irregulars;…
▽ More
In this report we present experimental results using \emph{Haussdorf-Besicovich} fractal dimension for performing morphological galaxy classification. The fractal dimension is a topological, structural and spatial property that give us information about the space were an object lives. We have calculated the fractal dimension value of the main types of galaxies: ellipticals, spirals and irregulars; and we use it as a feature for classifying them. Also, we have performed an image analysis process in order to standardize the galaxy images, and we have used principal component analysis to obtain the main attributes in the images. Galaxy classification was performed using machine learning algorithms: C4.5, k-nearest neighbors, random forest and support vector machines. Preliminary experimental results using 10-fold cross-validation show that fractal dimension helps to improve classification, with over 88 per cent accuracy for elliptical galaxies, 100 per cent accuracy for spiral galaxies and over 40 per cent for irregular galaxies.
△ Less
Submitted 22 June, 2017;
originally announced June 2017.
-
ChaLearn Looking at People: A Review of Events and Resources
Authors:
Sergio Escalera,
Xavier Baró,
Hugo Jair Escalante,
Isabelle Guyon
Abstract:
This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challe…
▽ More
This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges and events in this field. This paper reviews associated events, and introduces the ChaLearn LAP platform where public resources (including code, data and preprints of papers) related to the organized events are available. We also provide a discussion on perspectives of ChaLearn LAP activities.
△ Less
Submitted 15 February, 2017; v1 submitted 10 January, 2017;
originally announced January 2017.
-
Early text classification: a Naive solution
Authors:
Hugo Jair Escalante,
Manuel Montes-y-Gómez,
Luis Villaseñor-Pineda,
Marcelo Luis Errecalde
Abstract:
Text classification is a widely studied problem, and it can be considered solved for some domains and under certain circumstances. There are scenarios, however, that have received little or no attention at all, despite its relevance and applicability. One of such scenarios is early text classification, where one needs to know the category of a document by using partial information only. A document…
▽ More
Text classification is a widely studied problem, and it can be considered solved for some domains and under certain circumstances. There are scenarios, however, that have received little or no attention at all, despite its relevance and applicability. One of such scenarios is early text classification, where one needs to know the category of a document by using partial information only. A document is processed as a sequence of terms, and the goal is to devise a method that can make predictions as fast as possible. The importance of this variant of the text classification problem is evident in domains like sexual predator detection, where one wants to identify an offender as early as possible. This paper analyzes the suitability of the standard naive Bayes classifier for approaching this problem. Specifically, we assess its performance when classifying documents after seeing an increasingly number of terms. A simple modification to the standard naive Bayes implementation allows us to make predictions with partial information. To the best of our knowledge naive Bayes has not been used for this purpose before. Throughout an extensive experimental evaluation we show the effectiveness of the classifier for early text classification. What is more, we show that this simple solution is very competitive when compared with state of the art methodologies that are more elaborated. We foresee our work will pave the way for the development of more effective early text classification techniques based in the naive Bayes formulation.
△ Less
Submitted 20 September, 2015;
originally announced September 2015.
-
Term-Weighting Learning via Genetic Programming for Text Classification
Authors:
Hugo Jair Escalante,
Mauricio A. García-Limón,
Alicia Morales-Reyes,
Mario Graff,
Manuel Montes-y-Gómez,
Eduardo F. Morales
Abstract:
This paper describes a novel approach to learning term-weighting schemes (TWSs) in the context of text classification. In text mining a TWS determines the way in which documents will be represented in a vector space model, before applying a classifier. Whereas acceptable performance has been obtained with standard TWSs (e.g., Boolean and term-frequency schemes), the definition of TWSs has been tra…
▽ More
This paper describes a novel approach to learning term-weighting schemes (TWSs) in the context of text classification. In text mining a TWS determines the way in which documents will be represented in a vector space model, before applying a classifier. Whereas acceptable performance has been obtained with standard TWSs (e.g., Boolean and term-frequency schemes), the definition of TWSs has been traditionally an art. Further, it is still a difficult task to determine what is the best TWS for a particular problem and it is not clear yet, whether better schemes, than those currently available, can be generated by combining known TWS. We propose in this article a genetic program that aims at learning effective TWSs that can improve the performance of current schemes in text classification. The genetic program learns how to combine a set of basic units to give rise to discriminative TWSs. We report an extensive experimental study comprising data sets from thematic and non-thematic text classification as well as from image classification. Our study shows the validity of the proposed method; in fact, we show that TWSs learned with the genetic program outperform traditional schemes and other TWSs proposed in recent works. Further, we show that TWSs learned from a specific domain can be effectively used for other tasks.
△ Less
Submitted 6 October, 2014; v1 submitted 2 October, 2014;
originally announced October 2014.
-
Principal motion components for gesture recognition using a single-example
Authors:
Hugo Jair Escalante,
Isabelle Guyon,
Vassilis Athitsos,
Pat Jangyodsuk,
Jun Wan
Abstract:
This paper introduces principal motion components (PMC), a new method for one-shot gesture recognition. In the considered scenario a single training-video is available for each gesture to be recognized, which limits the application of traditional techniques (e.g., HMMs). In PMC, a 2D map of motion energy is obtained per each pair of consecutive frames in a video. Motion maps associated to a video…
▽ More
This paper introduces principal motion components (PMC), a new method for one-shot gesture recognition. In the considered scenario a single training-video is available for each gesture to be recognized, which limits the application of traditional techniques (e.g., HMMs). In PMC, a 2D map of motion energy is obtained per each pair of consecutive frames in a video. Motion maps associated to a video are processed to obtain a PCA model, which is used for recognition under a reconstruction-error approach. The main benefits of the proposed approach are its simplicity, easiness of implementation, competitive performance and efficiency. We report experimental results in one-shot gesture recognition using the ChaLearn Gesture Dataset; a benchmark comprising more than 50,000 gestures, recorded as both RGB and depth video with a Kinect camera. Results obtained with PMC are competitive with alternative methods proposed for the same data set.
△ Less
Submitted 31 January, 2014; v1 submitted 17 October, 2013;
originally announced October 2013.
-
An estimation of distribution algorithm with adaptive Gibbs sampling for unconstrained global optimization
Authors:
Jonás Velasco,
Mario A. Saucedo-Espinosa,
Hugo Jair Escalante,
Karlo Mendoza,
César Emilio Villarreal-Rodríguez,
Óscar L. Chacón-Mondragón,
Adrián Rodríguez,
Arturo Berrones
Abstract:
In this paper is proposed a new heuristic approach belonging to the field of evolutionary Estimation of Distribution Algorithms (EDAs). EDAs builds a probability model and a set of solutions is sampled from the model which characterizes the distribution of such solutions. The main framework of the proposed method is an estimation of distribution algorithm, in which an adaptive Gibbs sampling is us…
▽ More
In this paper is proposed a new heuristic approach belonging to the field of evolutionary Estimation of Distribution Algorithms (EDAs). EDAs builds a probability model and a set of solutions is sampled from the model which characterizes the distribution of such solutions. The main framework of the proposed method is an estimation of distribution algorithm, in which an adaptive Gibbs sampling is used to generate new promising solutions and, in combination with a local search strategy, it improves the individual solutions produced in each iteration. The Estimation of Distribution Algorithm with Adaptive Gibbs Sampling we are proposing in this paper is called AGEDA. We experimentally evaluate and compare this algorithm against two deterministic procedures and several stochastic methods in three well known test problems for unconstrained global optimization. It is empirically shown that our heuristic is robust in problems that involve three central aspects that mainly determine the difficulty of global optimization problems, namely high-dimensionality, multi-modality and non-smoothness.
△ Less
Submitted 29 May, 2013; v1 submitted 11 July, 2011;
originally announced July 2011.