Search | arXiv e-print repository

Implementing engrams from a machine learning perspective: XOR as a basic motif

Authors: Jesus Marco de Lucas, Maria Peña Fernandez, Lara Lloret Iglesias

Abstract: We have previously presented the idea of how complex multimodal information could be represented in our brains in a compressed form, following mechanisms similar to those employed in machine learning tools, like autoencoders. In this short comment note we reflect, mainly with a didactical purpose, upon the basic question for a biological implementation: what could be the mechanism working as a los… ▽ More We have previously presented the idea of how complex multimodal information could be represented in our brains in a compressed form, following mechanisms similar to those employed in machine learning tools, like autoencoders. In this short comment note we reflect, mainly with a didactical purpose, upon the basic question for a biological implementation: what could be the mechanism working as a loss function, and how it could be connected to a neuronal network providing the required feedback to build a simple training configuration. We present our initial ideas based on a basic motif that implements an XOR switch, using few excitatory and inhibitory neurons. Such motif is guided by a principle of homeostasis, and it implements a loss function that could provide feedback to other neuronal structures, establishing a control system. We analyse the presence of this XOR motif in the connectome of C.Elegans, and indicate the relationship with the well-known lateral inhibition motif. We then explore how to build a basic biological neuronal structure with learning capacity integrating this XOR motif. Guided by the computational analogy, we show an initial example that indicates the feasibility of this approach, applied to learning binary sequences, like it is the case for simple melodies. In summary, we provide didactical examples exploring the parallelism between biological and computational learning mechanisms, identifying basic motifs and training procedures, and how an engram encoding a melody could be built using a simple recurrent network involving both excitatory and inhibitory neurons. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 9 pages, short comment

arXiv:2402.14904 [pdf, other]

Watermarking Makes Language Models Radioactive

Authors: Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon

Abstract: This paper investigates the radioactivity of LLM-generated texts, i.e. whether it is possible to detect that such input was used as training data. Conventional methods like membership inference can carry out this detection with some level of accuracy. We show that watermarked training data leaves traces easier to detect and much more reliable than membership inference. We link the contamination le… ▽ More This paper investigates the radioactivity of LLM-generated texts, i.e. whether it is possible to detect that such input was used as training data. Conventional methods like membership inference can carry out this detection with some level of accuracy. We show that watermarked training data leaves traces easier to detect and much more reliable than membership inference. We link the contamination level to the watermark robustness, its proportion in the training set, and the fine-tuning process. We notably demonstrate that training on watermarked synthetic instructions can be detected with high confidence (p-value < 1e-5) even when as little as 5% of training text is watermarked. Thus, LLM watermarking, originally designed for detecting machine-generated text, gives the ability to easily identify if the outputs of a watermarked LLM were used to fine-tune another LLM. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.13673 [pdf, other]

doi 10.3390/axioms13020083

Computing Transiting Exoplanet Parameters with 1D Convolutional Neural Networks

Authors: Santiago Iglesias Álvarez, Enrique Díez Alonso, María Luisa Sánchez Rodríguez, Javier Rodríguez Rodríguez, Saúl Pérez Fernández, Francisco Javier de Cos Juez

Abstract: The transit method allows the detection and characterization of planetary systems by analyzing stellar light curves. Convolutional neural networks appear to offer a viable solution for automating these analyses. In this research, two 1D convolutional neural network models, which work with simulated light curves in which transit-like signals were injected, are presented. One model operates on compl… ▽ More The transit method allows the detection and characterization of planetary systems by analyzing stellar light curves. Convolutional neural networks appear to offer a viable solution for automating these analyses. In this research, two 1D convolutional neural network models, which work with simulated light curves in which transit-like signals were injected, are presented. One model operates on complete light curves and estimates the orbital period, and the other one operates on phase-folded light curves and estimates the semimajor axis of the orbit and the square of the planet-to-star radius ratio. Both models were tested on real data from TESS light curves with confirmed planets to ensure that they are able to work with real data. The results obtained show that 1D CNNs are able to characterize transiting exoplanets from their host star's detrended light curve and, furthermore, reducing both the required time and computational costs compared with the current detection and characterization algorithms. △ Less

Submitted 21 February, 2024; originally announced February 2024.

arXiv:2401.17264 [pdf, other]

Proactive Detection of Voice Cloning with Localized Watermarking

Authors: Robin San Roman, Pierre Fernandez, Alexandre Défossez, Teddy Furon, Tuan Tran, Hady Elsahar

Abstract: In the rapidly evolving field of speech generative models, there is a pressing need to ensure audio authenticity against the risks of voice cloning. We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech. AudioSeal employs a generator/detector architecture trained jointly with a localization loss to enable localized waterma… ▽ More In the rapidly evolving field of speech generative models, there is a pressing need to ensure audio authenticity against the risks of voice cloning. We present AudioSeal, the first audio watermarking technique designed specifically for localized detection of AI-generated speech. AudioSeal employs a generator/detector architecture trained jointly with a localization loss to enable localized watermark detection up to the sample level, and a novel perceptual loss inspired by auditory masking, that enables AudioSeal to achieve better imperceptibility. AudioSeal achieves state-of-the-art performance in terms of robustness to real life audio manipulations and imperceptibility based on automatic and human evaluation metrics. Additionally, AudioSeal is designed with a fast, single-pass detector, that significantly surpasses existing models in speed - achieving detection up to two orders of magnitude faster, making it ideal for large-scale and real-time applications. △ Less

Submitted 6 June, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: Published at ICML 2024. Code at https://github.com/facebookresearch/audioseal - webpage at https://pierrefdz.github.io/publications/audioseal/

arXiv:2312.05187 [pdf, other]

Seamless: Multilingual Expressive and Streaming Speech Translation

Authors: Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek , et al. (40 additional authors not shown)

Abstract: Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion. First, we contribute an improved version of the massively multilingual and multimodal SeamlessM4… ▽ More Large-scale automatic speech translation systems today lack key features that help machine-mediated communication feel seamless when compared to human-to-human dialogue. In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion. First, we contribute an improved version of the massively multilingual and multimodal SeamlessM4T model-SeamlessM4T v2. This newer model, incorporating an updated UnitY2 framework, was trained on more low-resource language data. SeamlessM4T v2 provides the foundation on which our next two models are initiated. SeamlessExpressive enables translation that preserves vocal styles and prosody. Compared to previous efforts in expressive speech research, our work addresses certain underexplored aspects of prosody, such as speech rate and pauses, while also preserving the style of one's voice. As for SeamlessStreaming, our model leverages the Efficient Monotonic Multihead Attention mechanism to generate low-latency target translations without waiting for complete source utterances. As the first of its kind, SeamlessStreaming enables simultaneous speech-to-speech/text translation for multiple source and target languages. To ensure that our models can be used safely and responsibly, we implemented the first known red-teaming effort for multimodal machine translation, a system for the detection and mitigation of added toxicity, a systematic evaluation of gender bias, and an inaudible localized watermarking mechanism designed to dampen the impact of deepfakes. Consequently, we bring major components from SeamlessExpressive and SeamlessStreaming together to form Seamless, the first publicly available system that unlocks expressive cross-lingual communication in real-time. The contributions to this work are publicly released and accessible at https://github.com/facebookresearch/seamless_communication △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2311.12485 [pdf, other]

doi 10.1016/j.csi.2024.103878

Pricing4APIs: A Rigorous Model for RESTful API Pricings

Authors: Rafael Fresno-Aranda, Pablo Fernandez, Antonio Gamez-Diaz, Amador Duran, Antonio Ruiz-Cortes

Abstract: APIs are increasingly becoming new business assets for organizations and consequently, API functionality and its pricing should be precisely defined for customers. Pricing is typically composed by different plans that specify a range of limitations, e.g., a Free plan allows 100 monthly requests while a Gold plan has 10000 requests per month. In this context, the OpenAPI Specification (OAS) has eme… ▽ More APIs are increasingly becoming new business assets for organizations and consequently, API functionality and its pricing should be precisely defined for customers. Pricing is typically composed by different plans that specify a range of limitations, e.g., a Free plan allows 100 monthly requests while a Gold plan has 10000 requests per month. In this context, the OpenAPI Specification (OAS) has emerged to model the functional part of an API, becoming a de facto industry standard and boosting a rich ecosystem of vendor-neutral tools to assist API providers and consumers. In contrast, there is no proposal for modeling API pricings (i.e. their plans and limitations) and this lack hinders the creation of tools that can leverage this information. To deal with this gap, this paper presents a pricing modeling framework that includes: (a) Pricing4APIs model, a comprehensive and rigorous model of API pricings, along SLA4OAI, a serialization that extends OAS; (b) an operation to validate the description of API pricings, with a toolset (sla4oai-analyzer) that has been developed to automate this operation. Additionally, we analyzed 268 real-world APIs to assess the expressiveness of our proposal and created a representative dataset of 54 pricing models to validate our framework. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.06156 [pdf, other]

Triad: Trusted Timestamps in Untrusted Environments

Authors: Gabriel P. Fernandez, Andrey Brito, Christof Fetzer

Abstract: We aim to provide trusted time measurement mechanisms to applications and cloud infrastructure deployed in environments that could harbor potential adversaries, including the hardware infrastructure provider. Despite Trusted Execution Environments (TEEs) providing multiple security functionalities, timestamps from the Operating System are not covered. Nevertheless, some services require time for v… ▽ More We aim to provide trusted time measurement mechanisms to applications and cloud infrastructure deployed in environments that could harbor potential adversaries, including the hardware infrastructure provider. Despite Trusted Execution Environments (TEEs) providing multiple security functionalities, timestamps from the Operating System are not covered. Nevertheless, some services require time for validating permissions or ordering events. To address that need, we introduce Triad, a trusted timestamp dispatcher of time readings. The solution provides trusted timestamps enforced by mutually supportive enclave-based clock servers that create a continuous trusted timeline. We leverage enclave properties such as forced exits and CPU-based counters to mitigate attacks on the server's timestamp counters. Triad produces trusted, confidential, monotonically-increasing timestamps with bounded error and desirable, non-trivial properties. Our implementation relies on Intel SGX and SCONE, allowing transparent usage. We evaluate Triad's error and behavior in multiple dimensions. △ Less

Submitted 26 February, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.06154 [pdf, other]

A Last-Level Defense for Application Integrity and Confidentiality

Authors: Gabriel P. Fernandez, Andrey Brito, Ardhi Putra Pratama Hartono, Muhammad Usama Sardar, Christof Fetzer

Abstract: Our objective is to protect the integrity and confidentiality of applications operating in untrusted environments. Trusted Execution Environments (TEEs) are not a panacea. Hardware TEEs fail to protect applications against Sybil, Fork and Rollback Attacks and, consequently, fail to preserve the consistency and integrity of applications. We introduce a novel system, LLD, that enforces the integrity… ▽ More Our objective is to protect the integrity and confidentiality of applications operating in untrusted environments. Trusted Execution Environments (TEEs) are not a panacea. Hardware TEEs fail to protect applications against Sybil, Fork and Rollback Attacks and, consequently, fail to preserve the consistency and integrity of applications. We introduce a novel system, LLD, that enforces the integrity and consistency of applications in a transparent and scalable fashion. Our solution augments TEEs with instantiation control and rollback protection. Instantiation control, enforced with TEE-supported leases, mitigates Sybil/Fork Attacks without incurring the high costs of solving crypto-puzzles. Our rollback detection mechanism does not need excessive replication, nor does it sacrifice durability. We show that implementing these functionalities in the LLD runtime automatically protects applications and services such as a popular DBMS. △ Less

Submitted 10 November, 2023; originally announced November 2023.

arXiv:2310.11446 [pdf, other]

Functional Invariants to Watermark Large Transformers

Authors: Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze

Abstract: The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance. Watermarking addresses this issue by embedding a unique identifier into the model, while preserving its performance. However, most existing approaches require to optimize the weights to imprint the watermark signal, which is not suitable at scale due to the computational cost. This pa… ▽ More The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance. Watermarking addresses this issue by embedding a unique identifier into the model, while preserving its performance. However, most existing approaches require to optimize the weights to imprint the watermark signal, which is not suitable at scale due to the computational cost. This paper explores watermarks with virtually no computational cost, applicable to a non-blind white-box setting (assuming access to both the original and watermarked networks). They generate functionally equivalent copies by leveraging the models' invariance, via operations like dimension permutations or scaling/unscaling. This enables to watermark models without any change in their outputs and remains stealthy. Experiments demonstrate the effectiveness of the approach and its robustness against various model transformations (fine-tuning, quantization, pruning), making it a practical solution to protect the integrity of large models. △ Less

Submitted 18 January, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: Published at ICASSP 2024. Webpage at https://pierrefdz.github.io/publications/invariancewm/

arXiv:2308.02199 [pdf, other]

A Survey of Spanish Clinical Language Models

Authors: Guillem García Subies, Álvaro Barbero Jiménez, Paloma Martínez Fernández

Abstract: This survey focuses in encoder Language Models for solving tasks in the clinical domain in the Spanish language. We review the contributions of 17 corpora focused mainly in clinical tasks, then list the most relevant Spanish Language Models and Spanish Clinical Language models. We perform a thorough comparison of these models by benchmarking them over a curated subset of the available corpora, in… ▽ More This survey focuses in encoder Language Models for solving tasks in the clinical domain in the Spanish language. We review the contributions of 17 corpora focused mainly in clinical tasks, then list the most relevant Spanish Language Models and Spanish Clinical Language models. We perform a thorough comparison of these models by benchmarking them over a curated subset of the available corpora, in order to find the best-performing ones; in total more than 3000 models were fine-tuned for this study. All the tested corpora and the best models are made publically available in an accessible way, so that the results can be reproduced by independent teams or challenged in the future when new Spanish Clinical Language models are created. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.00113 [pdf, other]

Three Bricks to Consolidate Watermarks for Large Language Models

Authors: Pierre Fernandez, Antoine Chaffin, Karim Tit, Vivien Chappelier, Teddy Furon

Abstract: The task of discerning between generated and natural texts is increasingly challenging. In this context, watermarking emerges as a promising technique for ascribing generated text to a specific model. It alters the sampling generation process so as to leave an invisible trace in the generated output, facilitating later detection. This research consolidates watermarks for large language models base… ▽ More The task of discerning between generated and natural texts is increasingly challenging. In this context, watermarking emerges as a promising technique for ascribing generated text to a specific model. It alters the sampling generation process so as to leave an invisible trace in the generated output, facilitating later detection. This research consolidates watermarks for large language models based on three theoretical and empirical considerations. First, we introduce new statistical tests that offer robust theoretical guarantees which remain valid even at low false-positive rates (less than 10$^{\text{-6}}$). Second, we compare the effectiveness of watermarks using classical benchmarks in the field of natural language processing, gaining insights into their real-world applicability. Third, we develop advanced detection schemes for scenarios where access to the LLM is available, as well as multi-bit watermarking. △ Less

Submitted 8 November, 2023; v1 submitted 26 July, 2023; originally announced August 2023.

Comments: Published at WIFS 2023. Code at https://github.com/facebookresearch/three_bricks - webpage at https://pierrefdz.github.io/publications/threebricks/

arXiv:2304.12210 [pdf, other]

A Cookbook of Self-Supervised Learning

Authors: Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Gei**, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann LeCun, Micah Goldblum

Abstract: Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier… ▽ More Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier to entry into SSL research by laying the foundations and latest SSL recipes in the style of a cookbook. We hope to empower the curious researcher to navigate the terrain of methods, understand the role of the various knobs, and gain the know-how required to explore how delicious SSL can be. △ Less

Submitted 28 June, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.07193 [pdf, other]

DINOv2: Learning Robust Visual Features without Supervision

Authors: Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin , et al. (1 additional authors not shown)

Abstract: The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producing all-purpose visual features, i.e., features that work across image distributions and tasks without finetuning. This work shows that existing pr… ▽ More The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producing all-purpose visual features, i.e., features that work across image distributions and tasks without finetuning. This work shows that existing pretraining methods, especially self-supervised methods, can produce such features if trained on enough curated data from diverse sources. We revisit existing approaches and combine different techniques to scale our pretraining in terms of data and model size. Most of the technical contributions aim at accelerating and stabilizing the training at scale. In terms of data, we propose an automatic pipeline to build a dedicated, diverse, and curated image dataset instead of uncurated data, as typically done in the self-supervised literature. In terms of models, we train a ViT model (Dosovitskiy et al., 2020) with 1B parameters and distill it into a series of smaller models that surpass the best available all-purpose features, OpenCLIP (Ilharco et al., 2021) on most of the benchmarks at image and pixel levels. △ Less

Submitted 2 February, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

arXiv:2303.15435 [pdf, other]

The Stable Signature: Rooting Watermarks in Latent Diffusion Models

Authors: Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon

Abstract: Generative image modeling enables a wide range of applications but raises ethical concerns about responsible deployment. This paper introduces an active strategy combining image watermarking and Latent Diffusion Models. The goal is for all generated images to conceal an invisible watermark allowing for future detection and/or identification. The method quickly fine-tunes the latent decoder of the… ▽ More Generative image modeling enables a wide range of applications but raises ethical concerns about responsible deployment. This paper introduces an active strategy combining image watermarking and Latent Diffusion Models. The goal is for all generated images to conceal an invisible watermark allowing for future detection and/or identification. The method quickly fine-tunes the latent decoder of the image generator, conditioned on a binary signature. A pre-trained watermark extractor recovers the hidden signature from any generated image and a statistical test then determines whether it comes from the generative model. We evaluate the invisibility and robustness of the watermarks on a variety of generation tasks, showing that Stable Signature works even after the images are modified. For instance, it detects the origin of an image generated from a text prompt, then cropped to keep $10\%$ of the content, with $90$+$\%$ accuracy at a false positive rate below 10$^{-6}$. △ Less

Submitted 26 July, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

Comments: Published at ICCV 2023. Code at https://github.com/facebookresearch/stable_signature - webpage at https://pierrefdz.github.io/publications/stablesignature

arXiv:2303.05995 [pdf, other]

Exploring Gender Bias in Remote Pair Programming among Software Engineering Students: The twincode Original Study and First External Replication

Authors: Amador Durán, Pablo Fernández, Beatriz Bernárdez, Nathaniel Weinman, Aslıhan Akalın, Armando Fox

Abstract: Context. Software Engineering (SE) has low female representation due to gender bias that men are better at programming. Pair programming (PP) is common in industry and can increase student interest in SE, especially women; but if gender bias affects PP, it may discourage women from joining the field. Objective. We explore gender bias in PP. In a remote setting where students cannot see their pee… ▽ More Context. Software Engineering (SE) has low female representation due to gender bias that men are better at programming. Pair programming (PP) is common in industry and can increase student interest in SE, especially women; but if gender bias affects PP, it may discourage women from joining the field. Objective. We explore gender bias in PP. In a remote setting where students cannot see their peers' gender, we study how perceived productivity, technical competency and collaboration/interaction behaviors of SE students vary by perceived gender of their remote partner. Method. We developed an online PP platform (twincode) with a collaborative editing window and a chat pane. Control group had no gender information about their partner, while treatment group saw a gendered avatar as a man or woman. Avatar gender was swapped between tasks to analyze 45 variables on collaborative coding behavior, chat utterances and questionnaire responses of 46 pairs in original study at the University of Seville and 23 pairs in the replication at the University of California, Berkeley. Results. No significant effect of gender bias treatment or interaction between perceived partner's gender and subject's gender in any variable in original study. In replication, significant effects with moderate to large sizes in four variables within experimental group comparing subjects' actions when partner was male vs female. △ Less

Submitted 10 March, 2023; originally announced March 2023.

arXiv:2211.15999 [pdf]

doi 10.5121/csit.2022.122106

Impact of Automatic Image Classification and Blind Deconvolution in Improving Text Detection Performance of the CRAFT Algorithm

Authors: Clarisa V. Albarillo, Proceso L. Fernandez Jr

Abstract: Text detection in natural scenes has been a significant and active research subject in computer vision and document analysis because of its wide range of applications as evidenced by the emergence of the Robust Reading Competition. One of the algorithms which has good text detection performance in the said competition is the Character Region Awareness for Text Detection (CRAFT). Employing the ICDA… ▽ More Text detection in natural scenes has been a significant and active research subject in computer vision and document analysis because of its wide range of applications as evidenced by the emergence of the Robust Reading Competition. One of the algorithms which has good text detection performance in the said competition is the Character Region Awareness for Text Detection (CRAFT). Employing the ICDAR 2013 dataset, this study investigates the impact of automatic image classification and blind deconvolution as image pre-processing steps to further enhance the text detection performance of CRAFT. The proposed technique automatically classifies the scene images into two categories, blurry and non-blurry, by utilizing of a Laplacian operator with 100 as threshold. Prior to applying the CRAFT algorithm, images that are categorized as blurry are further pre-processed using blind deconvolution to reduce the blur. The results revealed that the proposed method significantly enhanced the detection performance of CRAFT, as demonstrated by its IoU h-mean of 94.47% compared to the original 91.42% h-mean of CRAFT and this even outperformed the top-ranked SenseTime, whose h-mean is 93.62%. △ Less

Submitted 29 November, 2022; originally announced November 2022.

Comments: 18 pages, 7 figures, 3rd International Conference on Machine Learning Techniques and Data Science

Journal ref: Vol. 12, No. 21, 2022, 58-75

arXiv:2210.10620 [pdf, other]

Active Image Indexing

Authors: Pierre Fernandez, Matthijs Douze, Hervé Jégou, Teddy Furon

Abstract: Image copy detection and retrieval from large databases leverage two components. First, a neural network maps an image to a vector representation, that is relatively robust to various transformations of the image. Second, an efficient but approximate similarity search algorithm trades scalability (size and speed) against quality of the search, thereby introducing a source of error. This paper impr… ▽ More Image copy detection and retrieval from large databases leverage two components. First, a neural network maps an image to a vector representation, that is relatively robust to various transformations of the image. Second, an efficient but approximate similarity search algorithm trades scalability (size and speed) against quality of the search, thereby introducing a source of error. This paper improves the robustness of image copy detection with active indexing, that optimizes the interplay of these two components. We reduce the quantization loss of a given image representation by making imperceptible changes to the image before its release. The loss is back-propagated through the deep neural network back to the image, under perceptual constraints. These modifications make the image more retrievable. Our experiments show that the retrieval and copy detection of activated images is significantly improved. For instance, activation improves by $+40\%$ the Recall1@1 on various image transformations, and for several popular indexing structures based on product quantization and locality sensitivity hashing. △ Less

Submitted 5 October, 2022; originally announced October 2022.

arXiv:2210.02833 [pdf, other]

Matching Text and Audio Embeddings: Exploring Transfer-learning Strategies for Language-based Audio Retrieval

Authors: Benno Weck, Miguel Pérez Fernández, Holger Kirchhoff, Xavier Serra

Abstract: We present an analysis of large-scale pretrained deep learning models used for cross-modal (text-to-audio) retrieval. We use embeddings extracted by these models in a metric learning framework to connect matching pairs of audio and text. Shallow neural networks map the embeddings to a common dimensionality. Our system, which is an extension of our submission to the Language-based Audio Retrieval T… ▽ More We present an analysis of large-scale pretrained deep learning models used for cross-modal (text-to-audio) retrieval. We use embeddings extracted by these models in a metric learning framework to connect matching pairs of audio and text. Shallow neural networks map the embeddings to a common dimensionality. Our system, which is an extension of our submission to the Language-based Audio Retrieval Task of the DCASE Challenge 2022, employs the RoBERTa foundation model as the text embedding extractor. A pretrained PANNs model extracts the audio embeddings. To improve the generalisation of our model, we investigate how pretraining with audio and associated noisy text collected from the online platform Freesound improves the performance of our method. Furthermore, our ablation study reveals that the proper choice of the loss function and fine-tuning the pretrained models are essential in training a competitive retrieval system. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: 5 pages, 2 figures. Accepted at Detection and Classification of Acoustic Scenes and Events 2022 (DCASE2022)

arXiv:2112.09581 [pdf, other]

Watermarking Images in Self-Supervised Latent Spaces

Authors: Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze

Abstract: We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches. We present a way to embed both marks and binary messages into their latent spaces, leveraging data augmentation at marking time. Our method can operate at any resolution and creates watermarks robust to a broad range of transformations (rotations, crops, JPEG, contrast, etc). It signi… ▽ More We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches. We present a way to embed both marks and binary messages into their latent spaces, leveraging data augmentation at marking time. Our method can operate at any resolution and creates watermarks robust to a broad range of transformations (rotations, crops, JPEG, contrast, etc). It significantly outperforms the previous zero-bit methods, and its performance on multi-bit watermarking is on par with state-of-the-art encoder-decoder architectures trained end-to-end for watermarking. The code is available at github.com/facebookresearch/ssl_watermarking △ Less

Submitted 23 March, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

arXiv:2110.01962 [pdf, other]

Gender Bias in Remote Pair Programming among Software Engineering Students: The twincode Exploratory Study

Authors: Amador Durán, Pablo Fernández, Beatriz Bernárdez, Nathaniel Weinman, Aslı Akalın, Armando Fox

Abstract: Context. Pair programming (PP) has been found to increase student interest in Computer Science, particularly so for women, and would therefore appear to be a way to help remedy their under-representation, which could be partially motivated by gender stereotypes applied to software engineers, assuming that men perform better than their women peers. If this same bias is present in pair programming,… ▽ More Context. Pair programming (PP) has been found to increase student interest in Computer Science, particularly so for women, and would therefore appear to be a way to help remedy their under-representation, which could be partially motivated by gender stereotypes applied to software engineers, assuming that men perform better than their women peers. If this same bias is present in pair programming, it could work against the goal of improving gender balance. Objective. In a remote setting in which students cannot directly observe their peers, we aim to explore whether they behave differently when the perceived gender of their remote PP partners changes, searching for differences in (i) the perceived productivity compared to solo programming; (ii) the partner's perceived technical competency compared to their own; (iii) the partner's perceived skill level; (iv) the interaction behavior, such as the frequency of source code additions, deletions, etc.; and (v) the type and relative frequencies of dialog messages in a chat window. Method. Using the twincode platform, several behaviors are automatically measured during the remote PP process, together with two questionnaires and a semantic tagging of the pairs' chats. A series of experiments to identify the effect, if any, of possible gender bias shall be performed. The control group will have no information about their partner's gender, whereas the treatment group will receive such information but will be selectively deceived about their partner's gender. For each response variable we will (i) compare control and experimental groups for the score distance between two in-pair tasks; then, using the data from the experimental group only, we will (ii) compare scores using the partner's perceived gender as a within-subjects variable; and (iii) analyze the interaction between the partner's perceived gender and the subject's gender. △ Less

Submitted 5 October, 2021; originally announced October 2021.

Comments: Accepted at the ESEM 2021 Registered Report track

arXiv:2103.06798 [pdf, other]

Bluejay: A Cross-Tooling Audit Framework For Agile Software Teams

Authors: Cesar Garcia, Alejandro Guerrero, Joshua Zeitsoff, Srujay Korlakunta, Pablo Fernandez, Armando Fox, Antonio Ruiz-Cortes

Abstract: Agile software teams are expected to follow a number of specific Team Practices (TPs) during each iteration, such as estimating the effort ("points") required to complete user stories and coordinating the management of the codebase with the delivery of features. For software engineering instructors trying to teach such TPs to student teams, manually auditing teams if teams are following the TPs an… ▽ More Agile software teams are expected to follow a number of specific Team Practices (TPs) during each iteration, such as estimating the effort ("points") required to complete user stories and coordinating the management of the codebase with the delivery of features. For software engineering instructors trying to teach such TPs to student teams, manually auditing teams if teams are following the TPs and improving over time is tedious, time-consuming and error-prone. It is even more difficult when those TPs involve two or more tools. For example, starting work on a feature in a project-management tool such as Pivotal Tracker should usually be followed relatively quickly by the creation of a feature branch on GitHub. Merging a feature branch on GitHub should usually be followed relatively quickly by deploying the new feature to a staging server for customer feedback. Few systems are designed specifically to audit such TPs, and existing ones, as far as we know, are limited to a single specific tool. We present Bluejay, an open-source extensible platform that uses the APIs of multiple tools to collect raw data, synthesize it into TP measurements, and present dashboards to audit the TPs. A key insight in Bluejay's design is that TPs can be expressed in terminology similar to that used for modeling and auditing Service Level Agreement (SLA) compliance. Bluejay therefore builds on mature tools used in that ecosystem and adapts them for describing, auditing, and reporting on TPs. Bluejay currently consumes data from five different widely-used development tools, and can be customized by connecting it to any service with a REST API. Video showcase available at governify.io/showcase/bluejay △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: 6 pages

MSC Class: 68U99 ACM Class: D.2.9

arXiv:2101.04240 [pdf, other]

Lesion2Vec: Deep Metric Learning for Few-Shot Multiple Lesions Recognition in Wireless Capsule Endoscopy Video

Authors: Sodiq Adewole, Philip Fernandez, Michelle Yeghyayan, James Jablonski, Andrew Copland, Michael Porter, Sana Syed, Donald Brown

Abstract: Effective and rapid detection of lesions in the Gastrointestinal tract is critical to gastroenterologist's response to some life-threatening diseases. Wireless Capsule Endoscopy (WCE) has revolutionized traditional endoscopy procedure by allowing gastroenterologists visualize the entire GI tract non-invasively. Once the tiny capsule is swallowed, it sequentially capture images of the GI tract at a… ▽ More Effective and rapid detection of lesions in the Gastrointestinal tract is critical to gastroenterologist's response to some life-threatening diseases. Wireless Capsule Endoscopy (WCE) has revolutionized traditional endoscopy procedure by allowing gastroenterologists visualize the entire GI tract non-invasively. Once the tiny capsule is swallowed, it sequentially capture images of the GI tract at about 2 to 6 frames per second (fps). A single video can last up to 8 hours producing between 30,000 to 100,000 images. Automating the detection of frames containing specific lesion in WCE video would relieve gastroenterologists the arduous task of reviewing the entire video before making diagnosis. While the WCE produces large volume of images, only about 5\% of the frames contain lesions that aid the diagnosis process. Convolutional Neural Network (CNN) based models have been very successful in various image classification tasks. However, they suffer excessive parameters, are sample inefficient and rely on very large amount of training data. Deploying a CNN classifier for lesion detection task will require time-to-time fine-tuning to generalize to any unforeseen category. In this paper, we propose a metric-based learning framework followed by a few-shot lesion recognition in WCE data. Metric-based learning is a meta-learning framework designed to establish similarity or dissimilarity between concepts while few-shot learning (FSL) aims to identify new concepts from only a small number of examples. We train a feature extractor to learn a representation for different small bowel lesions using metric-based learning. At the testing stage, the category of an unseen sample is predicted from only a few support examples, thereby allowing the model to generalize to a new category that has never been seen before. We demonstrated the efficacy of this method on real patient capsule endoscopy data. △ Less

Submitted 15 January, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

arXiv:2007.07188 [pdf, ps, other]

Nonclassical truth with classical strength. A proof-theoretic analysis of compositional truth over HYPE

Authors: Martin Fischer, Carlo Nicolai, Pablo Dopico Fernandez

Abstract: Questions concerning the proof-theoretic strength of classical versus non-classical theories of truth have received some attention recently. A particularly convenient case study concerns classical and nonclassical axiomatizations of fixed-point semantics. It is known that nonclassical axiomatizations in four- or three-valued logics are substantially weaker than their classical counterparts. In thi… ▽ More Questions concerning the proof-theoretic strength of classical versus non-classical theories of truth have received some attention recently. A particularly convenient case study concerns classical and nonclassical axiomatizations of fixed-point semantics. It is known that nonclassical axiomatizations in four- or three-valued logics are substantially weaker than their classical counterparts. In this paper we consider the addition of a suitable conditional to First-Degree Entailment -- a logic recently studied by Hannes Leitgeb under the label `HYPE'. We show in particular that, by formulating the theory PKF over HYPE one obtains a theory that is sound with respect to fixed-point models, while being proof-theoretically on a par with its classical counterpart KF. Moreover, we establish that also its schematic extension -- in the sense of Feferman -- is as strong as the schematic extension of KF, thus matching the strength of predicative analysis. △ Less

Submitted 13 August, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: Fixed a gap in the proof of the lower bound for KFL^*

arXiv:2001.06612 [pdf, other]

Deep Metric Structured Learning For Facial Expression Recognition

Authors: Pedro D. Marrero Fernandez, Tsang Ing Ren, Tsang Ing Jyh, Fidel A. Guerrero Peña, Alexandre Cunha

Abstract: We propose a deep metric learning model to create embedded sub-spaces with a well defined structure. A new loss function that imposes Gaussian structures on the output space is introduced to create these sub-spaces thus sha** the distribution of the data. Having a mixture of Gaussians solution space is advantageous given its simplified and well established structure. It allows fast discovering o… ▽ More We propose a deep metric learning model to create embedded sub-spaces with a well defined structure. A new loss function that imposes Gaussian structures on the output space is introduced to create these sub-spaces thus sha** the distribution of the data. Having a mixture of Gaussians solution space is advantageous given its simplified and well established structure. It allows fast discovering of classes within classes and the identification of mean representatives at the centroids of individual classes. We also propose a new semi-supervised method to create sub-classes. We illustrate our methods on the facial expression recognition problem and validate results on the FER+, AffectNet, Extended Cohn-Kanade (CK+), BU-3DFE, and JAFFE datasets. We experimentally demonstrate that the learned embedding can be successfully used for various applications including expression retrieval and emotion recognition. △ Less

Submitted 5 January, 2022; v1 submitted 18 January, 2020; originally announced January 2020.

arXiv:1910.09783 [pdf, other]

J Regularization Improves Imbalanced Multiclass Segmentation

Authors: Fidel A. Guerrero Peña, Pedro D. Marrero Fernandez, Paul T. Tarr, Tsang Ing Ren, Elliot M. Meyerowitz, Alexandre Cunha

Abstract: We propose a new loss formulation to further advance the multiclass segmentation of cluttered cells under weakly supervised conditions. We improve the separation of touching and immediate cells, obtaining sharp segmentation boundaries with high adequacy, when we add Youden's $J$ statistic regularization term to the cross entropy loss. This regularization intrinsically supports class imbalance th… ▽ More We propose a new loss formulation to further advance the multiclass segmentation of cluttered cells under weakly supervised conditions. We improve the separation of touching and immediate cells, obtaining sharp segmentation boundaries with high adequacy, when we add Youden's $J$ statistic regularization term to the cross entropy loss. This regularization intrinsically supports class imbalance thus eliminating the necessity of explicitly using weights to balance training. Simulations demonstrate this capability and show how the regularization leads to better results by hel** advancing the optimization when cross entropy stalls. We build upon our previous work on multiclass segmentation by adding yet another training class representing gaps between adjacent cells. This addition helps the classifier identify narrow gaps as background and no longer as touching regions. We present results of our methods for 2D and 3D images, from bright field to confocal stacks containing different types of cells, and we show that they accurately segment individual cells after training with a limited number of annotated images, some of which are poorly annotated. △ Less

Submitted 22 October, 2019; originally announced October 2019.

Comments: Submitted to ISBI 2020

arXiv:1908.10945 [pdf, other]

A Multiple Source Hourglass Deep Network for Multi-Focus Image Fusion

Authors: Fidel Alejandro Guerrero Peña, Pedro Diamel Marrero Fernández, Tsang Ing Ren, Germano Crispim Vasconcelos, Alexandre Cunha

Abstract: Multi-Focus Image Fusion seeks to improve the quality of an acquired burst of images with different focus planes. For solving the task, an activity level measurement and a fusion rule are typically established to select and fuse the most relevant information from the sources. However, the design of this kind of method by hand is really hard and sometimes restricted to solution spaces where the opt… ▽ More Multi-Focus Image Fusion seeks to improve the quality of an acquired burst of images with different focus planes. For solving the task, an activity level measurement and a fusion rule are typically established to select and fuse the most relevant information from the sources. However, the design of this kind of method by hand is really hard and sometimes restricted to solution spaces where the optimal all-in-focus images are not contained. Then, we propose here two fast and straightforward approaches for image fusion based on deep neural networks. Our solution uses a multiple source Hourglass architecture trained in an end-to-end fashion. Models are data-driven and can be easily generalized for other kinds of fusion problems. A segmentation approach is used for recognition of the focus map, while the weighted average rule is used for fusion. We designed a training loss function for our regression-based fusion function, which allows the network to learn both the activity level measurement and the fusion rule. Experimental results show our approach has comparable results to the state-of-the-art methods with a 60X increase of computational efficiency for 520X520 resolution images. △ Less

Submitted 28 August, 2019; originally announced August 2019.

arXiv:1908.09891 [pdf, other]

A Weakly Supervised Method for Instance Segmentation of Biological Cells

Authors: Fidel A. Guerrero-Peña, Pedro D. Marrero Fernandez, Tsang Ing Ren, Alexandre Cunha

Abstract: We present a weakly supervised deep learning method to perform instance segmentation of cells present in microscopy images. Annotation of biomedical images in the lab can be scarce, incomplete, and inaccurate. This is of concern when supervised learning is used for image analysis as the discriminative power of a learning model might be compromised in these situations. To overcome the curse of poor… ▽ More We present a weakly supervised deep learning method to perform instance segmentation of cells present in microscopy images. Annotation of biomedical images in the lab can be scarce, incomplete, and inaccurate. This is of concern when supervised learning is used for image analysis as the discriminative power of a learning model might be compromised in these situations. To overcome the curse of poor labeling, our method focuses on three aspects to improve learning: i) we propose a loss function operating in three classes to facilitate separating adjacent cells and to drive the optimizer to properly classify underrepresented regions; ii) a contour-aware weight map model is introduced to strengthen contour detection while improving the network generalization capacity; and iii) we augment data by carefully modulating local intensities on edges shared by adjoining regions and to account for possibly weak signals on these edges. Generated probability maps are segmented using different methods, with the watershed based one generally offering the best solutions, specially in those regions where the prevalence of a single class is not clear. The combination of these contributions allows segmenting individual cells on challenging images. We demonstrate our methods in sparse and crowded cell images, showing improvements in the learning process for a fixed network architecture. △ Less

Submitted 26 August, 2019; originally announced August 2019.

Comments: Accepted at MICCAI Worshop 2019

arXiv:1902.03284 [pdf, other]

FERAtt: Facial Expression Recognition with Attention Net

Authors: Pedro D. Marrero Fernandez, Fidel A. Guerrero Peña, Tsang Ing Ren, Alexandre Cunha

Abstract: We present a new end-to-end network architecture for facial expression recognition with an attention model. It focuses attention in the human face and uses a Gaussian space representation for expression recognition. We devise this architecture based on two fundamental complementary components: (1) facial image correction and attention and (2) facial expression representation and classification. Th… ▽ More We present a new end-to-end network architecture for facial expression recognition with an attention model. It focuses attention in the human face and uses a Gaussian space representation for expression recognition. We devise this architecture based on two fundamental complementary components: (1) facial image correction and attention and (2) facial expression representation and classification. The first component uses an encoder-decoder style network and a convolutional feature extractor that are pixel-wise multiplied to obtain a feature attention map. The second component is responsible for obtaining an embedded representation and classification of the facial expression. We propose a loss function that creates a Gaussian structure on the representation space. To demonstrate the proposed method, we create two larger and more comprehensive synthetic datasets using the traditional BU3DFE and CK+ facial datasets. We compared results with the PreActResNet18 baseline. Our experiments on these datasets have shown the superiority of our approach in recognizing facial expressions. △ Less

Submitted 8 February, 2019; originally announced February 2019.

arXiv:1810.12121 [pdf, other]

Burst ranking for blind multi-image deblurring

Authors: Fidel A. Guerrero Peña, Pedro D. Marrero Fernández, Tsang Ing Ren, Jorge J. G. Leandro, Ricardo Nishihara

Abstract: We propose a new incremental aggregation algorithm for multi-image deblurring with automatic image selection. The primary motivation is that current bursts deblurring methods do not handle well situations in which misalignment or out-of-context frames are present in the burst. These real-life situations result in poor reconstructions or manual selection of the images that will be used to deblur. A… ▽ More We propose a new incremental aggregation algorithm for multi-image deblurring with automatic image selection. The primary motivation is that current bursts deblurring methods do not handle well situations in which misalignment or out-of-context frames are present in the burst. These real-life situations result in poor reconstructions or manual selection of the images that will be used to deblur. Automatically selecting best frames within the burst to improve the base reconstruction is challenging because the amount of possible images fusions is equal to the power set cardinal. Here, we approach the multi-image deblurring problem as a two steps process. First, we successfully learn a comparison function to rank a burst of images using a deep convolutional neural network. Then, an incremental Fourier burst accumulation with a reconstruction degradation mechanism is applied fusing only less blurred images that are sufficient to maximize the reconstruction quality. Experiments with the proposed algorithm have shown superior results when compared to other similar approaches, outperforming other methods described in the literature in previously described situations. We validate our findings on several synthetic and real datasets. △ Less

Submitted 30 October, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: Submitted to IEEE Transactions on Image Processing. 11 pages, 9 figures

arXiv:1810.09435 [pdf, other]

On the ability of discontinuous Galerkin methods to simulate under-resolved turbulent flows

Authors: Pablo Fernandez, Ngoc-Cuong Nguyen, Jaime Peraire

Abstract: We investigate the ability of discontinuous Galerkin (DG) methods to simulate under-resolved turbulent flows in large-eddy simulation. The role of the Riemann solver and the subgrid-scale model in the prediction of a variety of flow regimes, including transition to turbulence, wall-free turbulence and wall-bounded turbulence, are examined. Numerical and theoretical results show the Riemann solver… ▽ More We investigate the ability of discontinuous Galerkin (DG) methods to simulate under-resolved turbulent flows in large-eddy simulation. The role of the Riemann solver and the subgrid-scale model in the prediction of a variety of flow regimes, including transition to turbulence, wall-free turbulence and wall-bounded turbulence, are examined. Numerical and theoretical results show the Riemann solver in the DG scheme plays the role of an implicit subgrid-scale model and introduces numerical dissipation in under-resolved turbulent regions of the flow. This implicit model behaves like a dynamic model and vanishes for flows that do not contain subgrid scales, such as laminar flows, which is a critical feature to accurately predict transition to turbulence. In addition, for the moderate-Reynolds-number turbulence problems considered, the implicit model provides a more accurate representation of the actual subgrid scales in the flow than state-of-the-art explicit eddy viscosity models, including dynamic Smagorinsky, WALE and Vreman. The results in this paper indicate new best practices for subgrid-scale modeling are needed with high-order DG methods. △ Less

Submitted 19 October, 2018; originally announced October 2018.

MSC Class: 65M60; 76Fxx; 76Hxx

arXiv:1810.08639 [pdf, other]

doi 10.1016/j.imavis.2018.11.001

Fast and Robust Multiple ColorChecker Detection using Deep Convolutional Neural Networks

Authors: Pedro D. Marrero Fernandez, Fidel A. Guerrero-Peña, Tsang Ing Ren, Jorge J. G. Leandro

Abstract: ColorCheckers are reference standards that professional photographers and filmmakers use to ensure predictable results under every lighting condition. The objective of this work is to propose a new fast and robust method for automatic ColorChecker detection. The process is divided into two steps: (1) ColorCheckers localization and (2) ColorChecker patches recognition. For the ColorChecker localiza… ▽ More ColorCheckers are reference standards that professional photographers and filmmakers use to ensure predictable results under every lighting condition. The objective of this work is to propose a new fast and robust method for automatic ColorChecker detection. The process is divided into two steps: (1) ColorCheckers localization and (2) ColorChecker patches recognition. For the ColorChecker localization, we trained a detection convolutional neural network using synthetic images. The synthetic images are created with the 3D models of the ColorChecker and different background images. The output of the neural networks are the bounding box of each possible ColorChecker candidates in the input image. Each bounding box defines a cropped image which is evaluated by a recognition system, and each image is canonized with regards to color and dimensions. Subsequently, all possible color patches are extracted and grouped with respect to the center's distance. Each group is evaluated as a candidate for a ColorChecker part, and its position in the scene is estimated. Finally, a cost function is applied to evaluate the accuracy of the estimation. The method is tested using real and synthetic images. The proposed method is fast, robust to overlaps and invariant to affine projections. The algorithm also performs well in case of multiple ColorCheckers detection. △ Less

Submitted 19 October, 2018; originally announced October 2018.

Comments: Submitted to Image and Vision Computing

arXiv:1807.11318 [pdf, other]

doi 10.1007/s10723-018-9454-2

umd-verification: Automation of Software Validation for the EGI federated e-Infrastructure

Authors: Pablo Orviz Fernandez, Joao Pina, Alvaro Lopez Garcia, Isabel Campos Plasencia, Mario David, Jorge Gomes

Abstract: Supporting e-Science in the EGI e-Infrastructure requires extensive and reliable software, for advanced computing use, deployed across over approximately 300 European and worldwide data centers. The Unified Middleware Distribution (UMD) and Cloud Middleware Distribution (CMD) are the channels to deliver the software for the EGI e-Infrastructure consumption. The software is compiled, validated and… ▽ More Supporting e-Science in the EGI e-Infrastructure requires extensive and reliable software, for advanced computing use, deployed across over approximately 300 European and worldwide data centers. The Unified Middleware Distribution (UMD) and Cloud Middleware Distribution (CMD) are the channels to deliver the software for the EGI e-Infrastructure consumption. The software is compiled, validated and distributed following the Software Provisioning Process (SWPP), where the Quality Criteria (QC) definition sets the minimum quality requirements for EGI acceptance. The growing number of software components currently existing within UMD and CMD distributions hinders the application of the traditional, manual-based validation mechanisms, thus driving the adoption of automated solutions. This paper presents umd-verification, an open-source tool that enforces the fulfillment of the QC requirements in an automated way for the continuous validation of the software products for scientific disposal. The umd-verification tool has been successfully integrated within the SWPP pipeline and is progressively supporting the full validation of the products in the UMD and CMD repositories. While the cost of supporting new products is dependant on the availability of Infrastructure as Code solutions to take over the deployment and high test coverage, the results obtained for the already integrated products are promising, as the time invested in the validation of products has been drastically reduced. Furthermore, automation adoption has brought along benefits for the reliability of the process, such as the removal of human-associated errors or the risk of regression of previously tested functionalities. △ Less

Submitted 30 July, 2018; originally announced July 2018.

Comments: This is the author's pre-print version of this work. The final publication is available at http://dx.doi.org/10.1007/s10723-018-9454-2

Journal ref: Journal of Grid COmputing (2018) 1-14

arXiv:1807.01748 [pdf]

Significant acceleration of development by automating quality assurance of a medical particle accelerator safety system using a formal language driven test stand

Authors: Pablo Fernandez Carmona, Michael Eichin, Alexandre Mayor, Harald Regele, Martin Grossmann, Damien Charles Weber

Abstract: At the Centre for Proton Therapy at the Paul Scherrer Institute cancer patients are treated with a fixed beamline and in two gantries for ocular and non-ocular malignancies, respectively. For the installation of a third gantry a new patient safety system (PaSS) was developed and is sequentially being rolled out to update the existing areas. The aim of PaSS is to interrupt the treatment whenever an… ▽ More At the Centre for Proton Therapy at the Paul Scherrer Institute cancer patients are treated with a fixed beamline and in two gantries for ocular and non-ocular malignancies, respectively. For the installation of a third gantry a new patient safety system (PaSS) was developed and is sequentially being rolled out to update the existing areas. The aim of PaSS is to interrupt the treatment whenever any sub-system detects a hazardous condition. To ensure correct treatment delivery, this system needs to be thoroughly tested as part of the regular quality assurance (QA) protocols as well as after any upgrade. In the legacy safety systems, unit testing required an extensive use of resources: two weeks of work per area in the laboratory in addition to QA beam time. In order to significantly reduce the time, an automated PaSS test stand for unit testing was developed based on a PXI chassis with virtually unlimited IOs that are synchronously stimulated or sampled at 1 MHz. It can emulate the rest of the facility using adapters to connect each type of interface. With it PaSS can be tested under arbitrary conditions. A VHDL-based formal language was developed to describe stimuli, expected behaviour and specific measurements, interpreted by a LabView runtime environment. This article describes the tools and methodology being applied for unit testing and QA release tests for the new PaSS. It shows how automation and formalization made possible an increase in test coverage while significantly cutting down the laboratory testing time and facility's beam usage. △ Less

Submitted 23 June, 2018; originally announced July 2018.

Comments: 6 pages, 9 figures, 21st IEEE Real Time Conference, 9-15 June 2018 Colonial Williamsburg, USA

MSC Class: J.3 LIFE AND MEDICAL SCIENCES

arXiv:1802.07465 [pdf, other]

doi 10.1109/ICIP.2018.8451187

Multiclass Weighted Loss for Instance Segmentation of Cluttered Cells

Authors: Fidel A. Guerrero-Pena, Pedro D. Marrero Fernandez, Tsang Ing Ren, Mary Yui, Ellen Rothenberg, Alexandre Cunha

Abstract: We propose a new multiclass weighted loss function for instance segmentation of cluttered cells. We are primarily motivated by the need of developmental biologists to quantify and model the behavior of blood T-cells which might help us in understanding their regulation mechanisms and ultimately help researchers in their quest for develo** an effective immuno-therapy cancer treatment. Segmenting… ▽ More We propose a new multiclass weighted loss function for instance segmentation of cluttered cells. We are primarily motivated by the need of developmental biologists to quantify and model the behavior of blood T-cells which might help us in understanding their regulation mechanisms and ultimately help researchers in their quest for develo** an effective immuno-therapy cancer treatment. Segmenting individual touching cells in cluttered regions is challenging as the feature distribution on shared borders and cell foreground are similar thus difficulting discriminating pixels into proper classes. We present two novel weight maps applied to the weighted cross entropy loss function which take into account both class imbalance and cell geometry. Binary ground truth training data is augmented so the learning model can handle not only foreground and background but also a third touching class. This framework allows training using U-Net. Experiments with our formulations have shown superior results when compared to other similar schemes, outperforming binary class models with significant improvement of boundary adequacy and instance detection. We validate our results on manually annotated microscope images of T-cells. △ Less

Submitted 21 February, 2018; originally announced February 2018.

Comments: Submitted to IEEE ICIP 2018

ACM Class: I.4.6; I.2.10; J.3

arXiv:1711.08045 [pdf, ps, other]

doi 10.1016/j.csi.2016.02.002

Standards for enabling heterogeneous IaaS cloud federations

Authors: Álvaro López García, Enol Fernández del Castillo, Pablo Orviz Fernández

Abstract: Technology market is continuing a rapid growth phase where different resource providers and Cloud Management Frameworks are positioning to provide ad-hoc solutions -in terms of management interfaces, information discovery or billing- trying to differentiate from competitors but that as a result remain incompatible between them when addressing more complex scenarios like federated clouds. Gras**… ▽ More Technology market is continuing a rapid growth phase where different resource providers and Cloud Management Frameworks are positioning to provide ad-hoc solutions -in terms of management interfaces, information discovery or billing- trying to differentiate from competitors but that as a result remain incompatible between them when addressing more complex scenarios like federated clouds. Gras** interoperability problems present in current infrastructures is then a must-do, tackled by studying how existing and emerging standards could enhance user experience in the cloud ecosystem. In this paper we will review the current open challenges in Infrastructure as a Service cloud interoperability and federation, as well as point to the potential standards that should alleviate these problems. △ Less

Submitted 21 November, 2017; originally announced November 2017.

Journal ref: Computer Standards & Interfaces, Volume 47, 2016, Pages 19-23,

arXiv:1711.03334 [pdf, other]

doi 10.1007/s10723-017-9418-y

Orchestrating Complex Application Architectures in Heterogeneous Clouds

Authors: Miguel Caballer, Sahdev Zala, Álvaro López García, Germán Moltó, Pablo Orviz Fernández, Mathieu Velten

Abstract: Private cloud infrastructures are now widely deployed and adopted across technology industries and research institutions. Although cloud computing has emerged as a reality, it is now known that a single cloud provider cannot fully satisfy complex user requirements. This has resulted in a growing interest in develo** hybrid cloud solutions that bind together distinct and heterogeneous cloud infra… ▽ More Private cloud infrastructures are now widely deployed and adopted across technology industries and research institutions. Although cloud computing has emerged as a reality, it is now known that a single cloud provider cannot fully satisfy complex user requirements. This has resulted in a growing interest in develo** hybrid cloud solutions that bind together distinct and heterogeneous cloud infrastructures. In this paper we describe the orchestration approach for heterogeneous clouds that has been implemented and used within the INDIGO-DataCloud project. This orchestration model uses existing open-source software like OpenStack and leverages the OASIS Topology and Specification for Cloud Applications (TOSCA) open standard as the modeling language. Our approach uses virtual machines and Docker containers in an homogeneous and transparent way providing consistent application deployment for the users. This approach is illustrated by means of two different use cases in different scientific communities, implemented using the INDIGO-DataCloud solutions. △ Less

Submitted 9 November, 2017; originally announced November 2017.

Journal ref: J Grid Computing (2017)

arXiv:1709.08526 [pdf, other]

doi 10.1002/spe.2544

Resource provisioning in Science Clouds: Requirements and challenges

Authors: Álvaro López García, Enol Fernández-del-Castillo, Pablo Orviz Fernández, Isabel Campos Plasencia, Jesús Marco de Lucas

Abstract: Cloud computing has permeated into the information technology industry in the last few years, and it is emerging nowadays in scientific environments. Science user communities are demanding a broad range of computing power to satisfy the needs of high-performance applications, such as local clusters, high-performance computing systems, and computing grids. Different workloads are needed from differ… ▽ More Cloud computing has permeated into the information technology industry in the last few years, and it is emerging nowadays in scientific environments. Science user communities are demanding a broad range of computing power to satisfy the needs of high-performance applications, such as local clusters, high-performance computing systems, and computing grids. Different workloads are needed from different computational models, and the cloud is already considered as a promising paradigm. The scheduling and allocation of resources is always a challenging matter in any form of computation and clouds are not an exception. Science applications have unique features that differentiate their workloads, hence, their requirements have to be taken into consideration to be fulfilled when building a Science Cloud. This paper will discuss what are the main scheduling and resource allocation challenges for any Infrastructure as a Service provider supporting scientific applications. △ Less

Submitted 25 September, 2017; originally announced September 2017.

Journal ref: Software: Practice and Experience. 2017;1-13

arXiv:1605.01020 [pdf, other]

Implicit large-eddy simulation of compressible flows using the Interior Embedded Discontinuous Galerkin method

Authors: Pablo Fernandez, Ngoc-Cuong Nguyen, Xevi Roca, Jaime Peraire

Abstract: We present a high-order implicit large-eddy simulation (ILES) approach for simulating transitional turbulent flows. The approach consists of an Interior Embedded Discontinuous Galerkin (IEDG) method for the discretization of the compressible Navier-Stokes equations and a parallel preconditioned Newton-GMRES solver for the resulting nonlinear system of equations. The IEDG method arises from the mar… ▽ More We present a high-order implicit large-eddy simulation (ILES) approach for simulating transitional turbulent flows. The approach consists of an Interior Embedded Discontinuous Galerkin (IEDG) method for the discretization of the compressible Navier-Stokes equations and a parallel preconditioned Newton-GMRES solver for the resulting nonlinear system of equations. The IEDG method arises from the marriage of the Embedded Discontinuous Galerkin (EDG) method and the Hybridizable Discontinuous Galerkin (HDG) method. As such, the IEDG method inherits the advantages of both the EDG method and the HDG method to make itself well-suited for turbulence simulations. We propose a minimal residual Newton algorithm for solving the nonlinear system arising from the IEDG discretization of the Navier-Stokes equations. The preconditioned GMRES algorithm is based on a restricted additive Schwarz (RAS) preconditioner in conjunction with a block incomplete LU factorization at the subdomain level. The proposed approach is applied to the ILES of transitional turbulent flows over a NACA 65-(18)10 compressor cascade at Reynolds number 250,000 in both design and off-design conditions. The high-order ILES results show good agreement with a subgrid-scale LES model discretized with a second-order finite volume code while using significantly less degrees of freedom. This work shows that high-order accuracy is key for predicting transitional turbulent flows without a SGS model. △ Less

Submitted 3 May, 2016; originally announced May 2016.

Comments: 54th AIAA Aerospace Sciences Meeting, AIAA SciTech, 2016

arXiv:1510.01276 [pdf]

Hadamard Product Decomposition and Mutually Exclusive Matrices on Network Structure and Utilization

Authors: Michael Ybañez, Kardi Teknomo, Proceso Fernandez

Abstract: Graphs are very important mathematical structures used in many applications, one of which is transportation science. When dealing with transportation networks, one deals not only with the network structure, but also with information related to the utilization of the elements of the network, which can be shown using flow and origin-destination matrices. This paper extends an algebraic model used to… ▽ More Graphs are very important mathematical structures used in many applications, one of which is transportation science. When dealing with transportation networks, one deals not only with the network structure, but also with information related to the utilization of the elements of the network, which can be shown using flow and origin-destination matrices. This paper extends an algebraic model used to relate all these components by deriving additional relationships and constructing a more structured understanding of the model. Specifically, the paper introduces the concept of mutually exclusive matrices, and shows their effect when decomposing the components of a Hadamard product on matrices △ Less

Submitted 1 October, 2015; originally announced October 2015.

Comments: Proceeding of the International Conference on Innovation Challenges in Multidisciplinary Research and Practice (ICMRP 2013), Kuala Lumpur, Dec 13-14, 2013

arXiv:1510.00889 [pdf]

Background Image Generation Using Boolean Operations

Authors: Kardi Teknomo, Proceso Fernandez

Abstract: Tracking moving objects from a video sequence requires segmentation of these objects from the background image. However, getting the actual background image automatically without object detection and using only the video is difficult. In this paper, we describe a novel algorithm that generates background from real world images without foreground detection. The algorithm assumes that the background… ▽ More Tracking moving objects from a video sequence requires segmentation of these objects from the background image. However, getting the actual background image automatically without object detection and using only the video is difficult. In this paper, we describe a novel algorithm that generates background from real world images without foreground detection. The algorithm assumes that the background image is shown in the majority of the video. Given this simple assumption, the method described in this paper is able to accurately generate, with high probability, the background image from a video using only a small number of binary operations. △ Less

Submitted 3 October, 2015; originally announced October 2015.

ACM Class: I.4.6

Journal ref: Philippine Computing Journal Vol 4 No 2, December 2009, pp. 43-49

Showing 1–40 of 40 results for author: Fernandez, P