Search | arXiv e-print repository

CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages

Authors: Gabriel Oliveira dos Santos, Diego A. B. Moreira, Alef Iury Ferreira, Jhessica Silva, Luiz Pereira, Pedro Bueno, Thiago Sousa, Helena Maia, Nádia Da Silva, Esther Colombini, Helio Pedrini, Sandra Avila

Abstract: This work introduces CAPIVARA, a cost-efficient framework designed to enhance the performance of multilingual CLIP models in low-resource languages. While CLIP has excelled in zero-shot vision-language tasks, the resource-intensive nature of model training remains challenging. Many datasets lack linguistic diversity, featuring solely English descriptions for images. CAPIVARA addresses this by augm… ▽ More This work introduces CAPIVARA, a cost-efficient framework designed to enhance the performance of multilingual CLIP models in low-resource languages. While CLIP has excelled in zero-shot vision-language tasks, the resource-intensive nature of model training remains challenging. Many datasets lack linguistic diversity, featuring solely English descriptions for images. CAPIVARA addresses this by augmenting text data using image captioning and machine translation to generate multiple synthetic captions in low-resource languages. We optimize the training pipeline with LiT, LoRA, and gradient checkpointing to alleviate the computational cost. Through extensive experiments, CAPIVARA emerges as state of the art in zero-shot tasks involving images and Portuguese texts. We show the potential for significant improvements in other low-resource languages, achieved by fine-tuning the pre-trained multilingual CLIP using CAPIVARA on a single GPU for 2 hours. Our model and code is available at https://github.com/hiaac-nlp/CAPIVARA. △ Less

Submitted 23 October, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

arXiv:2301.10835 [pdf, other]

When Layers Play the Lottery, all Tickets Win at Initialization

Authors: Artur Jordao, George Correa de Araujo, Helena de Almeida Maia, Helio Pedrini

Abstract: Pruning is a standard technique for reducing the computational cost of deep networks. Many advances in pruning leverage concepts from the Lottery Ticket Hypothesis (LTH). LTH reveals that inside a trained dense network exists sparse subnetworks (tickets) able to achieve similar accuracy (i.e., win the lottery - winning tickets). Pruning at initialization focuses on finding winning tickets without… ▽ More Pruning is a standard technique for reducing the computational cost of deep networks. Many advances in pruning leverage concepts from the Lottery Ticket Hypothesis (LTH). LTH reveals that inside a trained dense network exists sparse subnetworks (tickets) able to achieve similar accuracy (i.e., win the lottery - winning tickets). Pruning at initialization focuses on finding winning tickets without training a dense network. Studies on these concepts share the trend that subnetworks come from weight or filter pruning. In this work, we investigate LTH and pruning at initialization from the lens of layer pruning. First, we confirm the existence of winning tickets when the pruning process removes layers. Leveraged by this observation, we propose to discover these winning tickets at initialization, eliminating the requirement of heavy computational resources for training the initial (over-parameterized) dense network. Extensive experiments show that our winning tickets notably speed up the training phase and reduce up to 51% of carbon emission, an important step towards democratization and green Artificial Intelligence. Beyond computational benefits, our winning tickets exhibit robustness against adversarial and out-of-distribution examples. Finally, we show that our subnetworks easily win the lottery at initialization while tickets from filter removal (the standard structured LTH) hardly become winning tickets. △ Less

Submitted 19 March, 2024; v1 submitted 25 January, 2023; originally announced January 2023.

Comments: Published at International Conference on Computer Vision Workshop (ICCV), 2023

arXiv:2210.04103 [pdf, other]

doi 10.1016/j.chaos.2023.113287

Controversy-seeking fuels rumor-telling activity in polarized opinion networks

Authors: Hugo P. Maia, Silvio C. Ferreira, Marcelo L. Martins

Abstract: Rumors have ignited revolutions, undermined the trust in political parties, or threatened the stability of human societies. Such destructive potential has been significantly enhanced by the development of on-line social networks. Several theoretical and computational studies have been devoted to understanding the dynamics and to control rumor spreading. In the present work, a model of rumor-tellin… ▽ More Rumors have ignited revolutions, undermined the trust in political parties, or threatened the stability of human societies. Such destructive potential has been significantly enhanced by the development of on-line social networks. Several theoretical and computational studies have been devoted to understanding the dynamics and to control rumor spreading. In the present work, a model of rumor-telling in opinion polarized networks was investigated through extensive computer simulations. The key mechanism is the coupling between ones' opinions and their leaning to spread a given information, either by supporting or opposing its content. We report that a highly modular topology of polarized networks strongly impairs rumor spreading, but the couplings between agent's opinions and their spreading/stifling rates can either further inhibit or, conversely, foster information propagation, depending on the nature of those couplings. In particular, a controversy-seeking mechanism, in which agents are stimulated to postpone their transitions to the stiffer state upon interactions with other agents of confronting opinions, enhances the rumor spreading. Therefore such a mechanism is capable of overcoming the propagation bottlenecks imposed by loosely connected modular structures. △ Less

Submitted 8 October, 2022; originally announced October 2022.

Comments: 11 pages, 7 figures

arXiv:2206.02607 [pdf, other]

CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations

Authors: Peter Yichen Chen, **xu Xiang, Dong Heon Cho, Yue Chang, G A Pershing, Henrique Teles Maia, Maurizio M. Chiaramonte, Kevin Carlberg, Eitan Grinspun

Abstract: The long runtime of high-fidelity partial differential equation (PDE) solvers makes them unsuitable for time-critical applications. We propose to accelerate PDE solvers using reduced-order modeling (ROM). Whereas prior ROM approaches reduce the dimensionality of discretized vector fields, our continuous reduced-order modeling (CROM) approach builds a low-dimensional embedding of the continuous vec… ▽ More The long runtime of high-fidelity partial differential equation (PDE) solvers makes them unsuitable for time-critical applications. We propose to accelerate PDE solvers using reduced-order modeling (ROM). Whereas prior ROM approaches reduce the dimensionality of discretized vector fields, our continuous reduced-order modeling (CROM) approach builds a low-dimensional embedding of the continuous vector fields themselves, not their discretization. We represent this reduced manifold using continuously differentiable neural fields, which may train on any and all available numerical solutions of the continuous system, even when they are obtained using diverse methods or discretizations. We validate our approach on an extensive range of PDEs with training data from voxel grids, meshes, and point clouds. Compared to prior discretization-dependent ROM methods, such as linear subspace proper orthogonal decomposition (POD) and nonlinear manifold neural-network-based autoencoders, CROM features higher accuracy, lower memory consumption, dynamically adaptive resolutions, and applicability to any discretization. For equal latent space dimension, CROM exhibits 79$\times$ and 49$\times$ better accuracy, and 39$\times$ and 132$\times$ smaller memory footprint, than POD and autoencoder methods, respectively. Experiments demonstrate 109$\times$ and 89$\times$ wall-clock speedups over unreduced models on CPUs and GPUs, respectively. Videos and codes are available on the project page: https://crom-pde.github.io △ Less

Submitted 3 March, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

arXiv:2110.00881 [pdf, other]

Weakly Supervised Attention-based Models Using Activation Maps for Citrus Mite and Insect Pest Classification

Authors: Edson Bollis, Helena Maia, Helio Pedrini, Sandra Avila

Abstract: Citrus juices and fruits are commodities with great economic potential in the international market, but productivity losses caused by mites and other pests are still far from being a good mark. Despite the integrated pest mechanical aspect, only a few works on automatic classification have handled images with orange mite characteristics, which means tiny and noisy regions of interest. On the compu… ▽ More Citrus juices and fruits are commodities with great economic potential in the international market, but productivity losses caused by mites and other pests are still far from being a good mark. Despite the integrated pest mechanical aspect, only a few works on automatic classification have handled images with orange mite characteristics, which means tiny and noisy regions of interest. On the computational side, attention-based models have gained prominence in deep learning research, and, along with weakly supervised learning algorithms, they have improved tasks performed with some label restrictions. In agronomic research of pests and diseases, these techniques can improve classification performance while pointing out the location of mites and insects without specific labels, reducing deep learning development costs related to generating bounding boxes. In this context, this work proposes an attention-based activation map approach developed to improve the classification of tiny regions called Two-Weighted Activation Map**, which also produces locations using feature map scores learned from class labels. We apply our method in a two-stage network process called Attention-based Multiple Instance Learning Guided by Saliency Maps. We analyze the proposed approach in two challenging datasets, the Citrus Pest Benchmark, which was captured directly in the field using magnifying glasses, and the Insect Pest, a large pest image benchmark. In addition, we evaluate and compare our models with weakly supervised methods, such as Attention-based Deep MIL and WILDCAT. The results show that our classifier is superior to literature methods that use tiny regions in their classification tasks, surpassing them in all scenarios by at least 16 percentage points. Moreover, our approach infers bounding box locations for salient insects, even training without any location labels. △ Less

Submitted 2 October, 2021; originally announced October 2021.

Comments: 18 pages, 9 figures, 5 tables. Paper under review

arXiv:2110.00717 [pdf, other]

Mobile Manipulation Leveraging Multiple Views

Authors: David Watkins, Peter K Allen, Henrique Maia, Madhavan Seshadri, Jonathan Sanabria, Nicholas Waytowich, Jacob Varley

Abstract: While both navigation and manipulation are challenging topics in isolation, many tasks require the ability to both navigate and manipulate in concert. To this end, we propose a mobile manipulation system that leverages novel navigation and shape completion methods to manipulate an object with a mobile robot. Our system utilizes uncertainty in the initial estimation of a manipulation target to calc… ▽ More While both navigation and manipulation are challenging topics in isolation, many tasks require the ability to both navigate and manipulate in concert. To this end, we propose a mobile manipulation system that leverages novel navigation and shape completion methods to manipulate an object with a mobile robot. Our system utilizes uncertainty in the initial estimation of a manipulation target to calculate a predicted next-best-view. Without the need of localization, the robot then uses the predicted panoramic view at the next-best-view location to navigate to the desired location, capture a second view of the object, create a new model that predicts the shape of object more accurately than a single image alone, and uses this model for grasp planning. We show that the system is highly effective for mobile manipulation tasks through simulation experiments using real world data, as well as ablations on each component of our system. △ Less

Submitted 7 March, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

Comments: 6 pages, 2 pages of references, 5 figures, 5 tables

arXiv:2109.07395 [pdf, other]

Can one hear the shape of a neural network?: Snoo** the GPU via Magnetic Side Channel

Authors: Henrique Teles Maia, Chang Xiao, Dingzeyu Li, Eitan Grinspun, Changxi Zheng

Abstract: Neural network applications have become popular in both enterprise and personal settings. Network solutions are tuned meticulously for each task, and designs that can robustly resolve queries end up in high demand. As the commercial value of accurate and performant machine learning models increases, so too does the demand to protect neural architectures as confidential investments. We explore the… ▽ More Neural network applications have become popular in both enterprise and personal settings. Network solutions are tuned meticulously for each task, and designs that can robustly resolve queries end up in high demand. As the commercial value of accurate and performant machine learning models increases, so too does the demand to protect neural architectures as confidential investments. We explore the vulnerability of neural networks deployed as black boxes across accelerated hardware through electromagnetic side channels. We examine the magnetic flux emanating from a graphics processing unit's power cable, as acquired by a cheap $3 induction sensor, and find that this signal betrays the detailed topology and hyperparameters of a black-box neural network model. The attack acquires the magnetic signal for one query with unknown input values, but known input dimensions. The network reconstruction is possible due to the modular layer sequence in which deep neural networks are evaluated. We find that each layer component's evaluation produces an identifiable magnetic signal signature, from which layer topology, width, function type, and sequence order can be inferred using a suitably trained classifier and a joint consistency optimization based on integer programming. We study the extent to which network specifications can be recovered, and consider metrics for comparing network similarity. We demonstrate the potential accuracy of this side channel attack in recovering the details for a broad range of network architectures, including random designs. We consider applications that may exploit this novel side channel exposure, such as adversarial transfer attacks. In response, we discuss countermeasures to protect against our method and other similar snoo** techniques. △ Less

Submitted 15 September, 2021; originally announced September 2021.

Comments: 14 pages, accepted to USENIX Security 2022

arXiv:2103.08688 [pdf, other]

Self-Adaptive Microservice-based Systems -- Landscape and Research Opportunities

Authors: Messias Filho, Eliaquim Pimentel, Wellington Pereira, Paulo Henrique M. Maia, Mariela I. Cortés

Abstract: Microservices have become popular in the past few years, attracting the interest of both academia and industry. Despite of its benefits, this new architectural style still poses important challenges, such as resilience, performance and evolution. Self-adaptation techniques have been applied recently as an alternative to solve or mitigate those problems. However, due to the range of quality attribu… ▽ More Microservices have become popular in the past few years, attracting the interest of both academia and industry. Despite of its benefits, this new architectural style still poses important challenges, such as resilience, performance and evolution. Self-adaptation techniques have been applied recently as an alternative to solve or mitigate those problems. However, due to the range of quality attributes that affect microservice architectures, many different self-adaptation strategies can be used. Thus, to understand the state-of-the-art of the use of self-adaptation techniques and mechanisms in microservice-based systems, this work conducted a systematic map**, in which 21 primary studies were analyzed considering qualitative and quantitative research questions. The results show that most studies focus on the Monitor phase (28.57%) of the adaptation control loop, address the self-healing property (23.81%), apply a reactive adaptation strategy (80.95%) in the system infrastructure level (47.62%) and use a centralized approach (38.10%). From those, it was possible to propose some research directions to fill existing gaps. △ Less

Submitted 29 March, 2021; v1 submitted 15 March, 2021; originally announced March 2021.

arXiv:2010.08635 [pdf, other]

doi 10.1016/j.physa.2020.125588

Adaptive network approach for emergence of societal bubbles

Authors: Hugo P. Maia, Silvio C. Ferreira, Marcelo L. Martins

Abstract: Far beyond its relevance for commercial and political marketings, opinion formation and decision making processes are central for representative democracy, government functioning, and state organization. In the present report, a stochastic agent-based model is investigated. The model assumes that bounded confidence and homophily mechanisms drive both opinion dynamics and social network evolution t… ▽ More Far beyond its relevance for commercial and political marketings, opinion formation and decision making processes are central for representative democracy, government functioning, and state organization. In the present report, a stochastic agent-based model is investigated. The model assumes that bounded confidence and homophily mechanisms drive both opinion dynamics and social network evolution through either rewiring or breakage of social contacts. In addition to the classical transition from global consensus to opinion polarization, our main findings are (i) a cascade of fragmentation of the social network into echo chambers (modules) holding distinct opinions and rupture of the bridges interconnecting these modules as the tolerance for opinion differences increases. There are multiple surviving opinions associated to these modules within which consensus is formed; and (ii) the adaptive social network exhibits a hysteresis-like behavior characterized by irreversible changes in its topology as the opinion tolerance cycles from radicalization towards consensus and backward to radicalization. △ Less

Submitted 16 October, 2020; originally announced October 2020.

Comments: 9 pages, 7 figures

Journal ref: Physica A: Statistical Mechanics and its Applications 572 (2021) 125588

arXiv:2005.11322 [pdf, ps, other]

Implications of a Quillen Model Structures-Based Framework for Locality under Logical Equivalence

Authors: Hendrick Maia

Abstract: In [15] a homotopic variation for locality of logics was presented, namely a Quillen model category-based framework for locality under logical equivalence, for every primitive-positive sentence of quantifier-rank $k$. In this paper, we will present some of the implications and possible themes for investigations that arise from the aforementioned framework. In [15] a homotopic variation for locality of logics was presented, namely a Quillen model category-based framework for locality under logical equivalence, for every primitive-positive sentence of quantifier-rank $k$. In this paper, we will present some of the implications and possible themes for investigations that arise from the aforementioned framework. △ Less

Submitted 22 May, 2020; originally announced May 2020.

Comments: arXiv admin note: text overlap with arXiv:2005.09135, arXiv:1103.0400

arXiv:2005.09135 [pdf, ps, other]

Quillen Model Structures-Based Notions of Locality of Logics over Finite Models

Authors: Hendrick Maia

Abstract: Locality is a property of logics, based on Hanf's and Gaifman's theorems, and that was shown to be very useful in the context of finite model theory. In this paper I present a homotopic variation for locality, namely a Quillen model category-based framework for locality under k-logical equivalence, for every primitive-positive sentence of quantifier-rank k. Locality is a property of logics, based on Hanf's and Gaifman's theorems, and that was shown to be very useful in the context of finite model theory. In this paper I present a homotopic variation for locality, namely a Quillen model category-based framework for locality under k-logical equivalence, for every primitive-positive sentence of quantifier-rank k. △ Less

Submitted 18 May, 2020; originally announced May 2020.

arXiv:2005.07415 [pdf, other]

doi 10.1016/j.cor.2020.104995

MineReduce: an approach based on data mining for problem size reduction

Authors: Marcelo Rodrigues de Holanda Maia, Alexandre Plastino, Puca Huachi Vaz Penna

Abstract: Hybrid variations of metaheuristics that include data mining strategies have been utilized to solve a variety of combinatorial optimization problems, with superior and encouraging results. Previous hybrid strategies applied mined patterns to guide the construction of initial solutions, leading to more effective exploration of the solution space. Solving a combinatorial optimization problem is usua… ▽ More Hybrid variations of metaheuristics that include data mining strategies have been utilized to solve a variety of combinatorial optimization problems, with superior and encouraging results. Previous hybrid strategies applied mined patterns to guide the construction of initial solutions, leading to more effective exploration of the solution space. Solving a combinatorial optimization problem is usually a hard task because its solution space grows exponentially with its size. Therefore, problem size reduction is also a useful strategy in this context, especially in the case of large-scale problems. In this paper, we build upon these ideas by presenting an approach named MineReduce, which uses mined patterns to perform problem size reduction. We present an application of MineReduce to improve a heuristic for the heterogeneous fleet vehicle routing problem. The results obtained in computational experiments show that this proposed heuristic demonstrates superior performance compared to the original heuristic and other state-of-the-art heuristics, achieving better solution costs with shorter run times. △ Less

Submitted 22 May, 2020; v1 submitted 15 May, 2020; originally announced May 2020.

Showing 1–12 of 12 results for author: Maia, H