Search | arXiv e-print repository

doi 10.1145/3632620.3671098

Insights from Social Sha** Theory: The Appropriation of Large Language Models in an Undergraduate Programming Course

Authors: Aadarsh Padiyath, Xinying Hou, Amy Pang, Diego Viramontes Vargas, Xingjian Gu, Tamara Nelson-Fromm, Zihan Wu, Mark Guzdial, Barbara Ericson

Abstract: The capability of large language models (LLMs) to generate, debug, and explain code has sparked the interest of researchers and educators in undergraduate programming, with many anticipating their transformative potential in programming education. However, decisions about why and how to use LLMs in programming education may involve more than just the assessment of an LLM's technical capabilities.… ▽ More The capability of large language models (LLMs) to generate, debug, and explain code has sparked the interest of researchers and educators in undergraduate programming, with many anticipating their transformative potential in programming education. However, decisions about why and how to use LLMs in programming education may involve more than just the assessment of an LLM's technical capabilities. Using the social sha** of technology theory as a guiding framework, our study explores how students' social perceptions influence their own LLM usage. We then examine the correlation of self-reported LLM usage with students' self-efficacy and midterm performances in an undergraduate programming course. Triangulating data from an anonymous end-of-course student survey (n = 158), a mid-course self-efficacy survey (n=158), student interviews (n = 10), self-reported LLM usage on homework, and midterm performances, we discovered that students' use of LLMs was associated with their expectations for their future careers and their perceptions of peer usage. Additionally, early self-reported LLM usage in our context correlated with lower self-efficacy and lower midterm scores, while students' perceived over-reliance on LLMs, rather than their usage itself, correlated with decreased self-efficacy later in the course. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Accepted to the ACM Conference on International Computing Education Research V.1 (ICER '24 Vol. 1)

arXiv:2404.05168 [pdf, other]

Adapting to Covariate Shift in Real-time by Encoding Trees with Motion Equations

Authors: Tham Yik Foong, Heng Zhang, Mao Po Yuan, Danilo Vasconcellos Vargas

Abstract: Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source di… ▽ More Input distribution shift presents a significant problem in many real-world systems. Here we present Xenovert, an adaptive algorithm that can dynamically adapt to changes in input distribution. It is a perfect binary tree that adaptively divides a continuous input space into several intervals of uniform density while receiving a continuous stream of input. This process indirectly maps the source distribution to the shifted target distribution, preserving the data's relationship with the downstream decoder/operation, even after the shift occurs. In this paper, we demonstrated how a neural network integrated with Xenovert achieved better results in 4 out of 5 shifted datasets, saving the hurdle of retraining a machine learning model. We anticipate that Xenovert can be applied to many more applications that require adaptation to unforeseen input distribution shifts, even when the distribution shift is drastic. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 7 figures, 2 tables

arXiv:2403.14932 [pdf, other]

Extending Token Computation for LLM Reasoning

Authors: Bingli Liao, Danilo Vasconcellos Vargas

Abstract: Large Language Models (LLMs) are pivotal in advancing natural language processing but often struggle with complex reasoning tasks due to inefficient attention distributions. In this paper, we explore the effect of increased computed tokens on LLM performance and introduce a novel method for extending computed tokens in the Chain-of-Thought (CoT) process, utilizing attention mechanism optimization.… ▽ More Large Language Models (LLMs) are pivotal in advancing natural language processing but often struggle with complex reasoning tasks due to inefficient attention distributions. In this paper, we explore the effect of increased computed tokens on LLM performance and introduce a novel method for extending computed tokens in the Chain-of-Thought (CoT) process, utilizing attention mechanism optimization. By fine-tuning an LLM on a domain-specific, highly structured dataset, we analyze attention patterns across layers, identifying inefficiencies caused by non-semantic tokens with outlier high attention scores. To address this, we propose an algorithm that emulates early layer attention patterns across downstream layers to re-balance skewed attention distributions and enhance knowledge abstraction. Our findings demonstrate that our approach not only facilitates a deeper understanding of the internal dynamics of LLMs but also significantly improves their reasoning capabilities, particularly in non-STEM domains. Our study lays the groundwork for further innovations in LLM design, aiming to create more powerful, versatile, and responsible models capable of tackling a broad range of real-world applications. △ Less

Submitted 23 June, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

arXiv:2402.04699 [pdf, other]

Breaking Free: How to Hack Safety Guardrails in Black-Box Diffusion Models!

Authors: Shashank Kotyan, Po-Yuan Mao, Pin-Yu Chen, Danilo Vasconcellos Vargas

Abstract: Deep neural networks can be exploited using natural adversarial samples, which do not impact human perception. Current approaches often rely on deep neural networks' white-box nature to generate these adversarial samples or synthetically alter the distribution of adversarial samples compared to the training distribution. In contrast, we propose EvoSeed, a novel evolutionary strategy-based algorith… ▽ More Deep neural networks can be exploited using natural adversarial samples, which do not impact human perception. Current approaches often rely on deep neural networks' white-box nature to generate these adversarial samples or synthetically alter the distribution of adversarial samples compared to the training distribution. In contrast, we propose EvoSeed, a novel evolutionary strategy-based algorithmic framework for generating photo-realistic natural adversarial samples. Our EvoSeed framework uses auxiliary Conditional Diffusion and Classifier models to operate in a black-box setting. We employ CMA-ES to optimize the search for an initial seed vector, which, when processed by the Conditional Diffusion Model, results in the natural adversarial sample misclassified by the Classifier Model. Experiments show that generated adversarial images are of high image quality, raising concerns about generating harmful content bypassing safety classifiers. Our research opens new avenues to understanding the limitations of current safety mechanisms and the risk of plausible attacks against classifier systems using image generation. Project Website can be accessed at: https://shashankkotyan.github.io/EvoSeed. △ Less

Submitted 22 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

arXiv:2312.11473 [pdf, other]

Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models

Authors: Mao Po-Yuan, Shashank Kotyan, Tham Yik Foong, Danilo Vasconcellos Vargas

Abstract: Recent advances in Conditional Diffusion Models have led to substantial capabilities in various domains. However, understanding the impact of variations in the initial seed vector remains an underexplored area of concern. Particularly, latent-based diffusion models display inconsistencies in image generation under standard conditions when initialized with suboptimal initial seed vectors. To unders… ▽ More Recent advances in Conditional Diffusion Models have led to substantial capabilities in various domains. However, understanding the impact of variations in the initial seed vector remains an underexplored area of concern. Particularly, latent-based diffusion models display inconsistencies in image generation under standard conditions when initialized with suboptimal initial seed vectors. To understand the impact of the initial seed vector on generated samples, we propose a reliability evaluation framework that evaluates the generated samples of a diffusion model when the initial seed vector is subjected to various synthetic shifts. Our results indicate that slight manipulations to the initial seed vector of the state-of-the-art Stable Diffusion (Rombach et al., 2022) can lead to significant disturbances in the generated samples, consequently creating images without the effect of conditioning variables. In contrast, GLIDE (Nichol et al., 2022) stands out in generating reliable samples even when the initial seed vector is transformed. Thus, our study sheds light on the importance of the selection and the impact of the initial seed vector in the latent-based diffusion model. △ Less

Submitted 24 November, 2023; originally announced December 2023.

arXiv:2312.04024 [pdf, other]

k* Distribution: Evaluating the Latent Space of Deep Neural Networks using Local Neighborhood Analysis

Authors: Shashank Kotyan, Ueda Tatsuya, Danilo Vasconcellos Vargas

Abstract: Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. While these methods effectively capture the overall sample distribution in the entire learned latent space, they tend to distort the structure of sample distributions within specific classes in the subset of the latent space. This distortion complicates the task o… ▽ More Most examinations of neural networks' learned latent spaces typically employ dimensionality reduction techniques such as t-SNE or UMAP. While these methods effectively capture the overall sample distribution in the entire learned latent space, they tend to distort the structure of sample distributions within specific classes in the subset of the latent space. This distortion complicates the task of easily distinguishing classes identifiable by neural networks. In response to this challenge, we introduce the k* Distribution methodology. This approach focuses on capturing the characteristics and structure of sample distributions for individual classes within the subset of the learned latent space using local neighborhood analysis. The key concept is to facilitate easy comparison of different k* distributions, enabling analysis of how various classes are processed by the same neural network. This provides a more profound understanding of existing contemporary visualizations. Our study reveals three distinct distributions of samples within the learned latent space subset: a) Fractured, b) Overlapped, and c) Clustered. We note and demonstrate that the distribution of samples within the network's learned latent space significantly varies depending on the class. Furthermore, we illustrate that our analysis can be applied to explore the latent space of diverse neural network architectures, various layers within neural networks, transformations applied to input samples, and the distribution of training and testing data for neural networks. We anticipate that our approach will facilitate more targeted investigations into neural networks by collectively examining the distribution of different samples within the learned latent space. △ Less

Submitted 6 December, 2023; originally announced December 2023.

arXiv:2311.13620 [pdf, other]

The Challenges of Image Generation Models in Generating Multi-Component Images

Authors: Tham Yik Foong, Shashank Kotyan, Po Yuan Mao, Danilo Vasconcellos Vargas

Abstract: Recent advances in text-to-image generators have led to substantial capabilities in image generation. However, the complexity of prompts acts as a bottleneck in the quality of images generated. A particular under-explored facet is the ability of generative models to create high-quality images comprising multiple components given as a prior. In this paper, we propose and validate a metric called Co… ▽ More Recent advances in text-to-image generators have led to substantial capabilities in image generation. However, the complexity of prompts acts as a bottleneck in the quality of images generated. A particular under-explored facet is the ability of generative models to create high-quality images comprising multiple components given as a prior. In this paper, we propose and validate a metric called Components Inclusion Score (CIS) to evaluate the extent to which a model can correctly generate multiple components. Our results reveal that the evaluated models struggle to incorporate all the visual elements from prompts with multiple components (8.53% drop in CIS per component for all evaluated models). We also identify a significant decline in the quality of the images and context awareness within an image as the number of components increased (15.91% decrease in inception Score and 9.62% increase in Frechet Inception Distance). To remedy this issue, we fine-tuned Stable Diffusion V2 on a custom-created test dataset with multiple components, outperforming its vanilla counterpart. To conclude, these findings reveal a critical limitation in existing text-to-image generators, shedding light on the challenge of generating multiple components within a single image using a complex prompt. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: 10 pages, 6 figures, and 3 tables

arXiv:2311.10177 [pdf, other]

Towards Improving Robustness Against Common Corruptions using Mixture of Class Specific Experts

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas

Abstract: Neural networks have demonstrated significant accuracy across various domains, yet their vulnerability to subtle input alterations remains a persistent challenge. Conventional methods like data augmentation, while effective to some extent, fall short in addressing unforeseen corruptions, limiting the adaptability of neural networks in real-world scenarios. In response, this paper introduces a nove… ▽ More Neural networks have demonstrated significant accuracy across various domains, yet their vulnerability to subtle input alterations remains a persistent challenge. Conventional methods like data augmentation, while effective to some extent, fall short in addressing unforeseen corruptions, limiting the adaptability of neural networks in real-world scenarios. In response, this paper introduces a novel paradigm known as the Mixture of Class-Specific Expert Architecture. The approach involves disentangling feature learning for individual classes, offering a nuanced enhancement in scalability and overall performance. By training dedicated network segments for each class and subsequently aggregating their outputs, the proposed architecture aims to mitigate vulnerabilities associated with common neural network structures. The study underscores the importance of comprehensive evaluation methodologies, advocating for the incorporation of benchmarks like the common corruptions benchmark. This inclusion provides nuanced insights into the vulnerabilities of neural networks, especially concerning their generalization capabilities and robustness to unforeseen distortions. The research aligns with the broader objective of advancing the development of highly robust learning systems capable of nuanced reasoning across diverse and challenging real-world scenarios. Through this contribution, the paper aims to foster a deeper understanding of neural network limitations and proposes a practical approach to enhance their resilience in the face of evolving and unpredictable conditions. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2311.07928; text overlap with arXiv:1903.12261 by other authors

arXiv:2311.07928 [pdf, other]

Towards Improving Robustness Against Common Corruptions in Object Detectors Using Adversarial Contrastive Learning

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas

Abstract: Neural networks have revolutionized various domains, exhibiting remarkable accuracy in tasks like natural language processing and computer vision. However, their vulnerability to slight alterations in input samples poses challenges, particularly in safety-critical applications like autonomous driving. Current approaches, such as introducing distortions during training, fall short in addressing unf… ▽ More Neural networks have revolutionized various domains, exhibiting remarkable accuracy in tasks like natural language processing and computer vision. However, their vulnerability to slight alterations in input samples poses challenges, particularly in safety-critical applications like autonomous driving. Current approaches, such as introducing distortions during training, fall short in addressing unforeseen corruptions. This paper proposes an innovative adversarial contrastive learning framework to enhance neural network robustness simultaneously against adversarial attacks and common corruptions. By generating instance-wise adversarial examples and optimizing contrastive loss, our method fosters representations that resist adversarial perturbations and remain robust in real-world scenarios. Subsequent contrastive learning then strengthens the similarity between clean samples and their adversarial counterparts, fostering representations resistant to both adversarial attacks and common distortions. By focusing on improving performance under adversarial and real-world conditions, our approach aims to bolster the robustness of neural networks in safety-critical applications, such as autonomous vehicles navigating unpredictable weather conditions. We anticipate that this framework will contribute to advancing the reliability of neural networks in challenging environments, facilitating their widespread adoption in mission-critical scenarios. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.00441 [pdf, other]

Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas

Abstract: Vision Transformer (ViT) has demonstrated promising performance in computer vision tasks, comparable to state-of-the-art neural networks. Yet, this new type of deep neural network architecture is vulnerable to adversarial attacks limiting its capabilities in terms of robustness. This article presents a novel contribution aimed at further improving the accuracy and robustness of ViT, particularly i… ▽ More Vision Transformer (ViT) has demonstrated promising performance in computer vision tasks, comparable to state-of-the-art neural networks. Yet, this new type of deep neural network architecture is vulnerable to adversarial attacks limiting its capabilities in terms of robustness. This article presents a novel contribution aimed at further improving the accuracy and robustness of ViT, particularly in the face of adversarial attacks. We propose an augmentation technique called `Dynamic Scanning Augmentation' that leverages dynamic input sequences to adaptively focus on different patches, thereby maintaining performance and robustness. Our detailed investigations reveal that this adaptability to the input sequence induces significant changes in the attention mechanism of ViT, even for the same image. We introduce four variations of Dynamic Scanning Augmentation, outperforming ViT in terms of both robustness to adversarial attacks and accuracy against natural images, with one variant showing comparable results. By integrating our augmentation technique, we observe a substantial increase in ViT's robustness, improving it from $17\%$ to $92\%$ measured across different types of adversarial attacks. These findings, together with other comprehensive tests, indicate that Dynamic Scanning Augmentation enhances accuracy and robustness by promoting a more adaptive type of attention. In conclusion, this work contributes to the ongoing research on Vision Transformers by introducing Dynamic Scanning Augmentation as a technique for improving the accuracy and robustness of ViT. The observed results highlight the potential of this approach in advancing computer vision tasks and merit further exploration in future studies. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: Accepted in Neurocomputing

arXiv:2310.10768 [pdf]

Security in Cryptocurrency

Authors: Chelsea Medina, Lily Shaw, Dissy Vargas, Sundar Krishnan

Abstract: This paper discusses the mechanisms of cryptocurrency, the idea of using security in the system, and the popularity of it. To begin, the authors provide a background on cryptocurrency and how it works. The authors understand that while most people may be familiar with the concept, they may not know how it works. Next, the authors discuss the security of cryptocurrency in-depth within the paper. Th… ▽ More This paper discusses the mechanisms of cryptocurrency, the idea of using security in the system, and the popularity of it. To begin, the authors provide a background on cryptocurrency and how it works. The authors understand that while most people may be familiar with the concept, they may not know how it works. Next, the authors discuss the security of cryptocurrency in-depth within the paper. The authors also provide examples of attacks on cryptocurrency systems to show the vulnerabilities within the system. Lastly, the authors discuss the popularity of the system to further express the need for security in cryptocurrency. △ Less

Submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.10045 [pdf, ps, other]

doi 10.1016/j.physd.2023.133923

Symmetrical SyncMap for Imbalanced General Chunking Problems

Authors: Heng Zhang, Danilo Vasconcellos Vargas

Abstract: Recently, SyncMap pioneered an approach to learn complex structures from sequences as well as adapt to any changes in underlying structures. This is achieved by using only nonlinear dynamical equations inspired by neuron group behaviors, i.e., without loss functions. Here we propose Symmetrical SyncMap that goes beyond the original work to show how to create dynamical equations and attractor-repel… ▽ More Recently, SyncMap pioneered an approach to learn complex structures from sequences as well as adapt to any changes in underlying structures. This is achieved by using only nonlinear dynamical equations inspired by neuron group behaviors, i.e., without loss functions. Here we propose Symmetrical SyncMap that goes beyond the original work to show how to create dynamical equations and attractor-repeller points which are stable over the long run, even dealing with imbalanced continual general chunking problems (CGCPs). The main idea is to apply equal updates from negative and positive feedback loops by symmetrical activation. We then introduce the concept of memory window to allow for more positive updates. Our algorithm surpasses or ties other unsupervised state-of-the-art baselines in all 12 imbalanced CGCPs with various difficulties, including dynamically changing ones. To verify its performance in real-world scenarios, we conduct experiments on several well-studied structure learning problems. The proposed method surpasses substantially other methods in 3 out of 4 scenarios, suggesting that symmetrical activation plays a critical role in uncovering topological structures and even hierarchies encoded in temporal data. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 40 pages, 19 figures

Journal ref: Physica D: Nonlinear Phenomena, Volume 456, 2023, 133923, ISSN 0167-2789

arXiv:2307.15092 [pdf, other]

doi 10.1109/ACCESS.2023.3299296

A Survey on Reservoir Computing and its Interdisciplinary Applications Beyond Traditional Machine Learning

Authors: Heng Zhang, Danilo Vasconcellos Vargas

Abstract: Reservoir computing (RC), first applied to temporal signal processing, is a recurrent neural network in which neurons are randomly connected. Once initialized, the connection strengths remain unchanged. Such a simple structure turns RC into a non-linear dynamical system that maps low-dimensional inputs into a high-dimensional space. The model's rich dynamics, linear separability, and memory capaci… ▽ More Reservoir computing (RC), first applied to temporal signal processing, is a recurrent neural network in which neurons are randomly connected. Once initialized, the connection strengths remain unchanged. Such a simple structure turns RC into a non-linear dynamical system that maps low-dimensional inputs into a high-dimensional space. The model's rich dynamics, linear separability, and memory capacity then enable a simple linear readout to generate adequate responses for various applications. RC spans areas far beyond machine learning, since it has been shown that the complex dynamics can be realized in various physical hardware implementations and biological devices. This yields greater flexibility and shorter computation time. Moreover, the neuronal responses triggered by the model's dynamics shed light on understanding brain mechanisms that also exploit similar dynamical processes. While the literature on RC is vast and fragmented, here we conduct a unified review of RC's recent developments from machine learning to physics, biology, and neuroscience. We first review the early RC models, and then survey the state-of-the-art models and their applications. We further introduce studies on modeling the brain's mechanisms by RC. Finally, we offer new perspectives on RC development, including reservoir design, coding frameworks unification, physical RC implementations, and interaction between RC, cognitive neuroscience and evolution. △ Less

Submitted 27 July, 2023; originally announced July 2023.

Comments: 51 pages, 19 figures, IEEE Access

Journal ref: IEEE Access, vol. 11, pp. 81033-81070 (2023)

arXiv:2306.10927 [pdf, other]

Generating Oscillation Activity with Echo State Network to Mimic the Behavior of a Simple Central Pattern Generator

Authors: Tham Yik Foong, Danilo Vasconcellos Vargas

Abstract: This paper presents a method for reproducing a simple central pattern generator (CPG) using a modified Echo State Network (ESN). Conventionally, the dynamical reservoir needs to be damped to stabilize and preserve memory. However, we find that a reservoir that develops oscillatory activity without any external excitation can mimic the behaviour of a simple CPG in biological systems. We define the… ▽ More This paper presents a method for reproducing a simple central pattern generator (CPG) using a modified Echo State Network (ESN). Conventionally, the dynamical reservoir needs to be damped to stabilize and preserve memory. However, we find that a reservoir that develops oscillatory activity without any external excitation can mimic the behaviour of a simple CPG in biological systems. We define the specific neuron ensemble required for generating oscillations in the reservoir and demonstrate how adjustments to the leaking rate, spectral radius, topology, and population size can increase the probability of reproducing these oscillations. The results of the experiments, conducted on the time series simulation tasks, demonstrate that the ESN is able to generate the desired waveform without any input. This approach offers a promising solution for the development of bio-inspired controllers for robotic systems. △ Less

Submitted 19 June, 2023; originally announced June 2023.

Comments: 12 pages, 6 figures, COGSCI 2023 in-press

arXiv:2305.17178 [pdf, other]

Rate-Splitting Multiple Access: Finite Constellations, Receiver Design, and SIC-free Implementation

Authors: Sibo Zhang, Bruno Clerckx, David Vargas, Oliver Haffenden, Andrew Murphy

Abstract: Rate-Splitting Multiple Access (RSMA) has emerged as a novel multiple access technique that enlarges the achievable rate region of Multiple-Input Multiple-Output (MIMO) broadcast channels with linear precoding. In this work, we jointly address three practical but fundamental questions: (1) How to exploit the benefit of RSMA under finite constellations? (2) What are the potential and promising ways… ▽ More Rate-Splitting Multiple Access (RSMA) has emerged as a novel multiple access technique that enlarges the achievable rate region of Multiple-Input Multiple-Output (MIMO) broadcast channels with linear precoding. In this work, we jointly address three practical but fundamental questions: (1) How to exploit the benefit of RSMA under finite constellations? (2) What are the potential and promising ways to implement RSMA receivers? (3) Can RSMA still retain its superiority in the absence of successive interference cancellers (SIC)? To address these concerns, we first propose low-complexity precoder designs taking finite constellations into account and show that the potential of RSMA is better achieved with such designs than those assuming Gaussian signalling. We then consider some practical receiver designs that can be applied to RSMA. We notice that these receiver designs follow one of two principles: (1) SIC: cancelling upper layer signals before decoding the lower layer and (2) non-SIC: treating upper layer signals as noise when decoding the lower layer. In light of this, we propose to alter the precoder design according to the receiver category. Through link-level simulations, the effectiveness of the proposed precoder and receiver designs are verified. More importantly, we show that it is possible to preserve the superiority of RSMA over Spatial Domain Multiple Access (SDMA), including SDMA with advanced receivers, even without SIC at the receivers. Those results therefore open the door to competitive implementable RSMA strategies for 6G and beyond communications. △ Less

Submitted 6 December, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: Submitted to IEEE for publication

arXiv:2305.15630 [pdf]

doi 10.1109/TBC.2023.3311330

Multicast and Unicast Superposition Transmission in MIMO OFDMA Systems with Statistical CSIT

Authors: Yong ** Daniel Kim, David Vargas

Abstract: We consider a downlink multicast and unicast superposition transmission in multi-layer Multiple-Input Multiple-Output (MIMO) Orthogonal Frequency Division Multiple Access (OFDMA) systems when only the statistical channel state information is available at the transmitter (CSIT). Multiple users can be scheduled by using the time/frequency resources in OFDMA, while for each scheduled user MIMO spatia… ▽ More We consider a downlink multicast and unicast superposition transmission in multi-layer Multiple-Input Multiple-Output (MIMO) Orthogonal Frequency Division Multiple Access (OFDMA) systems when only the statistical channel state information is available at the transmitter (CSIT). Multiple users can be scheduled by using the time/frequency resources in OFDMA, while for each scheduled user MIMO spatial multiplexing is used to transmit multiple information layers, i.e., single user (SU)-MIMO. The users only need to feedback to the base-station the rank-indicator and the long-term average channel signal-to-noise ratio to indicate a suitable number of transmission layers, a suitable modulation and coding scheme and allow the base-station to perform user scheduling. This approach is especially relevant for the delivery of common (e.g., popular live event) and independent (e.g., user personalized) content to a high number of users in deployments in the lower frequency bands operating in Frequency-Division-Duplex (FDD) mode, e.g., sub-1 GHz. We show that the optimal resource allocation that maximizes the ergodic sum-rate involves greedy user selection per OFDM subchannel and superposition transmission of one multicast signal across all subchannels and single unicast signal per subchannel. Degree-of-freedom (DoF) analysis shows that while the lack of instantaneous CSI limits DoF of unicast messages to the minimum number of transmit antennas and receiver antennas, the multicast message obtains full DoF that increases linearly with the number of users. We present resource allocation algorithms consisting of user selection and power allocation between multicast and unicast signals in each OFDM subchannel. System level simulations in 5G rural macro-cell scenarios show overall network throughput gains in realistic network environments by superposition transmission of multicast and unicast signals. △ Less

Submitted 29 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: 17 pages, 10 figures, 2 tables

arXiv:2302.02140 [pdf, other]

Dynamical Equations With Bottom-up Self-Organizing Properties Learn Accurate Dynamical Hierarchies Without Any Loss Function

Authors: Danilo Vasconcellos Vargas, Tham Yik Foong, Heng Zhang

Abstract: Self-organization is ubiquitous in nature and mind. However, machine learning and theories of cognition still barely touch the subject. The hurdle is that general patterns are difficult to define in terms of dynamical equations and designing a system that could learn by reordering itself is still to be seen. Here, we propose a learning system, where patterns are defined within the realm of nonline… ▽ More Self-organization is ubiquitous in nature and mind. However, machine learning and theories of cognition still barely touch the subject. The hurdle is that general patterns are difficult to define in terms of dynamical equations and designing a system that could learn by reordering itself is still to be seen. Here, we propose a learning system, where patterns are defined within the realm of nonlinear dynamics with positive and negative feedback loops, allowing attractor-repeller pairs to emerge for each pattern observed. Experiments reveal that such a system can map temporal to spatial correlation, enabling hierarchical structures to be learned from sequential data. The results are accurate enough to surpass state-of-the-art unsupervised learning algorithms in seven out of eight experiments as well as two real-world problems. Interestingly, the dynamic nature of the system makes it inherently adaptive, giving rise to phenomena similar to phase transitions in chemistry/thermodynamics when the input structure changes. Thus, the work here sheds light on how self-organization can allow for pattern recognition and hints at how intelligent behavior might emerge from simple dynamic equations without any objective/loss function. △ Less

Submitted 4 February, 2023; originally announced February 2023.

Comments: 29 pages, 17 figures

arXiv:2106.05657 [pdf, other]

Deep neural network loses attention to adversarial images

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas

Abstract: Adversarial algorithms have shown to be effective against neural networks for a variety of tasks. Some adversarial algorithms perturb all the pixels in the image minimally for the image classification task in image classification. In contrast, some algorithms perturb few pixels strongly. However, very little information is available regarding why these adversarial samples so diverse from each othe… ▽ More Adversarial algorithms have shown to be effective against neural networks for a variety of tasks. Some adversarial algorithms perturb all the pixels in the image minimally for the image classification task in image classification. In contrast, some algorithms perturb few pixels strongly. However, very little information is available regarding why these adversarial samples so diverse from each other exist. Recently, Vargas et al. showed that the existence of these adversarial samples might be due to conflicting saliency within the neural network. We test this hypothesis of conflicting saliency by analysing the Saliency Maps (SM) and Gradient-weighted Class Activation Maps (Grad-CAM) of original and few different types of adversarial samples. We also analyse how different adversarial samples distort the attention of the neural network compared to original samples. We show that in the case of Pixel Attack, perturbed pixels either calls the network attention to themselves or divert the attention from them. Simultaneously, the Projected Gradient Descent Attack perturbs pixels so that intermediate layers inside the neural network lose attention for the correct class. We also show that both attacks affect the saliency map and activation maps differently. Thus, shedding light on why some defences successful against some attacks remain vulnerable against other attacks. We hope that this analysis will improve understanding of the existence and the effect of adversarial samples and enable the community to develop more robust neural networks. △ Less

Submitted 10 June, 2021; originally announced June 2021.

Comments: Accepted in Workshop on Artificial Intelligence Safety (AISafety 2021), IJCAI-2021

arXiv:2009.01110 [pdf, other]

Perceptual Deep Neural Networks: Adversarial Robustness through Input Recreation

Authors: Danilo Vasconcellos Vargas, Bingli Liao, Takahiro Kanzaki

Abstract: Adversarial examples have shown that albeit highly accurate, models learned by machines, differently from humans, have many weaknesses. However, humans' perception is also fundamentally different from machines, because we do not see the signals which arrive at the retina but a rather complex recreation of them. In this paper, we explore how machines could recreate the input as well as investigate… ▽ More Adversarial examples have shown that albeit highly accurate, models learned by machines, differently from humans, have many weaknesses. However, humans' perception is also fundamentally different from machines, because we do not see the signals which arrive at the retina but a rather complex recreation of them. In this paper, we explore how machines could recreate the input as well as investigate the benefits of such an augmented perception. In this regard, we propose Perceptual Deep Neural Networks ($\varphi$DNN) which also recreate their own input before further processing. The concept is formalized mathematically and two variations of it are developed (one based on inpainting the whole image and the other based on a noisy resized super resolution recreation). Experiments reveal that $\varphi$DNNs and their adversarial training variations can increase the robustness substantially, surpassing both state-of-the-art defenses and pre-processing types of defenses in 100% of the tests. $\varphi$DNNs are shown to scale well to bigger image sizes, kee** a similar high accuracy throughout; while the state-of-the-art worsen up to 35%. Moreover, the recreation process intentionally corrupts the input image. Interestingly, we show by ablation tests that corrupting the input is, although counter-intuitive, beneficial. Thus, $\varphi$DNNs reveal that input recreation has strong benefits for artificial neural networks similar to biological ones, shedding light into the importance of purposely corrupting the input as well as pioneering an area of perception models based on GANs and autoencoders for robust recognition in artificial intelligence. △ Less

Submitted 30 November, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

arXiv:2008.10820 [pdf, other]

Simple Unsupervised Similarity-Based Aspect Extraction

Authors: Danny Suarez Vargas, Lucas R. C. Pessutto, Viviane Pereira Moreira

Abstract: In the context of sentiment analysis, there has been growing interest in performing a finer granularity analysis focusing on the specific aspects of the entities being evaluated. This is the goal of Aspect-Based Sentiment Analysis (ABSA) which basically involves two tasks: aspect extraction and polarity detection. The first task is responsible for discovering the aspects mentioned in the review te… ▽ More In the context of sentiment analysis, there has been growing interest in performing a finer granularity analysis focusing on the specific aspects of the entities being evaluated. This is the goal of Aspect-Based Sentiment Analysis (ABSA) which basically involves two tasks: aspect extraction and polarity detection. The first task is responsible for discovering the aspects mentioned in the review text and the second task assigns a sentiment orientation (positive, negative, or neutral) to that aspect. Currently, the state-of-the-art in ABSA consists of the application of deep learning methods such as recurrent, convolutional and attention neural networks. The limitation of these techniques is that they require a lot of training data and are computationally expensive. In this paper, we propose a simple approach called SUAEx for aspect extraction. SUAEx is unsupervised and relies solely on the similarity of word embeddings. Experimental results on datasets from three different domains have shown that SUAEx achieves results that can outperform the state-of-the-art attention-based approach at a fraction of the time. △ Less

Submitted 25 August, 2020; originally announced August 2020.

Comments: 12 pages, 3 figures, paper to be published in 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019)

arXiv:2006.07853 [pdf, other]

Continual General Chunking Problem and SyncMap

Authors: Danilo Vasconcellos Vargas, Toshitake Asabuki

Abstract: Humans possess an inherent ability to chunk sequences into their constituent parts. In fact, this ability is thought to bootstrap language skills and learning of image patterns which might be a key to a more animal-like type of intelligence. Here, we propose a continual generalization of the chunking problem (an unsupervised problem), encompassing fixed and probabilistic chunks, discovery of tempo… ▽ More Humans possess an inherent ability to chunk sequences into their constituent parts. In fact, this ability is thought to bootstrap language skills and learning of image patterns which might be a key to a more animal-like type of intelligence. Here, we propose a continual generalization of the chunking problem (an unsupervised problem), encompassing fixed and probabilistic chunks, discovery of temporal and causal structures and their continual variations. Additionally, we propose an algorithm called SyncMap that can learn and adapt to changes in the problem by creating a dynamic map which preserves the correlation between variables. Results of SyncMap suggest that the proposed algorithm learn near optimal solutions, despite the presence of many types of structures and their continual variation. When compared to Word2vec, PARSER and MRIL, SyncMap surpasses or ties with the best algorithm on $66\%$ of the scenarios while being the second best in the remaining $34\%$. SyncMap's model-free simple dynamics and the absence of loss functions reveal that, perhaps surprisingly, much can be done with self-organization alone. Code available at https://github.com/zweifel/SyncMap. △ Less

Submitted 5 April, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

Journal ref: AAAI2021

arXiv:2001.02758 [pdf, other]

doi 10.1109/SAM.2018.8448955

Pioneering Studies on LTE eMBMS: Towards 5G Point-to-Multipoint Transmissions

Authors: Hongzhi Chen, De Mi, Manuel Fuentes, David Vargas, Eduardo Garro, Jose Luis Carcel, Belkacem Mouhouche, Pei Xiao, Rahim Tafazolli

Abstract: The first 5G (5th generation wireless systems) New Radio Release-15 was recently completed. However, the specification only considers the use of unicast technologies and the extension to point-to-multipoint (PTM) scenarios is not yet considered. To this end, we first present in this work a technical overview of the state-of-the-art LTE (Long Term Evolution) PTM technology, i.e., eMBMS (evolved Mul… ▽ More The first 5G (5th generation wireless systems) New Radio Release-15 was recently completed. However, the specification only considers the use of unicast technologies and the extension to point-to-multipoint (PTM) scenarios is not yet considered. To this end, we first present in this work a technical overview of the state-of-the-art LTE (Long Term Evolution) PTM technology, i.e., eMBMS (evolved Multimedia Broadcast Multicast Services), and investigate the physical layer performance via link-level simulations. Then based on the simulation analysis, we discuss potential improvements for the two current eMBMS solutions, i.e., MBSFN (MBMS over Single Frequency Networks) and SC-PTM (Single-Cell PTM). This work explicitly focus on equip** the current eMBMS solutions with 5G candidate techniques, e.g., multiple antennas and millimeter wave, and its potentials to meet the requirements of next generation PTM transmissions. △ Less

Submitted 29 November, 2019; originally announced January 2020.

Comments: SAM 2018, 5 pages, 4 figs

arXiv:1906.11667 [pdf, other]

Evolving Robust Neural Architectures to Defend from Adversarial Attacks

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas

Abstract: Neural networks are prone to misclassify slightly modified input images. Recently, many defences have been proposed, but none have improved the robustness of neural networks consistently. Here, we propose to use adversarial attacks as a function evaluation to search for neural architectures that can resist such attacks automatically. Experiments on neural architecture search algorithms from the li… ▽ More Neural networks are prone to misclassify slightly modified input images. Recently, many defences have been proposed, but none have improved the robustness of neural networks consistently. Here, we propose to use adversarial attacks as a function evaluation to search for neural architectures that can resist such attacks automatically. Experiments on neural architecture search algorithms from the literature show that although accurate, they are not able to find robust architectures. A significant reason for this lies in their limited search space. By creating a novel neural architecture search with options for dense layers to connect with convolution layers and vice-versa as well as the addition of concatenation layers in the search, we were able to evolve an architecture that is inherently accurate on adversarial samples. Interestingly, this inherent robustness of the evolved architecture rivals state-of-the-art defences such as adversarial training while being trained only on the non-adversarial samples. Moreover, the evolved architecture makes use of some peculiar traits which might be useful for develo** even more robust ones. Thus, the results here confirm that more robust architectures exist as well as opens up a new realm of feasibilities for the development and exploration of neural networks. Code available at http://bit.ly/RobustArchitectureSearch. △ Less

Submitted 16 July, 2020; v1 submitted 27 June, 2019; originally announced June 2019.

Comments: Pre-print of the published article in Proceedings of the Workshop on Artificial Intelligence Safety 2020, co-located with the 29th International Joint Conference on Artificial Intelligence and the 17th Pacific Rim International Conference on Artificial Intelligence (IJCAI-PRICAI 2020)

arXiv:1906.06627 [pdf, other]

Representation Quality Of Neural Networks Links To Adversarial Attacks and Defences

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas, Moe Matsuki

Abstract: Neural networks have been shown vulnerable to a variety of adversarial algorithms. A crucial step to understanding the rationale for this lack of robustness is to assess the potential of the neural networks' representation to encode the existing features. Here, we propose a method to understand the representation quality of the neural networks using a novel test based on Zero-Shot Learning, entitl… ▽ More Neural networks have been shown vulnerable to a variety of adversarial algorithms. A crucial step to understanding the rationale for this lack of robustness is to assess the potential of the neural networks' representation to encode the existing features. Here, we propose a method to understand the representation quality of the neural networks using a novel test based on Zero-Shot Learning, entitled Raw Zero-Shot. The principal idea is that, if an algorithm learns rich features, such features should be able to interpret "unknown" classes as an aggregate of previously learned features. This is because unknown classes usually share several regular features with recognised classes, given the features learned are general enough. We further introduce two metrics to assess these learned features to interpret unknown classes. One is based on inter-cluster validation technique (Davies-Bouldin Index), and the other is based on the distance to an approximated ground-truth. Experiments suggest that adversarial defences improve the representation of the classifiers, further suggesting that to improve the robustness of the classifiers, one has to improve the representation quality also. Experiments also reveal a strong association (a high Pearson Correlation and low p-value) between the metrics and adversarial attacks. Interestingly, the results indicate that dynamic routing networks such as CapsNet have better representation while current deeper neural networks are trading off representation quality for accuracy. Code available at http://bit.ly/RepresentationMetrics. △ Less

Submitted 16 July, 2020; v1 submitted 15 June, 2019; originally announced June 2019.

arXiv:1906.06026 [pdf, other]

Adversarial Robustness Assessment: Why both $L_0$ and $L_\infty$ Attacks Are Necessary

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas

Abstract: There exists a vast number of adversarial attacks and defences for machine learning algorithms of various types which makes assessing the robustness of algorithms a daunting task. To make matters worse, there is an intrinsic bias in these adversarial algorithms. Here, we organise the problems faced: a) Model Dependence, b) Insufficient Evaluation, c) False Adversarial Samples, and d) Perturbation… ▽ More There exists a vast number of adversarial attacks and defences for machine learning algorithms of various types which makes assessing the robustness of algorithms a daunting task. To make matters worse, there is an intrinsic bias in these adversarial algorithms. Here, we organise the problems faced: a) Model Dependence, b) Insufficient Evaluation, c) False Adversarial Samples, and d) Perturbation Dependent Results). Based on this, we propose a model agnostic dual quality assessment method, together with the concept of robustness levels to tackle them. We validate the dual quality assessment on state-of-the-art neural networks (WideResNet, ResNet, AllConv, DenseNet, NIN, LeNet and CapsNet) as well as adversarial defences for image classification problem. We further show that current networks and defences are vulnerable at all levels of robustness. The proposed robustness assessment reveals that depending on the metric used (i.e., $L_0$ or $L_\infty$), the robustness may vary significantly. Hence, the duality should be taken into account for a correct evaluation. Moreover, a mathematical derivation, as well as a counter-example, suggest that $L_1$ and $L_2$ metrics alone are not sufficient to avoid spurious adversarial samples. Interestingly, the threshold attack of the proposed assessment is a novel $L_\infty$ black-box adversarial method which requires even less perturbation than the One-Pixel Attack (only $12\%$ of One-Pixel Attack's amount of perturbation) to achieve similar results. Code is available at http://bit.ly/DualQualityAssessment. △ Less

Submitted 16 July, 2020; v1 submitted 14 June, 2019; originally announced June 2019.

arXiv:1904.12738 [pdf, other]

Self Training Autonomous Driving Agent

Authors: Shashank Kotyan, Danilo Vasconcellos Vargas, Venkanna U

Abstract: Intrinsically, driving is a Markov Decision Process which suits well the reinforcement learning paradigm. In this paper, we propose a novel agent which learns to drive a vehicle without any human assistance. We use the concept of reinforcement learning and evolutionary strategies to train our agent in a 2D simulation environment. Our model's architecture goes beyond the World Model's by introducin… ▽ More Intrinsically, driving is a Markov Decision Process which suits well the reinforcement learning paradigm. In this paper, we propose a novel agent which learns to drive a vehicle without any human assistance. We use the concept of reinforcement learning and evolutionary strategies to train our agent in a 2D simulation environment. Our model's architecture goes beyond the World Model's by introducing difference images in the auto encoder. This novel involvement of difference images in the auto-encoder gives better representation of the latent space with respect to the motion of vehicle and helps an autonomous agent to learn more efficiently how to drive a vehicle. Results show that our method requires fewer (96% less) total agents, (87.5% less) agents per generations, (70% less) generations and (90% less) rollouts than the original architecture while achieving the same accuracy of the original. △ Less

Submitted 26 April, 2019; originally announced April 2019.

arXiv:1904.08658 [pdf, other]

Batch Tournament Selection for Genetic Programming

Authors: Vinicius V. Melo, Danilo Vasconcellos Vargas, Wolfgang Banzhaf

Abstract: Lexicase selection achieves very good solution quality by introducing ordered test cases. However, the computational complexity of lexicase selection can prohibit its use in many applications. In this paper, we introduce Batch Tournament Selection (BTS), a hybrid of tournament and lexicase selection which is approximately one order of magnitude faster than lexicase selection while achieving a comp… ▽ More Lexicase selection achieves very good solution quality by introducing ordered test cases. However, the computational complexity of lexicase selection can prohibit its use in many applications. In this paper, we introduce Batch Tournament Selection (BTS), a hybrid of tournament and lexicase selection which is approximately one order of magnitude faster than lexicase selection while achieving a competitive quality of solutions. Tests on a number of regression datasets show that BTS compares well with lexicase selection in terms of mean absolute error while having a speed-up of up to 25 times. Surprisingly, BTS and lexicase selection have almost no difference in both diversity and performance. This reveals that batches and ordered test cases are completely different mechanisms which share the same general principle fostering the specialization of individuals. This work introduces an efficient algorithm that sheds light onto the main principles behind the success of lexicase, potentially opening up a new range of possibilities for algorithms to come. △ Less

Submitted 18 April, 2019; originally announced April 2019.

arXiv:1903.09304 [pdf, other]

Tackling Unit Commitment and Load Dispatch Problems Considering All Constraints with Evolutionary Computation

Authors: Danilo Vasconcellos Vargas, Junichi Murata, Hirotaka Takano

Abstract: Unit commitment and load dispatch problems are important and complex problems in power system operations that have being traditionally solved separately. In this paper, both problems are solved together without approximations or simplifications. In fact, the problem solved has a massive amount of grid-connected photovoltaic units, four pump-storage hydro plants as energy storage units and ten ther… ▽ More Unit commitment and load dispatch problems are important and complex problems in power system operations that have being traditionally solved separately. In this paper, both problems are solved together without approximations or simplifications. In fact, the problem solved has a massive amount of grid-connected photovoltaic units, four pump-storage hydro plants as energy storage units and ten thermal power plants, each with its own set of operation requirements that need to be satisfied. To face such a complex constrained optimization problem an adaptive repair method is proposed. By including a given repair method itself as a parameter to be optimized, the proposed adaptive repair method avoid any bias in repair choices. Moreover, this results in a repair method that adapt to the problem and will improve together with the solution during optimization. Experiments are conducted revealing that the proposed method is capable of surpassing exact method solutions on a simplified version of the problem with approximations as well as solve the otherwise intractable complete problem without simplifications. Moreover, since the proposed approach can be applied to other problems in general and it may not be obvious how to choose the constraint handling for a certain constraint, a guideline is provided explaining the reasoning behind. Thus, this paper open further possibilities to deal with the ever changing types of generation units and other similarly complex operation/schedule optimization problems with many difficult constraints. △ Less

Submitted 5 March, 2019; originally announced March 2019.

arXiv:1902.06703 [pdf, other]

doi 10.1109/TNNLS.2016.2551748

Spectrum-Diverse Neuroevolution with Unified Neural Models

Authors: Danilo Vasconcellos Vargas, Junichi Murata

Abstract: Learning algorithms are being increasingly adopted in various applications. However, further expansion will require methods that work more automatically. To enable this level of automation, a more powerful solution representation is needed. However, by increasing the representation complexity a second problem arises. The search space becomes huge and therefore an associated scalable and efficient… ▽ More Learning algorithms are being increasingly adopted in various applications. However, further expansion will require methods that work more automatically. To enable this level of automation, a more powerful solution representation is needed. However, by increasing the representation complexity a second problem arises. The search space becomes huge and therefore an associated scalable and efficient searching algorithm is also required. To solve both problems, first a powerful representation is proposed that unifies most of the neural networks features from the literature into one representation. Secondly, a new diversity preserving method called Spectrum Diversity is created based on the new concept of chromosome spectrum that creates a spectrum out of the characteristics and frequency of alleles in a chromosome. The combination of Spectrum Diversity with a unified neuron representation enables the algorithm to either surpass or equal NeuroEvolution of Augmenting Topologies (NEAT) on all of the five classes of problems tested. Ablation tests justifies the good results, showing the importance of added new features in the unified neuron representation. Part of the success is attributed to the novelty-focused evolution and good scalability with chromosome size provided by Spectrum Diversity. Thus, this study sheds light on a new representation and diversity preserving mechanism that should impact algorithms and applications to come. To download the code please access the following https://github.com/zweifel/Physis-Shard. △ Less

Submitted 6 January, 2019; originally announced February 2019.

Journal ref: IEEE Transactions on Neural Networks and Learning Systems, Volume 28, Issue 8, 1759-1773, 2017

arXiv:1902.02947 [pdf, other]

Understanding the One-Pixel Attack: Propagation Maps and Locality Analysis

Authors: Danilo Vasconcellos Vargas, Jiawei Su

Abstract: Deep neural networks were shown to be vulnerable to single pixel modifications. However, the reason behind such phenomena has never been elucidated. Here, we propose Propagation Maps which show the influence of the perturbation in each layer of the network. Propagation Maps reveal that even in extremely deep networks such as Resnet, modification in one pixel easily propagates until the last layer.… ▽ More Deep neural networks were shown to be vulnerable to single pixel modifications. However, the reason behind such phenomena has never been elucidated. Here, we propose Propagation Maps which show the influence of the perturbation in each layer of the network. Propagation Maps reveal that even in extremely deep networks such as Resnet, modification in one pixel easily propagates until the last layer. In fact, this initial local perturbation is also shown to spread becoming a global one and reaching absolute difference values that are close to the maximum value of the original feature maps in a given layer. Moreover, we do a locality analysis in which we demonstrate that nearby pixels of the perturbed one in the one-pixel attack tend to share the same vulnerability, revealing that the main vulnerability lies in neither neurons nor pixels but receptive fields. Hopefully, the analysis conducted in this work together with a new technique called propagation maps shall shed light into the inner workings of other adversarial samples and be the basis of new defense systems to come. △ Less

Submitted 8 February, 2019; originally announced February 2019.

arXiv:1901.07132 [pdf, other]

Universal Rules for Fooling Deep Neural Networks based Text Classification

Authors: Di Li, Danilo Vasconcellos Vargas, Sakurai Kouichi

Abstract: Recently, deep learning based natural language processing techniques are being extensively used to deal with spam mail, censorship evaluation in social networks, among others. However, there is only a couple of works evaluating the vulnerabilities of such deep neural networks. Here, we go beyond attacks to investigate, for the first time, universal rules, i.e., rules that are sample agnostic and t… ▽ More Recently, deep learning based natural language processing techniques are being extensively used to deal with spam mail, censorship evaluation in social networks, among others. However, there is only a couple of works evaluating the vulnerabilities of such deep neural networks. Here, we go beyond attacks to investigate, for the first time, universal rules, i.e., rules that are sample agnostic and therefore could turn any text sample in an adversarial one. In fact, the universal rules do not use any information from the method itself (no information from the method, gradient information or training dataset information is used), making them black-box universal attacks. In other words, the universal rules are sample and method agnostic. By proposing a coevolutionary optimization algorithm we show that it is possible to create universal rules that can automatically craft imperceptible adversarial samples (only less than five perturbations which are close to misspelling are inserted in the text sample). A comparison with a random search algorithm further justifies the strength of the method. Thus, universal rules for fooling networks are here shown to exist. Hopefully, the results from this work will impact the development of yet more sample and model agnostic attacks as well as their defenses, culminating in perhaps a new age for artificial intelligence. △ Less

Submitted 3 April, 2019; v1 submitted 21 January, 2019; originally announced January 2019.

arXiv:1901.00266 [pdf, other]

doi 10.1162/EVCO_a_00118

General Subpopulation Framework and Taming the Conflict Inside Populations

Authors: Danilo Vasconcellos Vargas, Junichi Murata, Hirotaka Takano, Alexandre Claudio Botazzo Delbem

Abstract: Structured evolutionary algorithms have been investigated for some time. However, they have been under-explored specially in the field of multi-objective optimization. Despite their good results, the use of complex dynamics and structures make their understanding and adoption rate low. Here, we propose the general subpopulation framework that has the capability of integrating optimization algorith… ▽ More Structured evolutionary algorithms have been investigated for some time. However, they have been under-explored specially in the field of multi-objective optimization. Despite their good results, the use of complex dynamics and structures make their understanding and adoption rate low. Here, we propose the general subpopulation framework that has the capability of integrating optimization algorithms without restrictions as well as aid the design of structured algorithms. The proposed framework is capable of generalizing most of the structured evolutionary algorithms, such as cellular algorithms, island models, spatial predator-prey and restricted mating based algorithms under its formalization. Moreover, we propose two algorithms based on the general subpopulation framework, demonstrating that with the simple addition of a number of single-objective differential evolution algorithms for each objective the results improve greatly, even when the combined algorithms behave poorly when evaluated alone at the tests. Most importantly, the comparison between the subpopulation algorithms and their related panmictic algorithms suggests that the competition between different strategies inside one population can have deleterious consequences for an algorithm and reveal a strong benefit of using the subpopulation framework. The code for SAN, the proposed multi-objective algorithm which has the current best results in the hardest benchmark, is available at the following https://github.com/zweifel/zweifel △ Less

Submitted 2 January, 2019; originally announced January 2019.

Journal ref: Evolutionary computation 23 (1), 1-36, 2015

arXiv:1811.08226 [pdf, other]

Self Organizing Classifiers and Niched Fitness

Authors: Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

Abstract: Learning classifier systems are adaptive learning systems which have been widely applied in a multitude of application domains. However, there are still some generalization problems unsolved. The hurdle is that fitness and niching pressures are difficult to balance. Here, a new algorithm called Self Organizing Classifiers is proposed which faces this problem from a different perspective. Instead o… ▽ More Learning classifier systems are adaptive learning systems which have been widely applied in a multitude of application domains. However, there are still some generalization problems unsolved. The hurdle is that fitness and niching pressures are difficult to balance. Here, a new algorithm called Self Organizing Classifiers is proposed which faces this problem from a different perspective. Instead of balancing the pressures, both pressures are separated and no balance is necessary. In fact, the proposed algorithm possesses a dynamical population structure that self-organizes itself to better project the input space into a map. The niched fitness concept is defined along with its dynamical population structure, both are indispensable for the understanding of the proposed method. Promising results are shown on two continuous multi-step problems. One of which is yet more challenging than previous problems of this class in the literature. △ Less

Submitted 20 November, 2018; originally announced November 2018.

Comments: arXiv admin note: text overlap with arXiv:1811.08225

Journal ref: Proceedings of the 15th annual conference on Genetic and evolutionary computation (GECCO 2013)

arXiv:1811.08225 [pdf, other]

Self Organizing Classifiers: First Steps in Structured Evolutionary Machine Learning

Authors: Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

Abstract: Learning classifier systems (LCSs) are evolutionary machine learning algorithms, flexible enough to be applied to reinforcement, supervised and unsupervised learning problems with good performance. Recently, self organizing classifiers were proposed which are similar to LCSs but have the advantage that in its structured population no balance between niching and fitness pressure is necessary. Howev… ▽ More Learning classifier systems (LCSs) are evolutionary machine learning algorithms, flexible enough to be applied to reinforcement, supervised and unsupervised learning problems with good performance. Recently, self organizing classifiers were proposed which are similar to LCSs but have the advantage that in its structured population no balance between niching and fitness pressure is necessary. However, more tests and analysis are required to verify its benefits. Here, a variation of the first algorithm is proposed which uses a parameterless self organizing map (SOM). This algorithm is applied in challenging problems such as big, noisy as well as dynamically changing continuous input-action mazes (growing and compressing mazes are included) with good performance. Moreover, a genetic operator is proposed which utilizes the topological information of the SOM's population structure, improving the results. Thus, the first steps in structured evolutionary machine learning are shown, nonetheless, the problems faced are more difficult than the state-of-art continuous input-action multi-step ones. △ Less

Submitted 20 November, 2018; originally announced November 2018.

Journal ref: Evolutionary Intelligence 6 (2), 57-72 (2013)

arXiv:1811.08214 [pdf, other]

Contingency Training

Authors: Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

Abstract: When applied to high-dimensional datasets, feature selection algorithms might still leave dozens of irrelevant variables in the dataset. Therefore, even after feature selection has been applied, classifiers must be prepared to the presence of irrelevant variables. This paper investigates a new training method called Contingency Training which increases the accuracy as well as the robustness agains… ▽ More When applied to high-dimensional datasets, feature selection algorithms might still leave dozens of irrelevant variables in the dataset. Therefore, even after feature selection has been applied, classifiers must be prepared to the presence of irrelevant variables. This paper investigates a new training method called Contingency Training which increases the accuracy as well as the robustness against irrelevant attributes. Contingency training is classifier independent. By subsampling and removing information from each sample, it creates a set of constraints. These constraints aid the method to automatically find proper importance weights of the dataset's features. Experiments are conducted with the contingency training applied to neural networks over traditional datasets as well as datasets with additional irrelevant variables. For all of the tests, contingency training surpassed the unmodified training on datasets with irrelevant variables and even outperformed slightly when only a few or no irrelevant variables were present. △ Less

Submitted 20 November, 2018; originally announced November 2018.

Journal ref: Proc. of SICE Annual Conference 2013

arXiv:1811.00912 [pdf]

Two-Layered Superposition of Broadcast/Multicast and Unicast Signals in Multiuser OFDMA Systems

Authors: David Vargas, Yong ** Daniel Kim

Abstract: We study optimal delivery strategies of one common and $K$ independent messages from a source to multiple users in wireless environments. In particular, two-layered superposition of broadcast/multicast and unicast signals is considered in a downlink multiuser OFDMA system. In the literature and industry, the two-layer superposition is often considered as a pragmatic approach to make a compromise b… ▽ More We study optimal delivery strategies of one common and $K$ independent messages from a source to multiple users in wireless environments. In particular, two-layered superposition of broadcast/multicast and unicast signals is considered in a downlink multiuser OFDMA system. In the literature and industry, the two-layer superposition is often considered as a pragmatic approach to make a compromise between the simple but suboptimal orthogonal multiplexing (OM) and the optimal but complex fully-layered non-orthogonal multiplexing. In this work, we show that only two-layers are necessary to achieve the maximum sum-rate when the common message has higher priority than the $K$ individual unicast messages, and OM cannot be sum-rate optimal in general. We develop an algorithm that finds the optimal power allocation over the two-layers and across the OFDMA radio resources in static channels and a class of fading channels. Two main use-cases are considered: i) Multicast and unicast multiplexing when $K$ users with uplink capabilities request both common and independent messages, and ii) broadcast and unicast multiplexing when the common message targets receive-only devices and $K$ users with uplink capabilities additionally request independent messages. Finally, we develop a transceiver design for broadcast/multicast and unicast superposition transmission based on LTE-A-Pro physical layer and show with numerical evaluations in mobile environments with multipath propagation that the capacity improvements can be translated into significant practical performance gains compared to the orthogonal schemes in the 3GPP specifications. We also analyze the impact of real channel estimation and show that significant gains in terms of spectral efficiency or coverage area are still available even with estimation errors and imperfect interference cancellation for the two-layered superposition system. △ Less

Submitted 4 December, 2019; v1 submitted 2 November, 2018; originally announced November 2018.

arXiv:1809.07098 [pdf, other]

doi 10.1109/CEC.2015.7257254

Novelty-organizing team of classifiers in noisy and dynamic environments

Authors: Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

Abstract: In the real world, the environment is constantly changing with the input variables under the effect of noise. However, few algorithms were shown to be able to work under those circumstances. Here, Novelty-Organizing Team of Classifiers (NOTC) is applied to the continuous action mountain car as well as two variations of it: a noisy mountain car and an unstable weather mountain car. These problems t… ▽ More In the real world, the environment is constantly changing with the input variables under the effect of noise. However, few algorithms were shown to be able to work under those circumstances. Here, Novelty-Organizing Team of Classifiers (NOTC) is applied to the continuous action mountain car as well as two variations of it: a noisy mountain car and an unstable weather mountain car. These problems take respectively noise and change of problem dynamics into account. Moreover, NOTC is compared with NeuroEvolution of Augmenting Topologies (NEAT) in these problems, revealing a trade-off between the approaches. While NOTC achieves the best performance in all of the problems, NEAT needs less trials to converge. It is demonstrated that NOTC achieves better performance because of its division of the input space (creating easier problems). Unfortunately, this division of input space also requires a bit of time to bootstrap. △ Less

Submitted 19 September, 2018; originally announced September 2018.

Journal ref: 2015 IEEE Congress on Evolutionary Computation (CEC)

arXiv:1807.01662 [pdf]

Implementing SCRUM to develop a connected robot

Authors: Diego Armando Diaz Vargas, Rui Xue, Claude Baron, Philippe Esteban, Rob Vingerhoeds, Y Citlalih, Chao Liu

Abstract: Agile methods are receiving a growing interest from industry and these approaches are nowadays well accepted and deployed in software engineering. However, some issues remain to introduce agility in systems engineering. The objective of this paper is to show an agile management implementation in an educational project consisting in develo** a connected mobile robot, and to evaluate the issues an… ▽ More Agile methods are receiving a growing interest from industry and these approaches are nowadays well accepted and deployed in software engineering. However, some issues remain to introduce agility in systems engineering. The objective of this paper is to show an agile management implementation in an educational project consisting in develo** a connected mobile robot, and to evaluate the issues and benefits of adopting an agile approach. Among the most famous agile management methods, SCRUM has been chosen to lead this experiment. This paper first presents the project and how students traditionally manage it, then it describes how Scrum could be used instead. It evaluates the difficulties and interests to introduce agility in this project, and concludes on the ability of Scrum to design, test and progressively integrate the system, thus providing an operational prototype more quickly. △ Less

Submitted 3 July, 2018; originally announced July 2018.

Journal ref: 12th International Conference on Modeling, Optimization and SIMulation - MOSIM'18, Jun 2018, Toulouse, France. 2018

arXiv:1804.07062 [pdf, other]

Attacking Convolutional Neural Network using Differential Evolution

Authors: Jiawei Su, Danilo Vasconcellos Vargas, Kouichi Sakurai

Abstract: The output of Convolutional Neural Networks (CNN) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbations. That is, images modified by adding such perturbations(i.e. adversarial perturbations) that make little difference to human eyes, can completely alter the CNN classification results. In this paper, we propose a practica… ▽ More The output of Convolutional Neural Networks (CNN) has been shown to be discontinuous which can make the CNN image classifier vulnerable to small well-tuned artificial perturbations. That is, images modified by adding such perturbations(i.e. adversarial perturbations) that make little difference to human eyes, can completely alter the CNN classification results. In this paper, we propose a practical attack using differential evolution(DE) for generating effective adversarial perturbations. We comprehensively evaluate the effectiveness of different types of DEs for conducting the attack on different network structures. The proposed method is a black-box attack which only requires the miracle feedback of the target CNN systems. The results show that under strict constraints which simultaneously control the number of pixels changed and overall perturbation strength, attacking can achieve 72.29%, 78.24% and 61.28% non-targeted attack success rates, with 88.68%, 99.85% and 73.07% confidence on average, on three common types of CNNs. The attack only requires modifying 5 pixels with 20.44, 14.76 and 22.98 pixel values distortion. Thus, the result shows that the current DNNs are also vulnerable to such simpler black-box attacks even under very limited attack conditions. △ Less

Submitted 19 April, 2018; originally announced April 2018.

arXiv:1802.03714 [pdf, other]

Lightweight Classification of IoT Malware based on Image Recognition

Authors: Jiawei Su, Danilo Vasconcellos Vargas, Sanjiva Prasad, Daniele Sgandurra, Yaokai Feng, Kouichi Sakurai

Abstract: The Internet of Things (IoT) is an extension of the traditional Internet, which allows a very large number of smart devices, such as home appliances, network cameras, sensors and controllers to connect to one another to share information and improve user experiences. Current IoT devices are typically micro-computers for domain-specific computations rather than traditional functionspecific embedded… ▽ More The Internet of Things (IoT) is an extension of the traditional Internet, which allows a very large number of smart devices, such as home appliances, network cameras, sensors and controllers to connect to one another to share information and improve user experiences. Current IoT devices are typically micro-computers for domain-specific computations rather than traditional functionspecific embedded devices. Therefore, many existing attacks, targeted at traditional computers connected to the Internet, may also be directed at IoT devices. For example, DDoS attacks have become very common in IoT environments, as these environments currently lack basic security monitoring and protection mechanisms, as shown by the recent Mirai and Brickerbot IoT botnets. In this paper, we propose a novel light-weight approach for detecting DDos malware in IoT environments.We firstly extract one-channel gray-scale images converted from binaries, and then utilize a lightweight convolutional neural network for classifying IoT malware families. The experimental results show that the proposed system can achieve 94.0% accuracy for the classification of goodware and DDoS malware, and 81.8% accuracy for the classification of goodware and two main malware families. △ Less

Submitted 11 February, 2018; originally announced February 2018.

arXiv:1710.08864 [pdf, other]

doi 10.1109/TEVC.2019.2890858

One pixel attack for fooling deep neural networks

Authors: Jiawei Su, Danilo Vasconcellos Vargas, Sakurai Kouichi

Abstract: Recent research has revealed that the output of Deep Neural Networks (DNN) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE). It require… ▽ More Recent research has revealed that the output of Deep Neural Networks (DNN) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE). It requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE. The results show that 67.97% of the natural images in Kaggle CIFAR-10 test dataset and 16.04% of the ImageNet (ILSVRC 2012) test images can be perturbed to at least one target class by modifying just one pixel with 74.03% and 22.91% confidence on average. We also show the same vulnerability on the original CIFAR-10 dataset. Thus, the proposed attack explores a different take on adversarial machine learning in an extreme limited scenario, showing that current DNNs are also vulnerable to such low dimension attacks. Besides, we also illustrate an important application of DE (or broadly speaking, evolutionary computation) in the domain of adversarial machine learning: creating tools that can effectively generate low-cost adversarial attacks against neural networks for evaluating robustness. △ Less

Submitted 17 October, 2019; v1 submitted 24 October, 2017; originally announced October 2017.

Journal ref: IEEE Transactions on Evolutionary Computation}, Vol.23 , Issue.5 , pp. 828--841. Publisher: IEEE. 2019

Showing 1–41 of 41 results for author: Vargas, D