-
CS-Mixer: A Cross-Scale Vision MLP Model with Spatial-Channel Mixing
Authors:
Jonathan Cui,
David A. Araujo,
Suman Saha,
Md. Faisal Kabir
Abstract:
Despite their simpler information fusion designs compared with Vision Transformers and Convolutional Neural Networks, Vision MLP architectures have demonstrated strong performance and high data efficiency in recent research. However, existing works such as CycleMLP and Vision Permutator typically model spatial information in equal-size spatial regions and do not consider cross-scale spatial intera…
▽ More
Despite their simpler information fusion designs compared with Vision Transformers and Convolutional Neural Networks, Vision MLP architectures have demonstrated strong performance and high data efficiency in recent research. However, existing works such as CycleMLP and Vision Permutator typically model spatial information in equal-size spatial regions and do not consider cross-scale spatial interactions. Further, their token mixers only model 1- or 2-axis correlations, avoiding 3-axis spatial-channel mixing due to its computational demands. We therefore propose CS-Mixer, a hierarchical Vision MLP that learns dynamic low-rank transformations for spatial-channel mixing through cross-scale local and global aggregation. The proposed methodology achieves competitive results on popular image recognition benchmarks without incurring substantially more compute. Our largest model, CS-Mixer-L, reaches 83.2% top-1 accuracy on ImageNet-1k with 13.7 GFLOPs and 94 M parameters.
△ Less
Submitted 14 January, 2024; v1 submitted 25 August, 2023;
originally announced August 2023.
-
Perception of Personality Traits in Crowds of Virtual Humans
Authors:
Lucas Nardino,
Enzo Krzmienszki,
Vinícius Jurinic Cassol,
Diogo Schaffer,
Victor Flávio de Andrade Araujo,
Rodolfo Migon Favaretto,
Felipe Elsner,
Gabriel Fonseca Silva,
Soraia Raupp Musse
Abstract:
This paper proposes a perceptual visual analysis regarding the personality of virtual humans. Many studies have presented findings regarding the way human beings perceive virtual humans with respect to their faces, body animation, motion in the virtual environment and etc. We are interested in investigating the way people perceive visual manifestations of virtual humans' personality traits when th…
▽ More
This paper proposes a perceptual visual analysis regarding the personality of virtual humans. Many studies have presented findings regarding the way human beings perceive virtual humans with respect to their faces, body animation, motion in the virtual environment and etc. We are interested in investigating the way people perceive visual manifestations of virtual humans' personality traits when they are interactive and organized in groups. Many applications in games and movies can benefit from the findings regarding the perceptual analysis with the main goal to provide more realistic characters and improve the users' experience. We provide experiments with subjects and obtained results indicate that, although is very subtle, people perceive more the extraversion (the personality trait that we measured), into the crowds of virtual humans, when interacting with virtual humans behaviors, than when just observing as a spectator camera.
△ Less
Submitted 6 October, 2022;
originally announced October 2022.
-
Evaluating data-flow coverage in spectrum-based fault localization
Authors:
Henrique Lemos Ribeiro,
Higor Amario de Souza,
Roberto Paulo de Andrioli Araujo,
Marcos Lordello Chaim,
Fabio Kon
Abstract:
Background: Debugging is a key task during the software development cycle. Spectrum-based Fault Localization (SFL) is a promising technique to improve and automate debugging. SFL techniques use control-flow spectra to pinpoint the most suspicious program elements. However, data-flow spectra provide more detailed information about the program execution, which may be useful for fault localization. A…
▽ More
Background: Debugging is a key task during the software development cycle. Spectrum-based Fault Localization (SFL) is a promising technique to improve and automate debugging. SFL techniques use control-flow spectra to pinpoint the most suspicious program elements. However, data-flow spectra provide more detailed information about the program execution, which may be useful for fault localization. Aims: We evaluate the effectiveness and efficiency of ten SFL ranking metrics using data-flow spectra. Method: We compare the performance of data- and control-flow spectra for SFL using 163 faults from 5 real-world open source programs, which contain from 468 to 4130 test cases. The data- and control-flow spectra types used in our evaluation are definition-use associations (DUAs) and lines, respectively. Results: Using data-flow spectra, up to 50% more faults are ranked in the top-15 positions compared to control-flow spectra. Also, most SFL ranking metrics present better effectiveness using data-flow to inspect up to the top-40 positions. The execution cost of data-flow spectra is higher than control-flow, taking from 22 seconds to less than 9 minutes. Data-flow has an average overhead of 353% for all programs, while the average overhead for control-flow is of 102%. Conclusions: The results suggest that SFL techniques can benefit from using data-flow spectra to classify faults in better positions, which may lead developers to inspect less code to find bugs. The execution cost to gather data-flow is higher compared to control-flow, but it is not prohibitive. Moreover, data-flow spectra also provide information about suspicious variables for fault localization, which may improve the developers' performance using SFL.
△ Less
Submitted 27 June, 2019;
originally announced June 2019.
-
A Mid-level Video Representation based on Binary Descriptors: A Case Study for Pornography Detection
Authors:
Carlos Caetano,
Sandra Avila,
William Robson Schwartz,
Silvio Jamil F. Guimarães,
Arnaldo de A. Araújo
Abstract:
With the growing amount of inappropriate content on the Internet, such as pornography, arises the need to detect and filter such material. The reason for this is given by the fact that such content is often prohibited in certain environments (e.g., schools and workplaces) or for certain publics (e.g., children). In recent years, many works have been mainly focused on detecting pornographic images…
▽ More
With the growing amount of inappropriate content on the Internet, such as pornography, arises the need to detect and filter such material. The reason for this is given by the fact that such content is often prohibited in certain environments (e.g., schools and workplaces) or for certain publics (e.g., children). In recent years, many works have been mainly focused on detecting pornographic images and videos based on visual content, particularly on the detection of skin color. Although these approaches provide good results, they generally have the disadvantage of a high false positive rate since not all images with large areas of skin exposure are necessarily pornographic images, such as people wearing swimsuits or images related to sports. Local feature based approaches with Bag-of-Words models (BoW) have been successfully applied to visual recognition tasks in the context of pornography detection. Even though existing methods provide promising results, they use local feature descriptors that require a high computational processing time yielding high-dimensional vectors. In this work, we propose an approach for pornography detection based on local binary feature extraction and BossaNova image representation, a BoW model extension that preserves more richly the visual information. Moreover, we propose two approaches for video description based on the combination of mid-level representations namely BossaNova Video Descriptor (BNVD) and BoW Video Descriptor (BoW-VD). The proposed techniques are promising, achieving an accuracy of 92.40%, thus reducing the classification error by 16% over the current state-of-the-art local features approach on the Pornography dataset.
△ Less
Submitted 12 May, 2016;
originally announced May 2016.