-
Read, look and detect: Bounding box annotation from image-caption pairs
Authors:
Eduardo Hugo Sanchez
Abstract:
Various methods have been proposed to detect objects while reducing the cost of data annotation. For instance, weakly supervised object detection (WSOD) methods rely only on image-level annotations during training. Unfortunately, data annotation remains expensive since annotators must provide the categories describing the content of each image and labeling is restricted to a fixed set of categorie…
▽ More
Various methods have been proposed to detect objects while reducing the cost of data annotation. For instance, weakly supervised object detection (WSOD) methods rely only on image-level annotations during training. Unfortunately, data annotation remains expensive since annotators must provide the categories describing the content of each image and labeling is restricted to a fixed set of categories. In this paper, we propose a method to locate and label objects in an image by using a form of weaker supervision: image-caption pairs. By leveraging recent advances in vision-language (VL) models and self-supervised vision transformers (ViTs), our method is able to perform phrase grounding and object detection in a weakly supervised manner. Our experiments demonstrate the effectiveness of our approach by achieving a 47.51% recall@1 score in phrase grounding on Flickr30k Entities and establishing a new state-of-the-art in object detection by achieving 21.1 mAP 50 and 10.5 mAP 50:95 on MS COCO when exclusively relying on image-caption pairs.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Trust in Motion: Capturing Trust Ascendancy in Open-Source Projects using Hybrid AI
Authors:
Huascar Sanchez,
Briland Hitaj
Abstract:
Open-source is frequently described as a driver for unprecedented communication and collaboration, and the process works best when projects support teamwork. Yet, open-source cooperation processes in no way protect project contributors from considerations of trust, power, and influence. Indeed, achieving the level of trust necessary to contribute to a project and thus influence its direction is a…
▽ More
Open-source is frequently described as a driver for unprecedented communication and collaboration, and the process works best when projects support teamwork. Yet, open-source cooperation processes in no way protect project contributors from considerations of trust, power, and influence. Indeed, achieving the level of trust necessary to contribute to a project and thus influence its direction is a constant process of change, and developers take many different routes over many communication channels to achieve it. We refer to this process of influence-seeking and trust-building as trust ascendancy.
This paper describes a methodology for understanding the notion of trust ascendancy and introduces the capabilities that are needed to localize trust ascendancy operations happening over open-source projects. Much of the prior work in understanding trust in open-source software development has focused on a static view of the problem using different forms of quantity measures. However, trust ascendancy is not static, but rather adapts to changes in the open-source ecosystem in response to new input. This paper is the first attempt to articulate and study these signals from a dynamic view of the problem. In that respect, we identify related work that may help illuminate research challenges, implementation tradeoffs, and complementary solutions. Our preliminary results show the effectiveness of our method at capturing the trust ascendancy developed by individuals involved in a well-documented 2020 social engineering attack. Our future plans highlight research challenges and encourage cross-disciplinary collaboration to create more automated, accurate, and efficient ways to model and then track trust ascendancy in open-source projects.
△ Less
Submitted 10 October, 2022; v1 submitted 5 October, 2022;
originally announced October 2022.
-
K-textures, a self-supervised hard clustering deep learning algorithm for satellite image segmentation
Authors:
Fabien H. Wagner,
Ricardo Dalagnol,
Alber H. Sánchez,
Mayumi C. M. Hirye,
Samuel Favrichon,
Jake H. Lee,
Steffen Mauceri,
Yan Yang,
Sassan Saatchi
Abstract:
Deep learning self-supervised algorithms that can segment an image in a fixed number of hard labels such as the k-means algorithm and relying only on deep learning techniques are still lacking. Here, we introduce the k-textures algorithm which provides self-supervised segmentation of a 4-band image (RGB-NIR) for a $k$ number of classes. An example of its application on high resolution Planet satel…
▽ More
Deep learning self-supervised algorithms that can segment an image in a fixed number of hard labels such as the k-means algorithm and relying only on deep learning techniques are still lacking. Here, we introduce the k-textures algorithm which provides self-supervised segmentation of a 4-band image (RGB-NIR) for a $k$ number of classes. An example of its application on high resolution Planet satellite imagery is given. Our algorithm shows that discrete search is feasible using convolutional neural networks (CNN) and gradient descent. The model detects $k$ hard clustering classes represented in the model as $k$ discrete binary masks and their associated $k$ independently generated textures, that combined are a simulation of the original image. The similarity loss is the mean squared error between the features of the original and the simulated image, both extracted from the penultimate convolutional block of Keras 'imagenet' pretrained VGG-16 model and a custom feature extractor made with Planet data. The main advances of the k-textures model are: first, the $k$ discrete binary masks are obtained inside the model using gradient descent. The model allows for the generation of discrete binary masks using a novel method using a hard sigmoid activation function. Second, it provides hard clustering classes -- each pixels has only one class. Finally, in comparison to k-means, where each pixel is considered independently, here, contextual information is also considered and each class is not associated only to similar values in the color channels but also to a texture. Our approach is designed to ease the production of training samples for satellite image segmentation and the k-textures architecture could be adapted to support different number of bands and for more complex tasks, such as object self-segmentation. The model codes and weights are available at https://doi.org/10.5281/zenodo.6359859
△ Less
Submitted 27 May, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
DesCert: Design for Certification
Authors:
Natarajan Shankar,
Devesh Bhatt,
Michael Ernst,
Minyoung Kim,
Srivatsan Varadarajan,
Suzanne Millstein,
Jorge Navas,
Jason Biatek,
Huascar Sanchez,
Anitha Murugesan,
Hao Ren
Abstract:
The goal of the DARPA Automated Rapid Certification Of Software (ARCOS) program is to "automate the evaluation of software assurance evidence to enable certifiers to determine rapidly that system risk is acceptable." As part of this program, the DesCert project focuses on the assurance-driven development of new software. The DesCert team consists of SRI International, Honeywell Research, and the U…
▽ More
The goal of the DARPA Automated Rapid Certification Of Software (ARCOS) program is to "automate the evaluation of software assurance evidence to enable certifiers to determine rapidly that system risk is acceptable." As part of this program, the DesCert project focuses on the assurance-driven development of new software. The DesCert team consists of SRI International, Honeywell Research, and the University of Washington. We have adopted a formal, tool-based approach to the construction of software artifacts that are supported by rigorous evidence. The DesCert workflow integrates evidence generation into a design process that goes from requirements capture and analysis to the decomposition of the high-level software requirements into architecture properties and software components with assertional contracts, and on to software that can be analyzed both dynamically and statically. The generated evidence is organized by means of an assurance ontology and integrated into the RACK knowledge base.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Leveraging Team Dynamics to Predict Open-source Software Projects' Susceptibility to Social Engineering Attacks
Authors:
Luiz Giovanini,
Daniela Oliveira,
Huascar Sanchez,
Deborah Shands
Abstract:
Open-source software (OSS) is a critical part of the software supply chain. Recent social engineering attacks against OSS development teams have enabled attackers to become code contributors and later inject malicious code or vulnerabilities into the project with the goal of compromising dependent software. The attackers have exploited interactions among development team members and the social dyn…
▽ More
Open-source software (OSS) is a critical part of the software supply chain. Recent social engineering attacks against OSS development teams have enabled attackers to become code contributors and later inject malicious code or vulnerabilities into the project with the goal of compromising dependent software. The attackers have exploited interactions among development team members and the social dynamics of team behavior to enable their attacks. We introduce a security approach that leverages signatures and patterns of team dynamics to predict the susceptibility of a software development team to social engineering attacks that enable access to the OSS project code. The proposed approach is programming language-, platform-, and vulnerability-agnostic because it assesses the artifacts of OSS team interactions, rather than OSS code.
△ Less
Submitted 2 July, 2021; v1 submitted 30 June, 2021;
originally announced June 2021.
-
Learning Disentangled Representations via Mutual Information Estimation
Authors:
Eduardo Hugo Sanchez,
Mathieu Serrurier,
Mathias Ortner
Abstract:
In this paper, we investigate the problem of learning disentangled representations. Given a pair of images sharing some attributes, we aim to create a low-dimensional representation which is split into two parts: a shared representation that captures the common information between the images and an exclusive representation that contains the specific information of each image. To address this issue…
▽ More
In this paper, we investigate the problem of learning disentangled representations. Given a pair of images sharing some attributes, we aim to create a low-dimensional representation which is split into two parts: a shared representation that captures the common information between the images and an exclusive representation that contains the specific information of each image. To address this issue, we propose a model based on mutual information estimation without relying on image reconstruction or image generation. Mutual information maximization is performed to capture the attributes of data in the shared and exclusive representations while we minimize the mutual information between the shared and exclusive representation to enforce representation disentanglement. We show that these representations are useful to perform downstream tasks such as image classification and image retrieval based on the shared or exclusive component. Moreover, classification results show that our model outperforms the state-of-the-art model based on VAE/GAN approaches in representation disentanglement.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Four-Arm Manipulation via Feet Interfaces
Authors:
Jacob Hernandez Sanchez,
Walid Amanhoud,
Anaïs Haget,
Hannes Bleuler,
Aude Billard,
Mohamed Bouri
Abstract:
We seek to augment human manipulation by enabling humans to control two robotic arms in addition to their natural arms using their feet. Thereby, the hands are free to perform tasks of high dexterity, while the feet-controlled arms perform tasks requiring lower dexterity, such as supporting a load. The robotic arms are tele-operated through two foot interfaces that transmit translation and rotatio…
▽ More
We seek to augment human manipulation by enabling humans to control two robotic arms in addition to their natural arms using their feet. Thereby, the hands are free to perform tasks of high dexterity, while the feet-controlled arms perform tasks requiring lower dexterity, such as supporting a load. The robotic arms are tele-operated through two foot interfaces that transmit translation and rotation to the end effector of the manipulator. Haptic feedback is provided for the human to perceive contact and change in load and to adapt the feet pressure accordingly.
Existing foot interfaces have been used primarily for a single foot control and are limited in range of motion and number of degrees of freedom they can control. This paper presents foot-interfaces specifically made for bipedal control, with a workspace suitable for two feet operation and in five degrees of freedom each. This paper also presents a position-force teleoperation controller based on Impedance Control modulated through Dynamical Systems for trajectory generation. Finally, an initial validation of the platform is presented, whereby a user grasps an object with both feet and generates various disturbances while the object is supported by the feet.
△ Less
Submitted 11 September, 2019;
originally announced September 2019.