-
einspace: Searching for Neural Architectures from Fundamental Operations
Authors:
Linus Ericsson,
Miguel Espinosa,
Chenhongyi Yang,
Antreas Antoniou,
Amos Storkey,
Shay B. Cohen,
Steven McDonagh,
Elliot J. Crowley
Abstract:
Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shift…
▽ More
Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shifts, we need a novel expressive search space design which is built from more fundamental operations. To this end, we introduce einspace, a search space based on a parameterised probabilistic context-free grammar. Our space is versatile, supporting architectures of various sizes and complexities, while also containing diverse network operations which allow it to model convolutions, attention components and more. It contains many existing competitive architectures, and provides flexibility for discovering new ones. Using this search space, we perform experiments to find novel architectures as well as improvements on existing ones on the diverse Unseen NAS datasets. We show that competitive architectures can be obtained by searching from scratch, and we consistently find large improvements when initialising the search with strong baselines. We believe that this work is an important advancement towards a transformative NAS paradigm where search space expressivity and strategic search initialisation play key roles.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Authors:
Chenhongyi Yang,
Zehui Chen,
Miguel Espinosa,
Linus Ericsson,
Zhenyu Wang,
Jiaming Liu,
Elliot J. Crowley
Abstract:
We present PlainMamba: a simple non-hierarchical state space model (SSM) designed for general visual recognition. The recent Mamba model has shown how SSMs can be highly competitive with other architectures on sequential data and initial attempts have been made to apply it to images. In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability…
▽ More
We present PlainMamba: a simple non-hierarchical state space model (SSM) designed for general visual recognition. The recent Mamba model has shown how SSMs can be highly competitive with other architectures on sequential data and initial attempts have been made to apply it to images. In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information. Our architecture is designed to be easy to use and easy to scale, formed by stacking identical PlainMamba blocks, resulting in a model with constant width throughout all layers. The architecture is further simplified by removing the need for special tokens. We evaluate PlainMamba on a variety of visual recognition tasks including image classification, semantic segmentation, object detection, and instance segmentation. Our method achieves performance gains over previous non-hierarchical models and is competitive with hierarchical alternatives. For tasks requiring high-resolution inputs, in particular, PlainMamba requires much less computing while maintaining high performance. Code and models are available at https://github.com/ChenhongyiYang/PlainMamba
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Map** the Landscape of Independent Food Delivery Platforms in the United States
Authors:
Yuhan Liu,
Amna Liaqat,
Owen Xingjian Zhang,
Mariana Consuelo Fernández Espinosa,
Ankhitha Manjunatha,
Alexander Yang,
Orestis Papakyriakopoulos,
Andrés Monroy-Hernández
Abstract:
Beyond the well-known giants like Uber Eats and DoorDash, there are hundreds of independent food delivery platforms in the United States. However, little is known about the sociotechnical landscape of these ``indie'' platforms. In this paper, we analyzed these platforms to understand why they were created, how they operate, and what technologies they use. We collected data on 495 indie platforms a…
▽ More
Beyond the well-known giants like Uber Eats and DoorDash, there are hundreds of independent food delivery platforms in the United States. However, little is known about the sociotechnical landscape of these ``indie'' platforms. In this paper, we analyzed these platforms to understand why they were created, how they operate, and what technologies they use. We collected data on 495 indie platforms and detailed survey responses from 29 platforms. We found that personalized, timely service is a central value of indie platforms, as is a sense of responsibility to the local community they serve. Indie platforms are motivated to provide fair rates for restaurants and couriers. These alternative business practices differentiate them from mainstream platforms. Though indie platforms have plans to expand, a lack of customizability in off-the-shelf software prevents independent platforms from personalizing services for their local communities. We show that these platforms are a widespread and longstanding fixture of the food delivery market. We illustrate the diversity of motivations and values to explain why a one-size-fits-all support is insufficient, and we discuss the siloing of technology that inhibits platforms' growth. Through these insights, we aim to promote future HCI research into the potential development of public-interest technologies for local food delivery.
△ Less
Submitted 25 March, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Generate Your Own Scotland: Satellite Image Generation Conditioned on Maps
Authors:
Miguel Espinosa,
Elliot J. Crowley
Abstract:
Despite recent advancements in image generation, diffusion models still remain largely underexplored in Earth Observation. In this paper we show that state-of-the-art pretrained diffusion models can be conditioned on cartographic data to generate realistic satellite images. We provide two large datasets of paired OpenStreetMap images and satellite views over the region of Mainland Scotland and the…
▽ More
Despite recent advancements in image generation, diffusion models still remain largely underexplored in Earth Observation. In this paper we show that state-of-the-art pretrained diffusion models can be conditioned on cartographic data to generate realistic satellite images. We provide two large datasets of paired OpenStreetMap images and satellite views over the region of Mainland Scotland and the Central Belt. We train a ControlNet model and qualitatively evaluate the results, demonstrating that both image quality and map fidelity are possible. Finally, we provide some insights on the opportunities and challenges of applying these models for remote sensing. Our model weights and code for creating the dataset are publicly available at https://github.com/miquel-espinosa/map-sat.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Prolog-based agnostic explanation module for structured pattern classification
Authors:
Gonzalo Nápoles,
Fabian Hoitsma,
Andreas Knoben,
Agnieszka Jastrzebska,
Maikel Leon Espinosa
Abstract:
This paper presents a Prolog-based reasoning module to generate counterfactual explanations given the predictions computed by a black-box classifier. The proposed symbolic reasoning module can also resolve what-if queries using the ground-truth labels instead of the predicted ones. Overall, our approach comprises four well-defined stages that can be applied to any structured pattern classification…
▽ More
This paper presents a Prolog-based reasoning module to generate counterfactual explanations given the predictions computed by a black-box classifier. The proposed symbolic reasoning module can also resolve what-if queries using the ground-truth labels instead of the predicted ones. Overall, our approach comprises four well-defined stages that can be applied to any structured pattern classification problem. Firstly, we pre-process the given dataset by imputing missing values and normalizing the numerical features. Secondly, we transform numerical features into symbolic ones using fuzzy clustering such that extracted fuzzy clusters are mapped to an ordered set of predefined symbols. Thirdly, we encode instances as a Prolog rule using the nominal values, the predefined symbols, the decision classes, and the confidence values. Fourthly, we compute the overall confidence of each Prolog rule using fuzzy-rough set theory to handle the uncertainty caused by transforming numerical quantities into symbols. This step comes with an additional theoretical contribution to a new similarity function to compare the previously defined Prolog rules involving confidence values. Finally, we implement a chatbot as a proxy between human beings and the Prolog-based reasoning module to resolve natural language queries and generate counterfactual explanations. During the numerical simulations using synthetic datasets, we study the performance of our system when using different fuzzy operators and similarity functions. Towards the end, we illustrate how our reasoning module works using different use cases.
△ Less
Submitted 18 November, 2022; v1 submitted 23 December, 2021;
originally announced December 2021.
-
Recurrence-Aware Long-Term Cognitive Network for Explainable Pattern Classification
Authors:
Gonzalo Nápoles,
Yamisleydi Salgueiro,
Isel Grau,
Maikel Leon Espinosa
Abstract:
Machine learning solutions for pattern classification problems are nowadays widely deployed in society and industry. However, the lack of transparency and accountability of most accurate models often hinders their safe use. Thus, there is a clear need for develo** explainable artificial intelligence mechanisms. There exist model-agnostic methods that summarize feature contributions, but their in…
▽ More
Machine learning solutions for pattern classification problems are nowadays widely deployed in society and industry. However, the lack of transparency and accountability of most accurate models often hinders their safe use. Thus, there is a clear need for develo** explainable artificial intelligence mechanisms. There exist model-agnostic methods that summarize feature contributions, but their interpretability is limited to predictions made by black-box models. An open challenge is to develop models that have intrinsic interpretability and produce their own explanations, even for classes of models that are traditionally considered black boxes like (recurrent) neural networks. In this paper, we propose a Long-Term Cognitive Network for interpretable pattern classification of structured data. Our method brings its own mechanism for providing explanations by quantifying the relevance of each feature in the decision process. For supporting the interpretability without affecting the performance, the model incorporates more flexibility through a quasi-nonlinear reasoning rule that allows controlling nonlinearity. Besides, we propose a recurrence-aware decision model that evades the issues posed by the unique fixed point while introducing a deterministic learning algorithm to compute the tunable parameters. The simulations show that our interpretable model obtains competitive results when compared to state-of-the-art white and black-box models.
△ Less
Submitted 23 December, 2021; v1 submitted 7 July, 2021;
originally announced July 2021.
-
HiveMind: A Scalable and Serverless Coordination Control Platform for UAV Swarms
Authors:
Justin Hu,
Ariana Bruno,
Brian Ritchken,
Brendon Jackson,
Mateo Espinosa,
Aditya Shah,
Christina Delimitrou
Abstract:
Swarms of autonomous devices are increasing in ubiquity and size. There are two main trains of thought for controlling devices in such swarms; centralized and distributed control. Centralized platforms achieve higher output quality but result in high network traffic and limited scalability, while decentralized systems are more scalable, but less sophisticated.
In this work we present HiveMind, a…
▽ More
Swarms of autonomous devices are increasing in ubiquity and size. There are two main trains of thought for controlling devices in such swarms; centralized and distributed control. Centralized platforms achieve higher output quality but result in high network traffic and limited scalability, while decentralized systems are more scalable, but less sophisticated.
In this work we present HiveMind, a centralized coordination control platform for IoT swarms that is both scalable and performant. HiveMind leverages a centralized cluster for all resource-intensive computation, deferring lightweight and time-critical operations, such as obstacle avoidance to the edge devices to reduce network traffic. HiveMind employs an event-driven serverless framework to run tasks on the cluster, guarantees fault tolerance both in the edge devices and serverless functions, and handles straggler tasks and underperforming devices. We evaluate HiveMind on a swarm of 16 programmable drones on two scenarios; searching for given items, and counting unique people in an area. We show that HiveMind achieves better performance and battery efficiency compared to fully centralized and fully decentralized platforms, while also handling load imbalances and failures gracefully, and allowing edge devices to leverage the cluster to collectively improve their output quality.
△ Less
Submitted 4 February, 2020;
originally announced February 2020.
-
An Open-Source Benchmark Suite for Cloud and IoT Microservices
Authors:
Yu Gan,
Yanqi Zhang,
Dailun Cheng,
Ankitha Shetty,
Priyal Rathi,
Nayan Katarki,
Ariana Bruno,
Justin Hu,
Brian Ritchken,
Brendon Jackson,
Kelvin Hu,
Meghna Pancholi,
Yuan He,
Brett Clancy,
Chris Colen,
Fukang Wen,
Catherine Leung,
Siyuan Wang,
Leon Zaruvinsky,
Mateo Espinosa,
Rick Lin,
Zhongling Liu,
Jake Padilla,
Christina Delimitrou
Abstract:
Cloud services have recently started undergoing a major shift from monolithic applications, to graphs of hundreds of loosely-coupled microservices. Microservices fundamentally change a lot of assumptions current cloud systems are designed with, and present both opportunities and challenges when optimizing for quality of service (QoS) and utilization. In this paper we explore the implications micro…
▽ More
Cloud services have recently started undergoing a major shift from monolithic applications, to graphs of hundreds of loosely-coupled microservices. Microservices fundamentally change a lot of assumptions current cloud systems are designed with, and present both opportunities and challenges when optimizing for quality of service (QoS) and utilization. In this paper we explore the implications microservices have across the cloud system stack. We first present DeathStarBench, a novel, open-source benchmark suite built with microservices that is representative of large end-to-end services, modular and extensible. DeathStarBench includes a social network, a media service, an e-commerce site, a banking system, and IoT applications for coordination control of UAV swarms. We then use DeathStarBench to study the architectural characteristics of microservices, their implications in networking and operating systems, their challenges with respect to cluster management, and their trade-offs in terms of application design and programming frameworks. Finally, we explore the tail at scale effects of microservices in real deployments with hundreds of users, and highlight the increased pressure they put on performance predictability.
△ Less
Submitted 27 May, 2019;
originally announced May 2019.
-
Out of Site: Empowering a New Approach to Online Boycotts
Authors:
H. Li,
B. Alarcon,
S. M. Espinosa,
B,
Hecht
Abstract:
GrabYourWallet, #boycottNRA and other online boycott campaigns have attracted substantial public interest in recent months. However, a number of significant challenges are preventing online boycotts from reaching their potential. In particular, complex webs of brands and subsidiaries can make it difficult for participants to conform to the goals of a boycott. Similarly, participants and organizers…
▽ More
GrabYourWallet, #boycottNRA and other online boycott campaigns have attracted substantial public interest in recent months. However, a number of significant challenges are preventing online boycotts from reaching their potential. In particular, complex webs of brands and subsidiaries can make it difficult for participants to conform to the goals of a boycott. Similarly, participants and organizers have limited visibility into a boycott's progress. This affects their ability to use sociotechnical innovations from social computing to incentivize participation. To address these challenges, this paper makes a system contribution: a new boycott tool called Out of Site. Out of Site uses lightweight automation to remove obstacles to successful online boycotts. We describe the design challenges associated with Out of Site and report results from two phases of deployment with the GrabYourWallet and Stop Animal Testing boycott communities. Our findings highlight the potential of boycott-assisting technologies and inform the design of this new class of technologies. Finally, like is the case for many systems in social computing, while we designed Out of Site for pro-social uses, there are a number of easily predictable ways in which the system can be leveraged for anti-social purposes (e.g. exacerbating filter bubble issues, empowering boycotts of businesses owned by racial, ethnic, and religious minorities). As such, we developed for this project a new, very straightforward design approach that treats preventing these anti-social uses as a top-tier design concern. This approach stands in contrast to the status quo of ignoring potential anti-social uses and/or considering them to be a secondary design priority. We discuss how our simple approach may help other research projects reduce their potential negative impacts with minimal burden.
△ Less
Submitted 2 April, 2019;
originally announced April 2019.
-
Modeling Dengue Vector Population Using Remotely Sensed Data and Machine Learning
Authors:
J. M. Scavuzzo,
F. Trucco,
M. Espinosa,
C. B. Tauro,
M. Abril,
C. M. Scavuzzo,
A. C. Frery
Abstract:
Mosquitoes are vectors of many human diseases. In particular, Aedes ægypti (Linnaeus) is the main vector for Chikungunya, Dengue, and Zika viruses in Latin America and it represents a global threat. Public health policies that aim at combating this vector require dependable and timely information, which is usually expensive to obtain with field campaigns. For this reason, several efforts have been…
▽ More
Mosquitoes are vectors of many human diseases. In particular, Aedes ægypti (Linnaeus) is the main vector for Chikungunya, Dengue, and Zika viruses in Latin America and it represents a global threat. Public health policies that aim at combating this vector require dependable and timely information, which is usually expensive to obtain with field campaigns. For this reason, several efforts have been done to use remote sensing due to its reduced cost. The present work includes the temporal modeling of the oviposition activity (measured weekly on 50 ovitraps in a north Argentinean city) of Aedes ægypti (Linnaeus), based on time series of data extracted from operational earth observation satellite images. We use are NDVI, NDWI, LST night, LST day and TRMM-GPM rain from 2012 to 2016 as predictive variables. In contrast to previous works which use linear models, we employ Machine Learning techniques using completely accessible open source toolkits. These models have the advantages of being non-parametric and capable of describing nonlinear relationships between variables. Specifically, in addition to two linear approaches, we assess a Support Vector Machine, an Artificial Neural Networks, a K-nearest neighbors and a Decision Tree Regressor. Considerations are made on parameter tuning and the validation and training approach. The results are compared to linear models used in previous works with similar data sets for generating temporal predictive models. These new tools perform better than linear approaches, in particular Nearest Neighbor Regression (KNNR) performs the best. These results provide better alternatives to be implemented operatively on the Argentine geospatial Risk system that is running since 2012.
△ Less
Submitted 4 May, 2018;
originally announced May 2018.
-
To Centralize or Not to Centralize: A Tale of Swarm Coordination
Authors:
Justin Hu,
Ariana Bruno,
Drew Zagieboylo,
Mark Zhao,
Brian Ritchken,
Brendon Jackson,
Joo Yeon Chae,
Francois Mertil,
Mateo Espinosa,
Christina Delimitrou
Abstract:
Large swarms of autonomous devices are increasing in size and importance. When it comes to controlling the devices of large-scale swarms there are two main lines of thought. Centralized control, where all decisions - and often compute - happen in a centralized back-end cloud system, and distributed control, where edge devices are responsible for selecting and executing tasks with minimal or zero h…
▽ More
Large swarms of autonomous devices are increasing in size and importance. When it comes to controlling the devices of large-scale swarms there are two main lines of thought. Centralized control, where all decisions - and often compute - happen in a centralized back-end cloud system, and distributed control, where edge devices are responsible for selecting and executing tasks with minimal or zero help from a centralized entity. In this work we aim to quantify the trade-offs between the two approaches with respect to task assignment quality, latency, and reliability. We do so first on a local swarm of 12 programmable drones with a 10-server cluster as the backend cloud, and then using a validated simulator to study the tail at scale effects of swarm coordination control. We conclude that although centralized control almost always outperforms distributed in the quality of its decisions, it faces significant scalability limitations, and we provide a list of system challenges that need to be addressed for centralized control to scale.
△ Less
Submitted 4 May, 2018;
originally announced May 2018.