Search | arXiv e-print repository

Evolutionary Computation and Explainable AI: A Roadmap to Transparent Intelligent Systems

Authors: Ryan Zhou, Jaume Bacardit, Alexander Brownlee, Stefano Cagnoni, Martin Fyvie, Giovanni Iacca, John McCall, Niki van Stein, David Walker, Ting Hu

Abstract: AI methods are finding an increasing number of applications, but their often black-box nature has raised concerns about accountability and trust. The field of explainable artificial intelligence (XAI) has emerged in response to the need for human understanding of AI models. Evolutionary computation (EC), as a family of powerful optimization and learning tools, has significant potential to contribu… ▽ More AI methods are finding an increasing number of applications, but their often black-box nature has raised concerns about accountability and trust. The field of explainable artificial intelligence (XAI) has emerged in response to the need for human understanding of AI models. Evolutionary computation (EC), as a family of powerful optimization and learning tools, has significant potential to contribute to XAI. In this paper, we provide an introduction to XAI and review various techniques in current use for explaining machine learning (ML) models. We then focus on how EC can be used in XAI, and review some XAI approaches which incorporate EC techniques. Additionally, we discuss the application of XAI principles within EC itself, examining how these principles can shed some light on the behavior and outcomes of EC algorithms in general, on the (automatic) configuration of these algorithms, and on the underlying problem landscapes that these algorithms optimize. Finally, we discuss some open challenges in XAI and opportunities for future research in this field using EC. Our aim is to demonstrate that EC is well-suited for addressing current problems in explainability and to encourage further exploration of these methods to contribute to the development of more transparent and trustworthy ML models and EC algorithms. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 29 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2306.14786

arXiv:2405.19519 [pdf, other]

Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data

Authors: Sudeshna Das, Yao Ge, Yuting Guo, Swati Rajwal, JaMor Hairston, Jeanne Powell, Drew Walker, Snigdha Peddireddy, Sahithi Lakamana, Selen Bozkurt, Matthew Reyna, Reza Sameni, Yunyu Xiao, Sangmi Kim, Rasheeta Chandler, Natalie Hernandez, Danielle Mowery, Rachel Wightman, Jennifer Love, Anthony Spadaro, Jeanmarie Perrone, Abeed Sarker

Abstract: Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for qu… ▽ More Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.05204 [pdf]

CARE-SD: Classifier-based analysis for recognizing and eliminating stigmatizing and doubt marker labels in electronic health records: model development and validation

Authors: Drew Walker, Annie Thorne, Sudeshna Das, Jennifer Love, Hannah LF Cooper, Melvin Livingston III, Abeed Sarker

Abstract: Objective: To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques. Materials and Methods: We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexico… ▽ More Objective: To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques. Materials and Methods: We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexicon was further extended using Word2Vec and GPT 3.5, and refined through human evaluation. These lexicons were used to search for matches across 18 million sentences from the de-identified Medical Information Mart for Intensive Care-III (MIMIC-III) dataset. For each linguistic bias feature, 1000 sentence matches were sampled, labeled by expert clinical and public health annotators, and used to supervised learning classifiers. Results: Lexicon development from expanded literature stem-word lists resulted in a doubt marker lexicon containing 58 expressions, and a stigmatizing labels lexicon containing 127 expressions. Classifiers for doubt markers and stigmatizing labels had the highest performance, with macro F1-scores of .84 and .79, positive-label recall and precision values ranging from .71 to .86, and accuracies aligning closely with human annotator agreement (.87). Discussion: This study demonstrated the feasibility of supervised classifiers in automatically identifying stigmatizing labels and doubt markers in medical text, and identified trends in stigmatizing language use in an EHR setting. Additional labeled data may help improve lower scare quote model performance. Conclusions: Classifiers developed in this study showed high model performance and can be applied to identify patterns and target interventions to reduce stigmatizing labels and doubt markers in healthcare systems. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 28 pages, 3 figures, 4 tables. 5 Appendices

arXiv:2403.17277 [pdf, other]

Relational Network Verification

Authors: Xieyang Xu, Yifei Yuan, Zachary Kincaid, Arvind Krishnamurthy, Ratul Mahajan, David Walker, Ennan Zhai

Abstract: Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications… ▽ More Relational network verification is a new approach to validating network changes. In contrast to traditional network verification, which analyzes specifications for a single network snapshot, relational network verification analyzes specifications concerning two network snapshots (e.g., pre- and post-change snapshots) and captures their similarities and differences. Relational change specifications are compact and precise because they specify the flows or paths that change between snapshots and then simply mandate that other behaviors of the network "stay the same", without enumerating them. To achieve similar guarantees, single-snapshot specifications need to enumerate all flow and path behaviors that are not expected to change, so we can check that nothing has accidentally changed. Thus, precise single-snapshot specifications are proportional to network size, which makes them impractical to generate for many real-world networks. To demonstrate the value of relational reasoning, we develop a high-level relational specification language and a tool called Rela to validate network changes. Rela first compiles input specifications and network snapshot representations to finite state transducers. It then checks compliance using decision procedures for automaton equivalence. Our experiments using data on complex changes to a global backbone (with over 10^3 routers) find that Rela specifications need fewer than 10 terms for 93% of them and it validates 80% of them within 20 minutes. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.07124 [pdf, other]

Stochastic gradient descent-based inference for dynamic network models with attractors

Authors: Hancong Pan, Xiao**g Zhu, Cantay Caliskan, Dino P. Christenson, Konstantinos Spiliopoulos, Dylan Walker, Eric D. Kolaczyk

Abstract: In Coevolving Latent Space Networks with Attractors (CLSNA) models, nodes in a latent space represent social actors, and edges indicate their dynamic interactions. Attractors are added at the latent level to capture the notion of attractive and repulsive forces between nodes, borrowing from dynamical systems theory. However, CLSNA reliance on MCMC estimation makes scaling difficult, and the requir… ▽ More In Coevolving Latent Space Networks with Attractors (CLSNA) models, nodes in a latent space represent social actors, and edges indicate their dynamic interactions. Attractors are added at the latent level to capture the notion of attractive and repulsive forces between nodes, borrowing from dynamical systems theory. However, CLSNA reliance on MCMC estimation makes scaling difficult, and the requirement for nodes to be present throughout the study period limit practical applications. We address these issues by (i) introducing a Stochastic gradient descent (SGD) parameter estimation method, (ii) develo** a novel approach for uncertainty quantification using SGD, and (iii) extending the model to allow nodes to join and leave over time. Simulation results show that our extensions result in little loss of accuracy compared to MCMC, but can scale to much larger networks. We apply our approach to the longitudinal social networks of members of US Congress on the social media platform X. Accounting for node dynamics overcomes selection bias in the network and uncovers uniquely and increasingly repulsive forces within the Republican Party. △ Less

Submitted 20 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2402.11155 [pdf, other]

Automated Optimization of Parameterized Data-Plane Programs with Parasol

Authors: Mary Hogan, Devon Loehr, John Sonchack, Shir Landau Feibish, Jennifer Rexford, David Walker

Abstract: Programmable data planes allow for sophisticated applications that give operators the power to customize the functionality of their networks. Deploying these applications, however, often requires tedious and burdensome optimization of their layout and design, in which programmers must manually write, compile, and test an implementation, adjust the design, and repeat. In this paper we present Paras… ▽ More Programmable data planes allow for sophisticated applications that give operators the power to customize the functionality of their networks. Deploying these applications, however, often requires tedious and burdensome optimization of their layout and design, in which programmers must manually write, compile, and test an implementation, adjust the design, and repeat. In this paper we present Parasol, a framework that allows programmers to define general, parameterized network algorithms and automatically optimize their various parameters. The parameters of a Parasol program can represent a wide variety of implementation decisions, and may be optimized for arbitrary, high-level objectives defined by the programmer. Furthermore, optimization may be tailored to particular environments by providing a representative sample of traffic. We show how we implement the Parasol framework, which consists of a sketching language for writing parameterized programs, and a simulation-based optimizer for testing different parameter settings. We evaluate Parasol by implementing a suite of ten data-plane applications, and find that Parasol produces a solution with comparable performance to hand-optimized P4 code within a two-hour time budget. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2309.15199 [pdf, other]

Generalised 3D Morton and Hilbert Orderings

Authors: David Walker

Abstract: This document describes algorithms for generating general Morton and Hilbert orderings for three-dimensional data volumes. This document describes algorithms for generating general Morton and Hilbert orderings for three-dimensional data volumes. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Report number: CUPECS-2023-21

arXiv:2308.12329 [pdf, other]

Saggitarius: A DSL for Specifying Grammatical Domains

Authors: Anders Miltner, Devon Loehr, Arnold Mong, Kathleen Fisher, David Walker

Abstract: Common data types like dates, addresses, phone numbers and tables can have multiple textual representations, and many heavily-used languages, such as SQL, come in several dialects. These variations can cause data to be misinterpreted, leading to silent data corruption, failure of data processing systems, or even security vulnerabilities. Saggitarius is a new language and system designed to help pr… ▽ More Common data types like dates, addresses, phone numbers and tables can have multiple textual representations, and many heavily-used languages, such as SQL, come in several dialects. These variations can cause data to be misinterpreted, leading to silent data corruption, failure of data processing systems, or even security vulnerabilities. Saggitarius is a new language and system designed to help programmers reason about the format of data, by describing grammatical domains -- that is, sets of context-free grammars that describe the many possible representations of a datatype. We describe the design of Saggitarius via example and provide a relational semantics. We show how Saggitarius may be used to analyze a data set: given example data, it uses an algorithm based on semi-ring parsing and MaxSAT to infer which grammar in a given domain best matches that data. We evaluate the effectiveness of the algorithm on a benchmark suite of 110 example problems, and we demonstrate that our system typically returns a satisfying grammar within a few seconds with only a small number of examples. We also delve deeper into a more extensive case study on using Saggitarius for CSV dialect detection. Despite being general-purpose, we find that Saggitarius offers comparable results to hand-tuned, specialized tools; in the case of CSV, it infers grammars for 84% of benchmarks within 60 seconds, and has comparable accuracy to custom-built dialect detection tools. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: OOPSLA 2023

arXiv:2308.05673 [pdf, other]

Algorithms for Encoding and Decoding 3D Hilbert Orderings

Authors: David Walker

Abstract: This paper presents algorithms and pseudocode for encoding and decoding 3D Hilbert orderings. This paper presents algorithms and pseudocode for encoding and decoding 3D Hilbert orderings. △ Less

Submitted 25 September, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

Report number: CUPECS-2023-20

arXiv:2307.10675 [pdf, other]

Massively parallel quantum chemistry: PFAS on over 1 million cloud vCPUs

Authors: Alan E. Rask, Lee Huntington, SungYeon Kim, David Walker, Andrew Wildman, Rodrigo Wang, Nicole Hazel, Alan Judi, James T. Pegg, Punit K. Jha, Zara Mayimfor, Carl Dukatz, Hassan Naseri, Ilan Gleiser, Maxime R. Hugues, Paul M. Zimmerman, Arman Zaribafiyan, Rudi Plesch, Takeshi Yamazaki

Abstract: Accurate solutions to the electronic Schrödinger equation can provide valuable insight for electron interactions within molecular systems, accelerating the molecular design and discovery processes in many different applications. However, the availability of such accurate solutions are limited to small molecular systems due to both the extremely high computational complexity and the challenge of op… ▽ More Accurate solutions to the electronic Schrödinger equation can provide valuable insight for electron interactions within molecular systems, accelerating the molecular design and discovery processes in many different applications. However, the availability of such accurate solutions are limited to small molecular systems due to both the extremely high computational complexity and the challenge of operating and executing these workloads on high-performance compute clusters. This work presents a massively scalable cloud-based quantum chemistry platform by implementing a highly parallelizable quantum chemistry method that provides a polynomial-scaling approximation to full configuration interaction (FCI). Our platform orchestrates more than one million virtual CPUs on the cloud to analyze the bond-breaking behaviour of carbon-fluoride bonds of per- and polyfluoroalkyl substances (PFAS) with near-exact accuracy within the chosen basis set. This is the first quantum chemistry calculation utilizing more than one million virtual CPUs on the cloud and is the most accurate electronic structure computation of PFAS bond breaking to date. △ Less

Submitted 20 July, 2023; originally announced July 2023.

arXiv:2307.07828 [pdf, other]

The Impact of Space-Filling Curves on Data Movement in Parallel Systems

Authors: David Walker, Anthony Skjellum

Abstract: Modern computer systems are characterized by deep memory hierarchies, composed of main memory, multiple layers of cache, and other specialized types of memory. In parallel and distributed systems, additional memory layers are added to this hierarchy. Achieving good performance for computational science applications, in terms of execution time, depends on the efficient use of this diverse and hiera… ▽ More Modern computer systems are characterized by deep memory hierarchies, composed of main memory, multiple layers of cache, and other specialized types of memory. In parallel and distributed systems, additional memory layers are added to this hierarchy. Achieving good performance for computational science applications, in terms of execution time, depends on the efficient use of this diverse and hierarchical memory. This paper revisits the use of space-filling curves to specify the ordering in memory of data structures used in representative scientific applications executing on parallel machines containing clusters of multicore CPUs with attached GPUs. This work examines the hypothesis that space-filling curves, such as Hilbert and Morton ordering, can improve data locality and hence result in more efficient data movement than row or column-based orderings. First, performance results are presented that show for what application parameterizations and machine characteristics this is the case, and are interpreted in terms of how an application interacts with the computer hardware and low-level software. This research particularly focuses on the use of stencil-based applications that form the basis of many scientific computations. Second, how space-filling curves impact data sharing in nearest-neighbour and stencil-based codes is considered. △ Less

Submitted 15 July, 2023; originally announced July 2023.

Report number: CUPECS-2023-19 ACM Class: D.1.3; E.2

arXiv:2306.05562 [pdf, other]

AircraftVerse: A Large-Scale Multimodal Dataset of Aerial Vehicle Designs

Authors: Adam D. Cobb, Anirban Roy, Daniel Elenius, F. Michael Heim, Brian Swenson, Sydney Whittington, James D. Walker, Theodore Bapty, Joseph Hite, Karthik Ramani, Christopher McComb, Susmit Jha

Abstract: We present AircraftVerse, a publicly available aerial vehicle design dataset. Aircraft design encompasses different physics domains and, hence, multiple modalities of representation. The evaluation of these cyber-physical system (CPS) designs requires the use of scientific analytical and simulation models ranging from computer-aided design tools for structural and manufacturing analysis, computati… ▽ More We present AircraftVerse, a publicly available aerial vehicle design dataset. Aircraft design encompasses different physics domains and, hence, multiple modalities of representation. The evaluation of these cyber-physical system (CPS) designs requires the use of scientific analytical and simulation models ranging from computer-aided design tools for structural and manufacturing analysis, computational fluid dynamics tools for drag and lift computation, battery models for energy estimation, and simulation models for flight control and dynamics. AircraftVerse contains 27,714 diverse air vehicle designs - the largest corpus of engineering designs with this level of complexity. Each design comprises the following artifacts: a symbolic design tree describing topology, propulsion subsystem, battery subsystem, and other design details; a STandard for the Exchange of Product (STEP) model data; a 3D CAD design using a stereolithography (STL) file format; a 3D point cloud for the shape of the design; and evaluation results from high fidelity state-of-the-art physics models that characterize performance metrics such as maximum flight distance and hover-time. We also present baseline surrogate models that use different modalities of design representation to predict design performance metrics, which we provide as part of our dataset release. Finally, we discuss the potential impact of this dataset on the use of learning in aircraft design and, more generally, in CPS. AircraftVerse is accompanied by a data card, and it is released under Creative Commons Attribution-ShareAlike (CC BY-SA) license. The dataset is hosted at https://zenodo.org/record/6525446, baseline models and code at https://github.com/SRI-CSL/AircraftVerse, and the dataset description at https://aircraftverse.onrender.com/. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: The dataset is hosted at https://zenodo.org/record/6525446, baseline models and code at https://github.com/SRI-CSL/AircraftVerse, and the dataset description at https://aircraftverse.onrender.com/

arXiv:2305.12029 [pdf, other]

MultiTurnCleanup: A Benchmark for Multi-Turn Spoken Conversational Transcript Cleanup

Authors: Hua Shen, Vicky Zayats, Johann C. Rocholl, Daniel D. Walker, Dirk Padfield

Abstract: Current disfluency detection models focus on individual utterances each from a single speaker. However, numerous discontinuity phenomena in spoken conversational transcripts occur across multiple turns, hampering human readability and the performance of downstream NLP tasks. This study addresses these phenomena by proposing an innovative Multi-Turn Cleanup task for spoken conversational transcript… ▽ More Current disfluency detection models focus on individual utterances each from a single speaker. However, numerous discontinuity phenomena in spoken conversational transcripts occur across multiple turns, hampering human readability and the performance of downstream NLP tasks. This study addresses these phenomena by proposing an innovative Multi-Turn Cleanup task for spoken conversational transcripts and collecting a new dataset, MultiTurnCleanup1. We design a data labeling schema to collect the high-quality dataset and provide extensive data analysis. Furthermore, we leverage two modeling approaches for experimental evaluation as benchmarks for future research. △ Less

Submitted 27 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: EMNLP 2023 main conference. Dataset: https://github.com/huashen218/MultiTurnCleanup

arXiv:2302.00239 [pdf, other]

Filtering Context Mitigates Scarcity and Selection Bias in Political Ideology Prediction

Authors: Chen Chen, Dylan Walker, Venkatesh Saligrama

Abstract: We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs. This problem is motivated by the fact that manual data-labeling is expensive, while self-reported labels are often scarce and exhibit significant selection bias. We propose a novel statistical model that decomposes the document embeddings into a linear s… ▽ More We propose a novel supervised learning approach for political ideology prediction (PIP) that is capable of predicting out-of-distribution inputs. This problem is motivated by the fact that manual data-labeling is expensive, while self-reported labels are often scarce and exhibit significant selection bias. We propose a novel statistical model that decomposes the document embeddings into a linear superposition of two vectors; a latent neutral \emph{context} vector independent of ideology, and a latent \emph{position} vector aligned with ideology. We train an end-to-end model that has intermediate contextual and positional vectors as outputs. At deployment time, our model predicts labels for input documents by exclusively leveraging the predicted positional vectors. On two benchmark datasets we show that our model is capable of outputting predictions even when trained with as little as 5\% biased data, and is significantly more accurate than the state-of-the-art. Through crowd-sourcing we validate the neutrality of contextual vectors, and show that context filtering results in ideological concentration, allowing for prediction on out-of-distribution examples. △ Less

Submitted 31 January, 2023; originally announced February 2023.

arXiv:2301.13862 [pdf, other]

Salient Conditional Diffusion for Defending Against Backdoor Attacks

Authors: Brandon B. May, N. Joseph Tatro, Dylan Walker, Piyush Kumar, Nathan Shnidman

Abstract: We propose a novel algorithm, Salient Conditional Diffusion (Sancdifi), a state-of-the-art defense against backdoor attacks. Sancdifi uses a denoising diffusion probabilistic model (DDPM) to degrade an image with noise and then recover said image using the learned reverse diffusion. Critically, we compute saliency map-based masks to condition our diffusion, allowing for stronger diffusion on the m… ▽ More We propose a novel algorithm, Salient Conditional Diffusion (Sancdifi), a state-of-the-art defense against backdoor attacks. Sancdifi uses a denoising diffusion probabilistic model (DDPM) to degrade an image with noise and then recover said image using the learned reverse diffusion. Critically, we compute saliency map-based masks to condition our diffusion, allowing for stronger diffusion on the most salient pixels by the DDPM. As a result, Sancdifi is highly effective at diffusing out triggers in data poisoned by backdoor attacks. At the same time, it reliably recovers salient features when applied to clean data. This performance is achieved without requiring access to the model parameters of the Trojan network, meaning Sancdifi operates as a black-box defense. △ Less

Submitted 19 May, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: 14 pages, 5 figures. Edit: Added new baselines

ACM Class: I.2

arXiv:2212.04439 [pdf, other]

Technical Report: Match-reference regular expressions and lenses

Authors: Jeanne-Marie Musca, Anders Miltner, Kathleen Fisher, David Walker

Abstract: A lens is a single program that specifies two data transformations at once: one transformation converts data from source format to target format and a second transformation inverts the process. Over the past decade, researchers have developed many different kinds of lenses with different properties. One class of such languages operate over regular languages. In other words, these lenses convert st… ▽ More A lens is a single program that specifies two data transformations at once: one transformation converts data from source format to target format and a second transformation inverts the process. Over the past decade, researchers have developed many different kinds of lenses with different properties. One class of such languages operate over regular languages. In other words, these lenses convert strings drawn from one regular language to strings drawn from another regular language (and back again). In this paper, we define a more powerful language of lenses, which we call match-reference lenses, that is capable of translating between non-regular formats that contain repeated substrings, which is a primitive form of dependency. To define the non-regular formats themselves, we develop a new language, match-reference regular expressions, which are regular expressions that can bind variables to substrings and use those substrings repeatedly. These match-reference regular expressions are closely related to the familiar ``back-references" that can be found in traditional regular expression packages, but are redesigned to adhere to conventional programming language lexical sco** conventions and to interact smoothly with lens language infrastructure. We define the semantics of match-reference regular expressions and match-reference lenses. We also define a new kind of automaton, the match-reference regex automaton system (MRRAS), for deciding string membership in the language match-reference regular expressions. We illustrate our definitions with a variety of examples. △ Less

Submitted 8 December, 2022; originally announced December 2022.

arXiv:2209.12870 [pdf, other]

Test Coverage for Network Configurations

Authors: Xieyang Xu, Weixin Deng, Ryan Beckett, Ratul Mahajan, David Walker

Abstract: We develop NetCov, the first tool to reveal which network configuration lines are being tested by a suite of network tests. It helps network engineers improve test suites and thus increase network reliability. A key challenge in its development is that many network tests test the data plane instead of testing the configurations (control plane) directly. We must be able to efficiently infer which c… ▽ More We develop NetCov, the first tool to reveal which network configuration lines are being tested by a suite of network tests. It helps network engineers improve test suites and thus increase network reliability. A key challenge in its development is that many network tests test the data plane instead of testing the configurations (control plane) directly. We must be able to efficiently infer which configuration elements contribute to tested data plane elements, even when such contributions are non-local (on remote devices) or non-deterministic. NetCov uses an information flow graph based model that precisely captures various forms of contributions and a scalable method to lazily infer contributions. Using it, we show that an existing test suite for Internet2 (a nation-wide backbone network in the USA) covers only 26% of the configuration lines. The feedback from NetCov makes it easy to define new tests that improve coverage. For Internet2, adding just three such tests covers an additional 17% of the lines. △ Less

Submitted 26 September, 2022; originally announced September 2022.

arXiv:2206.02100 [pdf, other]

ACORN: Network Control Plane Abstraction using Route Nondeterminism

Authors: Divya Raghunathan, Ryan Beckett, Aarti Gupta, David Walker

Abstract: Networks are hard to configure correctly, and misconfigurations occur frequently, leading to outages or security breaches. Formal verification techniques have been applied to guarantee the correctness of network configurations, thereby improving network reliability. This work addresses verification of distributed network control planes, with two distinct contributions to improve the scalability of… ▽ More Networks are hard to configure correctly, and misconfigurations occur frequently, leading to outages or security breaches. Formal verification techniques have been applied to guarantee the correctness of network configurations, thereby improving network reliability. This work addresses verification of distributed network control planes, with two distinct contributions to improve the scalability of formal verification. Our first contribution is a hierarchy of abstractions of varying precision which introduce nondeterminism into the route selection procedure that routers use to select the best available route. We prove the soundness of these abstractions and show their benefits. Our second contribution is a novel SMT encoding which uses symbolic graphs to encode all possible stable routing trees that are compliant with the given network control plane configurations. We have implemented our abstractions and SMT encodings in a prototype tool called ACORN. Our evaluations show that our abstractions can provide significant relative speedups (up to 323x) in performance, and ACORN can scale up to $\approx37,000$ routers (organized in FatTree topologies, with synthesized shortest-path routing and valley-free policies) for verifying reachability. This far exceeds the performance of existing tools for control plane verification. △ Less

Submitted 5 June, 2022; originally announced June 2022.

Comments: 23 pages, 10 figures

arXiv:2205.00620 [pdf, other]

Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection

Authors: Angelica Chen, Vicky Zayats, Daniel D. Walker, Dirk Padfield

Abstract: In modern interactive speech-based systems, speech is consumed and transcribed incrementally prior to having disfluencies removed. This post-processing step is crucial for producing clean transcripts and high performance on downstream tasks (e.g. machine translation). However, most current state-of-the-art NLP models such as the Transformer operate non-incrementally, potentially causing unacceptab… ▽ More In modern interactive speech-based systems, speech is consumed and transcribed incrementally prior to having disfluencies removed. This post-processing step is crucial for producing clean transcripts and high performance on downstream tasks (e.g. machine translation). However, most current state-of-the-art NLP models such as the Transformer operate non-incrementally, potentially causing unacceptable delays. We propose a streaming BERT-based sequence tagging model that, combined with a novel training objective, is capable of detecting disfluencies in real-time while balancing accuracy and latency. This is accomplished by training the model to decide whether to immediately output a prediction for the current input or to wait for further context. Essentially, the model learns to dynamically size its lookahead window. Our results demonstrate that our model produces comparably accurate predictions and does so sooner than our baselines, with lower flicker. Furthermore, the model attains state-of-the-art latency and stability scores when compared with recent work on incremental disfluency detection. △ Less

Submitted 1 May, 2022; originally announced May 2022.

Comments: To be published at NAACL 2022

arXiv:2204.10303 [pdf, other]

doi 10.1145/3591222

Modular Control Plane Verification via Temporal Invariants

Authors: Timothy Alberdingk Thijm, Ryan Beckett, Aarti Gupta, David Walker

Abstract: Monolithic control plane verification cannot scale to hyperscale network architectures with tens of thousands of nodes, heterogeneous network policies and thousands of network changes a day. Instead, modular verification offers improved scalability, reasoning over diverse behaviors, and robustness following policy updates. We introduce Timepiece, a new modular control plane verification system. Wh… ▽ More Monolithic control plane verification cannot scale to hyperscale network architectures with tens of thousands of nodes, heterogeneous network policies and thousands of network changes a day. Instead, modular verification offers improved scalability, reasoning over diverse behaviors, and robustness following policy updates. We introduce Timepiece, a new modular control plane verification system. While one class of verifiers, starting with Minesweeper, were based on analysis of stable paths, we show that such models, when deployed naively for modular verification, are unsound. To rectify the situation, we adopt a routing model based around a logical notion of time and develop a sound, expressive, and scalable verification engine. Our system requires that a user specifies interfaces between module components. We develop methods for defining these interfaces using predicates inspired by temporal logic, and show how to use those interfaces to verify a range of network-wide properties such as reachability or access control. Verifying a prefix-filtering policy using a non-modular verification engine times out on an 80-node fattree network after 2 hours. However, Timepiece verifies a 2,000-node fattree in 2.37 minutes on a 96-core virtual machine. Modular verification of individual routers is embarrassingly parallel and completes in seconds, which allows verification to scale beyond non-modular engines, while still allowing the full power of SMT-based symbolic reasoning. △ Less

Submitted 8 April, 2023; v1 submitted 21 April, 2022; originally announced April 2022.

Comments: 27 pages (22 pages body, ~3 pages references, ~1 page proofs), 14 figures, accepted to PLDI 2023

ACM Class: C.2.2; D.2.4; F.3.1

arXiv:2202.06098 [pdf, other]

doi 10.1109/ICNP55882.2022.9940333

Kirigami, the Verifiable Art of Network Cutting

Authors: Tim Alberdingk Thijm, Ryan Beckett, Aarti Gupta, David Walker

Abstract: We introduce a modular verification approach to network control plane verification, where we cut a network into smaller fragments to improve the scalability of SMT solving. Users provide an annotated cut which describes how to generate these fragments from the monolithic network, and we verify each fragment independently, using the annotations to define assumptions and guarantees over fragments ak… ▽ More We introduce a modular verification approach to network control plane verification, where we cut a network into smaller fragments to improve the scalability of SMT solving. Users provide an annotated cut which describes how to generate these fragments from the monolithic network, and we verify each fragment independently, using the annotations to define assumptions and guarantees over fragments akin to assume-guarantee reasoning. We prove this modular network verification procedure is sound and complete with respect to verification over the monolithic network. We implement this procedure as Kirigami, an extension of NV - a network verification language and tool - and evaluate it on industrial topologies with synthesized policies. We observe a 2-8x improvement in end-to-end NV verification time, with SMT solve time improving by up to 6 orders of magnitude. △ Less

Submitted 12 February, 2022; originally announced February 2022.

Comments: 30 pages, 9 figures, submitted to CAV 2022

ACM Class: C.2.2; D.2.4; F.3.1

arXiv:2107.02244 [pdf, other]

doi 10.1145/3452296.3472903

Lucid: A Language for Control in the Data Plane

Authors: John Sonchack, Devon Loehr, Jennifer Rexford, David Walker

Abstract: Programmable switch hardware makes it possible to move fine-grained control logic inside the network data plane, improving performance for a wide range of applications. However, applications with integrated control are inherently hard to write in existing data-plane programming languages such as P4. This paper presents Lucid, a language that raises the level of abstraction for putting control func… ▽ More Programmable switch hardware makes it possible to move fine-grained control logic inside the network data plane, improving performance for a wide range of applications. However, applications with integrated control are inherently hard to write in existing data-plane programming languages such as P4. This paper presents Lucid, a language that raises the level of abstraction for putting control functionality in the data plane. Lucid introduces abstractions that make it easy to write sophisticated data-plane applications with interleaved packet-handling and control logic, specialized type and syntax systems that prevent programmer bugs related to data-plane state, and an open-sourced compiler that translates Lucid programs into P4 optimized for the Intel Tofino. These features make Lucid general and easy to use, as we demonstrate by writing a suite of ten different data-plane applications in Lucid. Working prototypes take well under an hour to write, even for a programmer without prior Tofino experience, have around 10x fewer lines of code compared to P4, and compile efficiently to real hardware. In a stateful firewall written in Lucid, we find that moving control from a switch's CPU to its data-plane processor using Lucid reduces the latency of performance-sensitive operations by over 300X. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: 12 pages plus 5 pages references/appendix. 17 figures. To appear in SIGCOMM 2021

ACM Class: C.2.1

arXiv:2104.10769 [pdf, ps, other]

Disfluency Detection with Unlabeled Data and Small BERT Models

Authors: Johann C. Rocholl, Vicky Zayats, Daniel D. Walker, Noah B. Murad, Aaron Schneider, Daniel J. Liebling

Abstract: Disfluency detection models now approach high accuracy on English text. However, little exploration has been done in improving the size and inference time of the model. At the same time, automatic speech recognition (ASR) models are moving from server-side inference to local, on-device inference. Supporting models in the transcription pipeline (like disfluency detection) must follow suit. In this… ▽ More Disfluency detection models now approach high accuracy on English text. However, little exploration has been done in improving the size and inference time of the model. At the same time, automatic speech recognition (ASR) models are moving from server-side inference to local, on-device inference. Supporting models in the transcription pipeline (like disfluency detection) must follow suit. In this work we concentrate on the disfluency detection task, focusing on small, fast, on-device models based on the BERT architecture. We demonstrate it is possible to train disfluency detection models as small as 1.3 MiB, while retaining high performance. We build on previous work that showed the benefit of data augmentation approaches such as self-training. Then, we evaluate the effect of domain mismatch between conversational and written text on model performance. We find that domain adaptation and data augmentation strategies have a more pronounced effect on these smaller models, as compared to conventional BERT models. △ Less

Submitted 27 July, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

Comments: INTERSPEECH 2021

arXiv:2010.11473 [pdf, other]

A Novel Variable Stiffness Soft Robotic Gripper

Authors: Dimuthu D. Arachchige, Yue Chen, Ian D. Walker, Isuru S. Godage

Abstract: We propose a novel tri-fingered soft robotic gripper with decoupled stiffness and shape control capability for performing adaptive gras** with minimum system complexity. The proposed soft fingers adaptively conform to object shapes facilitating the handling of objects of different types, shapes, and sizes. Each soft gripper finger has an inextensible articulable backbone and is actuated by pneum… ▽ More We propose a novel tri-fingered soft robotic gripper with decoupled stiffness and shape control capability for performing adaptive gras** with minimum system complexity. The proposed soft fingers adaptively conform to object shapes facilitating the handling of objects of different types, shapes, and sizes. Each soft gripper finger has an inextensible articulable backbone and is actuated by pneumatic muscles. We derive a kinematic model of the gripper and use an empirical approach to map input pressures to stiffness and bending deformation of fingers. We use these map**s to achieve decoupled stiffness and shape control. We conduct tests to quantify the ability to hold objects as the gripper changes orientation, the ability to maintain the gras** status as the gripper moves, and the amount of force required to release the object from the gripped fingers, respectively. The results validate the proposed gripper's performance and show how stiffness control can improve the gras** quality. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: This paper has been submitted to IEEE International Conference on Robotics and Automation 2021

arXiv:2006.12309 [pdf, other]

Visualising Evolution History in Multi- and Many-Objective Optimisation

Authors: Mathew Walter, David Walker, Matthew Craven

Abstract: Evolutionary algorithms are widely used to solve optimisation problems. However, challenges of transparency arise in both visualising the processes of an optimiser operating through a problem and understanding the problem features produced from many-objective problems, where comprehending four or more spatial dimensions is difficult. This work considers the visualisation of a population as an opti… ▽ More Evolutionary algorithms are widely used to solve optimisation problems. However, challenges of transparency arise in both visualising the processes of an optimiser operating through a problem and understanding the problem features produced from many-objective problems, where comprehending four or more spatial dimensions is difficult. This work considers the visualisation of a population as an optimisation process executes. We have adapted an existing visualisation technique to multi- and many-objective problem data, enabling a user to visualise the EA processes and identify specific problem characteristics and thus providing a greater understanding of the problem landscape. This is particularly valuable if the problem landscape is unknown, contains unknown features or is a many-objective problem. We have shown how using this framework is effective on a suite of multi- and many-objective benchmark test problems, optimising them with NSGA-II and NSGA-III. △ Less

Submitted 22 June, 2020; originally announced June 2020.

arXiv:2003.12106 [pdf, other]

Data-Driven Inference of Representation Invariants

Authors: Anders Miltner, Saswat Padhi, Todd Millstein, David Walker

Abstract: A representation invariant is a property that holds of all values of abstract type produced by a module. Representation invariants play important roles in software engineering and program verification. In this paper, we develop a counterexample-driven algorithm for inferring a representation invariant that is sufficient to imply a desired specification for a module. The key novelty is a type-direc… ▽ More A representation invariant is a property that holds of all values of abstract type produced by a module. Representation invariants play important roles in software engineering and program verification. In this paper, we develop a counterexample-driven algorithm for inferring a representation invariant that is sufficient to imply a desired specification for a module. The key novelty is a type-directed notion of visible inductiveness, which ensures that the algorithm makes progress toward its goal as it alternates between weakening and strengthening candidate invariants. The algorithm is parameterized by an example-based synthesis engine and a verifier, and we prove that it is sound and complete for first-order modules over finite types, assuming that the synthesizer and verifier are as well. We implement these ideas in a tool called Hanoi, which synthesizes representation invariants for recursive data types. Hanoi not only handles invariants for first-order code, but higher-order code as well. In its back end, Hanoi uses an enumerative synthesizer called Myth and an enumerative testing tool as a verifier. Because Hanoi uses testing for verification, it is not sound, though our empirical evaluation shows that it is successful on the benchmarks we investigated. △ Less

Submitted 26 March, 2020; originally announced March 2020.

Comments: 18 Pages, Full version of PLDI 2020 paper

arXiv:1902.00849 [pdf, other]

Contra: A Programmable System for Performance-aware Routing

Authors: Kuo-Feng Hsu, Ryan Beckett, Ang Chen, Jennifer Rexford, Praveen Tammana, David Walker

Abstract: We present Contra, a system for performance-aware routing that can adapt to traffic changes at hardware speeds. While existing work has developed point solutions for performance-aware routing on a fixed topology (e.g., a Fattree) with a fixed routing policy (e.g., use least utilized paths), Contra can be configured to operate seamlessly over any network topology and a wide variety of sophisticated… ▽ More We present Contra, a system for performance-aware routing that can adapt to traffic changes at hardware speeds. While existing work has developed point solutions for performance-aware routing on a fixed topology (e.g., a Fattree) with a fixed routing policy (e.g., use least utilized paths), Contra can be configured to operate seamlessly over any network topology and a wide variety of sophisticated routing policies. Users of Contra write network-wide policies that rank network paths given their current performance. A compiler then analyzes such policies in conjunction with the network topology and decomposes them into switch-local P4 programs, which collectively implement a new, specialized distance-vector protocol. This protocol generates compact probes that traverse the network, gathering path metrics to optimize for the user policy dynamically. Switches respond to changing network conditions at hardware speeds by routing flowlets along the best policy-compliant paths. Our experiments show that Contra scales to large networks, and that in terms of flow completion times, it is competitive with hand-crafted systems that have been customized for specific topologies and policies. △ Less

Submitted 3 February, 2019; originally announced February 2019.

arXiv:1901.01479 [pdf, ps, other]

Center of Gravity-based Approach for Modeling Dynamics of Multisection Continuum Arms

Authors: Isuru S. Godage, Robert J. Webster III, Ian D. Walker

Abstract: Multisection continuum arms offer complementary characteristics to those of traditional rigid-bodied robots. Inspired by biological appendages, such as elephant trunks and octopus arms, these robots trade rigidity for compliance, accuracy for safety, and therefore exhibit strong potential for applications in human-occupied spaces. Prior work has demonstrated their superiority in operation in conge… ▽ More Multisection continuum arms offer complementary characteristics to those of traditional rigid-bodied robots. Inspired by biological appendages, such as elephant trunks and octopus arms, these robots trade rigidity for compliance, accuracy for safety, and therefore exhibit strong potential for applications in human-occupied spaces. Prior work has demonstrated their superiority in operation in congested spaces and manipulation of irregularly-shaped objects. However, they are yet to be widely applied outside laboratory spaces. One key reason is that, due to compliance, they are difficult to control. Sophisticated and numerically efficient dynamic models are a necessity to implement dynamic control. In this paper, we propose a novel, numerically stable, center of gravity-based dynamic model for variable-length multisection continuum arms. The model can accommodate continuum robots having any number of sections with varying physical dimensions. The dynamic algorithm is of O(n2) complexity, runs at 9.5 kHz, simulates 6-8 times faster than real-time for a three-section continuum robot, and therefore is ideally suited for real-time control implementations. The model accuracy is validated numerically against an integral-dynamic model proposed by the authors and experimentally for a three-section, pneumatically actuated variable-length multisection continuum arm. This is the first sub real-time dynamic model based on a smooth continuous deformation model for variable-length multisection continuum arms. △ Less

Submitted 5 January, 2019; originally announced January 2019.

Comments: Submitted to IEEE Transactions on Robotics

arXiv:1811.04991 [pdf, other]

Dynamic Control of Pneumatic Muscle Actuators

Authors: Isuru S. Godage, Yue Chen, Ian D. Walker

Abstract: Pneumatic muscle actuators (PMA) are easy-to-fabricate, lightweight, compliant, and have high power-to-weight ratio, thus making them the ideal actuation choice for many soft and continuum robots. But so far, limited work has been carried out in dynamic control of PMAs. One reason is that PMAs are highly hysteretic. Coupled with their high compliance and response lag, PMAs are challenging to contr… ▽ More Pneumatic muscle actuators (PMA) are easy-to-fabricate, lightweight, compliant, and have high power-to-weight ratio, thus making them the ideal actuation choice for many soft and continuum robots. But so far, limited work has been carried out in dynamic control of PMAs. One reason is that PMAs are highly hysteretic. Coupled with their high compliance and response lag, PMAs are challenging to control, particularly when subjected to external loads. The hysteresis models proposed to-date rely on many physical and mechanical parameters that are difficult to measure reliably and therefore of limited use for implementing dynamic control. In this work, we employ a Bouc-Wen hysteresis modeling approach to account for the hysteresis of PMAs and use the model for implementing dynamic control. The controller is then compared to PID feedback control for a number of dynamic position tracking tests. The dynamic control based on the Bouc-Wen hysteresis model shows significantly better tracking performance. This work lays the foundation towards implementing dynamic control for PMA-powered high degrees of freedom soft and continuum robots. △ Less

Submitted 12 November, 2018; originally announced November 2018.

Comments: 4 pages, 5 figures. Submitted to Soft Robotic Modeling and Control: Bringing Together Articulated Soft Robots and Soft-Bodied Robots workshop, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018

arXiv:1810.11527 [pdf, other]

Synthesizing Symmetric Lenses

Authors: Anders Miltner, Solomon Maina, Kathleen Fisher, Benjamin C. Pierce, David Walker, Steve Zdancewic

Abstract: Lenses are programs that can be run both "front to back" and "back to front," allowing updates to either their source or their target data to be transferred in both directions. Lenses have been extensively studied, extended, and applied. Recent work has demonstrated how techniques from type-directed program synthesis can be used to efficiently synthesize a simple class of lenses---bijective lenses… ▽ More Lenses are programs that can be run both "front to back" and "back to front," allowing updates to either their source or their target data to be transferred in both directions. Lenses have been extensively studied, extended, and applied. Recent work has demonstrated how techniques from type-directed program synthesis can be used to efficiently synthesize a simple class of lenses---bijective lenses over string data---given a pair of types (regular expressions) and examples. We extend this synthesis algorithm to a broader class of lenses, called simple symmetric lenses, including all bijective lenses, all of the popular category of "asymmetric" lenses, and a subset of the "symmetric lenses" proposed by Hofmann et al. Intuitively, simple symmetric lenses allow some information to be present on one side but not the other and vice versa. They are of independent theoretical interest, being the largest class of symmetric lenses that do not use persistent internal state. Synthesizing simple symmetric lenses is more challenging than synthesizing bijective lenses: Since some of the information on each side can be "disconnected" from the other side, there will typically be many lenses that agree with a given example. To guide the search process, we use stochastic regular expressions and information theory to estimate the amount of information propagated by a candidate lens, preferring lenses that propagate more information, as well as user annotations marking parts of the source and target formats as either irrelevant or essential. We describe an implementation of simple symmetric lenses and our synthesis procedure as extensions to the Boomerang language. We evaluate its performance on 48 benchmark examples drawn from Flash Fill, Augeas, and the bidirectional programming literature. Our implementation can synthesize each of these lenses in under 30 seconds. △ Less

Submitted 25 June, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

Comments: ICFP 2019

arXiv:1806.08744 [pdf, other]

Control Plane Compression

Authors: Ryan Beckett, Aarti Gupta, Ratul Mahajan, David Walker

Abstract: We develop an algorithm capable of compressing large networks into a smaller ones with similar control plane behavior: For every stable routing solution in the large, original network, there exists a corresponding solution in the compressed network, and vice versa. Our compression algorithm preserves a wide variety of network properties including reachability, loop freedom, and path length. Conseq… ▽ More We develop an algorithm capable of compressing large networks into a smaller ones with similar control plane behavior: For every stable routing solution in the large, original network, there exists a corresponding solution in the compressed network, and vice versa. Our compression algorithm preserves a wide variety of network properties including reachability, loop freedom, and path length. Consequently, operators may speed up network analysis, based on simulation, emulation, or verification, by analyzing only the compressed network. Our approach is based on a new theory of control plane equivalence. We implement these ideas in a tool called Bonsai and apply it to real and synthetic networks. Bonsai can shrink real networks by over a factor of 5 and speed up analysis by several orders of magnitude. △ Less

Submitted 22 June, 2018; originally announced June 2018.

Comments: Extended version of the paper appearing in ACM SIGCOMM 2018

arXiv:1710.03248 [pdf, ps, other]

Synthesizing Bijective Lenses

Authors: Anders Miltner, Kathleen Fisher, Benjamin C. Pierce, David Walker, Steve Zdancewic

Abstract: Bidirectional transformations between different data representations occur frequently in modern software systems. They appear as serializers and deserializers, as database views and view updaters, and more. Manually building bidirectional transformations---by writing two separate functions that are intended to be inverses---is tedious and error prone. A better approach is to use a domain-specific… ▽ More Bidirectional transformations between different data representations occur frequently in modern software systems. They appear as serializers and deserializers, as database views and view updaters, and more. Manually building bidirectional transformations---by writing two separate functions that are intended to be inverses---is tedious and error prone. A better approach is to use a domain-specific language in which both directions can be written as a single expression. However, these domain-specific languages can be difficult to program in, requiring programmers to manage fiddly details while working in a complex type system. To solve this, we present Optician, a tool for type-directed synthesis of bijective string transformers. The inputs to Optician are two ordinary regular expressions representing two data formats and a few concrete examples for disambiguation. The output is a well-typed program in Boomerang (a bidirectional language based on the theory of lenses). The main technical challenge involves navigating the vast program search space efficiently enough. Unlike most prior work on type-directed synthesis, our system operates in the context of a language with a rich equivalence relation on types (the theory of regular expressions). We synthesize terms of a equivalent language and convert those generated terms into our lens language. We prove the correctness of our synthesis algorithm. We also demonstrate empirically that our new language changes the synthesis problem from one that admits intractable solutions to one that admits highly efficient solutions. We evaluate Optician on a benchmark suite of 39 examples including both microbenchmarks and realistic examples derived from other data management systems including Flash Fill, a tool for synthesizing string transformations in spreadsheets, and Augeas, a tool for bidirectional processing of Linux system configuration files. △ Less

Submitted 9 October, 2017; originally announced October 2017.

Comments: 127 Pages, Extended Version with Appendix

arXiv:1701.08310 [pdf, other]

A Systematic Literature Review on Intertemporal Choice in Software Engineering - Protocol and Results

Authors: Christoph Becker, Dawn Walker, Curtis McCord

Abstract: When making choices in software projects, engineers and other stakeholders engage in decision making that involves uncertain future outcomes. Research in psychology, behavioral economics and neuroscience has questioned many of the classical assumptions of how such decisions are made. This literature review aims to characterize the assumptions that underpin the study of these decisions in Software… ▽ More When making choices in software projects, engineers and other stakeholders engage in decision making that involves uncertain future outcomes. Research in psychology, behavioral economics and neuroscience has questioned many of the classical assumptions of how such decisions are made. This literature review aims to characterize the assumptions that underpin the study of these decisions in Software Engineering. We identify empirical research on this subject and analyze how the role of time has been characterized in the study of decision making in SE. The literature review aims to support the development of descriptive frameworks for empirical studies of intertemporal decision making in practice. △ Less

Submitted 28 January, 2017; originally announced January 2017.

arXiv:1512.00822 [pdf, other]

SNAP: Stateful Network-Wide Abstractions for Packet Processing

Authors: Mina Tahmasbi Arashloo, Yaron Koral, Michael Greenberg, Jennifer Rexford, David Walker

Abstract: Early programming languages for software-defined networking (SDN) were built on top of the simple match-action paradigm offered by OpenFlow 1.0. However, emerging hardware and software switches offer much more sophisticated support for persistent state in the data plane, without involving a central controller. Nevertheless, managing stateful, distributed systems efficiently and correctly is known… ▽ More Early programming languages for software-defined networking (SDN) were built on top of the simple match-action paradigm offered by OpenFlow 1.0. However, emerging hardware and software switches offer much more sophisticated support for persistent state in the data plane, without involving a central controller. Nevertheless, managing stateful, distributed systems efficiently and correctly is known to be one of the most challenging programming problems. To simplify this new SDN problem, we introduce SNAP. SNAP offers a simpler "centralized" stateful programming model, by allowing programmers to develop programs on top of one big switch rather than many. These programs may contain reads and writes to global, persistent arrays, and as a result, programmers can implement a broad range of applications, from stateful firewalls to fine-grained traffic monitoring. The SNAP compiler relieves programmers of having to worry about how to distribute, place, and optimize access to these stateful arrays by doing it all for them. More specifically, the compiler discovers read/write dependencies between arrays and translates one-big-switch programs into an efficient internal representation based on a novel variant of binary decision diagrams. This internal representation is used to construct a mixed-integer linear program, which jointly optimizes the placement of state and the routing of traffic across the underlying physical topology. We have implemented a prototype compiler and applied it to about 20 SNAP programs over various topologies to demonstrate our techniques' scalability. △ Less

Submitted 4 July, 2016; v1 submitted 2 December, 2015; originally announced December 2015.

arXiv:1407.0330 [pdf]

Inferring Social Structure and Dominance Relationships Between Rhesus macaques using RFID Tracking Data

Authors: Hanuma Teja Maddali, Michael Novitzky, Brian Hrolenok, Daniel Walker, Tucker Balch, Kim Wallen

Abstract: In this paper we address the problem of inferring social structure and dominance relationships in a group of rhesus macaques (a species of monkey) using only position data captured using RFID tags. Automatic inference of the social structure in an animal group enables a number of important capabilities, including: 1) A verifiable measure of how the social structure is affected by an intervention s… ▽ More In this paper we address the problem of inferring social structure and dominance relationships in a group of rhesus macaques (a species of monkey) using only position data captured using RFID tags. Automatic inference of the social structure in an animal group enables a number of important capabilities, including: 1) A verifiable measure of how the social structure is affected by an intervention such as a change in the environment, or the introduction of another animal, and 2) A potentially significant reduction in person hours normally used for assessing these changes. Social structure in a group is an important indicator of its members' relative level of access to resources and has interesting implications for an individual's health and learning in groups. There are two main quantitative criteria assessed in order to infer the social structure; Time spent close to conspecifics, and displacements. An interaction matrix is used to represent the total duration of events detected as grooming behavior between any two monkeys. This forms an undirected tie-strength (closeness of relationships) graph. A directed graph of hierarchy is constructed by using the well cited assumption of a linear hierarchy for rhesus macaques. Events that contribute to the adjacency matrix for this graph are withdrawals or displacements where a lower ranked monkey moves away from a higher ranked monkey. Displacements are one of the observable behaviors that can act as a strong indication of tie-strength and dominance. To quantify the directedness of interaction during these events we construct histograms of the dot products of motion orientation and relative position. This gives us a measure of how much time a monkey spends in moving towards or away from other group members. △ Less

Submitted 30 June, 2014; originally announced July 2014.

Report number: ci-2014/100

arXiv:1312.1719 [pdf, other]

Programming Protocol-Independent Packet Processors

Authors: Pat Bosshart, Dan Daly, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, David Walker

Abstract: P4 is a high-level language for programming protocol-independent packet processors. P4 works in conjunction with SDN control protocols like OpenFlow. In its current form, OpenFlow explicitly specifies protocol headers on which it operates. This set has grown from 12 to 41 fields in a few years, increasing the complexity of the specification while still not providing the flexibility to add new head… ▽ More P4 is a high-level language for programming protocol-independent packet processors. P4 works in conjunction with SDN control protocols like OpenFlow. In its current form, OpenFlow explicitly specifies protocol headers on which it operates. This set has grown from 12 to 41 fields in a few years, increasing the complexity of the specification while still not providing the flexibility to add new headers. In this paper we propose P4 as a strawman proposal for how OpenFlow should evolve in the future. We have three goals: (1) Reconfigurability in the field: Programmers should be able to change the way switches process packets once they are deployed. (2) Protocol independence: Switches should not be tied to any specific network protocols. (3) Target independence: Programmers should be able to describe packet-processing functionality independently of the specifics of the underlying hardware. As an example, we describe how to use P4 to configure a switch to add a new hierarchical label. △ Less

Submitted 15 May, 2014; v1 submitted 5 December, 2013; originally announced December 2013.

arXiv:1310.4168 [pdf, other]

A Mobile Robotic Personal Nightstand with Integrated Perceptual Processes

Authors: Vidya N. Murali, Anthony L. Threatt, Joe Manganelli, Paul M. Yanik, Sumod K. Mohan, Akshay A. Apte, Raghavendran Ramachandran, Linnea Smolentzov, Johnell Brooks, Ian D. Walker, Keith E. Green

Abstract: We present an intelligent interactive nightstand mounted on a mobile robot, to aid the elderly in their homes using physical, tactile and visual percepts. We show the integration of three different sensing modalities for controlling the navigation of a robot mounted nightstand within the constrained environment of a general purpose living room housing a single aging individual in need of assistanc… ▽ More We present an intelligent interactive nightstand mounted on a mobile robot, to aid the elderly in their homes using physical, tactile and visual percepts. We show the integration of three different sensing modalities for controlling the navigation of a robot mounted nightstand within the constrained environment of a general purpose living room housing a single aging individual in need of assistance and monitoring. A camera mounted on the ceiling of the room, gives a top-down view of the obstacles, the person and the nightstand. Pressure sensors mounted beneath the bed-stand of the individual provide physical perception of the person's state. A proximity IR sensor on the nightstand acts as a tactile interface along with a Wii Nunchuck (Nintendo) to control mundane operations on the nightstand. Intelligence from these three modalities are combined to enable path planning for the nightstand to approach the individual. With growing emphasis on assistive technology for the aging individuals who are increasingly electing to stay in their homes, we show how ubiquitous intelligence can be brought inside homes to help monitor and provide care to an individual. Our approach goes one step towards achieving pervasive intelligence by seamlessly integrating different sensors embedded in the fabric of the environment. △ Less

Submitted 12 October, 2013; originally announced October 2013.

Comments: Submitted to AAAI 2010, IROS 2011

arXiv:physics/0612122 [pdf, ps, other]

doi 10.1088/1742-5468/2007/06/P06010

Ranking Scientific Publications Using a Simple Model of Network Traffic

Authors: Dylan Walker, Huafeng Xie, Koon-Kiu Yan, Sergei Maslov

Abstract: To account for strong aging characteristics of citation networks, we modify Google's PageRank algorithm by initially distributing random surfers exponentially with age, in favor of more recent publications. The output of this algorithm, which we call CiteRank, is interpreted as approximate traffic to individual publications in a simple model of how researchers find new information. We develop an… ▽ More To account for strong aging characteristics of citation networks, we modify Google's PageRank algorithm by initially distributing random surfers exponentially with age, in favor of more recent publications. The output of this algorithm, which we call CiteRank, is interpreted as approximate traffic to individual publications in a simple model of how researchers find new information. We develop an analytical understanding of traffic flow in terms of an RPA-like model and optimize parameters of our algorithm to achieve the best performance. The results are compared for two rather different citation networks: all American Physical Society publications and the set of high-energy physics theory (hep-th) preprints. Despite major differences between these two networks, we find that their optimal parameters for the CiteRank algorithm are remarkably similar. △ Less

Submitted 13 December, 2006; originally announced December 2006.

Comments: 4 pages, 3 figures

Journal ref: J.Stat.Mech.0706:P06010,2007

Showing 1–38 of 38 results for author: Walker, D