Search | arXiv e-print repository

Utilizing Weak-to-Strong Consistency for Semi-Supervised Glomeruli Segmentation

Authors: Irina Zhang, Jim Denholm, Azam Hamidinekoo, Oskar Ålund, Christopher Bagnall, Joana Palés Huix, Michal Sulikowski, Ortensia Vito, Arthur Lewis, Robert Unwin, Magnus Soderberg, Nikolay Burlutskiy, Talha Qaiser

Abstract: Accurate segmentation of glomerulus instances attains high clinical significance in the automated analysis of renal biopsies to aid in diagnosing and monitoring kidney disease. Analyzing real-world histopathology images often encompasses inter-observer variability and requires a labor-intensive process of data annotation. Therefore, conventional supervised learning approaches generally achieve sub… ▽ More Accurate segmentation of glomerulus instances attains high clinical significance in the automated analysis of renal biopsies to aid in diagnosing and monitoring kidney disease. Analyzing real-world histopathology images often encompasses inter-observer variability and requires a labor-intensive process of data annotation. Therefore, conventional supervised learning approaches generally achieve sub-optimal performance when applied to external datasets. Considering these challenges, we present a semi-supervised learning approach for glomeruli segmentation based on the weak-to-strong consistency framework validated on multiple real-world datasets. Our experimental results on 3 independent datasets indicate superior performance of our approach as compared with existing supervised baseline models such as U-Net and SegFormer. △ Less

Submitted 30 May, 2024; originally announced June 2024.

Comments: accepted to MIDL'24

arXiv:2406.08583 [pdf, other]

Defining a Reference Architecture for Edge Systems in Highly-Uncertain Environments

Authors: Kevin Pitstick, Marc Novakouski, Grace A. Lewis, Ipek Ozkaya

Abstract: Increasing rate of progress in hardware and artificial intelligence (AI) solutions is enabling a range of software systems to be deployed closer to their users, increasing application of edge software system paradigms. Edge systems support scenarios in which computation is placed closer to where data is generated and needed, and provide benefits such as reduced latency, bandwidth optimization, and… ▽ More Increasing rate of progress in hardware and artificial intelligence (AI) solutions is enabling a range of software systems to be deployed closer to their users, increasing application of edge software system paradigms. Edge systems support scenarios in which computation is placed closer to where data is generated and needed, and provide benefits such as reduced latency, bandwidth optimization, and higher resiliency and availability. Users who operate in highly-uncertain and resource-constrained environments, such as first responders, law enforcement, and soldiers, can greatly benefit from edge systems to support timelier decision making. Unfortunately, understanding how different architecture approaches for edge systems impact priority quality concerns is largely neglected by industry and research, yet crucial for national and local safety, optimal resource utilization, and timely decision making. Much of industry is focused on the hardware and networking aspects of edge systems, with very little attention to the software that enables edge capabilities. This paper presents our work to fill this gap, defining a reference architecture for edge systems in highly-uncertain environments, and showing examples of how it has been implemented in practice. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Paper accepted and presented at ESA 2024, the 1st Workshop on Edge Software Architectures, co-located with ICSA 2024, the 21st International Conference on Software Architecture

arXiv:2406.08575 [pdf, ps, other]

Using Quality Attribute Scenarios for ML Model Test Case Generation

Authors: Rachel Brower-Sinning, Grace A. Lewis, Sebastían Echeverría, Ipek Ozkaya

Abstract: Testing of machine learning (ML) models is a known challenge identified by researchers and practitioners alike. Unfortunately, current practice for ML model testing prioritizes testing for model performance, while often neglecting the requirements and constraints of the ML-enabled system that integrates the model. This limited view of testing leads to failures during integration, deployment, and o… ▽ More Testing of machine learning (ML) models is a known challenge identified by researchers and practitioners alike. Unfortunately, current practice for ML model testing prioritizes testing for model performance, while often neglecting the requirements and constraints of the ML-enabled system that integrates the model. This limited view of testing leads to failures during integration, deployment, and operations, contributing to the difficulties of moving models from development to production. This paper presents an approach based on quality attribute (QA) scenarios to elicit and define system- and model-relevant test cases for ML models. The QA-based approach described in this paper has been integrated into MLTE, a process and tool to support ML model test and evaluation. Feedback from users of MLTE highlights its effectiveness in testing beyond model performance and identifying failures early in the development process. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Paper accepted and presented in SAML 2024, the 3rd International Workshop on Software Architecture and Machine Learning, co-located with ICSA 2024, the 21st IEEE International Conference on Software Architecture

arXiv:2404.08893 [pdf, other]

Early detection of disease outbreaks and non-outbreaks using incidence data

Authors: Shan Gao, Amit K. Chakraborty, Russell Greiner, Mark A. Lewis, Hao Wang

Abstract: Forecasting the occurrence and absence of novel disease outbreaks is essential for disease management. Here, we develop a general model, with no real-world training data, that accurately forecasts outbreaks and non-outbreaks. We propose a novel framework, using a feature-based time series classification method to forecast outbreaks and non-outbreaks. We tested our methods on synthetic data from a… ▽ More Forecasting the occurrence and absence of novel disease outbreaks is essential for disease management. Here, we develop a general model, with no real-world training data, that accurately forecasts outbreaks and non-outbreaks. We propose a novel framework, using a feature-based time series classification method to forecast outbreaks and non-outbreaks. We tested our methods on synthetic data from a Susceptible-Infected-Recovered model for slowly changing, noisy disease dynamics. Outbreak sequences give a transcritical bifurcation within a specified future time window, whereas non-outbreak (null bifurcation) sequences do not. We identified incipient differences in time series of infectives leading to future outbreaks and non-outbreaks. These differences are reflected in 22 statistical features and 5 early warning signal indicators. Classifier performance, given by the area under the receiver-operating curve, ranged from 0.99 for large expanding windows of training data to 0.7 for small rolling windows. Real-world performances of classifiers were tested on two empirical datasets, COVID-19 data from Singapore and SARS data from Hong Kong, with two classifiers exhibiting high accuracy. In summary, we showed that there are statistical features that distinguish outbreak and non-outbreak sequences long before outbreaks occur. We could detect these differences in synthetic and real-world data sets, well before potential outbreaks occur. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2403.16233 [pdf, other]

An early warning indicator trained on stochastic disease-spreading models with different noises

Authors: Amit K. Chakraborty, Shan Gao, Reza Miry, Pouria Ramazi, Russell Greiner, Mark A. Lewis, Hao Wang

Abstract: The timely detection of disease outbreaks through reliable early warning signals (EWSs) is indispensable for effective public health mitigation strategies. Nevertheless, the intricate dynamics of real-world disease spread, often influenced by diverse sources of noise and limited data in the early stages of outbreaks, pose a significant challenge in develo** reliable EWSs, as the performance of e… ▽ More The timely detection of disease outbreaks through reliable early warning signals (EWSs) is indispensable for effective public health mitigation strategies. Nevertheless, the intricate dynamics of real-world disease spread, often influenced by diverse sources of noise and limited data in the early stages of outbreaks, pose a significant challenge in develo** reliable EWSs, as the performance of existing indicators varies with extrinsic and intrinsic noises. Here, we address the challenge of modeling disease when the measurements are corrupted by additive white noise, multiplicative environmental noise, and demographic noise into a standard epidemic mathematical model. To navigate the complexities introduced by these noise sources, we employ a deep learning algorithm that provides EWS in infectious disease outbreak by training on noise-induced disease-spreading models. The indicator's effectiveness is demonstrated through its application to real-world COVID-19 cases in Edmonton and simulated time series derived from diverse disease spread models affected by noise. Notably, the indicator captures an impending transition in a time series of disease outbreaks and outperforms existing indicators. This study contributes to advancing early warning capabilities by addressing the intricate dynamics inherent in real-world disease spread, presenting a promising avenue for enhancing public health preparedness and response efforts. △ Less

Submitted 24 March, 2024; originally announced March 2024.

arXiv:2403.15749 [pdf, other]

Horoballs and the subgradient method

Authors: Adrian S. Lewis, Genaro Lopez-Acedo, Adriana Nicolae

Abstract: To explore convex optimization on Hadamard spaces, we consider an iteration in the style of a subgradient algorithm. Traditionally, such methods assume that the underlying spaces are manifolds and that the objectives are geodesically convex: the methods are described using tangent spaces and exponential maps. By contrast, our iteration applies in a general Hadamard space, is framed in the underlyi… ▽ More To explore convex optimization on Hadamard spaces, we consider an iteration in the style of a subgradient algorithm. Traditionally, such methods assume that the underlying spaces are manifolds and that the objectives are geodesically convex: the methods are described using tangent spaces and exponential maps. By contrast, our iteration applies in a general Hadamard space, is framed in the underlying space itself, and relies instead on horospherical convexity of the objective level sets. For this restricted class of objectives, we prove a complexity result of the usual form. Notably, the complexity does not depend on a lower bound on the space curvature. We illustrate our subgradient algorithm on the minimal enclosing ball problem in Hadamard spaces. △ Less

Submitted 2 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

MSC Class: 90C48; 65Y20; 49M29 ACM Class: G.1.6

arXiv:2402.06678 [pdf, other]

Can machine learning predict citizen-reported angler behavior?

Authors: Julia S. Schmid, Sean Simmons, Mark A. Lewis, Mark S. Poesch, Pouria Ramazi

Abstract: Prediction of angler behaviors, such as catch rates and angler pressure, is essential to maintaining fish populations and ensuring angler satisfaction. Angler behavior can partly be tracked by online platforms and mobile phone applications that provide fishing activities reported by recreational anglers. Moreover, angler behavior is known to be driven by local site attributes. Here, the prediction… ▽ More Prediction of angler behaviors, such as catch rates and angler pressure, is essential to maintaining fish populations and ensuring angler satisfaction. Angler behavior can partly be tracked by online platforms and mobile phone applications that provide fishing activities reported by recreational anglers. Moreover, angler behavior is known to be driven by local site attributes. Here, the prediction of citizen-reported angler behavior was investigated by machine-learning methods using auxiliary data on the environment, socioeconomics, fisheries management objectives, and events at a freshwater body. The goal was to determine whether auxiliary data alone could predict the reported behavior. Different spatial and temporal extents and temporal resolutions were considered. Accuracy scores averaged 88% for monthly predictions at single water bodies and 86% for spatial predictions on a day in a specific region across Canada. At other resolutions and scales, the models only achieved low prediction accuracy of around 60%. The study represents a first attempt at predicting angler behavior in time and space at a large scale and establishes a foundation for potential future expansions in various directions. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 36 pages, 10 figures, 4 tables (including supplementary information)

arXiv:2310.09668 [pdf, other]

Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs

Authors: Chenyang Yang, Rishabh Rustogi, Rachel Brower-Sinning, Grace A. Lewis, Christian Kästner, Tongshuang Wu

Abstract: Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding model testing. Weaver uses large language models to generate knowledge bases and recommends concepts from them interactively, allowing testers to elicit requir… ▽ More Current model testing work has mostly focused on creating test cases. Identifying what to test is a step that is largely ignored and poorly supported. We propose Weaver, an interactive tool that supports requirements elicitation for guiding model testing. Weaver uses large language models to generate knowledge bases and recommends concepts from them interactively, allowing testers to elicit requirements for further testing. Weaver provides rich external knowledge to testers and encourages testers to systematically explore diverse concepts beyond their own biases. In a user study, we show that both NLP experts and non-experts identified more, as well as more diverse concepts worth testing when using Weaver. Collectively, they found more than 200 failing test cases for stance detection with zero-shot ChatGPT. Our case studies further show that Weaver can help practitioners test models in real-world settings, where developers define more nuanced application scenarios (e.g., code understanding and transcript summarization) using LLMs. △ Less

Submitted 14 October, 2023; originally announced October 2023.

arXiv:2307.16081 [pdf, other]

Roll Up Your Sleeves: Working with a Collaborative and Engaging Task-Oriented Dialogue System

Authors: Lingbo Mo, Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Sunit Singh, Samuel Stevens, Chang-You Tai, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun

Abstract: We introduce TacoBot, a user-centered task-oriented digital assistant designed to guide users through complex real-world tasks with multiple steps. Covering a wide range of cooking and how-to tasks, we aim to deliver a collaborative and engaging dialogue experience. Equipped with language understanding, dialogue management, and response generation components supported by a robust search engine, Ta… ▽ More We introduce TacoBot, a user-centered task-oriented digital assistant designed to guide users through complex real-world tasks with multiple steps. Covering a wide range of cooking and how-to tasks, we aim to deliver a collaborative and engaging dialogue experience. Equipped with language understanding, dialogue management, and response generation components supported by a robust search engine, TacoBot ensures efficient task assistance. To enhance the dialogue experience, we explore a series of data augmentation strategies using LLMs to train advanced neural models continuously. TacoBot builds upon our successful participation in the inaugural Alexa Prize TaskBot Challenge, where our team secured third place among ten competing teams. We offer TacoBot as an open-source framework that serves as a practical example for deploying task-oriented dialogue systems. △ Less

Submitted 29 July, 2023; originally announced July 2023.

arXiv:2303.12095 [pdf, other]

Interpretable histopathology-based prediction of disease relevant features in Inflammatory Bowel Disease biopsies using weakly-supervised deep learning

Authors: Ricardo Mokhtari, Azam Hamidinekoo, Daniel Sutton, Arthur Lewis, Bastian Angermann, Ulf Gehrmann, Pal Lundin, Hibret Adissu, Junmei Cairns, Jessica Neisen, Emon Khan, Daniel Marks, Nia Khachapuridze, Talha Qaiser, Nikolay Burlutskiy

Abstract: Crohn's Disease (CD) and Ulcerative Colitis (UC) are the two main Inflammatory Bowel Disease (IBD) types. We developed deep learning models to identify histological disease features for both CD and UC using only endoscopic labels. We explored fine-tuning and end-to-end training of two state-of-the-art self-supervised models for predicting three different endoscopic categories (i) CD vs UC (AUC=0.8… ▽ More Crohn's Disease (CD) and Ulcerative Colitis (UC) are the two main Inflammatory Bowel Disease (IBD) types. We developed deep learning models to identify histological disease features for both CD and UC using only endoscopic labels. We explored fine-tuning and end-to-end training of two state-of-the-art self-supervised models for predicting three different endoscopic categories (i) CD vs UC (AUC=0.87), (ii) normal vs lesional (AUC=0.81), (iii) low vs high disease severity score (AUC=0.80). We produced visual attention maps to interpret what the models learned and validated them with the support of a pathologist, where we observed a strong association between the models' predictions and histopathological inflammatory features of the disease. Additionally, we identified several cases where the model incorrectly predicted normal samples as lesional but were correct on the microscopic level when reviewed by the pathologist. This tendency of histological presentation to be more severe than endoscopic presentation was previously published in the literature. In parallel, we utilised a model trained on the Colon Nuclei Identification and Counting (CoNIC) dataset to predict and explore 6 cell populations. We observed correlation between areas enriched with the predicted immune cells in biopsies and the pathologist's feedback on the attention maps. Finally, we identified several cell level features indicative of disease severity in CD and UC. These models can enhance our understanding about the pathology behind IBD and can shape our strategies for patient stratification in clinical trials. △ Less

Submitted 16 May, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Accepted to the Medical Imaging with Deep Learning (MIDL'23)

arXiv:2303.10838 [pdf, other]

Deceptive Reinforcement Learning in Model-Free Domains

Authors: Alan Lewis, Tim Miller

Abstract: This paper investigates deceptive reinforcement learning for privacy preservation in model-free and continuous action space domains. In reinforcement learning, the reward function defines the agent's objective. In adversarial scenarios, an agent may need to both maximise rewards and keep its reward function private from observers. Recent research presented the ambiguity model (AM), which selects a… ▽ More This paper investigates deceptive reinforcement learning for privacy preservation in model-free and continuous action space domains. In reinforcement learning, the reward function defines the agent's objective. In adversarial scenarios, an agent may need to both maximise rewards and keep its reward function private from observers. Recent research presented the ambiguity model (AM), which selects actions that are ambiguous over a set of possible reward functions, via pre-trained $Q$-functions. Despite promising results in model-based domains, our investigation shows that AM is ineffective in model-free domains due to misdirected state space exploration. It is also inefficient to train and inapplicable in continuous action space domains. We propose the deceptive exploration ambiguity model (DEAM), which learns using the deceptive policy during training, leading to targeted exploration of the state space. DEAM is also applicable in continuous action spaces. We evaluate DEAM in discrete and continuous action space path planning environments. DEAM achieves similar performance to an optimal model-based version of AM and outperforms a model-free version of AM in terms of path cost, deceptiveness and training efficiency. These results extend to the continuous domain. △ Less

Submitted 19 March, 2023; originally announced March 2023.

Comments: 8 pages, 1 reference page, 4 appendix pages, Accepted into International Conference on Automated Planning and Scheduling (ICAPS) 2023

arXiv:2303.01998 [pdf, other]

MLTEing Models: Negotiating, Evaluating, and Documenting Model and System Qualities

Authors: Katherine R. Maffey, Kyle Dotterrer, Jennifer Niemann, Iain Cruickshank, Grace A. Lewis, Christian Kästner

Abstract: Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles… ▽ More Many organizations seek to ensure that machine learning (ML) and artificial intelligence (AI) systems work as intended in production but currently do not have a cohesive methodology in place to do so. To fill this gap, we propose MLTE (Machine Learning Test and Evaluation, colloquially referred to as "melt"), a framework and implementation to evaluate ML models and systems. The framework compiles state-of-the-art evaluation techniques into an organizational process for interdisciplinary teams, including model developers, software engineers, system owners, and other stakeholders. MLTE tooling supports this process by providing a domain-specific language that teams can use to express model requirements, an infrastructure to define, generate, and collect ML evaluation metrics, and the means to communicate results. △ Less

Submitted 3 March, 2023; originally announced March 2023.

Comments: Accepted to the NIER Track of the 45th International Conference on Software Engineering (ICSE 2023)

arXiv:2302.04999 [pdf, other]

Ablation Study on Features in Learning-based Joints Calibration of Cable-driven Surgical Robots

Authors: Haonan Peng, Andrew Lewis, Blake Hannaford

Abstract: With worldwide implementation, millions of surgeries are assisted by surgical robots. The cable-drive mechanism on many surgical robots allows flexible, light, and compact arms and tools. However, the slack and stretch of the cables and the backlash of the gears introduce inevitable errors from motor poses to joint poses, and thus forwarded to the pose and orientation of the end-effector. In this… ▽ More With worldwide implementation, millions of surgeries are assisted by surgical robots. The cable-drive mechanism on many surgical robots allows flexible, light, and compact arms and tools. However, the slack and stretch of the cables and the backlash of the gears introduce inevitable errors from motor poses to joint poses, and thus forwarded to the pose and orientation of the end-effector. In this paper, a learning-based calibration using a deep neural network is proposed, which reduces the unloaded pose RMSE of joints 1, 2, 3 to 0.3003 deg, 0.2888 deg, 0.1565 mm, and loaded pose RMSE of joints 1, 2, 3 to 0.4456 deg, 0.3052 deg, 0.1900 mm, respectively. Then, removal ablation and inaccurate ablation are performed to study which features of the DNN model contribute to the calibration accuracy. The results suggest that raw joint poses and motor torques are the most important features. For joint poses, the removal ablation shows that DNN model can derive this information from end-effector pose and orientation. For motor torques, the direction is much more important than amplitude. △ Less

Submitted 14 February, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

arXiv:2302.02005 [pdf, other]

DeepAstroUDA: Semi-Supervised Universal Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection

Authors: A. Ćiprijanović, A. Lewis, K. Pedro, S. Madireddy, B. Nord, G. N. Perdue, S. M. Wild

Abstract: Artificial intelligence methods show great promise in increasing the quality and speed of work with large astronomical datasets, but the high complexity of these methods leads to the extraction of dataset-specific, non-robust features. Therefore, such methods do not generalize well across multiple datasets. We present a universal domain adaptation method, \textit{DeepAstroUDA}, as an approach to o… ▽ More Artificial intelligence methods show great promise in increasing the quality and speed of work with large astronomical datasets, but the high complexity of these methods leads to the extraction of dataset-specific, non-robust features. Therefore, such methods do not generalize well across multiple datasets. We present a universal domain adaptation method, \textit{DeepAstroUDA}, as an approach to overcome this challenge. This algorithm performs semi-supervised domain adaptation and can be applied to datasets with different data distributions and class overlaps. Non-overlap** classes can be present in any of the two datasets (the labeled source domain, or the unlabeled target domain), and the method can even be used in the presence of unknown classes. We apply our method to three examples of galaxy morphology classification tasks of different complexities ($3$-class and $10$-class problems), with anomaly detection: 1) datasets created after different numbers of observing years from a single survey (LSST mock data of $1$ and $10$ years of observations); 2) data from different surveys (SDSS and DECaLS); and 3) data from observing fields with different depths within one survey (wide field and Stripe 82 deep field of SDSS). For the first time, we demonstrate the successful use of domain adaptation between very discrepant observational datasets. \textit{DeepAstroUDA} is capable of bridging the gap between two astronomical surveys, increasing classification accuracy in both domains (up to $40\%$ on the unlabeled data), and making model performance consistent across datasets. Furthermore, our method also performs well as an anomaly detection algorithm and successfully clusters unknown class samples even in the unlabeled target dataset. △ Less

Submitted 22 March, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: Accepted in Machine Learning Science and Technology (MLST); 24 pages, 14 figures

Report number: FERMILAB-PUB-23-034-CSAID

arXiv:2301.09815 [pdf, ps, other]

Mixed Effects Random Forests for Personalised Predictions of Clinical Depression Severity

Authors: Robert A. Lewis, Asma Ghandeharioun, Szymon Fedor, Paola Pedrelli, Rosalind Picard, David Mischoulon

Abstract: This work demonstrates how mixed effects random forests enable accurate predictions of depression severity using multimodal physiological and digital activity data collected from an 8-week study involving 31 patients with major depressive disorder. We show that mixed effects random forests outperform standard random forests and personal average baselines when predicting clinical Hamilton Depressio… ▽ More This work demonstrates how mixed effects random forests enable accurate predictions of depression severity using multimodal physiological and digital activity data collected from an 8-week study involving 31 patients with major depressive disorder. We show that mixed effects random forests outperform standard random forests and personal average baselines when predicting clinical Hamilton Depression Rating Scale scores (HDRS_17). Compared to the latter baseline, accuracy is significantly improved for each patient by an average of 0.199-0.276 in terms of mean absolute error (p<0.05). This is noteworthy as these simple baselines frequently outperform machine learning methods in mental health prediction tasks. We suggest that this improved performance results from the ability of the mixed effects random forest to personalise model parameters to individuals in the dataset. However, we find that these improvements pertain exclusively to scenarios where labelled patient data are available to the model at training time. Investigating methods that improve accuracy when generalising to new patients is left as important future work. △ Less

Submitted 23 January, 2023; originally announced January 2023.

Comments: 9 pages

arXiv:2211.06409 [pdf, other]

Capabilities for Better ML Engineering

Authors: Chenyang Yang, Rachel Brower-Sinning, Grace A. Lewis, Christian Kästner, Tongshuang Wu

Abstract: In spite of machine learning's rapid growth, its engineering support is scattered in many forms, and tends to favor certain engineering stages, stakeholders, and evaluation preferences. We envision a capability-based framework, which uses fine-grained specifications for ML model behaviors to unite existing efforts towards better ML engineering. We use concrete scenarios (model design, debugging, a… ▽ More In spite of machine learning's rapid growth, its engineering support is scattered in many forms, and tends to favor certain engineering stages, stakeholders, and evaluation preferences. We envision a capability-based framework, which uses fine-grained specifications for ML model behaviors to unite existing efforts towards better ML engineering. We use concrete scenarios (model design, debugging, and maintenance) to articulate capabilities' broad applications across various different dimensions, and their impact on building safer, more generalizable and more trustworthy models that reflect human needs. Through preliminary experiments, we show capabilities' potential for reflecting model generalizability, which can provide guidance for ML engineering process. We discuss challenges and opportunities for capabilities' integration into ML engineering. △ Less

Submitted 10 February, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

arXiv:2211.00677 [pdf, other]

Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection

Authors: Aleksandra Ćiprijanović, Ashia Lewis, Kevin Pedro, Sandeep Madireddy, Brian Nord, Gabriel N. Perdue, Stefan M. Wild

Abstract: In the era of big astronomical surveys, our ability to leverage artificial intelligence algorithms simultaneously for multiple datasets will open new avenues for scientific discovery. Unfortunately, simply training a deep neural network on images from one data domain often leads to very poor performance on any other dataset. Here we develop a Universal Domain Adaptation method DeepAstroUDA, capabl… ▽ More In the era of big astronomical surveys, our ability to leverage artificial intelligence algorithms simultaneously for multiple datasets will open new avenues for scientific discovery. Unfortunately, simply training a deep neural network on images from one data domain often leads to very poor performance on any other dataset. Here we develop a Universal Domain Adaptation method DeepAstroUDA, capable of performing semi-supervised domain alignment that can be applied to datasets with different types of class overlap. Extra classes can be present in any of the two datasets, and the method can even be used in the presence of unknown classes. For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets (from SDSS and DECaLS). We show that our method is capable of bridging the gap between two astronomical surveys, and also performs well for anomaly detection and clustering of unknown data in the unlabeled dataset. We apply our model to two examples of galaxy morphology classification tasks with anomaly detection: 1) classifying spiral and elliptical galaxies with detection of merging galaxies (three classes including one unknown anomaly class); 2) a more granular problem where the classes describe more detailed morphological properties of galaxies, with the detection of gravitational lenses (ten classes including one unknown anomaly class). △ Less

Submitted 11 November, 2022; v1 submitted 1 November, 2022; originally announced November 2022.

Comments: 3 figures, 1 table; accepted to Machine Learning and the Physical Sciences - Workshop at the 36th conference on Neural Information Processing Systems (NeurIPS)

Report number: FERMILAB-CONF-22-791-SCD

arXiv:2209.03345 [pdf, other]

Data Leakage in Notebooks: Static Detection and Better Processes

Authors: Chenyang Yang, Rachel A Brower-Sinning, Grace A. Lewis, Christian Kästner

Abstract: Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model's accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and cha… ▽ More Data science pipelines to train and evaluate models with machine learning may contain bugs just like any other code. Leakage between training and test data can lead to overestimating the model's accuracy during offline evaluations, possibly leading to deployment of low-quality models in production. Such leakage can happen easily by mistake or by following poor practices, but may be tedious and challenging to detect manually. We develop a static analysis approach to detect common forms of data leakage in data science code. Our evaluation shows that our analysis accurately detects data leakage and that such leakage is pervasive among over 100,000 analyzed public notebooks. We discuss how our static analysis approach can help both practitioners and educators, and how leakage prevention can be designed into the development process. △ Less

Submitted 7 September, 2022; originally announced September 2022.

arXiv:2208.10937 [pdf, other]

Improving Computed Tomography (CT) Reconstruction via 3D Shape Induction

Authors: Elena Sizikova, Xu Cao, Ashia Lewis, Kenny Moise, Megan Coffee

Abstract: Chest computed tomography (CT) imaging adds valuable insight in the diagnosis and management of pulmonary infectious diseases, like tuberculosis (TB). However, due to the cost and resource limitations, only X-ray images may be available for initial diagnosis or follow up comparison imaging during treatment. Due to their projective nature, X-rays images may be more difficult to interpret by clinici… ▽ More Chest computed tomography (CT) imaging adds valuable insight in the diagnosis and management of pulmonary infectious diseases, like tuberculosis (TB). However, due to the cost and resource limitations, only X-ray images may be available for initial diagnosis or follow up comparison imaging during treatment. Due to their projective nature, X-rays images may be more difficult to interpret by clinicians. The lack of publicly available paired X-ray and CT image datasets makes it challenging to train a 3D reconstruction model. In addition, Chest X-ray radiology may rely on different device modalities with varying image quality and there may be variation in underlying population disease spectrum that creates diversity in inputs. We propose shape induction, that is, learning the shape of 3D CT from X-ray without CT supervision, as a novel technique to incorporate realistic X-ray distributions during training of a reconstruction model. Our experiments demonstrate that this process improves both the perceptual quality of generated CT and the accuracy of down-stream classification of pulmonary infectious diseases. △ Less

Submitted 14 November, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 11 pages

arXiv:2207.05223 [pdf, other]

Bootstrap** a User-Centered Task-Oriented Dialogue System

Authors: Shijie Chen, Ziru Chen, Xiang Deng, Ashley Lewis, Lingbo Mo, Samuel Stevens, Zhen Wang, Xiang Yue, Tianshu Zhang, Yu Su, Huan Sun

Abstract: We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks. TacoBot is designed with a user-centered principle and aspires to deliver a collaborative and accessible dialogue experience. Towards that end, it is equipped with accurate language understanding, flexible dialog… ▽ More We present TacoBot, a task-oriented dialogue system built for the inaugural Alexa Prize TaskBot Challenge, which assists users in completing multi-step cooking and home improvement tasks. TacoBot is designed with a user-centered principle and aspires to deliver a collaborative and accessible dialogue experience. Towards that end, it is equipped with accurate language understanding, flexible dialogue management, and engaging response generation. Furthermore, TacoBot is backed by a strong search engine and an automated end-to-end test suite. In bootstrap** the development of TacoBot, we explore a series of data augmentation strategies to train advanced neural language processing models and continuously improve the dialogue experience with collected real conversations. At the end of the semifinals, TacoBot achieved an average rating of 3.55/5.0. △ Less

Submitted 21 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: Published in 1st Proceedings of Alexa Prize TaskBot (Alexa Prize 2021). TacoBot won 3rd place in the challenge. See project website https://sunlab-osu.github.io/tacobot/ for details

arXiv:2112.13260 [pdf, other]

Utilizing gradient approximations to optimize data selection protocols for tumor growth model calibration

Authors: Allison L. Lewis, Kathleen M. Storey, Heyrim Cho, Anna C. Zittle

Abstract: The use of mathematical models to make predictions about tumor growth and response to treatment has become increasingly more prevalent in the clinical setting. The level of complexity within these models ranges broadly, and the calibration of more complex models correspondingly requires more detailed clinical data. This raises questions about how much data should be collected and when, in order to… ▽ More The use of mathematical models to make predictions about tumor growth and response to treatment has become increasingly more prevalent in the clinical setting. The level of complexity within these models ranges broadly, and the calibration of more complex models correspondingly requires more detailed clinical data. This raises questions about how much data should be collected and when, in order to minimize the total amount of data used and the time until a model can be calibrated accurately. To address these questions, we propose a Bayesian information-theoretic procedure, using a gradient-based score function to determine the optimal data collection times for model calibration. The novel score function introduced in this work eliminates the need for a weight parameter used in a previous study's score function, while still yielding accurate and efficient model calibration using even fewer scans on a sample set of synthetic data, simulating tumors of varying levels of radiosensitivity. We also conduct a robust analysis of the calibration accuracy and certainty, using both error and uncertainty metrics. Unlike the error analysis of the previous study, the inclusion of uncertainty analysis in this work|as a means for deciding when the algorithm can be terminated|provides a more realistic option for clinical decision-making, since it does not rely on data that will be collected later in time. △ Less

Submitted 25 December, 2021; originally announced December 2021.

Comments: 27 pages, 10 figures

arXiv:2112.09017 [pdf, other]

doi 10.1073/pnas.2122762119

Large Scale Distributed Linear Algebra With Tensor Processing Units

Authors: Adam G. M. Lewis, Jackson Beall, Martin Ganahl, Markus Hauru, Shrestha Basu Mallick, Guifre Vidal

Abstract: We have repurposed Google Tensor Processing Units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast inter-core interconnects (ICI)s, physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this reg… ▽ More We have repurposed Google Tensor Processing Units (TPUs), application-specific chips developed for machine learning, into large-scale dense linear algebra supercomputers. The TPUs' fast inter-core interconnects (ICI)s, physically two-dimensional network topology, and high-bandwidth memory (HBM) permit distributed matrix multiplication algorithms to rapidly become computationally bound. In this regime, the matrix-multiply units (MXU)s dominate the runtime, yielding impressive scaling, performance, and raw size: operating in float32 precision, a full 2048-core pod of third generation TPUs can multiply two matrices with linear size $N= 220= 1 048 576$ in about 2 minutes. Via curated algorithms emphasizing large, single-core matrix multiplications, other tasks in dense linear algebra can similarly scale. As examples, we present (i) QR decomposition; (ii) resolution of linear systems; and (iii) the computation of matrix functions by polynomial iteration, demonstrated by the matrix polar factorization. △ Less

Submitted 16 December, 2021; originally announced December 2021.

Comments: 12 pages, 8 figures

arXiv:2111.15645 [pdf, other]

doi 10.1137/21M1468450

Survey Descent: A Multipoint Generalization of Gradient Descent for Nonsmooth Optimization

Authors: X. Y. Han, Adrian S. Lewis

Abstract: For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is difficult to bound, leadin… ▽ More For strongly convex objectives that are smooth, the classical theory of gradient descent ensures linear convergence relative to the number of gradient evaluations. An analogous nonsmooth theory is challenging. Even when the objective is smooth at every iterate, the corresponding local models are unstable and the number of cutting planes invoked by traditional remedies is difficult to bound, leading to convergences guarantees that are sublinear relative to the cumulative number of gradient evaluations. We instead propose a multipoint generalization of the gradient descent iteration for local optimization. While designed with general objectives in mind, we are motivated by a ``max-of-smooth'' model that captures the subdifferential dimension at optimality. We prove linear convergence when the objective is itself max-of-smooth, and experiments suggest a more general phenomenon. △ Less

Submitted 27 September, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

Comments: Accepted to SIAM Journal on Optimization (SIOPT)

MSC Class: 90C25; 65K05; 49M37

Journal ref: SIAM Journal on Optimization, 33(1), 36-62

arXiv:2110.08345 [pdf, other]

Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction

Authors: Lingbo Mo, Ashley Lewis, Huan Sun, Michael White

Abstract: Existing studies on semantic parsing focus primarily on map** a natural-language utterance to a corresponding logical form in one turn. However, because natural language can contain a great deal of ambiguity and variability, this is a difficult challenge. In this work, we investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural langua… ▽ More Existing studies on semantic parsing focus primarily on map** a natural-language utterance to a corresponding logical form in one turn. However, because natural language can contain a great deal of ambiguity and variability, this is a difficult challenge. In this work, we investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language and enables the user to make corrections through natural-language feedback for individual steps. We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework, aiming to increase the transparency of the parsing process and help the user appropriately trust the final answer. To do so, we construct INSPIRED, a crowdsourced dialogue dataset derived from the ComplexWebQuestions dataset. Our experiments show that the interactive framework with human feedback has the potential to greatly improve overall parse accuracy. Furthermore, we develop a pipeline for dialogue simulation to evaluate our framework w.r.t. a variety of state-of-the-art KBQA models without involving further crowdsourcing effort. The results demonstrate that our interactive semantic parsing framework promises to be effective across such models. △ Less

Submitted 27 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

Comments: Accepted by Findings of ACL 2022

arXiv:2109.11480 [pdf, other]

Improving Tuberculosis (TB) Prediction using Synthetically Generated Computed Tomography (CT) Images

Authors: Ashia Lewis, Evanjelin Mahmoodi, Yuyue Zhou, Megan Coffee, Elena Sizikova

Abstract: The evaluation of infectious disease processes on radiologic images is an important and challenging task in medical image analysis. Pulmonary infections can often be best imaged and evaluated through computed tomography (CT) scans, which are often not available in low-resource environments and difficult to obtain for critically ill patients. On the other hand, X-ray, a different type of imaging pr… ▽ More The evaluation of infectious disease processes on radiologic images is an important and challenging task in medical image analysis. Pulmonary infections can often be best imaged and evaluated through computed tomography (CT) scans, which are often not available in low-resource environments and difficult to obtain for critically ill patients. On the other hand, X-ray, a different type of imaging procedure, is inexpensive, often available at the bedside and more widely available, but offers a simpler, two dimensional image. We show that by relying on a model that learns to generate CT images from X-rays synthetically, we can improve the automatic disease classification accuracy and provide clinicians with a different look at the pulmonary disease process. Specifically, we investigate Tuberculosis (TB), a deadly bacterial infectious disease that predominantly affects the lungs, but also other organ systems. We show that relying on synthetically generated CT improves TB identification by 7.50% and distinguishes TB properties up to 12.16% better than the X-ray baseline. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: Accepted to International Conference on Computer Vision (ICCV) 2021 Computer Vision for Automated Medical Diagnosis (CVAMD) Workshop

arXiv:2107.05748 [pdf, other]

Evaluation of an Inflated Beam Model Applied to Everted Tubes

Authors: Joel Hwee, Andrew Lewis, Allison Raines, Blake Hannaford

Abstract: Everted tubes have often been modeled as inflated beams to determine transverse and axial buckling conditions. This paper seeks to validate the assumption that an everted tube can be modeled in this way. The tip deflections of everted and uneverted beams under transverse cantilever loads are compared with a tip deflection model that was first developed for aerospace applications. LDPE and silicone… ▽ More Everted tubes have often been modeled as inflated beams to determine transverse and axial buckling conditions. This paper seeks to validate the assumption that an everted tube can be modeled in this way. The tip deflections of everted and uneverted beams under transverse cantilever loads are compared with a tip deflection model that was first developed for aerospace applications. LDPE and silicone coated nylon beams were tested; everted and uneverted beams showed similar tip deflection. The literature model best fit the tip deflection of LDPE tubes with an average tip deflection error of 6 mm, while the nylon tubes had an average tip deflection error of 16.4 mm. Everted beams of both materials buckled at 83% of the theoretical buckling condition while straight beams collapsed at 109% of the theoretical buckling condition. The curvature of everted beams was estimated from a tip load and a known displacement showing relative errors of 14.2% and 17.3% for LDPE and nylon beams respectively. This paper shows a numerical method for determining inflated beam deflection. It also provides an iterative method for computing static tip pose and applied wall forces in a known environment. △ Less

Submitted 12 July, 2021; originally announced July 2021.

arXiv:2103.14101 [pdf, other]

Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems

Authors: Grace A. Lewis, Stephany Bellomo, Ipek Ozkaya

Abstract: Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of M… ▽ More Increasing availability of machine learning (ML) frameworks and tools, as well as their promise to improve solutions to data-driven decision problems, has resulted in popularity of using ML techniques in software systems. However, end-to-end development of ML-enabled systems, as well as their seamless deployment and operations, remain a challenge. One reason is that development and deployment of ML-enabled systems involves three distinct workflows, perspectives, and roles, which include data science, software engineering, and operations. These three distinct perspectives, when misaligned due to incorrect assumptions, cause ML mismatches which can result in failed systems. We conducted an interview and survey study where we collected and validated common types of mismatches that occur in end-to-end development of ML-enabled systems. Our analysis shows that how each role prioritizes the importance of relevant mismatches varies, potentially contributing to these mismatched assumptions. In addition, the mismatch categories we identified can be specified as machine readable descriptors contributing to improved ML-enabled system development. In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development. △ Less

Submitted 25 March, 2021; originally announced March 2021.

Comments: 1st Workshop on AI Engineering: Software Engineering for AI (WAIN 2021) held at the 2021 IEEE/ACM 43rd International Conference on Software Engineering

arXiv:2102.13477 [pdf, other]

B-ETS: A Trusted Blockchain-based Emissions Trading System for Vehicle-to-Vehicle Networks

Authors: Lam Duc Nguyen, Amari N. Lewis, Israel Leyva-Mayorga, Amelia Regan, Petar Popovski

Abstract: Urban areas are negatively impacted by Carbon Dioxide (CO2 ) and Nitrogen Oxide (NOx) emissions. In order to achieve a cost-effective reduction of greenhouse gas emissions and to combat climate change, the European Union (EU) introduced an Emissions Trading System (ETS) where organizations can buy or receive emission allowances as needed. The current ETS is a centralized one, consisting of a set o… ▽ More Urban areas are negatively impacted by Carbon Dioxide (CO2 ) and Nitrogen Oxide (NOx) emissions. In order to achieve a cost-effective reduction of greenhouse gas emissions and to combat climate change, the European Union (EU) introduced an Emissions Trading System (ETS) where organizations can buy or receive emission allowances as needed. The current ETS is a centralized one, consisting of a set of complex rules. It is currently administered at the organizational level and is used for fixed-point sources of pollution such as factories, power plants, and refineries. However, the current ETS cannot efficiently cope with vehicle mobility, even though vehicles are one of the primary sources of CO2 and NOx emissions. In this study, we propose a new distributed Blockchain-based emissions allowance trading system called B-ETS. This system enables transparent and trustworthy data exchange as well as trading of allowances among vehicles, relying on vehicle-to-vehicle communication. In addition, we introduce an economic incentive-based mechanism that appeals to individual drivers and leads them to modify their driving behavior in order to reduce emissions. The efficiency of the proposed system is studied through extensive simulations, showing how increased vehicle connectivity can lead to a reduction of the emissions generated from those vehicles. We demonstrate that our method can be used for full life-cycle monitoring and fuel economy reporting. This leads us to conjecture that the proposed system could lead to important behavioral changes among the drivers △ Less

Submitted 28 February, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

Comments: Paper got accepted in 7th International Conference on Vehicle Technology and Intelligent Transport Systems (VEHITS) 2021

Journal ref: 7th International Conference on Vehicle Technology and Intelligent Transport Systems 2021

arXiv:2101.00699 [pdf, ps, other]

The structure of conservative gradient fields

Authors: Adrian Lewis, Tonghua Tian

Abstract: The classical Clarke subdifferential alone is inadequate for understanding automatic differentiation in nonsmooth contexts. Instead, we can sometimes rely on enlarged generalized gradients called "conservative fields", defined through the natural path-wise chain rule: one application is the convergence analysis of gradient-based deep learning algorithms. In the semi-algebraic case, we show that al… ▽ More The classical Clarke subdifferential alone is inadequate for understanding automatic differentiation in nonsmooth contexts. Instead, we can sometimes rely on enlarged generalized gradients called "conservative fields", defined through the natural path-wise chain rule: one application is the convergence analysis of gradient-based deep learning algorithms. In the semi-algebraic case, we show that all conservative fields are in fact just Clarke subdifferentials plus normals of manifolds in underlying Whitney stratifications. △ Less

Submitted 3 January, 2021; originally announced January 2021.

MSC Class: 49J53; 90C56; 65K10; 68T07; 14P10 ACM Class: G.1.6; I.2.6

arXiv:2012.04063 [pdf]

Cost-effective Machine Learning Inference Offload for Edge Computing

Authors: Christian Makaya, Amalendu Iyer, Jonathan Salfity, Madhu Athreya, M Anthony Lewis

Abstract: Computing at the edge is increasingly important since a massive amount of data is generated. This poses challenges in transporting all that data to the remote data centers and cloud, where they can be processed and analyzed. On the other hand, harnessing the edge data is essential for offering data-driven and machine learning-based applications, if the challenges, such as device capabilities, conn… ▽ More Computing at the edge is increasingly important since a massive amount of data is generated. This poses challenges in transporting all that data to the remote data centers and cloud, where they can be processed and analyzed. On the other hand, harnessing the edge data is essential for offering data-driven and machine learning-based applications, if the challenges, such as device capabilities, connectivity, and heterogeneity can be mitigated. Machine learning applications are very compute-intensive and require processing of large amount of data. However, edge devices are often resources-constrained, in terms of compute resources, power, storage, and network connectivity. Hence, limiting their potential to run efficiently and accurately state-of-the art deep neural network (DNN) models, which are becoming larger and more complex. This paper proposes a novel offloading mechanism by leveraging installed-base on-premises (edge) computational resources. The proposed mechanism allows the edge devices to offload heavy and compute-intensive workloads to edge nodes instead of using remote cloud. Our offloading mechanism has been prototyped and tested with state-of-the art person and object detection DNN models for mobile robots and video surveillance applications. The performance shows a significant gain compared to cloud-based offloading strategies in terms of accuracy and latency. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2010.06003 [pdf, other]

Modeling and Analysis of Data Trading on Blockchain-based Market in IoT Networks

Authors: Lam Duc Nguyen, Israel Leyva-Mayorga, Amari N. Lewis, Petar Popovski

Abstract: Mobile devices with embedded sensors for data collection and environmental sensing create a basis for a cost-effective approach for data trading. For example, these data can be related to pollution and gas emissions, which can be used to check the compliance with national and international regulations. The current approach for IoT data trading relies on a centralized third-party entity to negotiat… ▽ More Mobile devices with embedded sensors for data collection and environmental sensing create a basis for a cost-effective approach for data trading. For example, these data can be related to pollution and gas emissions, which can be used to check the compliance with national and international regulations. The current approach for IoT data trading relies on a centralized third-party entity to negotiate between data consumers and data providers, which is inefficient and insecure on a large scale. In comparison, a decentralized approach based on distributed ledger technologies (DLT) enables data trading while ensuring trust, security, and privacy. However, due to the lack of understanding of the communication efficiency between sellers and buyers, there is still a significant gap in benchmarking the data trading protocols in IoT environments. Motivated by this knowledge gap, we introduce a model for DLT-based IoT data trading over the Narrowband Internet of Things (NB-IoT) system, intended to support massive environmental sensing. We characterize the communication efficiency of three basic DLT-based IoT data trading protocols via NB-IoT connectivity in terms of latency and energy consumption. The model and analyses of these protocols provide a benchmark for IoT data trading applications. △ Less

Submitted 20 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

Comments: 10 pages, 8 figures, Accepted at IEEE Internet of Things Journal

arXiv:2009.02620 [pdf, other]

Bayesian information-theoretic calibration of patient-specific radiotherapy sensitivity parameters for informing effective scanning protocols in cancer

Authors: Heyrim Cho, Allison L. Lewis, Kathleen M. Storey

Abstract: With new advancements in technology, it is now possible to collect data for a variety of different metrics describing tumor growth, including tumor volume, composition, and vascularity, among others. For any proposed model of tumor growth and treatment, we observe large variability among individual patients' parameter values, particularly those relating to treatment response; thus, exploiting the… ▽ More With new advancements in technology, it is now possible to collect data for a variety of different metrics describing tumor growth, including tumor volume, composition, and vascularity, among others. For any proposed model of tumor growth and treatment, we observe large variability among individual patients' parameter values, particularly those relating to treatment response; thus, exploiting the use of these various metrics for model calibration can be helpful to infer such patient-specific parameters both accurately and early, so that treatment protocols can be adjusted mid-course for maximum efficacy. However, taking measurements can be costly and invasive, limiting clinicians to a sparse collection schedule. As such, the determination of optimal times and metrics for which to collect data in order to best inform proper treatment protocols could be of great assistance to clinicians. In this investigation, we employ a Bayesian information-theoretic calibration protocol for experimental design in order to identify the optimal times at which to collect data for informing treatment parameters. Within this procedure, data collection times are chosen sequentially to maximize the reduction in parameter uncertainty with each added measurement, ensuring that a budget of $n$ high-fidelity experimental measurements results in maximum information gain about the low-fidelity model parameter values. In addition to investigating the optimal temporal pattern for data collection, we also develop a framework for deciding which metrics should be utilized at each data collection point. We illustrate this framework with a variety of toy examples, each utilizing a radiotherapy treatment regimen. For each scenario, we analyze the dependence of the predictive power of the low-fidelity model upon the measurement budget. △ Less

Submitted 5 September, 2020; originally announced September 2020.

arXiv:2006.12628 [pdf]

Paratransit Agency Responses to the Adoption of Sub-contracted Services Using Secure Technologies

Authors: Amari N. Lewis, Amelia C. Regan

Abstract: Transportation agencies across the United States have the responsibility of providing transportation services for all travelers. Paratransit services which are designed to meet the needs of disabled travelers have been available to a certain extent for decades, but under the Americans with Disabilities Act mandate of 1990, uniform requirements were adopted across U.S. agencies. Most of these parat… ▽ More Transportation agencies across the United States have the responsibility of providing transportation services for all travelers. Paratransit services which are designed to meet the needs of disabled travelers have been available to a certain extent for decades, but under the Americans with Disabilities Act mandate of 1990, uniform requirements were adopted across U.S. agencies. Most of these paratransit operators offer services which must be scheduled at least a day in advance. And, provision of these services by accessible busses is generally very expensive. Therefore, many agencies are considering sub-contracting some services to approved ride-hailing or taxi services. The purpose of this work is to examine the opinions of various public agencies with respect to the adoption of sub-contracted services through the use of secure technologies. Our research provides insight into the future of these partnerships. Agencies expressed interest in the use of privacy preserving secure technologies as well as a strong desire for better software solutions for paratransit passengers and operators. The on-line survey received thirty responses for a completion rate of 19.1%. Our primary findings are that a major concern of agencies for this sort of arrangement is the lack of Wheelchair Accessible Vehicles offered by taxis and TNCs and about 36% of the surveyed agencies have not considered such partnerships. △ Less

Submitted 22 June, 2020; originally announced June 2020.

Comments: 14 pages, 7 figures

arXiv:2004.11479 [pdf]

Pill Identification using a Mobile Phone App for Assessing Medication Adherence and Post-Market Drug Surveillance

Authors: david Prokop, Joseph Babigumira, Ashleigh Lewis

Abstract: Objectives: Medication non-adherence is an important factor in clinical practice and research methodology. There have been many methods of measuring adherence yet no recognized standard for adherence. Here we conduct a software study of the usefulness and efficacy of a mobile phone app to measure medication adherence using photographs taken by a phone app of medications and self-reported health me… ▽ More Objectives: Medication non-adherence is an important factor in clinical practice and research methodology. There have been many methods of measuring adherence yet no recognized standard for adherence. Here we conduct a software study of the usefulness and efficacy of a mobile phone app to measure medication adherence using photographs taken by a phone app of medications and self-reported health measures. Results: The participants were asked by the app 'would help to keep track of your medication', their response indicated 92.9% felt the app 'would you use this app every day' to improve their medication adherence. The subjects were also asked by the app if they 'would photograph their pills on a daily basis'. Subject responses indicated 63% would use the app on a daily basis. By using the data collected, we determined that subjects who used the app on daily basis were more likely to adhere to the prescribed regimen. Conclusions: Pill photographs are a useful measure of adherence, allowing more accurate time measures and more frequent adherence assessment. Given the ubiquity of mobile telephone use, and the relative ease of this adherence measurement method, we believe it is a useful and cost-effective approach. However we feel the 'manual' nature of using the phone for taking a photograph of a pill has individual variability and an 'automatic' method is needed to reduce data inconsistency. △ Less

Submitted 23 April, 2020; originally announced April 2020.

Comments: 12 pages, 1 photo, 6 tables, 3 charts, 1 figure

arXiv:1912.10564 [pdf, other]

Teaching Responsible Data Science: Charting New Pedagogical Territory

Authors: Julia Stoyanovich, Armanda Lewis

Abstract: Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the "real" material. To develop instructional materials and methodologies… ▽ More Although numerous ethics courses are available, with many focusing specifically on technology and computer ethics, pedagogical approaches employed in these courses rely exclusively on texts rather than on software development or data analysis. Technical students often consider these courses unimportant and a distraction from the "real" material. To develop instructional materials and methodologies that are thoughtful and engaging, we must strive for balance: between texts and coding, between critique and solution, and between cutting-edge research and practical applicability. Finding such balance is particularly difficult in the nascent field of responsible data science (RDS), where we are only starting to understand how to interface between the intrinsically different methodologies of engineering and social sciences. In this paper we recount a recent experience in develo** and teaching an RDS course to graduate and advanced undergraduate students in data science. We then dive into an area that is critically important to RDS -- transparency and interpretability of machine-assisted decision-making, and tie this area to the needs of emerging RDS curricula. Recounting our own experience, and leveraging literature on pedagogical methods in data science and beyond, we propose the notion of an "object-to-interpret-with". We link this notion to "nutritional labels" -- a family of interpretability tools that are gaining popularity in RDS research and practice. With this work we aim to contribute to the nascent area of RDS education, and to inspire others in the community to come together to develop a deeper theoretical understanding of the pedagogical needs of RDS, and contribute concrete educational materials and methodologies that others can use. All course materials are publicly available at https://dataresponsibly.github.io/courses. △ Less

Submitted 22 December, 2019; originally announced December 2019.

arXiv:1911.13282 [pdf, other]

Quantum Computation with Machine-Learning-Controlled Quantum Stuff

Authors: Lucien Hardy, Adam G. M. Lewis

Abstract: We describe how one may go about performing quantum computation with arbitrary "quantum stuff", as long as it has some basic physical properties. Imagine a long strip of stuff, equipped with regularly spaced wires to provide input settings and to read off outcomes. After showing how the corresponding map from settings to outcomes can be construed as a quantum circuit, we provide a machine learning… ▽ More We describe how one may go about performing quantum computation with arbitrary "quantum stuff", as long as it has some basic physical properties. Imagine a long strip of stuff, equipped with regularly spaced wires to provide input settings and to read off outcomes. After showing how the corresponding map from settings to outcomes can be construed as a quantum circuit, we provide a machine learning algorithm to tomographically "learn" which settings implement the members of a universal gate set. At optimum, arbitrary quantum gates, and thus arbitrary quantum programs, can be implemented using the stuff. △ Less

Submitted 29 November, 2019; originally announced November 2019.

Comments: 13 pages, 3 figures, 1 table, 3 algorithms

arXiv:1911.07721 [pdf, other]

Program synthesis performance constrained by non-linear spatial relations in Synthetic Visual Reasoning Test

Authors: Lu Yihe, Scott C. Lowe, Penelope A. Lewis, Mark C. W. van Rossum

Abstract: Despite remarkable advances in automated visual recognition by machines, some visual tasks remain challenging for machines. Fleuret et al. (2011) introduced the Synthetic Visual Reasoning Test (SVRT) to highlight this point, which required classification of images consisting of randomly generated shapes based on hidden abstract rules using only a few examples. Ellis et al. (2015) demonstrated that… ▽ More Despite remarkable advances in automated visual recognition by machines, some visual tasks remain challenging for machines. Fleuret et al. (2011) introduced the Synthetic Visual Reasoning Test (SVRT) to highlight this point, which required classification of images consisting of randomly generated shapes based on hidden abstract rules using only a few examples. Ellis et al. (2015) demonstrated that a program synthesis approach could solve some of the SVRT problems with unsupervised, few-shot learning, whereas they remained challenging for several convolutional neural networks trained with thousands of examples. Here we re-considered the human and machine experiments, because they followed different protocols and yielded different statistics. We thus proposed a quantitative reintepretation of the data between the protocols, so that we could make fair comparison between human and machine performance. We improved the program synthesis classifier by correcting the image parsings, and compared the results to the performance of other machine agents and human subjects. We grouped the SVRT problems into different types by the two aspects of the core characteristics for classification: shape specification and location relation. We found that the program synthesis classifier could not solve problems involving shape distances, because it relied on symbolic computation which scales poorly with input dimension and adding distances into such computation would increase the dimension combinatorially with the number of shapes in an image. Therefore, although the program synthesis classifier is capable of abstract reasoning, its performance is highly constrained by the accessible information in image parsings. △ Less

Submitted 19 November, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1910.06136 [pdf, other]

Component Mismatches Are a Critical Bottleneck to Fielding AI-Enabled Systems in the Public Sector

Authors: Grace A. Lewis, Stephany Bellomo, April Galyardt

Abstract: The use of machine learning or artificial intelligence (ML/AI) holds substantial potential toward improving many functions and needs of the public sector. In practice however, integrating ML/AI components into public sector applications is severely limited not only by the fragility of these components and their algorithms, but also because of mismatches between components of ML-enabled systems. Fo… ▽ More The use of machine learning or artificial intelligence (ML/AI) holds substantial potential toward improving many functions and needs of the public sector. In practice however, integrating ML/AI components into public sector applications is severely limited not only by the fragility of these components and their algorithms, but also because of mismatches between components of ML-enabled systems. For example, if an ML model is trained on data that is different from data in the operational environment, field performance of the ML component will be dramatically reduced. Separate from software engineering considerations, the expertise needed to field an ML/AI component within a system frequently comes from outside software engineering. As a result, assumptions and even descriptive language used by practitioners from these different disciplines can exacerbate other challenges to integrating ML/AI components into larger systems. We are investigating classes of mismatches in ML/AI systems integration, to identify the implicit assumptions made by practitioners in different fields (data scientists, software engineers, operations staff) and find ways to communicate the appropriate information explicitly. We will discuss a few categories of mismatch, and provide examples from each class. To enable ML/AI components to be fielded in a meaningful way, we will need to understand the mismatches that exist and develop practices to mitigate the impacts of these mismatches. △ Less

Submitted 14 October, 2019; originally announced October 2019.

Comments: Presented at AAAI FSS-19: Artificial Intelligence in Government and Public Sector, Arlington, Virginia, USA

arXiv:1804.10120 [pdf, other]

Automatic generation of CUDA code performing tensor manipulations using C++ expression templates

Authors: Adam G. M. Lewis, Harald P. Pfeiffer

Abstract: We present a C++ library, TLoops, which uses a hierarchy of expression templates to represent operations upon tensorial quantities in single lines of C++ code that resemble analytic equations. These expressions may be run as-is, but may also be used to emit equivalent low-level C or CUDA code, which either performs the operations more quickly on the CPU, or allows them to be rapidly ported to run… ▽ More We present a C++ library, TLoops, which uses a hierarchy of expression templates to represent operations upon tensorial quantities in single lines of C++ code that resemble analytic equations. These expressions may be run as-is, but may also be used to emit equivalent low-level C or CUDA code, which either performs the operations more quickly on the CPU, or allows them to be rapidly ported to run on NVIDIA GPUs. We detail the expression template and C++-class hierarchy that represents the expressions and which makes automatic code-generation possible. We then present benchmarks of the expression-template code, the automatically generated C code, and the automatically generated CUDA code running on several generations of NVIDIA GPU. △ Less

Submitted 24 April, 2018; originally announced April 2018.

Comments: 46 pages, 5 figures

arXiv:1609.04667 [pdf]

War-Algorithm Accountability

Authors: Dustin A. Lewis, Gabriella Blum, Naz K. Modirzadeh

Abstract: In this briefing report, we introduce a new concept (war algorithms) that elevates algorithmically-derived choices and decisions to a, and perhaps the, central concern regarding technical autonomy in war. We thereby aim to shed light on and recast the discussion regarding autonomous weapon systems. We define war algorithm as any algorithm that is expressed in computer code, that is effectuated thr… ▽ More In this briefing report, we introduce a new concept (war algorithms) that elevates algorithmically-derived choices and decisions to a, and perhaps the, central concern regarding technical autonomy in war. We thereby aim to shed light on and recast the discussion regarding autonomous weapon systems. We define war algorithm as any algorithm that is expressed in computer code, that is effectuated through a constructed system, and that is capable of operating in relation to armed conflict. In introducing this concept, our foundational technological concern is the capability of a constructed system, without further human intervention, to help make and effectuate a decision or choice of a war algorithm. Distilled, the two core ingredients are an algorithm expressed in computer code and a suitably capable constructed system. Through that lens, we link international law and related accountability architectures to relevant technologies. We sketch a three-part (non-exhaustive) approach that highlights traditional and unconventional accountability avenues. We focus largely on international law because it is the only normative regime that purports, in key respects but with important caveats, to be both universal and uniform. By not limiting our inquiry only to weapon systems, we take an expansive view, showing how the broad concept of war algorithms might be susceptible to regulation, and how those algorithms might already fit within the existing regulatory system established by international law. △ Less

Submitted 12 September, 2016; originally announced September 2016.

arXiv:1509.00309 [pdf, other]

doi 10.1145/2833179.2833186

Scalable Task-Based Algorithm for Multiplication of Block-Rank-Sparse Matrices

Authors: Justus A. Calvin, Cannada A. Lewis, Edward F. Valeev

Abstract: A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based… ▽ More A task-based formulation of Scalable Universal Matrix Multiplication Algorithm (SUMMA), a popular algorithm for matrix multiplication (MM), is applied to the multiplication of hierarchy-free, rank-structured matrices that appear in the domain of quantum chemistry (QC). The novel features of our formulation are: (1) concurrent scheduling of multiple SUMMA iterations, and (2) fine-grained task-based composition. These features make it tolerant of the load imbalance due to the irregular matrix structure and eliminate all artifactual sources of global synchronization.Scalability of iterative computation of square-root inverse of block-rank-sparse QC matrices is demonstrated; for full-rank (dense) matrices the performance of our SUMMA formulation usually exceeds that of the state-of-the-art dense MM implementations (ScaLAPACK and Cyclops Tensor Framework). △ Less

Submitted 9 October, 2015; v1 submitted 1 September, 2015; originally announced September 2015.

Comments: 8 pages, 6 figures, accepted to IA3 2015. arXiv admin note: text overlap with arXiv:1504.05046

arXiv:1504.07135 [pdf]

doi 10.1007/978-3-319-24255-2_16

Systems-theoretic Safety Assessment of Robotic Telesurgical Systems

Authors: Homa Alemzadeh, Daniel Chen, Andrew Lewis, Zbigniew Kalbarczyk, Jaishankar Raman, Nancy Leveson, Ravishankar K. Iyer

Abstract: Robotic telesurgical systems are one of the most complex medical cyber-physical systems on the market, and have been used in over 1.75 million procedures during the last decade. Despite significant improvements in design of robotic surgical systems through the years, there have been ongoing occurrences of safety incidents during procedures that negatively impact patients. This paper presents an ap… ▽ More Robotic telesurgical systems are one of the most complex medical cyber-physical systems on the market, and have been used in over 1.75 million procedures during the last decade. Despite significant improvements in design of robotic surgical systems through the years, there have been ongoing occurrences of safety incidents during procedures that negatively impact patients. This paper presents an approach for systems-theoretic safety assessment of robotic telesurgical systems using software-implemented fault-injection. We used a systemstheoretic hazard analysis technique (STPA) to identify the potential safety hazard scenarios and their contributing causes in RAVEN II robot, an open-source robotic surgical platform. We integrated the robot control software with a softwareimplemented fault-injection engine which measures the resilience of the system to the identified safety hazard scenarios by automatically inserting faults into different parts of the robot control software. Representative hazard scenarios from real robotic surgery incidents reported to the U.S. Food and Drug Administration (FDA) MAUDE database were used to demonstrate the feasibility of the proposed approach for safety-based design of robotic telesurgical systems. △ Less

Submitted 8 July, 2015; v1 submitted 27 April, 2015; originally announced April 2015.

Comments: Revise based on reviewers feedback. To appear in the the International Conference on Computer Safety, Reliability, and Security (SAFECOMP) 2015

arXiv:0909.2368 [pdf]

Web Single Sign-On Authentication using SAML

Authors: Kelly D. Lewis andjames E. Lewis

Abstract: Companies have increasingly turned to application service providers (ASPs) or Software as a Service (SaaS) vendors to offer specialized web-based services that will cut costs and provide specific and focused applications to users. The complexity of designing, installing, configuring, deploying, and supporting the system with internal resources can be eliminated with this type of methodology, pro… ▽ More Companies have increasingly turned to application service providers (ASPs) or Software as a Service (SaaS) vendors to offer specialized web-based services that will cut costs and provide specific and focused applications to users. The complexity of designing, installing, configuring, deploying, and supporting the system with internal resources can be eliminated with this type of methodology, providing great benefit to organizations. However, these models can present an authentication problem for corporations with a large number of external service providers. This paper describes the implementation of Security Assertion Markup Language (SAML) and its capabilities to provide secure single sign-on (SSO) solutions for externally hosted applications. △ Less

Submitted 12 September, 2009; originally announced September 2009.

Comments: International Journal of Computer Science Issues (IJCSI), Volume 1, pp41-48, August 2009

Journal ref: K. D. LEWIS andJ. E. LEWIS, " Web Single Sign-On Authentication using SAML", International Journal of Computer Science Issues (IJCSI), Volume 1, pp41-48, August 2009

arXiv:math/0204068 [pdf, ps, other]

Computational problems for vector-valued quadratic forms

Authors: Francesco Bullo, Jorge Cortes, Andrew D. Lewis, Sonia Martinez

Abstract: Given two real vector spaces $U$ and $V$, and a symmetric bilinear map $B: U\times U\to V$, let $Q_B$ be its associated quadratic map $Q_B$. The problems we consider are as follows: (i) are there necessary and sufficient conditions, checkable in polynomial-time, for determining when $Q_B$ is surjective?; (ii) if $Q_B$ is surjective, given $v\in V$ is there a polynomial-time algorithm for finding… ▽ More Given two real vector spaces $U$ and $V$, and a symmetric bilinear map $B: U\times U\to V$, let $Q_B$ be its associated quadratic map $Q_B$. The problems we consider are as follows: (i) are there necessary and sufficient conditions, checkable in polynomial-time, for determining when $Q_B$ is surjective?; (ii) if $Q_B$ is surjective, given $v\in V$ is there a polynomial-time algorithm for finding a point $u\in Q_B^{-1}(v)$?; (iii) are there necessary and sufficient conditions, checkable in polynomial-time, for determining when $B$ is indefinite? We present an alternative formulation of the problem of determining the image of a vector-valued quadratic form in terms of the unprojectivised Veronese surface. The relation of these questions with several interesting problems in Control Theory is illustrated. △ Less

Submitted 5 April, 2002; originally announced April 2002.

Comments: 6 pages, no figures, submitted to Workshop on Open Problems in Mathematical Systems and Control Theory

MSC Class: 11Exx; 14Pxx; 14Q99; 15A63

Showing 1–44 of 44 results for author: Lewis, A