-
Safe AI for health and beyond -- Monitoring to transform a health service
Authors:
Mahed Abroshan,
Michael Burkhart,
Oscar Giles,
Sam Greenbury,
Zoe Kourtzi,
Jack Roberts,
Mihaela van der Schaar,
Jannetta S Steyn,
Alan Wilson,
May Yong
Abstract:
Machine learning techniques are effective for building predictive models because they identify patterns in large datasets. Development of a model for complex real-life problems often stop at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as patient demographics, systems and clinical…
▽ More
Machine learning techniques are effective for building predictive models because they identify patterns in large datasets. Development of a model for complex real-life problems often stop at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as patient demographics, systems and clinical practices change. The maintenance and monitoring of predictive model performance post-publication is crucial to enable their safe and effective long-term use. We will assess the infrastructure required to monitor the outputs of a machine learning algorithm, and present two scenarios with examples of monitoring and updates of models, firstly on a breast cancer prognosis model trained on public longitudinal data, and secondly on a neurodegenerative stratification algorithm that is currently being developed and tested in clinic.
△ Less
Submitted 6 June, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
Non-Imaging Medical Data Synthesis for Trustworthy AI: A Comprehensive Survey
Authors:
Xiaodan Xing,
Huanjun Wu,
Lichao Wang,
Iain Stenson,
May Yong,
Javier Del Ser,
Simon Walsh,
Guang Yang
Abstract:
Data quality is the key factor for the development of trustworthy AI in healthcare. A large volume of curated datasets with controlled confounding factors can help improve the accuracy, robustness and privacy of downstream AI algorithms. However, access to good quality datasets is limited by the technical difficulty of data acquisition and large-scale sharing of healthcare data is hindered by stri…
▽ More
Data quality is the key factor for the development of trustworthy AI in healthcare. A large volume of curated datasets with controlled confounding factors can help improve the accuracy, robustness and privacy of downstream AI algorithms. However, access to good quality datasets is limited by the technical difficulty of data acquisition and large-scale sharing of healthcare data is hindered by strict ethical restrictions. Data synthesis algorithms, which generate data with a similar distribution as real clinical data, can serve as a potential solution to address the scarcity of good quality data during the development of trustworthy AI. However, state-of-the-art data synthesis algorithms, especially deep learning algorithms, focus more on imaging data while neglecting the synthesis of non-imaging healthcare data, including clinical measurements, medical signals and waveforms, and electronic healthcare records (EHRs). Thus, in this paper, we will review the synthesis algorithms, particularly for non-imaging medical data, with the aim of providing trustworthy AI in this domain. This tutorial-styled review paper will provide comprehensive descriptions of non-imaging medical data synthesis on aspects including algorithms, evaluations, limitations and future research directions.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows
Authors:
Doris Jung-Lin Lee,
Dixin Tang,
Kunal Agarwal,
Thyne Boonmark,
Caitlyn Chen,
Jake Kang,
Ujjaini Mukhopadhyay,
Jerry Song,
Micah Yong,
Marti A. Hearst,
Aditya G. Parameswaran
Abstract:
Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and mental effort to determine what analysis to perform next. We propose Lux, an always-on framework for acce…
▽ More
Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and mental effort to determine what analysis to perform next. We propose Lux, an always-on framework for accelerating visual insight discovery in dataframe workflows. When users print a dataframe in their notebooks, Lux recommends visualizations to provide a quick overview of the patterns and trends and suggests promising analysis directions. Lux features a high level language for generating visualizations on demand to encourage rapid visual experimentation with data. We demonstrate that through the use of a careful design and three system optimizations, Lux adds no more than two seconds of overhead on top of pandas for over 98% of datasets in the UCI repository. We evaluate Lux in terms of usability via a controlled first-use study and interviews with early adopters, finding that Lux helps fulfill the needs of data scientists for visualization support within their dataframe workflows. Lux has already been embraced by data science practitioners, with over 3.1k stars on Github.
△ Less
Submitted 22 December, 2021; v1 submitted 30 April, 2021;
originally announced May 2021.
-
Urban Heat Islands: Beating the Heat with Multi-Modal Spatial Analysis
Authors:
Marcus Yong,
Kwan Hui Lim
Abstract:
In today's highly urbanized environment, the Urban Heat Island (UHI) phenomenon is increasingly prevalent where surface temperatures in urbanized areas are found to be much higher than surrounding rural areas. Excessive levels of heat stress leads to problems at various levels, ranging from the individual to the world. At the individual level, UHI could lead to the human body being unable to cope…
▽ More
In today's highly urbanized environment, the Urban Heat Island (UHI) phenomenon is increasingly prevalent where surface temperatures in urbanized areas are found to be much higher than surrounding rural areas. Excessive levels of heat stress leads to problems at various levels, ranging from the individual to the world. At the individual level, UHI could lead to the human body being unable to cope and break-down in terms of core functions. At the world level, UHI potentially contributes to global warming and adversely affects the environment. Using a multi-modal dataset comprising remote sensory imagery, geo-spatial data and population data, we proposed a framework for investigating how UHI is affected by a city's urban form characteristics through the use of statistical modelling. Using Singapore as a case study, we demonstrate the usefulness of this framework and discuss our main findings in understanding the effects of UHI and urban form characteristics.
△ Less
Submitted 5 December, 2020;
originally announced December 2020.
-
User-Centered Programming Language Design: A Course-Based Case Study
Authors:
Michael Coblenz,
Ariel Davis,
Megan Hofmann,
Vivian Huang,
Siyue **,
Max Krieger,
Kyle Liang,
Brian Wei,
Mengchen Sam Yong,
Jonathan Aldrich
Abstract:
Recently, user-centered methods have been proposed to improve the design of programming languages. In order to explore what benefits these methods might have for novice programming language designers, we taught a collection of user-centered programming language design methods to a group of eight students. We observed that natural programming and usability studies helped the students refine their l…
▽ More
Recently, user-centered methods have been proposed to improve the design of programming languages. In order to explore what benefits these methods might have for novice programming language designers, we taught a collection of user-centered programming language design methods to a group of eight students. We observed that natural programming and usability studies helped the students refine their language designs and identify opportunities for improvement, even in the short duration of a course project.
△ Less
Submitted 15 November, 2020;
originally announced November 2020.
-
Efficiently Embedding Dynamic Knowledge Graphs
Authors:
Tianxing Wu,
Arijit Khan,
Melvin Yong,
Guilin Qi,
Meng Wang
Abstract:
Knowledge graph (KG) embedding encodes the entities and relations from a KG into low-dimensional vector spaces to support various applications such as KG completion, question answering, and recommender systems. In real world, knowledge graphs (KGs) are dynamic and evolve over time with addition or deletion of triples. However, most existing models focus on embedding static KGs while neglecting dyn…
▽ More
Knowledge graph (KG) embedding encodes the entities and relations from a KG into low-dimensional vector spaces to support various applications such as KG completion, question answering, and recommender systems. In real world, knowledge graphs (KGs) are dynamic and evolve over time with addition or deletion of triples. However, most existing models focus on embedding static KGs while neglecting dynamics. To adapt to the changes in a KG, these models need to be retrained on the whole KG with a high time cost. In this paper, to tackle the aforementioned problem, we propose a new context-aware Dynamic Knowledge Graph Embedding (DKGE) method which supports the embedding learning in an online fashion. DKGE introduces two different representations (i.e., knowledge embedding and contextual element embedding) for each entity and each relation, in the joint modeling of entities and relations as well as their contexts, by employing two attentive graph convolutional networks, a gate strategy, and translation operations. This effectively helps limit the impacts of a KG update in certain regions, not in the entire graph, so that DKGE can rapidly acquire the updated KG embedding by a proposed online learning algorithm. Furthermore, DKGE can also learn KG embedding from scratch. Experiments on the tasks of link prediction and question answering in a dynamic environment demonstrate the effectiveness and efficiency of DKGE.
△ Less
Submitted 31 May, 2022; v1 submitted 15 October, 2019;
originally announced October 2019.
-
MediaPipe: A Framework for Building Perception Pipelines
Authors:
Camillo Lugaresi,
Jiuqiang Tang,
Hadon Nash,
Chris McClanahan,
Esha Uboweja,
Michael Hays,
Fan Zhang,
Chuo-Ling Chang,
Ming Guang Yong,
Juhyun Lee,
Wan-Teh Chang,
Wei Hua,
Manfred Georg,
Matthias Grundmann
Abstract:
Building applications that perceive the world around them is challenging. A developer needs to (a) select and develop corresponding machine learning algorithms and models, (b) build a series of prototypes and demos, (c) balance resource consumption against the quality of the solutions, and finally (d) identify and mitigate problematic cases. The MediaPipe framework addresses all of these challenge…
▽ More
Building applications that perceive the world around them is challenging. A developer needs to (a) select and develop corresponding machine learning algorithms and models, (b) build a series of prototypes and demos, (c) balance resource consumption against the quality of the solutions, and finally (d) identify and mitigate problematic cases. The MediaPipe framework addresses all of these challenges. A developer can use MediaPipe to build prototypes by combining existing perception components, to advance them to polished cross-platform applications and measure system performance and resource consumption on target platforms. We show that these features enable a developer to focus on the algorithm or model development and use MediaPipe as an environment for iteratively improving their application with results reproducible across different devices and platforms. MediaPipe will be open-sourced at https://github.com/google/mediapipe.
△ Less
Submitted 14 June, 2019;
originally announced June 2019.
-
Emotion-Recognition Using Smart Watch Sensor Data: Mixed-Design Study
Authors:
Juan C. Quiroz,
Elena Geangu,
Min Hooi Yong
Abstract:
This study investigates the use of movement sensor data from a smart watch to infer an individual's emotional state. We present our findings on a user study with 50 participants. The experimental design is a mixed-design study; within-subjects (emotions; happy, sad, neutral) and between-subjects (stimulus type: audio-visual "movie clips", audio "music clips"). Each participant experienced both emo…
▽ More
This study investigates the use of movement sensor data from a smart watch to infer an individual's emotional state. We present our findings on a user study with 50 participants. The experimental design is a mixed-design study; within-subjects (emotions; happy, sad, neutral) and between-subjects (stimulus type: audio-visual "movie clips", audio "music clips"). Each participant experienced both emotions in a single stimulus type. All participants walked 250m while wearing a smart watch on one wrist and a heart rate monitor strap on their chest. They also had to answer a short questionnaire (20 items; PANAS) before and after experiencing each emotion. The heart rate monitor served as supplementary information to our data. We performed time-series analysis on the data from the smart watch and a t-test on the questionnaire items to measure the change in emotional state. The heart rate data was analyzed using one-way ANOVA. We extracted features from the time-series using sliding windows and used the features to train and validate classifiers that determined an individual's emotion. Participants reported feeling less negative affect after watching sad videos or after listening to the sad music, P < .006. For the task of emotion recognition using classifiers, our results show that the personal models outperformed personal baselines, and achieved median accuracies higher than 78% for all conditions of the design study for the binary classification of happiness vs sadness. Our findings show that we are able to detect the changes in emotional state with data obtained from the smartwatch as well as behavioral responses.
△ Less
Submitted 6 January, 2019; v1 submitted 22 June, 2018;
originally announced June 2018.
-
Emotion-Recognition Using Smart Watch Accelerometer Data: Preliminary Findings
Authors:
Juan C. Quiroz,
Min Hooi Yong,
Elena Geangu
Abstract:
This study investigates the use of accelerometer data from a smart watch to infer an individual's emotional state. We present our preliminary findings on a user study with 50 participants. Participants were primed either with audio-visual (movie clips) or audio (classical music) to elicit emotional responses. Participants then walked while wearing a smart watch on one wrist and a heart rate strap…
▽ More
This study investigates the use of accelerometer data from a smart watch to infer an individual's emotional state. We present our preliminary findings on a user study with 50 participants. Participants were primed either with audio-visual (movie clips) or audio (classical music) to elicit emotional responses. Participants then walked while wearing a smart watch on one wrist and a heart rate strap on their chest. Our hypothesis is that the accelerometer signal will exhibit different patterns for participants in response to different emotion priming. We divided the accelerometer data using sliding windows, extracted features from each window, and used the features to train supervised machine learning algorithms to infer an individual's emotion from their walking pattern. Our discussion includes a description of the methodology, data collected, and early results.
△ Less
Submitted 26 September, 2017;
originally announced September 2017.