Search | arXiv e-print repository

Safe AI for health and beyond -- Monitoring to transform a health service

Authors: Mahed Abroshan, Michael Burkhart, Oscar Giles, Sam Greenbury, Zoe Kourtzi, Jack Roberts, Mihaela van der Schaar, Jannetta S Steyn, Alan Wilson, May Yong

Abstract: Machine learning techniques are effective for building predictive models because they identify patterns in large datasets. Development of a model for complex real-life problems often stop at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as patient demographics, systems and clinical… ▽ More Machine learning techniques are effective for building predictive models because they identify patterns in large datasets. Development of a model for complex real-life problems often stop at the point of publication, proof of concept or when made accessible through some mode of deployment. However, a model in the medical domain risks becoming obsolete as patient demographics, systems and clinical practices change. The maintenance and monitoring of predictive model performance post-publication is crucial to enable their safe and effective long-term use. We will assess the infrastructure required to monitor the outputs of a machine learning algorithm, and present two scenarios with examples of monitoring and updates of models, firstly on a breast cancer prognosis model trained on public longitudinal data, and secondly on a neurodegenerative stratification algorithm that is currently being developed and tested in clinic. △ Less

Submitted 6 June, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: 12 pages, 3 figures

ACM Class: I.2.1

arXiv:2209.09239 [pdf, other]

Non-Imaging Medical Data Synthesis for Trustworthy AI: A Comprehensive Survey

Authors: Xiaodan Xing, Huanjun Wu, Lichao Wang, Iain Stenson, May Yong, Javier Del Ser, Simon Walsh, Guang Yang

Abstract: Data quality is the key factor for the development of trustworthy AI in healthcare. A large volume of curated datasets with controlled confounding factors can help improve the accuracy, robustness and privacy of downstream AI algorithms. However, access to good quality datasets is limited by the technical difficulty of data acquisition and large-scale sharing of healthcare data is hindered by stri… ▽ More Data quality is the key factor for the development of trustworthy AI in healthcare. A large volume of curated datasets with controlled confounding factors can help improve the accuracy, robustness and privacy of downstream AI algorithms. However, access to good quality datasets is limited by the technical difficulty of data acquisition and large-scale sharing of healthcare data is hindered by strict ethical restrictions. Data synthesis algorithms, which generate data with a similar distribution as real clinical data, can serve as a potential solution to address the scarcity of good quality data during the development of trustworthy AI. However, state-of-the-art data synthesis algorithms, especially deep learning algorithms, focus more on imaging data while neglecting the synthesis of non-imaging healthcare data, including clinical measurements, medical signals and waveforms, and electronic healthcare records (EHRs). Thus, in this paper, we will review the synthesis algorithms, particularly for non-imaging medical data, with the aim of providing trustworthy AI in this domain. This tutorial-styled review paper will provide comprehensive descriptions of non-imaging medical data synthesis on aspects including algorithms, evaluations, limitations and future research directions. △ Less

Submitted 17 September, 2022; originally announced September 2022.

Comments: 35 pages, Submitted to ACM Computing Surveys

arXiv:2105.00121 [pdf, other]

Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows

Authors: Doris Jung-Lin Lee, Dixin Tang, Kunal Agarwal, Thyne Boonmark, Caitlyn Chen, Jake Kang, Ujjaini Mukhopadhyay, Jerry Song, Micah Yong, Marti A. Hearst, Aditya G. Parameswaran

Abstract: Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and mental effort to determine what analysis to perform next. We propose Lux, an always-on framework for acce… ▽ More Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and mental effort to determine what analysis to perform next. We propose Lux, an always-on framework for accelerating visual insight discovery in dataframe workflows. When users print a dataframe in their notebooks, Lux recommends visualizations to provide a quick overview of the patterns and trends and suggests promising analysis directions. Lux features a high level language for generating visualizations on demand to encourage rapid visual experimentation with data. We demonstrate that through the use of a careful design and three system optimizations, Lux adds no more than two seconds of overhead on top of pandas for over 98% of datasets in the UCI repository. We evaluate Lux in terms of usability via a controlled first-use study and interviews with early adopters, finding that Lux helps fulfill the needs of data scientists for visualization support within their dataframe workflows. Lux has already been embraced by data science practitioners, with over 3.1k stars on Github. △ Less

Submitted 22 December, 2021; v1 submitted 30 April, 2021; originally announced May 2021.

arXiv:2012.03049 [pdf, other]

Urban Heat Islands: Beating the Heat with Multi-Modal Spatial Analysis

Authors: Marcus Yong, Kwan Hui Lim

Abstract: In today's highly urbanized environment, the Urban Heat Island (UHI) phenomenon is increasingly prevalent where surface temperatures in urbanized areas are found to be much higher than surrounding rural areas. Excessive levels of heat stress leads to problems at various levels, ranging from the individual to the world. At the individual level, UHI could lead to the human body being unable to cope… ▽ More In today's highly urbanized environment, the Urban Heat Island (UHI) phenomenon is increasingly prevalent where surface temperatures in urbanized areas are found to be much higher than surrounding rural areas. Excessive levels of heat stress leads to problems at various levels, ranging from the individual to the world. At the individual level, UHI could lead to the human body being unable to cope and break-down in terms of core functions. At the world level, UHI potentially contributes to global warming and adversely affects the environment. Using a multi-modal dataset comprising remote sensory imagery, geo-spatial data and population data, we proposed a framework for investigating how UHI is affected by a city's urban form characteristics through the use of statistical modelling. Using Singapore as a case study, we demonstrate the usefulness of this framework and discuss our main findings in understanding the effects of UHI and urban form characteristics. △ Less

Submitted 5 December, 2020; originally announced December 2020.

Comments: Accepted at the 2020 IEEE International Conference on Big Data (BigData'20)

arXiv:2011.07565 [pdf, ps, other]

User-Centered Programming Language Design: A Course-Based Case Study

Authors: Michael Coblenz, Ariel Davis, Megan Hofmann, Vivian Huang, Siyue **, Max Krieger, Kyle Liang, Brian Wei, Mengchen Sam Yong, Jonathan Aldrich

Abstract: Recently, user-centered methods have been proposed to improve the design of programming languages. In order to explore what benefits these methods might have for novice programming language designers, we taught a collection of user-centered programming language design methods to a group of eight students. We observed that natural programming and usability studies helped the students refine their l… ▽ More Recently, user-centered methods have been proposed to improve the design of programming languages. In order to explore what benefits these methods might have for novice programming language designers, we taught a collection of user-centered programming language design methods to a group of eight students. We observed that natural programming and usability studies helped the students refine their language designs and identify opportunities for improvement, even in the short duration of a course project. △ Less

Submitted 15 November, 2020; originally announced November 2020.

Comments: 7 pages. Presented at HATRA 2020 (https://2020.splashcon.org/home/hatra-2020)

ACM Class: D.2; D.3

arXiv:1910.06708 [pdf, other]

doi 10.1016/j.knosys.2022.109124

Efficiently Embedding Dynamic Knowledge Graphs

Authors: Tianxing Wu, Arijit Khan, Melvin Yong, Guilin Qi, Meng Wang

Abstract: Knowledge graph (KG) embedding encodes the entities and relations from a KG into low-dimensional vector spaces to support various applications such as KG completion, question answering, and recommender systems. In real world, knowledge graphs (KGs) are dynamic and evolve over time with addition or deletion of triples. However, most existing models focus on embedding static KGs while neglecting dyn… ▽ More Knowledge graph (KG) embedding encodes the entities and relations from a KG into low-dimensional vector spaces to support various applications such as KG completion, question answering, and recommender systems. In real world, knowledge graphs (KGs) are dynamic and evolve over time with addition or deletion of triples. However, most existing models focus on embedding static KGs while neglecting dynamics. To adapt to the changes in a KG, these models need to be retrained on the whole KG with a high time cost. In this paper, to tackle the aforementioned problem, we propose a new context-aware Dynamic Knowledge Graph Embedding (DKGE) method which supports the embedding learning in an online fashion. DKGE introduces two different representations (i.e., knowledge embedding and contextual element embedding) for each entity and each relation, in the joint modeling of entities and relations as well as their contexts, by employing two attentive graph convolutional networks, a gate strategy, and translation operations. This effectively helps limit the impacts of a KG update in certain regions, not in the entire graph, so that DKGE can rapidly acquire the updated KG embedding by a proposed online learning algorithm. Furthermore, DKGE can also learn KG embedding from scratch. Experiments on the tasks of link prediction and question answering in a dynamic environment demonstrate the effectiveness and efficiency of DKGE. △ Less

Submitted 31 May, 2022; v1 submitted 15 October, 2019; originally announced October 2019.

Comments: 46 pages

arXiv:1906.08172 [pdf, other]

MediaPipe: A Framework for Building Perception Pipelines

Authors: Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, Wan-Teh Chang, Wei Hua, Manfred Georg, Matthias Grundmann

Abstract: Building applications that perceive the world around them is challenging. A developer needs to (a) select and develop corresponding machine learning algorithms and models, (b) build a series of prototypes and demos, (c) balance resource consumption against the quality of the solutions, and finally (d) identify and mitigate problematic cases. The MediaPipe framework addresses all of these challenge… ▽ More Building applications that perceive the world around them is challenging. A developer needs to (a) select and develop corresponding machine learning algorithms and models, (b) build a series of prototypes and demos, (c) balance resource consumption against the quality of the solutions, and finally (d) identify and mitigate problematic cases. The MediaPipe framework addresses all of these challenges. A developer can use MediaPipe to build prototypes by combining existing perception components, to advance them to polished cross-platform applications and measure system performance and resource consumption on target platforms. We show that these features enable a developer to focus on the algorithm or model development and use MediaPipe as an environment for iteratively improving their application with results reproducible across different devices and platforms. MediaPipe will be open-sourced at https://github.com/google/mediapipe. △ Less

Submitted 14 June, 2019; originally announced June 2019.

arXiv:1806.08518 [pdf]

doi 10.2196/10153

Emotion-Recognition Using Smart Watch Sensor Data: Mixed-Design Study

Authors: Juan C. Quiroz, Elena Geangu, Min Hooi Yong

Abstract: This study investigates the use of movement sensor data from a smart watch to infer an individual's emotional state. We present our findings on a user study with 50 participants. The experimental design is a mixed-design study; within-subjects (emotions; happy, sad, neutral) and between-subjects (stimulus type: audio-visual "movie clips", audio "music clips"). Each participant experienced both emo… ▽ More This study investigates the use of movement sensor data from a smart watch to infer an individual's emotional state. We present our findings on a user study with 50 participants. The experimental design is a mixed-design study; within-subjects (emotions; happy, sad, neutral) and between-subjects (stimulus type: audio-visual "movie clips", audio "music clips"). Each participant experienced both emotions in a single stimulus type. All participants walked 250m while wearing a smart watch on one wrist and a heart rate monitor strap on their chest. They also had to answer a short questionnaire (20 items; PANAS) before and after experiencing each emotion. The heart rate monitor served as supplementary information to our data. We performed time-series analysis on the data from the smart watch and a t-test on the questionnaire items to measure the change in emotional state. The heart rate data was analyzed using one-way ANOVA. We extracted features from the time-series using sliding windows and used the features to train and validate classifiers that determined an individual's emotion. Participants reported feeling less negative affect after watching sad videos or after listening to the sad music, P < .006. For the task of emotion recognition using classifiers, our results show that the personal models outperformed personal baselines, and achieved median accuracies higher than 78% for all conditions of the design study for the binary classification of happiness vs sadness. Our findings show that we are able to detect the changes in emotional state with data obtained from the smartwatch as well as behavioral responses. △ Less

Submitted 6 January, 2019; v1 submitted 22 June, 2018; originally announced June 2018.

Comments: Published in JMIR Mental Health

Journal ref: JMIR Ment Health (2018);5(3):e10153

arXiv:1709.09148 [pdf]

doi 10.1145/3123024.3125614

Emotion-Recognition Using Smart Watch Accelerometer Data: Preliminary Findings

Authors: Juan C. Quiroz, Min Hooi Yong, Elena Geangu

Abstract: This study investigates the use of accelerometer data from a smart watch to infer an individual's emotional state. We present our preliminary findings on a user study with 50 participants. Participants were primed either with audio-visual (movie clips) or audio (classical music) to elicit emotional responses. Participants then walked while wearing a smart watch on one wrist and a heart rate strap… ▽ More This study investigates the use of accelerometer data from a smart watch to infer an individual's emotional state. We present our preliminary findings on a user study with 50 participants. Participants were primed either with audio-visual (movie clips) or audio (classical music) to elicit emotional responses. Participants then walked while wearing a smart watch on one wrist and a heart rate strap on their chest. Our hypothesis is that the accelerometer signal will exhibit different patterns for participants in response to different emotion priming. We divided the accelerometer data using sliding windows, extracted features from each window, and used the features to train supervised machine learning algorithms to infer an individual's emotion from their walking pattern. Our discussion includes a description of the methodology, data collected, and early results. △ Less

Submitted 26 September, 2017; originally announced September 2017.

Comments: Mental Health and Well-being: Sensing and Intervention, UBICOMP 2017 Workshop

Showing 1–9 of 9 results for author: Yong, M