-
myAURA: Personalized health library for epilepsy management via knowledge graph sparsification and visualization
Authors:
Rion Brattig Correia,
Jordan C. Rozum,
Leonard Cross,
Jack Felag,
Michael Gallant,
Ziqi Guo,
Bruce W. Herr II,
Aehong Min,
Deborah Stungis Rocha,
Xuan Wang,
Katy Börner,
Wendy Miller,
Luis M. Rocha
Abstract:
Objective: We report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and researchers in making decisions about care and self-management.
Materials and Methods: myAURA rests on the federation of an unprecedented collection of heterogeneous data resources relevant to epilepsy, such as biomedical databases, social media,…
▽ More
Objective: We report the development of the patient-centered myAURA application and suite of methods designed to aid epilepsy patients, caregivers, and researchers in making decisions about care and self-management.
Materials and Methods: myAURA rests on the federation of an unprecedented collection of heterogeneous data resources relevant to epilepsy, such as biomedical databases, social media, and electronic health records. A generalizable, open-source methodology was developed to compute a multi-layer knowledge graph linking all this heterogeneous data via the terms of a human-centered biomedical dictionary.
Results: The power of the approach is first exemplified in the study of the drug-drug interaction phenomenon. Furthermore, we employ a novel network sparsification methodology using the metric backbone of weighted graphs, which reveals the most important edges for inference, recommendation, and visualization, such as pharmacology factors patients discuss on social media. The network sparsification approach also allows us to extract focused digital cohorts from social media whose discourse is more relevant to epilepsy or other biomedical problems. Finally, we present our patient-centered design and pilot-testing of myAURA, including its user interface, based on focus groups and other stakeholder input.
Discussion: The ability to search and explore myAURA's heterogeneous data sources via a sparsified multi-layer knowledge graph, as well as the combination of those layers in a single map, are useful features for integrating relevant information for epilepsy.
Conclusion: Our stakeholder-driven, scalable approach to integrate traditional and non-traditional data sources, enables biomedical discovery and data-powered patient self-management in epilepsy, and is generalizable to other chronic conditions.
△ Less
Submitted 10 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Map** the co-evolution of artificial intelligence, robotics, and the internet of things over 20 years (1998-2017)
Authors:
Katy Börner,
Olga Scrivner,
Leonard E. Cross,
Michael Gallant,
Shutian Ma,
Adam S. Martin,
Elizabeth Record,
Haici Yang,
Jonathan M. Dilger
Abstract:
Understanding the emergence, co-evolution, and convergence of science and technology (S&T) areas offers competitive intelligence for researchers, managers, policy makers, and others. The resulting data-driven decision support helps set proper research and development (R&D) priorities; develop future S&T investment strategies; monitor key authors, organizations, or countries; perform effective rese…
▽ More
Understanding the emergence, co-evolution, and convergence of science and technology (S&T) areas offers competitive intelligence for researchers, managers, policy makers, and others. The resulting data-driven decision support helps set proper research and development (R&D) priorities; develop future S&T investment strategies; monitor key authors, organizations, or countries; perform effective research program assessment; and implement cutting-edge education/training efforts. This paper presents new funding, publication, and scholarly network metrics and visualizations that were validated via expert surveys. The metrics and visualizations exemplify the emergence and convergence of three areas of strategic interest: artificial intelligence (AI), robotics, and internet of things (IoT) over the last 20 years (1998-2017). For 32,716 publications and 4,497 NSF awards, we identify their conceptual space (using the UCSD map of science), geospatial network, and co-evolution landscape. The findings demonstrate how the transition of knowledge (through cross-discipline publications and citations) and the emergence of new concepts (through term bursting) create a tangible potential for interdisciplinary research and new disciplines.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.
-
Xu: An Automated Query Expansion and Optimization Tool
Authors:
Morgan Gallant,
Haruna Isah,
Farhana Zulkernine,
Shahzad Khan
Abstract:
The exponential growth of information on the Internet is a big challenge for information retrieval systems towards generating relevant results. Novel approaches are required to reformat or expand user queries to generate a satisfactory response and increase recall and precision. Query expansion (QE) is a technique to broaden users' queries by introducing additional tokens or phrases based on some…
▽ More
The exponential growth of information on the Internet is a big challenge for information retrieval systems towards generating relevant results. Novel approaches are required to reformat or expand user queries to generate a satisfactory response and increase recall and precision. Query expansion (QE) is a technique to broaden users' queries by introducing additional tokens or phrases based on some semantic similarity metrics. The tradeoff is the added computational complexity to find semantically similar words and a possible increase in noise in information retrieval. Despite several research efforts on this topic, QE has not yet been explored enough and more work is needed on similarity matching and composition of query terms with an objective to retrieve a small set of most appropriate responses. QE should be scalable, fast, and robust in handling complex queries with a good response time and noise ceiling. In this paper, we propose Xu, an automated QE technique, using high dimensional clustering of word vectors and Datamuse API, an open source query engine to find semantically similar words. We implemented Xu as a command line tool and evaluated its performances using datasets containing news articles and human-generated QEs. The evaluation results show that Xu was better than Datamuse by achieving about 88% accuracy with reference to the human-generated QE.
△ Less
Submitted 8 May, 2019; v1 submitted 28 August, 2018;
originally announced August 2018.
-
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale
Authors:
Scott Emmons,
Stephen Kobourov,
Mike Gallant,
Katy Börner
Abstract:
Notions of community quality underlie network clustering. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used networ…
▽ More
Notions of community quality underlie network clustering. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms -- Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes.
We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on information recovery metrics. Our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information.
Smart local moving is the best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it absolutely superior. Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.
△ Less
Submitted 3 August, 2016; v1 submitted 18 May, 2016;
originally announced May 2016.