Search | arXiv e-print repository

Develo** patient-driven artificial intelligence based on personal rankings of care decision making steps

Abstract: We propose and experimentally motivate a new methodology to support decision-making processes in healthcare with artificial intelligence based on personal rankings of care decision making steps that can be identified with our methodology, questionnaire data and its statistical patterns. Our longitudinal quantitative cross-sectional three-stage study gathered self-ratings for 437 expression stateme… ▽ More We propose and experimentally motivate a new methodology to support decision-making processes in healthcare with artificial intelligence based on personal rankings of care decision making steps that can be identified with our methodology, questionnaire data and its statistical patterns. Our longitudinal quantitative cross-sectional three-stage study gathered self-ratings for 437 expression statements concerning healthcare situations on Likert scales in respect to "the need for help", "the advancement of health", "the hopefulness", "the indication of compassion" and "the health condition", and 45 answers about the person's demographics, health and wellbeing, also the duration of giving answers. Online respondents between 1 June 2020 and 29 June 2021 were recruited from Finnish patient and disabled people's organizations, other health-related organizations and professionals, and educational institutions (n=1075). With Kruskal-Wallis test, Wilcoxon rank-sum test (i.e., Mann-Whitney U test), Wilcoxon rank-sum pairwise test, Welch's t test and one-way analysis of variance (ANOVA) between groups test we identified statistically significant differences of ratings and their durations for each expression statement in respect to respondent grou**s based on the answer values of each background question. Frequencies of the later reordering of rating rankings showed dependencies with ratings given earlier in respect to various interpretation task entities, interpretation dimensions and respondent grou**s. Our methodology, questionnaire data and its statistical patterns enable analyzing with self-rated expression statements the representations of decision making steps in healthcare situations and their chaining, agglomeration and branching in knowledge entities of personalized care paths. Our results support building artificial intelligence solutions to address the patient's needs concerning care. △ Less

Submitted 15 May, 2022; originally announced May 2022.

Comments: Corresponding author: Lauri Lahti (email: [email protected]). This research article manuscript version was completed on 11 May 2022 and it was self-archived on the open-access Arxiv repository (https://arxiv.longhoe.net) on 11 May 2022. This research article (104 pages, 17 tables and 17 figures) is supplemented with seven supplementing documents (2781 pages)

arXiv:2012.13626 [pdf]

Detecting the patient's need for help with machine learning

Authors: Lauri Lahti

Abstract: Develo** machine learning models to support health analytics requires increased understanding about statistical properties of self-rated expression statements. We analyzed self-rated expression statements concerning the coronavirus COVID-19 epidemic to identify statistically significant differences between groups of respondents and to detect the patient's need for help with machine learning. Our… ▽ More Develo** machine learning models to support health analytics requires increased understanding about statistical properties of self-rated expression statements. We analyzed self-rated expression statements concerning the coronavirus COVID-19 epidemic to identify statistically significant differences between groups of respondents and to detect the patient's need for help with machine learning. Our quantitative study gathered the "need for help" ratings for twenty health-related expression statements concerning the coronavirus epidemic on a 11-point Likert scale, and nine answers about the person's health and wellbeing, sex and age. Online respondents between 30 May and 3 August 2020 were recruited from Finnish patient and disabled people's organizations, other health-related organizations and professionals, and educational institutions (n=673). We analyzed rating differences and dependencies with Kendall rank-correlation and cosine similarity measures and tests of Wilcoxon rank-sum, Kruskal-Wallis and one-way analysis of variance (ANOVA) between groups, and carried out machine learning experiments with a basic implementation of a convolutional neural network algorithm. We found statistically significant correlations and high cosine similarity values between various health-related expression statement pairs concerning the "need for help" ratings and a background question pair. We also identified statistically significant rating differences for several health-related expression statements in respect to grou**s based on the answer values of background questions, such as the ratings of suspecting to have the coronavirus infection and having it depending on the estimated health condition, quality of life and sex. Our experiments with a convolutional neural network algorithm showed the applicability of machine learning to support detecting the need for help in the patient's expressions. △ Less

Submitted 24 March, 2021; v1 submitted 25 December, 2020; originally announced December 2020.

Comments: Corresponding author: Lauri Lahti (email: [email protected]). The first manuscript version of this research article was self-archived (arXiv:2012.13626) on 24 December 2020 and this second manuscript version on 23 March 2021. Changes include extensions, clarifications and corrections. This article contains 26 pages, 7 tables and 5 figures. Supplemented with Appendix A (12 pages)

arXiv:1710.04158 [pdf]

Development of computational models for emotional diary text analysis to support maternal care

Authors: Lauri Lahti, Henni Tenhunen, Seppo Heinonen, Minna Helkavaara, Maritta Pöyhönen-Alho, Paulus Torkki

Abstract: We propose new computational models for analyzing self-reported emotional diary texts of pregnant women to support maternal care. We gathered affective ratings outside clinical setting and developed new models to facilitate interpretation and communication of affective expressions between persons representing different affective ratings. Relying on constructed emotion theory, models of dimensional… ▽ More We propose new computational models for analyzing self-reported emotional diary texts of pregnant women to support maternal care. We gathered affective ratings outside clinical setting and developed new models to facilitate interpretation and communication of affective expressions between persons representing different affective ratings. Relying on constructed emotion theory, models of dimensional emotion categories and affective ratings of Self Assessment Manikin, we demonstrate our new proposal to analyze linguistic data with computational models exploiting vector space and clustering methods. 35 persons having Finnish as a native language provided affective ratings for 195 emotional adjectives and 16 pregnancy-related nouns in Finnish in dimensions of pleasure, arousal and dominance. We developed new models to represent dependencies and differences of affective ratings between various population subgroup categorizations, including "women without children", "women with children" and "men without children" that we consider important population segments to be addressed in maternal care. Our affective ratings showed significant correlations between pleasure and dominance (like Warriner et al., 2013) and with previous data collections (Söderholm et al., 2013; Eilola & Havelka, 2010; Warriner et al., 2013). Our affective ratings had significant effects on categorizations based on gender, gender-parental role and the time of the day and duration of giving ratings. Our results indicate accordance with significant affectivity differences of gender and age (Warriner et al., 2013) and motherhood (Rosebrock et al., 2015). Our proposed models aim to support health-related communication. Our results suggest gathering next the affective ratings of patients of maternal care in a real clinical setting. △ Less

Submitted 17 October, 2017; v1 submitted 9 October, 2017; originally announced October 2017.

Comments: This article manuscript version was completed on 17 October 2017, and it was self-archived on the arXiv repository (https://arxiv.longhoe.net/) on 17 October 2017. It contains 56 pages (31 for the article and 25 for Appendices A-F). The article contains 7 tables and 2 figures. Updated: clarity, previous research and statistical results (e.g. significance of correlations, main effects and differences)

arXiv:1212.5932 [pdf, ps, other]

doi 10.1093/nar/gkt229

Fully scalable online-preprocessing algorithm for short oligonucleotide microarray atlases

Authors: Leo Lahti, Aurora Torrente, Laura L. Elo, Alvis Brazma, Johan Rung

Abstract: Accumulation of standardized data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of contemporary microarray collections. While short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level pr… ▽ More Accumulation of standardized data collections is opening up novel opportunities for holistic characterization of genome function. The limited scalability of current preprocessing techniques has, however, formed a bottleneck for full utilization of contemporary microarray collections. While short oligonucleotide arrays constitute a major source of genome-wide profiling data, scalable probe-level preprocessing algorithms have been available only for few measurement platforms based on pre-calculated model parameters from restricted reference training sets. To overcome these key limitations, we introduce a fully scalable online-learning algorithm that provides tools to process large microarray atlases including tens of thousands of arrays. Unlike the alternatives, the proposed algorithm scales up in linear time with respect to sample size and is readily applicable to all short oligonucleotide platforms. This is the only available preprocessing algorithm that can learn probe-level parameters based on sequential hyperparameter updates at small, consecutive batches of data, thus circumventing the extensive memory requirements of the standard approaches and opening up novel opportunities to take full advantage of contemporary microarray data collections. Moreover, using the most comprehensive data collections to estimate probe-level effects can assist in pinpointing individual probes affected by various biases and provide new tools to guide array design and quality control. The implementation is freely available in R/Bioconductor at http://www.bioconductor.org/packages/devel/bioc/html/RPA.html △ Less

Submitted 27 December, 2012; v1 submitted 24 December, 2012; originally announced December 2012.

Comments: 20 pages, 3 figures, 1 supplementary PDF

Journal ref: Leo Lahti, Aurora Torrente, Laura L. Elo, Alvis Brazma, Johan Rung. A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases. Nucleic Acids Research, Online April 5, 2013

arXiv:1202.0501 [pdf, ps, other]

doi 10.1093/bioinformatics/btq500

Global modeling of transcriptional responses in interaction networks

Authors: Leo Lahti, Juha E. A. Knuuttila, Samuel Kaski

Abstract: Motivation: Cell-biological processes are regulated through a complex network of interactions between genes and their products. The processes, their activating conditions, and the associated transcriptional responses are often unknown. Organism-wide modeling of network activation can reveal unique and shared mechanisms between physiological conditions, and potentially as yet unknown processes. We… ▽ More Motivation: Cell-biological processes are regulated through a complex network of interactions between genes and their products. The processes, their activating conditions, and the associated transcriptional responses are often unknown. Organism-wide modeling of network activation can reveal unique and shared mechanisms between physiological conditions, and potentially as yet unknown processes. We introduce a novel approach for organism-wide discovery and analysis of transcriptional responses in interaction networks. The method searches for local, connected regions in a network that exhibit coordinated transcriptional response in a subset of conditions. Known interactions between genes are used to limit the search space and to guide the analysis. Validation on a human pathway network reveals physiologically coherent responses, functional relatedness between physiological conditions, and coordinated, context-specific regulation of the genes. Availability: Implementation is freely available in R and Matlab at http://netpro.r-forge.r-project.org △ Less

Submitted 2 February, 2012; originally announced February 2012.

Comments: 19 pages, 13 figures

Journal ref: Global modeling of transcriptional responses in interaction networks. Leo Lahti, Juha E.A. Knuuttila, and Samuel Kaski. Bioinformatics 26(21):2713-2720, 2010

arXiv:1111.4639 [pdf]

doi 10.1093/bib/bbs005

Cancer gene prioritization by integrative analysis of mRNA expression and DNA copy number data: a comparative review

Authors: Leo Lahti, Martin Schäfer, Hans-Ulrich Klein, Silvio Bicciato, Martin Dugas

Abstract: A variety of genome-wide profiling techniques are available to probe complementary aspects of genome structure and function. Integrative analysis of heterogeneous data sources can reveal higher-level interactions that cannot be detected based on individual observations. A standard integration task in cancer studies is to identify altered genomic regions that induce changes in the expression of the… ▽ More A variety of genome-wide profiling techniques are available to probe complementary aspects of genome structure and function. Integrative analysis of heterogeneous data sources can reveal higher-level interactions that cannot be detected based on individual observations. A standard integration task in cancer studies is to identify altered genomic regions that induce changes in the expression of the associated genes based on joint analysis of genome-wide gene expression and copy number profiling measurements. In this review, we provide a comparison among various modeling procedures for integrating genome-wide profiling data of gene copy number and transcriptional alterations and highlight common approaches to genomic data integration. A transparent benchmarking procedure is introduced to quantitatively compare the cancer gene prioritization performance of the alternative methods. The benchmarking algorithms and data sets are available at http://intcomp.r-forge.r-project.org △ Less

Submitted 20 November, 2011; originally announced November 2011.

Comments: PDF file including supplementary material. 9 pages. Preprint

arXiv:1109.4928

RPA: Probabilistic analysis of probe performance and robust summarization

Authors: Leo Lahti, Laura L. Elo, Tero Aittokallio, Samuel Kaski

Abstract: Probe-level models have led to improved performance in microarray studies but the various sources of probe-level contamination are still poorly understood. Data-driven analysis of probe performance can be used to quantify the uncertainty in individual probes and to highlight the relative contribution of different noise sources. Improved understanding of the probe-level effects can lead to improved… ▽ More Probe-level models have led to improved performance in microarray studies but the various sources of probe-level contamination are still poorly understood. Data-driven analysis of probe performance can be used to quantify the uncertainty in individual probes and to highlight the relative contribution of different noise sources. Improved understanding of the probe-level effects can lead to improved preprocessing techniques and microarray design. We have implemented probabilistic tools for probe performance analysis and summarization on short oligonucleotide arrays. In contrast to standard preprocessing approaches, the methods provide quantitative estimates of probe-specific noise and affinity terms and tools to investigate these parameters. Tools to incorporate prior information of the probes in the analysis are provided as well. Comparisons to known probe-level error sources and spike-in data sets validate the approach. Implementation is freely available in R/BioConductor: http://www.bioconductor.org/packages/release/bioc/html/RPA.html △ Less

Submitted 6 April, 2013; v1 submitted 22 September, 2011; originally announced September 2011.

Comments: Replaced by extended work which forms an independent publication

arXiv:1109.4919 [pdf, ps, other]

A brief overview on the BioPAX and SBML standards for formal presentation of complex biological knowledge

Authors: Leo Lahti

Abstract: A brief informal overview on the BioPAX and SBML standards for formal presentation of complex biological knowledge. A brief informal overview on the BioPAX and SBML standards for formal presentation of complex biological knowledge. △ Less

Submitted 22 September, 2011; originally announced September 2011.

Comments: 14 pages, 2 figures, 1 appendix

arXiv:1102.5509 [pdf, ps, other]

Probabilistic analysis of the human transcriptome with side information

Authors: Leo Lahti

Abstract: Understanding functional organization of genetic information is a major challenge in modern biology. Following the initial publication of the human genome sequence in 2001, advances in high-throughput measurement technologies and efficient sharing of research material through community databases have opened up new views to the study of living organisms and the structure of life. In this thesis, no… ▽ More Understanding functional organization of genetic information is a major challenge in modern biology. Following the initial publication of the human genome sequence in 2001, advances in high-throughput measurement technologies and efficient sharing of research material through community databases have opened up new views to the study of living organisms and the structure of life. In this thesis, novel computational strategies have been developed to investigate a key functional layer of genetic information, the human transcriptome, which regulates the function of living cells through protein synthesis. The key contributions of the thesis are general exploratory tools for high-throughput data analysis that have provided new insights to cell-biological networks, cancer mechanisms and other aspects of genome function. A central challenge in functional genomics is that high-dimensional genomic observations are associated with high levels of complex and largely unknown sources of variation. By combining statistical evidence across multiple measurement sources and the wealth of background information in genomic data repositories it has been possible to solve some the uncertainties associated with individual observations and to identify functional mechanisms that could not be detected based on individual measurement sources. Statistical learning and probabilistic models provide a natural framework for such modeling tasks. Open source implementations of the key methodological contributions have been released to facilitate further adoption of the developed methods by the research community. △ Less

Submitted 27 February, 2011; originally announced February 2011.

Comments: Doctoral thesis. 103 pages, 11 figures

Report number: TKK-ICS-D19 ACM Class: G.3; I.5.3; J.3; K.8.1

Journal ref: TKK Dissertations in Information and Computer Science TKK-ICS-D19. Aalto University School of Science and Technology, Department of Information and Computer Science, Espoo, Finland, 2010

Showing 1–9 of 9 results for author: Lahti, L