Search | arXiv e-print repository

Towards Precision Healthcare: Robust Fusion of Time Series and Image Data

Authors: Ali Rasekh, Reza Heidari, Amir Hosein Haji Mohammad Rezaie, Parsa Sharifi Sedeh, Zahra Ahmadi, Prasenjit Mitra, Wolfgang Nejdl

Abstract: With the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenoty** where using different modalities of data could significantly improve our ability to predic… ▽ More With the increasing availability of diverse data types, particularly images and time series data from medical experiments, there is a growing demand for techniques designed to combine various modalities of data effectively. Our motivation comes from the important areas of predicting mortality and phenoty** where using different modalities of data could significantly improve our ability to predict. To tackle this challenge, we introduce a new method that uses two separate encoders, one for each type of data, allowing the model to understand complex patterns in both visual and time-based information. Apart from the technical challenges, our goal is to make the predictive model more robust in noisy conditions and perform better than current methods. We also deal with imbalanced datasets and use an uncertainty loss function, yielding improved results while simultaneously providing a principled means of modeling uncertainty. Additionally, we include attention mechanisms to fuse different modalities, allowing the model to focus on what's important for each task. We tested our approach using the comprehensive multimodal MIMIC dataset, combining MIMIC-IV and MIMIC-CXR datasets. Our experiments show that our method is effective in improving multimodal deep learning for clinical applications. The code will be made available online. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2404.13091 [pdf, other]

A process mining-based error correction approach to improve data quality of an IoT-sourced event log

Authors: Mohsen Shirali, Zahra Ahmadi, Carlos Fernández-Llatas, Jose-Luis Bayo-Monton, Gemma Di Federico

Abstract: Internet of Things (IoT) systems are vulnerable to data collection errors and these errors can significantly degrade the quality of collected data, impact data analysis and lead to inaccurate or distorted results. This article emphasizes the importance of evaluating data quality and errors before proceeding with analysis and considering the effectiveness of error correction methods for a smart hom… ▽ More Internet of Things (IoT) systems are vulnerable to data collection errors and these errors can significantly degrade the quality of collected data, impact data analysis and lead to inaccurate or distorted results. This article emphasizes the importance of evaluating data quality and errors before proceeding with analysis and considering the effectiveness of error correction methods for a smart home use case. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 10 pages

ACM Class: E.0; H.4.0; J.3; J.0

arXiv:2403.14707 [pdf, other]

Information Fusion in Multimodal IoT Systems for physical activity level monitoring

Authors: Mohsen Shirali, Zahra Ahmadi, Carlos Fernández-Llatas, Jose-Luis Bayo-Monton

Abstract: This study exploits information fusion in IoT systems and uses a clustering method to identify similarities in behaviours and key characteristics within each cluster. This approach facilitates early detection of behaviour changes and provides a more in-depth understanding of behaviour routines for continuous health monitoring. This study exploits information fusion in IoT systems and uses a clustering method to identify similarities in behaviours and key characteristics within each cluster. This approach facilitates early detection of behaviour changes and provides a more in-depth understanding of behaviour routines for continuous health monitoring. △ Less

Submitted 18 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

MSC Class: 68U35 ACM Class: H.4.0; J.3; I.2.1

arXiv:2306.03834 [pdf, other]

MTS2Graph: Interpretable Multivariate Time Series Classification with Temporal Evolving Graphs

Authors: Raneen Younis, Abdul Hakmeh, Zahra Ahmadi

Abstract: Conventional time series classification approaches based on bags of patterns or shapelets face significant challenges in dealing with a vast amount of feature candidates from high-dimensional multivariate data. In contrast, deep neural networks can learn low-dimensional features efficiently, and in particular, Convolutional Neural Networks (CNN) have shown promising results in classifying Multivar… ▽ More Conventional time series classification approaches based on bags of patterns or shapelets face significant challenges in dealing with a vast amount of feature candidates from high-dimensional multivariate data. In contrast, deep neural networks can learn low-dimensional features efficiently, and in particular, Convolutional Neural Networks (CNN) have shown promising results in classifying Multivariate Time Series (MTS) data. A key factor in the success of deep neural networks is this astonishing expressive power. However, this power comes at the cost of complex, black-boxed models, conflicting with the goals of building reliable and human-understandable models. An essential criterion in understanding such predictive deep models involves quantifying the contribution of time-varying input variables to the classification. Hence, in this work, we introduce a new framework for interpreting multivariate time series data by extracting and clustering the input representative patterns that highly activate CNN neurons. This way, we identify each signal's role and dependencies, considering all possible combinations of signals in the MTS input. Then, we construct a graph that captures the temporal relationship between the extracted patterns for each layer. An effective graph merging strategy finds the connection of each node to the previous layer's nodes. Finally, a graph embedding algorithm generates new representations of the created interpretable time-series features. To evaluate the performance of our proposed framework, we run extensive experiments on eight datasets of the UCR/UEA archive, along with HAR and PAM datasets. The experiments indicate the benefit of our time-aware graph-based representation in MTS classification while enriching them with more interpretability. △ Less

Submitted 6 June, 2023; originally announced June 2023.

arXiv:2208.13252 [pdf, other]

MANDO: Multi-Level Heterogeneous Graph Embeddings for Fine-Grained Detection of Smart Contract Vulnerabilities

Authors: Hoang H. Nguyen, Nhat-Minh Nguyen, Chunyao Xie, Zahra Ahmadi, Daniel Kudendo, Thanh-Nam Doan, Lingxiao Jiang

Abstract: Learning heterogeneous graphs consisting of different types of nodes and edges enhances the results of homogeneous graph techniques. An interesting example of such graphs is control-flow graphs representing possible software code execution flows. As such graphs represent more semantic information of code, develo** techniques and tools for such graphs can be highly beneficial for detecting vulner… ▽ More Learning heterogeneous graphs consisting of different types of nodes and edges enhances the results of homogeneous graph techniques. An interesting example of such graphs is control-flow graphs representing possible software code execution flows. As such graphs represent more semantic information of code, develo** techniques and tools for such graphs can be highly beneficial for detecting vulnerabilities in software for its reliability. However, existing heterogeneous graph techniques are still insufficient in handling complex graphs where the number of different types of nodes and edges is large and variable. This paper concentrates on the Ethereum smart contracts as a sample of software codes represented by heterogeneous contract graphs built upon both control-flow graphs and call graphs containing different types of nodes and links. We propose MANDO, a new heterogeneous graph representation to learn such heterogeneous contract graphs' structures. MANDO extracts customized metapaths, which compose relational connections between different types of nodes and their neighbors. Moreover, it develops a multi-metapath heterogeneous graph attention network to learn multi-level embeddings of different types of nodes and their metapaths in the heterogeneous contract graphs, which can capture the code semantics of smart contracts more accurately and facilitate both fine-grained line-level and coarse-grained contract-level vulnerability detection. Our extensive evaluation of large smart contract datasets shows that MANDO improves the vulnerability detection results of other techniques at the coarse-grained contract level. More importantly, it is the first learning-based approach capable of identifying vulnerabilities at the fine-grained line-level, and significantly improves the traditional code analysis-based vulnerability detection approaches by 11.35% to 70.81% in terms of F1-score. △ Less

Submitted 7 September, 2022; v1 submitted 28 August, 2022; originally announced August 2022.

Comments: Accepted at the 9th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2022 - Research Track)

ACM Class: I.2.5; D.2.4

arXiv:2102.02086 [pdf, other]

Focusing Knowledge-based Graph Argument Mining via Topic Modeling

Authors: Patrick Abels, Zahra Ahmadi, Sophie Burkhardt, Benjamin Schiller, Iryna Gurevych, Stefan Kramer

Abstract: Decision-making usually takes five steps: identifying the problem, collecting data, extracting evidence, identifying pro and con arguments, and making decisions. Focusing on extracting evidence, this paper presents a hybrid model that combines latent Dirichlet allocation and word embeddings to obtain external knowledge from structured and unstructured data. We study the task of sentence-level argu… ▽ More Decision-making usually takes five steps: identifying the problem, collecting data, extracting evidence, identifying pro and con arguments, and making decisions. Focusing on extracting evidence, this paper presents a hybrid model that combines latent Dirichlet allocation and word embeddings to obtain external knowledge from structured and unstructured data. We study the task of sentence-level argument mining, as arguments mostly require some degree of world knowledge to be identified and understood. Given a topic and a sentence, the goal is to classify whether a sentence represents an argument in regard to the topic. We use a topic model to extract topic- and sentence-specific evidence from the structured knowledge base Wikidata, building a graph based on the cosine similarity between the entity word vectors of Wikidata and the vector of the given sentence. Also, we build a second graph based on topic-specific articles found via Google to tackle the general incompleteness of structured knowledge bases. Combining these graphs, we obtain a graph-based model which, as our evaluation shows, successfully capitalizes on both structured and unstructured data. △ Less

Submitted 3 February, 2021; originally announced February 2021.

arXiv:2012.08459 [pdf, other]

Rule Extraction from Binary Neural Networks with Convolutional Rules for Model Validation

Authors: Sophie Burkhardt, Jannis Brugger, Nicolas Wagner, Zahra Ahmadi, Kristian Kersting, Stefan Kramer

Abstract: Most deep neural networks are considered to be black boxes, meaning their output is hard to interpret. In contrast, logical expressions are considered to be more comprehensible since they use symbols that are semantically close to natural language instead of distributed representations. However, for high-dimensional input data such as images, the individual symbols, i.e. pixels, are not easily int… ▽ More Most deep neural networks are considered to be black boxes, meaning their output is hard to interpret. In contrast, logical expressions are considered to be more comprehensible since they use symbols that are semantically close to natural language instead of distributed representations. However, for high-dimensional input data such as images, the individual symbols, i.e. pixels, are not easily interpretable. We introduce the concept of first-order convolutional rules, which are logical rules that can be extracted using a convolutional neural network (CNN), and whose complexity depends on the size of the convolutional filter and not on the dimensionality of the input. Our approach is based on rule extraction from binary neural networks with stochastic local search. We show how to extract rules that are not necessarily short, but characteristic of the input, and easy to visualize. Our experiments show that the proposed approach is able to model the functionality of the neural network while at the same time producing interpretable logical rules. △ Less

Submitted 15 December, 2020; originally announced December 2020.

arXiv:1804.01491 [pdf, other]

Online Multi-Label Classification: A Label Compression Method

Authors: Zahra Ahmadi, Stefan Kramer

Abstract: Many modern applications deal with multi-label data, such as functional categorizations of genes, image labeling and text categorization. Classification of such data with a large number of labels and latent dependencies among them is a challenging task, and it becomes even more challenging when the data is received online and in chunks. Many of the current multi-label classification methods requir… ▽ More Many modern applications deal with multi-label data, such as functional categorizations of genes, image labeling and text categorization. Classification of such data with a large number of labels and latent dependencies among them is a challenging task, and it becomes even more challenging when the data is received online and in chunks. Many of the current multi-label classification methods require a lot of time and memory, which make them infeasible for practical real-world applications. In this paper, we propose a fast linear label space dimension reduction method that transforms the labels into a reduced encoded space and trains models on the obtained pseudo labels. Additionally, it provides an analytical method to update the decoding matrix which maps the labels into the original space and is used during the test phase. Experimental results show the effectiveness of this approach in terms of running times and the prediction performance over different measures. △ Less

Submitted 4 April, 2018; originally announced April 2018.

Showing 1–8 of 8 results for author: Ahmadi, Z