Search | arXiv e-print repository

C2P-GCN: Cell-to-Patch Graph Convolutional Network for Colorectal Cancer Grading

Authors: Sudipta Paul, Bulent Yener, Amanda W. Lund

Abstract: Graph-based learning approaches, due to their ability to encode tissue/organ structure information, are increasingly favored for grading colorectal cancer histology images. Recent graph-based techniques involve dividing whole slide images (WSIs) into smaller or medium-sized patches, and then building graphs on each patch for direct use in training. This method, however, fails to capture the tissue… ▽ More Graph-based learning approaches, due to their ability to encode tissue/organ structure information, are increasingly favored for grading colorectal cancer histology images. Recent graph-based techniques involve dividing whole slide images (WSIs) into smaller or medium-sized patches, and then building graphs on each patch for direct use in training. This method, however, fails to capture the tissue structure information present in an entire WSI and relies on training from a significantly large dataset of image patches. In this paper, we propose a novel cell-to-patch graph convolutional network (C2P-GCN), which is a two-stage graph formation-based approach. In the first stage, it forms a patch-level graph based on the cell organization on each patch of a WSI. In the second stage, it forms an image-level graph based on a similarity measure between patches of a WSI considering each patch as a node of a graph. This graph representation is then fed into a multi-layer GCN-based classification network. Our approach, through its dual-phase graph construction, effectively gathers local structural details from individual patches and establishes a meaningful connection among all patches across a WSI. As C2P-GCN integrates the structural data of an entire WSI into a single graph, it allows our model to work with significantly fewer training data compared to the latest models for colorectal cancer. Experimental validation of C2P-GCN on two distinct colorectal cancer datasets demonstrates the effectiveness of our method. △ Less

Submitted 13 May, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

Comments: Accepted at IEEE EMBC 2024

arXiv:2308.15027 [pdf, ps, other]

Improving Neural Ranking Models with Traditional IR Methods

Authors: Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

Abstract: Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions. Nevertheless, they are computationally expensive to create, and require a great deal of labeled data for specialized corpora. In this paper, we explore a low resource alternative which is a bag-of-embedding… ▽ More Neural ranking methods based on large transformer models have recently gained significant attention in the information retrieval community, and have been adopted by major commercial solutions. Nevertheless, they are computationally expensive to create, and require a great deal of labeled data for specialized corpora. In this paper, we explore a low resource alternative which is a bag-of-embedding model for document retrieval and find that it is competitive with large transformer models fine tuned on information retrieval tasks. Our results show that a simple combination of TF-IDF, a traditional keyword matching method, with a shallow embedding model provides a low cost path to compete well with the performance of complex neural ranking models on 3 datasets. Furthermore, adding TF-IDF measures improves the performance of large-scale fine tuned models on these tasks. △ Less

Submitted 29 August, 2023; originally announced August 2023.

Comments: Short paper, 4 pages

arXiv:2308.03891 [pdf, other]

A Cross-Domain Evaluation of Approaches for Causal Knowledge Extraction

Authors: Anik Saha, Oktie Hassanzadeh, Alex Gittens, Jian Ni, Kavitha Srinivas, Bulent Yener

Abstract: Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging… ▽ More Causal knowledge extraction is the task of extracting relevant causes and effects from text by detecting the causal relation. Although this task is important for language understanding and knowledge discovery, recent works in this domain have largely focused on binary classification of a text segment as causal or non-causal. In this regard, we perform a thorough analysis of three sequence tagging models for causal knowledge extraction and compare it with a span based approach to causality extraction. Our experiments show that embeddings from pre-trained language models (e.g. BERT) provide a significant performance boost on this task compared to previous state-of-the-art models with complex architectures. We observe that span based models perform better than simple sequence tagging models based on BERT across all 4 data sets from diverse domains with different types of cause-effect phrases. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2305.20043 [pdf, ps, other]

Deception by Omission: Using Adversarial Missingness to Poison Causal Structure Learning

Authors: Deniz Koyuncu, Alex Gittens, Bülent Yener, Moti Yung

Abstract: Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate causal structural models (SCMs). However, when the data can be audited for correctness (e.g., it is c… ▽ More Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate causal structural models (SCMs). However, when the data can be audited for correctness (e.g., it is crytographically signed by its source), this adversarial mechanism is invalidated. This work introduces a novel attack methodology wherein the adversary deceptively omits a portion of the true training data to bias the learned causal structures in a desired manner. Theoretically sound attack mechanisms are derived for the case of arbitrary SCMs, and a sample-efficient learning-based heuristic is given for Gaussian SCMs. Experimental validation of these approaches on real and synthetic data sets demonstrates the effectiveness of adversarial missingness attacks at deceiving popular causal structure learning algorithms. △ Less

Submitted 31 May, 2023; originally announced May 2023.

arXiv:2304.10642 [pdf, other]

Word Sense Induction with Knowledge Distillation from BERT

Authors: Anik Saha, Alex Gittens, Bulent Yener

Abstract: Pre-trained contextual language models are ubiquitously employed for language understanding tasks, but are unsuitable for resource-constrained systems. Noncontextual word embeddings are an efficient alternative in these settings. Such methods typically use one vector to encode multiple different meanings of a word, and incur errors due to polysemy. This paper proposes a two-stage method to distill… ▽ More Pre-trained contextual language models are ubiquitously employed for language understanding tasks, but are unsuitable for resource-constrained systems. Noncontextual word embeddings are an efficient alternative in these settings. Such methods typically use one vector to encode multiple different meanings of a word, and incur errors due to polysemy. This paper proposes a two-stage method to distill multiple word senses from a pre-trained language model (BERT) by using attention over the senses of a word in a context and transferring this sense information to fit multi-sense embeddings in a skip-gram-like framework. We demonstrate an effective approach to training the sense disambiguation mechanism in our model with a distribution over word senses extracted from the output layer embeddings of BERT. Experiments on the contextual word similarity and sense induction tasks show that this method is superior to or competitive with state-of-the-art multi-sense embeddings on multiple benchmark data sets, and experiments with an embedding-based topic model (ETM) demonstrates the benefits of using this multi-sense embedding in a downstream application. △ Less

Submitted 20 April, 2023; originally announced April 2023.

arXiv:2202.13520 [pdf, other]

Anti-Malware Sandbox Games

Authors: Sujoy Sikdar, Sikai Ruan, Qishen Han, Paween Pitimanaaree, Jeremy Blackthorne, Bulent Yener, Lirong Xia

Abstract: We develop a game theoretic model of malware protection using the state-of-the-art sandbox method, to characterize and compute optimal defense strategies for anti-malware. We model the strategic interaction between developers of malware (M) and anti-malware (AM) as a two player game, where AM commits to a strategy of generating sandbox environments, and M responds by choosing to either attack or h… ▽ More We develop a game theoretic model of malware protection using the state-of-the-art sandbox method, to characterize and compute optimal defense strategies for anti-malware. We model the strategic interaction between developers of malware (M) and anti-malware (AM) as a two player game, where AM commits to a strategy of generating sandbox environments, and M responds by choosing to either attack or hide malicious activity based on the environment it senses. We characterize the condition for AM to protect all its machines, and identify conditions under which an optimal AM strategy can be computed efficiently. For other cases, we provide a quadratically constrained quadratic program (QCQP)-based optimization framework to compute the optimal AM strategy. In addition, we identify a natural and easy to compute strategy for AM, which as we show empirically, achieves AM utility that is close to the optimal AM utility, in equilibrium. △ Less

Submitted 27 February, 2022; originally announced February 2022.

arXiv:2202.13054 [pdf, other]

Missing Value Knockoffs

Authors: Deniz Koyuncu, Bülent Yener

Abstract: One limitation of the most statistical/machine learning-based variable selection approaches is their inability to control the false selections. A recently introduced framework, model-x knockoffs, provides that to a wide range of models but lacks support for datasets with missing values. In this work, we discuss ways of preserving the theoretical guarantees of the model-x framework in the missing d… ▽ More One limitation of the most statistical/machine learning-based variable selection approaches is their inability to control the false selections. A recently introduced framework, model-x knockoffs, provides that to a wide range of models but lacks support for datasets with missing values. In this work, we discuss ways of preserving the theoretical guarantees of the model-x framework in the missing data setting. First, we prove that posterior sampled imputation allows reusing existing knockoff samplers in the presence of missing values. Second, we show that sampling knockoffs only for the observed variables and applying univariate imputation also preserves the false selection guarantees. Third, for the special case of latent variable models, we demonstrate how jointly imputing and sampling knockoffs can reduce the computational complexity. We have verified the theoretical findings with two different exploratory variable distributions and investigated how the missing data pattern, amount of correlation, the number of observations, and missing values affected the statistical power. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: 11 pages, 23 pages with supplementary material, 8 figures

arXiv:2107.03806 [pdf, other]

Output Randomization: A Novel Defense for both White-box and Black-box Adversarial Models

Authors: Daniel Park, Haidar Khan, Azer Khan, Alex Gittens, Bülent Yener

Abstract: Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model in a "white box" setting and to the opposite in a "black box" setting. In this paper, we explore the use of output randomization as a defense against attacks in both the black box and white box models and propose two defenses. In the firs… ▽ More Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model in a "white box" setting and to the opposite in a "black box" setting. In this paper, we explore the use of output randomization as a defense against attacks in both the black box and white box models and propose two defenses. In the first defense, we propose output randomization at test time to thwart finite difference attacks in black box settings. Since this type of attack relies on repeated queries to the model to estimate gradients, we investigate the use of randomization to thwart such adversaries from successfully creating adversarial examples. We empirically show that this defense can limit the success rate of a black box adversary using the Zeroth Order Optimization attack to 0%. Secondly, we propose output randomization training as a defense against white box adversaries. Unlike prior approaches that use randomization, our defense does not require its use at test time, eliminating the Backward Pass Differentiable Approximation attack, which was shown to be effective against other randomization defenses. Additionally, this defense has low overhead and is easily implemented, allowing it to be used together with other defenses across various model architectures. We evaluate output randomization training against the Projected Gradient Descent attacker and show that the defense can reduce the PGD attack's success rate down to 12% when using cross-entropy loss. △ Less

Submitted 8 July, 2021; originally announced July 2021.

Comments: This is a substantially changed version of an earlier preprint (arXiv:1905.09871)

arXiv:2011.08982 [pdf, other]

Patient-Specific Seizure Prediction Using Single Seizure Electroencephalography Recording

Authors: Zaid Bin Tariq, Arun Iyengar, Lara Marcuse, Hui Su, Bülent Yener

Abstract: Electroencephalogram (EEG) is a prominent way to measure the brain activity for studying epilepsy, thereby hel** in predicting seizures. Seizure prediction is an active research area with many deep learning based approaches dominating the recent literature for solving this problem. But these models require a considerable number of patient-specific seizures to be recorded for extracting the preic… ▽ More Electroencephalogram (EEG) is a prominent way to measure the brain activity for studying epilepsy, thereby hel** in predicting seizures. Seizure prediction is an active research area with many deep learning based approaches dominating the recent literature for solving this problem. But these models require a considerable number of patient-specific seizures to be recorded for extracting the preictal and interictal EEG data for training a classifier. The increase in sensitivity and specificity for seizure prediction using the machine learning models is noteworthy. However, the need for a significant number of patient-specific seizures and periodic retraining of the model because of non-stationary EEG creates difficulties for designing practical device for a patient. To mitigate this process, we propose a Siamese neural network based seizure prediction method that takes a wavelet transformed EEG tensor as an input with convolutional neural network (CNN) as the base network for detecting change-points in EEG. Compared to the solutions in the literature, which utilize days of EEG recordings, our method only needs one seizure for training which translates to less than ten minutes of preictal and interictal data while still getting comparable results to models which utilize multiple seizures for seizure prediction. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: 8 pages

arXiv:2011.05973 [pdf, ps, other]

doi 10.1145/3433667.3433670

A survey on practical adversarial examples for malware classifiers

Authors: Daniel Park, Bülent Yener

Abstract: Machine learning based solutions have been very helpful in solving problems that deal with immense amounts of data, such as malware detection and classification. However, deep neural networks have been found to be vulnerable to adversarial examples, or inputs that have been purposefully perturbed to result in an incorrect label. Researchers have shown that this vulnerability can be exploited to cr… ▽ More Machine learning based solutions have been very helpful in solving problems that deal with immense amounts of data, such as malware detection and classification. However, deep neural networks have been found to be vulnerable to adversarial examples, or inputs that have been purposefully perturbed to result in an incorrect label. Researchers have shown that this vulnerability can be exploited to create evasive malware samples. However, many proposed attacks do not generate an executable and instead generate a feature vector. To fully understand the impact of adversarial examples on malware detection, we review practical attacks against malware classifiers that generate executable adversarial malware examples. We also discuss current challenges in this area of research, as well as suggestions for improvement and future research directions. △ Less

Submitted 6 November, 2020; originally announced November 2020.

Comments: preprint. to appear in the Reversing and Offensive-oriented Trends Symposium(ROOTS) 2020

arXiv:2011.03476 [pdf, other]

Towards Obfuscated Malware Detection for Low Powered IoT Devices

Authors: Daniel Park, Hannah Powers, Benji Prashker, Leland Liu, Bülent Yener

Abstract: With the increased deployment of IoT and edge devices into commercial and user networks, these devices have become a new threat vector for malware authors. It is imperative to protect these devices as they become more prevalent in commercial and personal networks. However, due to their limited computational power and storage space, especially in the case of battery-powered devices, it is infeasibl… ▽ More With the increased deployment of IoT and edge devices into commercial and user networks, these devices have become a new threat vector for malware authors. It is imperative to protect these devices as they become more prevalent in commercial and personal networks. However, due to their limited computational power and storage space, especially in the case of battery-powered devices, it is infeasible to deploy state-of-the-art malware detectors onto these systems. In this work, we propose using and extracting features from Markov matrices constructed from opcode traces as a low cost feature for unobfuscated and obfuscated malware detection. We empirically show that our approach maintains a high detection rate while consuming less power than similar work. △ Less

Submitted 6 November, 2020; originally announced November 2020.

Comments: preprint. to appear at the International Conference on Machine Learning Applications (ICMLA) 2020

arXiv:2007.13417 [pdf, other]

doi 10.1063/5.0013720

Image-driven discriminative and generative machine learning algorithms for establishing microstructure-processing relationships

Authors: Wufei Ma, Elizabeth Kautz, Arun Baskaran, Aritra Chowdhury, Vineet Joshi, Bülent Yener, Daniel Lewis

Abstract: We investigate methods of microstructure representation for the purpose of predicting processing condition from microstructure image data. A binary alloy (uranium-molybdenum) that is currently under development as a nuclear fuel was studied for the purpose of develo** an improved machine learning approach to image recognition, characterization, and building predictive capabilities linking micros… ▽ More We investigate methods of microstructure representation for the purpose of predicting processing condition from microstructure image data. A binary alloy (uranium-molybdenum) that is currently under development as a nuclear fuel was studied for the purpose of develo** an improved machine learning approach to image recognition, characterization, and building predictive capabilities linking microstructure to processing conditions. Here, we test different microstructure representations and evaluate model performance based on the F1 score. A F1 score of 95.1% was achieved for distinguishing between micrographs corresponding to ten different thermo-mechanical material processing conditions. We find that our newly developed microstructure representation describes image data well, and the traditional approach of utilizing area fractions of different phases is insufficient for distinguishing between multiple classes using a relatively small, imbalanced original data set of 272 images. To explore the applicability of generative methods for supplementing such limited data sets, generative adversarial networks were trained to generate artificial microstructure images. Two different generative networks were trained and tested to assess performance. Challenges and best practices associated with applying machine learning to limited microstructure image data sets is also discussed. Our work has implications for quantitative microstructure analysis, and development of microstructure-processing relationships in limited data sets typical of metallurgical process design studies. △ Less

Submitted 27 July, 2020; originally announced July 2020.

Comments: 14 pages, 15 figures

arXiv:1906.05496 [pdf, other]

An image-driven machine learning approach to kinetic modeling of a discontinuous precipitation reaction

Authors: Elizabeth Kautz, Wufei Ma, Saumyadeep Jana, Arun Devaraj, Vineet Joshi, Bülent Yener, Daniel Lewis

Abstract: Micrograph quantification is an essential component of several materials science studies. Machine learning methods, in particular convolutional neural networks, have previously demonstrated performance in image recognition tasks across several disciplines (e.g. materials science, medical imaging, facial recognition). Here, we apply these well-established methods to develop an approach to microstru… ▽ More Micrograph quantification is an essential component of several materials science studies. Machine learning methods, in particular convolutional neural networks, have previously demonstrated performance in image recognition tasks across several disciplines (e.g. materials science, medical imaging, facial recognition). Here, we apply these well-established methods to develop an approach to microstructure quantification for kinetic modeling of a discontinuous precipitation reaction in a case study on the uranium-molybdenum system. Prediction of material processing history based on image data (classification), calculation of area fraction of phases present in the micrographs (segmentation), and kinetic modeling from segmentation results were performed. Results indicate that convolutional neural networks represent microstructure image data well, and segmentation using the k-means clustering algorithm yields results that agree well with manually annotated images. Classification accuracies of original and segmented images are both 94\% for a 5-class classification problem. Kinetic modeling results agree well with previously reported data using manual thresholding. The image quantification and kinetic modeling approach developed and presented here aims to reduce researcher bias introduced into the characterization process, and allows for leveraging information in limited image data sets. △ Less

Submitted 13 June, 2019; originally announced June 2019.

Comments: 30 pages, 8 figures

arXiv:1905.09876 [pdf, other]

Deep density ratio estimation for change point detection

Authors: Haidar Khan, Lara Marcuse, Bülent Yener

Abstract: In this work, we propose new objective functions to train deep neural network based density ratio estimators and apply it to a change point detection problem. Existing methods use linear combinations of kernels to approximate the density ratio function by solving a convex constrained minimization problem. Approximating the density ratio function using a deep neural network requires defining a suit… ▽ More In this work, we propose new objective functions to train deep neural network based density ratio estimators and apply it to a change point detection problem. Existing methods use linear combinations of kernels to approximate the density ratio function by solving a convex constrained minimization problem. Approximating the density ratio function using a deep neural network requires defining a suitable objective function to optimize. We formulate and compare objective functions that can be minimized using gradient descent and show that the network can effectively learn to approximate the density ratio function. Using our deep density ratio estimation objective function results in better performance on a seizure detection task than other (kernel and neural network based) density ratio estimation methods and other window-based change point detection algorithms. We also show that the method can still support other neural network architectures, such as convolutional networks. △ Less

Submitted 23 May, 2019; originally announced May 2019.

arXiv:1905.09871 [pdf, other]

Thwarting finite difference adversarial attacks with output randomization

Authors: Haidar Khan, Daniel Park, Azer Khan, Bülent Yener

Abstract: Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model and to the opposite "black box" setting. Black box attacks are particularly threatening as the adversary only needs access to the input and output of the model. Defending against black box adversarial example generation attacks is paramou… ▽ More Adversarial examples pose a threat to deep neural network models in a variety of scenarios, from settings where the adversary has complete knowledge of the model and to the opposite "black box" setting. Black box attacks are particularly threatening as the adversary only needs access to the input and output of the model. Defending against black box adversarial example generation attacks is paramount as currently proposed defenses are not effective. Since these types of attacks rely on repeated queries to the model to estimate gradients over input dimensions, we investigate the use of randomization to thwart such adversaries from successfully creating adversarial examples. Randomization applied to the output of the deep neural network model has the potential to confuse potential attackers, however this introduces a tradeoff between accuracy and robustness. We show that for certain types of randomization, we can bound the probability of introducing errors by carefully setting distributional parameters. For the particular case of finite difference black box attacks, we quantify the error introduced by the defense in the finite difference estimate of the gradient. Lastly, we show empirically that the defense can thwart two adaptive black box adversarial attack algorithms. △ Less

Submitted 23 May, 2019; originally announced May 2019.

arXiv:1904.04802 [pdf, other]

doi 10.1109/ICMLA.2019.00210

Generation & Evaluation of Adversarial Examples for Malware Obfuscation

Authors: Daniel Park, Haidar Khan, Bülent Yener

Abstract: There has been an increased interest in the application of convolutional neural networks for image based malware classification, but the susceptibility of neural networks to adversarial examples allows malicious actors to evade classifiers. Adversarial examples are usually generated by adding small perturbations to the input that are unrecognizable to humans, but the same approach is not effective… ▽ More There has been an increased interest in the application of convolutional neural networks for image based malware classification, but the susceptibility of neural networks to adversarial examples allows malicious actors to evade classifiers. Adversarial examples are usually generated by adding small perturbations to the input that are unrecognizable to humans, but the same approach is not effective with malware. In general, these perturbations cause changes in the byte sequences that change the initial functionality or result in un-executable binaries. We present a generative model for executable adversarial malware examples using obfuscation that achieves a high misclassification rate, up to 100% and 98% in white-box and black-box settings respectively, and demonstrates transferability. We further evaluate the effectiveness of the proposed method by reporting insignificant change in the evasion rate of our adversarial examples against popular defense strategies. △ Less

Submitted 15 October, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

Comments: Preprint

arXiv:1903.02521 [pdf, other]

Quantifying error contributions of computational steps, algorithms and hyperparameter choices in image classification pipelines

Authors: Aritra Chowdhury, Malik Magdin-Ismail, Bulent Yener

Abstract: Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of several hyperparameters. Algorithms and hyperparameters must be optimized as a whole to produce the best performance. Typical machine learning pipelines typically co… ▽ More Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of several hyperparameters. Algorithms and hyperparameters must be optimized as a whole to produce the best performance. Typical machine learning pipelines typically consist of complex algorithms in each of the steps. Not only is the selection process combinatorial, but it is also important to interpret and understand the pipelines. We propose a method to quantify the importance of different layers in the pipeline, by computing an error contribution relative to an agnostic choice of algorithms in that layer. We demonstrate our methodology on image classification pipelines. The agnostic methodology quantifies the error contributions from the computational steps, algorithms and hyperparameters in the image classification pipeline. We show that algorithm selection and hyper-parameter optimization methods can be used to quantify the error contribution and that random search is able to quantify the contribution more accurately than Bayesian optimization. This methodology can be used by domain experts to understand machine learning and data analysis pipelines in terms of their individual components, which can help in prioritizing different components of the pipeline. △ Less

Submitted 25 February, 2019; originally announced March 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1903.00405

arXiv:1903.00405 [pdf, other]

Quantifying contribution and propagation of error from computational steps, algorithms and hyperparameter choices in image classification pipelines

Authors: Aritra Chowdhury, Malik Magdon-Ismail, Bulent Yener

Abstract: Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of several hyperparameters. Algorithms and hyperparameters must be optimized as a whole to produce the best performance. Typical machine learning pipelines consist of c… ▽ More Data science relies on pipelines that are organized in the form of interdependent computational steps. Each step consists of various candidate algorithms that maybe used for performing a particular function. Each algorithm consists of several hyperparameters. Algorithms and hyperparameters must be optimized as a whole to produce the best performance. Typical machine learning pipelines consist of complex algorithms in each of the steps. Not only is the selection process combinatorial, but it is also important to interpret and understand the pipelines. We propose a method to quantify the importance of different components in the pipeline, by computing an error contribution relative to an agnostic choice of computational steps, algorithms and hyperparameters. We also propose a methodology to quantify the propagation of error from individual components of the pipeline with the help of a naive set of benchmark algorithms not involved in the pipeline. We demonstrate our methodology on image classification pipelines. The agnostic and naive methodologies quantify the error contribution and propagation respectively from the computational steps, algorithms and hyperparameters in the image classification pipeline. We show that algorithm selection and hyperparameter optimization methods like grid search, random search and Bayesian optimization can be used to quantify the error contribution and propagation, and that random search is able to quantify them more accurately than Bayesian optimization. This methodology can be used by domain experts to understand machine learning and data analysis pipelines in terms of their individual components, which can help in prioritizing different components of the pipeline. △ Less

Submitted 21 February, 2019; originally announced March 2019.

arXiv:1805.11576 [pdf, other]

doi 10.1109/TBME.2017.2785401

Focal onset seizure prediction using convolutional networks

Authors: Haidar Khan, Lara Marcuse, Madeline Fields, Kalina Swann, Bülent Yener

Abstract: Objective: This work investigates the hypothesis that focal seizures can be predicted using scalp electroencephalogram (EEG) data. Our first aim is to learn features that distinguish between the interictal and preictal regions. The second aim is to define a prediction horizon in which the prediction is as accurate and as early as possible, clearly two competing objectives. Methods: Convolutional f… ▽ More Objective: This work investigates the hypothesis that focal seizures can be predicted using scalp electroencephalogram (EEG) data. Our first aim is to learn features that distinguish between the interictal and preictal regions. The second aim is to define a prediction horizon in which the prediction is as accurate and as early as possible, clearly two competing objectives. Methods: Convolutional filters on the wavelet transformation of the EEG signal are used to define and learn quantitative signatures for each period: interictal, preictal, and ictal. The optimal seizure prediction horizon is also learned from the data as opposed to making an a priori assumption. Results: Computational solutions to the optimization problem indicate a ten-minute seizure prediction horizon. This result is verified by measuring Kullback-Leibler divergence on the distributions of the automatically extracted features. Conclusion: The results on the EEG database of 204 recordings demonstrate that (i) the preictal phase transition occurs approximately ten minutes before seizure onset, and (ii) the prediction results on the test set are promising, with a sensitivity of 87.8% and a low false prediction rate of 0.142 FP/h. Our results significantly outperform a random predictor and other seizure prediction algorithms. Significance: We demonstrate that a robust set of features can be learned from scalp EEG that characterize the preictal state of focal seizures. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: 8 pages, 6 figures, 3 tables

Journal ref: Khan, H., Marcuse, L., Fields, M., Swann, K., & Yener, B. (2017). Focal onset seizure prediction using convolutional networks. IEEE Transactions on Biomedical Engineering

arXiv:1311.4591 [pdf, ps, other]

On the Security of Key Extraction from Measuring Physical Quantities

Authors: Matt Edman, Aggelos Kiayias, Qiang Tang, Bulent Yener

Abstract: Key extraction via measuring a physical quantity is a class of information theoretic key exchange protocols that rely on the physical characteristics of the communication channel to enable the computation of a shared key by two (or more) parties that share no prior secret information. The key is supposed to be information theoretically hidden to an eavesdropper. Despite the recent surge of researc… ▽ More Key extraction via measuring a physical quantity is a class of information theoretic key exchange protocols that rely on the physical characteristics of the communication channel to enable the computation of a shared key by two (or more) parties that share no prior secret information. The key is supposed to be information theoretically hidden to an eavesdropper. Despite the recent surge of research activity in the area, concrete claims about the security of the protocols typically rely on channel abstractions that are not fully experimentally substantiated. In this work, we propose a novel methodology for the {\em experimental} security analysis of these protocols. The crux of our methodology is a falsifiable channel abstraction that is accompanied by an efficient experimental approximation algorithm of the {\em conditional min-entropy} available to the two parties given the view of the eavesdropper. We focus on the signal strength between two wirelessly communicating transceivers as the measured quantity and we use an experimental setup to compute the conditional min-entropy of the channel given the view of the attacker which we find to be linearly increasing. Armed with this understanding of the channel, we showcase the methodology by providing a general protocol for key extraction in this setting that is shown to be secure for a concrete parameter selection. In this way we provide a first comprehensively analyzed wireless key extraction protocol that is demonstrably secure against passive adversaries. Our methodology uses hidden Markov models as the channel model and a dynamic programming approach to approximate conditional min-entropy but other possible instantiations of the methodology can be motivated by our work. △ Less

Submitted 16 December, 2015; v1 submitted 18 November, 2013; originally announced November 2013.

Showing 1–20 of 20 results for author: Yener, B