-
Imaging magnetic spiral phases, skyrmion clusters, and skyrmion displacements at the surface of bulk Cu$_2$OSeO$_3$
Authors:
E. Marchiori,
G. Romagnoli,
L. Schneider,
B. Gross,
P. Sahafi,
A. Jordan,
R. Budakian,
P. R. Baral,
A. Magrez,
J. S. White,
M. Poggio
Abstract:
Surfaces -- by breaking bulk symmetries, introducing roughness, or hosting defects -- can significantly influence magnetic order in magnetic materials. Determining their effect on the complex nanometer-scale phases present in certain non-centrosymmetric magnets is an outstanding problem requiring high-resolution magnetic microscopy. Here, we use scanning SQUID-on-tip microscopy to image the surfac…
▽ More
Surfaces -- by breaking bulk symmetries, introducing roughness, or hosting defects -- can significantly influence magnetic order in magnetic materials. Determining their effect on the complex nanometer-scale phases present in certain non-centrosymmetric magnets is an outstanding problem requiring high-resolution magnetic microscopy. Here, we use scanning SQUID-on-tip microscopy to image the surface of bulk Cu$_2$OSeO$_3$ at low temperature and in a magnetic field applied along $\left\langle100\right\rangle$. Real-space maps measured as a function of applied field reveal the microscopic structure of the magnetic phases and their transitions. In low applied field, we observe a magnetic texture consistent with an in-plane stripe phase, pointing to the existence of a distinct surface state. In the low-temperature skyrmion phase, the surface is populated by clusters of disordered skyrmions, which emerge from rupturing domains of the tilted spiral phase. Furthermore, we displace individual skyrmions from their pinning sites by applying an electric potential to the scanning probe, thereby demonstrating local skyrmion control at the surface of a magnetoelectric insulator.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Fabrication of Nb and MoGe SQUID-on-tip probes by magnetron sputtering
Authors:
G. Romagnoli,
E. Marchiori,
K. Bagani,
M. Poggio
Abstract:
We demonstrate the fabrication of scanning superconducting quantum interference devices (SQUIDs) on the apex of sharp quartz scanning probes -- known as SQUID-on-tip probes -- using conventional magnetron sputtering. We produce and characterize SQUID-on-tips made of both Nb and MoGe with effective diameters ranging from 50 to 80 nm, magnetic flux noise down to \SI{300}{\nanoΦ_{0}/\sqrt{\hertz}}, a…
▽ More
We demonstrate the fabrication of scanning superconducting quantum interference devices (SQUIDs) on the apex of sharp quartz scanning probes -- known as SQUID-on-tip probes -- using conventional magnetron sputtering. We produce and characterize SQUID-on-tips made of both Nb and MoGe with effective diameters ranging from 50 to 80 nm, magnetic flux noise down to \SI{300}{\nanoΦ_{0}/\sqrt{\hertz}}, and operating fields as high as \SI{2.5}{\tesla}. Compared to the SQUID-on-tip fabrication techniques used until now, including thermal evaporation and collimated sputtering, this simplified method facilitates experimentation with different materials, potentially expanding the functionality and operating conditions of these sensitive nanometer-scale scanning probes.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Android Private Compute Core Architecture
Authors:
Eugenio Marchiori,
Sarah de Haas,
Sergey Volnov,
Ronnie Falcon,
Roxanne Pinto,
Marco Zamarato
Abstract:
Android's Private Compute Core (PCC) is a secure, isolated environment within the operating system, that maintains separation from apps while enabling users and developers to maintain control over their data. It is backed by open-source code in the Android Framework introduced in Android 12. PCC allows features to communicate with a server to receive model updates and contribute to global model tr…
▽ More
Android's Private Compute Core (PCC) is a secure, isolated environment within the operating system, that maintains separation from apps while enabling users and developers to maintain control over their data. It is backed by open-source code in the Android Framework introduced in Android 12. PCC allows features to communicate with a server to receive model updates and contribute to global model training through Private Compute Services (PCS), the core of which has been open sourced. PCC is part of the OS, and by virtue of being isolated, constrained, and trusted, it can host sophisticated ML features. The hosted features themselves, running inside PCC, can be closed source and updatable. In this way, PCC enables machine learning features to process ambient and OS-level data and improve over time, while restricting the availability of information about individual users to servers or apps.
△ Less
Submitted 22 September, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Magnetic imaging of superconducting qubit devices with scanning SQUID-on-tip
Authors:
E. Marchiori,
L. Ceccarelli,
N. Rossi,
G. Romagnoli,
J. Herrmann,
J. -C. Besse,
S. Krinner,
A. Wallraff,
M. Poggio
Abstract:
We use a scanning superconducting quantum interference device (SQUID) to image the magnetic flux produced by a superconducting device designed for quantum computing. The nanometer-scale SQUID-on-tip probe reveals the flow of superconducting current through the circuit as well as the locations of trapped magnetic flux. In particular, maps of current flowing out of a flux-control line in the vicinit…
▽ More
We use a scanning superconducting quantum interference device (SQUID) to image the magnetic flux produced by a superconducting device designed for quantum computing. The nanometer-scale SQUID-on-tip probe reveals the flow of superconducting current through the circuit as well as the locations of trapped magnetic flux. In particular, maps of current flowing out of a flux-control line in the vicinity of a qubit show how these elements are coupled, providing insight on how to optimize qubit control.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Magnetic, thermal, and topographic imaging with a nanometer-scale SQUID-on-cantilever scanning probe
Authors:
M. Wyss,
K. Bagani,
D. Jetter,
E. Marchiori,
A. Vervelaki,
B. Gross,
J. Ridderbos,
S. Gliga,
C. Schönenberger,
M. Poggio
Abstract:
Scanning superconducting quantum interference device (SQUID) microscopy is a magnetic imaging technique combining high-field sensitivity with nanometer-scale spatial resolution. State-of-the-art SQUID-on-tip probes are now playing an important role in map** correlation phenomena, such as superconductivity and magnetism, which have recently been observed in two-dimensional van der Waals materials…
▽ More
Scanning superconducting quantum interference device (SQUID) microscopy is a magnetic imaging technique combining high-field sensitivity with nanometer-scale spatial resolution. State-of-the-art SQUID-on-tip probes are now playing an important role in map** correlation phenomena, such as superconductivity and magnetism, which have recently been observed in two-dimensional van der Waals materials. Here, we demonstrate a scanning probe that combines the magnetic and thermal imaging provided by an on-tip SQUID with the tip-sample distance control and topographic contrast of a non-contact atomic force microscope (AFM). We pattern the nanometer-scale SQUID, including its weak-link Josephson junctions, via focused ion beam milling at the apex of a cantilever coated with Nb, yielding a sensor with an effective diameter of 365 nm, field sensitivity of 9.5 $\text{nT}/\sqrt{\text{Hz}}$ and thermal sensitivity of 620 $\text{nK}/\sqrt{\text{Hz}}$, operating in magnetic fields up to 1.0 T. The resulting SQUID-on-lever is a robust AFM-like scanning probe that expands the reach of sensitive nanometer-scale magnetic and thermal imaging beyond what is currently possible.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Multi-view analysis of unregistered medical images using cross-view transformers
Authors:
Gijs van Tulder,
Yao Tong,
Elena Marchiori
Abstract:
Multi-view medical image analysis often depends on the combination of information from multiple views. However, differences in perspective or other forms of misalignment can make it difficult to combine views effectively, as registration is not always possible. Without registration, views can only be combined at a global feature level, by joining feature vectors after global pooling. We present a…
▽ More
Multi-view medical image analysis often depends on the combination of information from multiple views. However, differences in perspective or other forms of misalignment can make it difficult to combine views effectively, as registration is not always possible. Without registration, views can only be combined at a global feature level, by joining feature vectors after global pooling. We present a novel cross-view transformer method to transfer information between unregistered views at the level of spatial feature maps. We demonstrate this method on multi-view mammography and chest X-ray datasets. On both datasets, we find that a cross-view transformer that links spatial feature maps can outperform a baseline model that joins feature vectors after global pooling.
△ Less
Submitted 23 September, 2021; v1 submitted 21 March, 2021;
originally announced March 2021.
-
Technical Review: Imaging weak magnetic field patterns on the nanometer-scale and its application to 2D materials
Authors:
Estefani Marchiori,
Lorenzo Ceccarelli,
Nicola Rossi,
Luca Lorenzelli,
Christian L. Degen,
Martino Poggio
Abstract:
Nanometer-scale imaging of magnetization and current density is the key to deciphering the mechanisms behind a variety of new and poorly understood condensed matter phenomena. The recently discovered correlated states hosted in atomically layered materials such as twisted bilayer graphene or van der Waals heterostructures are noteworthy examples. Manifestations of these states range from supercond…
▽ More
Nanometer-scale imaging of magnetization and current density is the key to deciphering the mechanisms behind a variety of new and poorly understood condensed matter phenomena. The recently discovered correlated states hosted in atomically layered materials such as twisted bilayer graphene or van der Waals heterostructures are noteworthy examples. Manifestations of these states range from superconductivity, to highly insulating states, to magnetism. Their fragility and susceptibility to spatial inhomogeneities limits their macroscopic manifestation and complicates conventional transport or magnetization measurements, which integrate over an entire sample. In contrast, techniques for imaging weak magnetic field patterns with high spatial resolution overcome inhomogeneity by measuring the local fields produced by magnetization and current density. Already, such imaging techniques have shown the vulnerability of correlated states in twisted bilayer graphene to twist-angle disorder and revealed the complex current flows in quantum Hall edge states. Here, we review the state-of-the-art techniques most amenable to the investigation of such systems, because they combine the highest magnetic field sensitivity with the highest spatial resolution and are minimally invasive: magnetic force microscopy, scanning superconducting quantum interference device microscopy, and scanning nitrogen-vacancy center microscopy. We compare the capabilities of these techniques, their required operating conditions, and assess their suitability to different types of source contrast, in particular magnetization and current density. Finally, we focus on the prospects for improving each technique and speculate on its potential impact, especially in the rapidly growing field of two-dimensional (2D) materials.
△ Less
Submitted 18 March, 2021;
originally announced March 2021.
-
Gaussian Processes with Skewed Laplace Spectral Mixture Kernels for Long-term Forecasting
Authors:
Kai Chen,
Twan van Laarhoven,
Elena Marchiori
Abstract:
Long-term forecasting involves predicting a horizon that is far ahead of the last observation. It is a problem of high practical relevance, for instance for companies in order to decide upon expensive long-term investments. Despite the recent progress and success of Gaussian processes (GPs) based on spectral mixture kernels, long-term forecasting remains a challenging problem for these kernels bec…
▽ More
Long-term forecasting involves predicting a horizon that is far ahead of the last observation. It is a problem of high practical relevance, for instance for companies in order to decide upon expensive long-term investments. Despite the recent progress and success of Gaussian processes (GPs) based on spectral mixture kernels, long-term forecasting remains a challenging problem for these kernels because they decay exponentially at large horizons. This is mainly due to their use of a mixture of Gaussians to model spectral densities. Characteristics of the signal important for long-term forecasting can be unravelled by investigating the distribution of the Fourier coefficients of (the training part of) the signal, which is non-smooth, heavy-tailed, sparse, and skewed. The heavy tail and skewness characteristics of such distributions in the spectral domain allow to capture long-range covariance of the signal in the time domain. Motivated by these observations, we propose to model spectral densities using a skewed Laplace spectral mixture (SLSM) due to the skewness of its peaks, sparsity, non-smoothness, and heavy tail characteristics. By applying the inverse Fourier Transform to this spectral density we obtain a new GP kernel for long-term forecasting. In addition, we adapt the lottery ticket method, originally developed to prune weights of a neural network, to GPs in order to automatically select the number of kernel components. Results of extensive experiments, including a multivariate time series, show the beneficial effect of the proposed SLSM kernel for long-term extrapolation and robustness to the choice of the number of mixture components.
△ Less
Submitted 2 October, 2021; v1 submitted 8 November, 2020;
originally announced November 2020.
-
Unsupervised Domain Adaptation using Graph Transduction Games
Authors:
Sebastiano Vascon,
Sinem Aslan,
Alessandro Torcinovich,
Twan van Laarhoven,
Elena Marchiori,
Marcello Pelillo
Abstract:
Unsupervised domain adaptation (UDA) amounts to assigning class labels to the unlabeled instances of a dataset from a target domain, using labeled instances of a dataset from a related source domain. In this paper, we propose to cast this problem in a game-theoretic setting as a non-cooperative game and introduce a fully automatized iterative algorithm for UDA based on graph transduction games (GT…
▽ More
Unsupervised domain adaptation (UDA) amounts to assigning class labels to the unlabeled instances of a dataset from a target domain, using labeled instances of a dataset from a related source domain. In this paper, we propose to cast this problem in a game-theoretic setting as a non-cooperative game and introduce a fully automatized iterative algorithm for UDA based on graph transduction games (GTG). The main advantages of this approach are its principled foundation, guaranteed termination of the iterative algorithms to a Nash equilibrium (which corresponds to a consistent labeling condition) and soft labels quantifying the uncertainty of the label assignment process. We also investigate the beneficial effect of using pseudo-labels from linear classifiers to initialize the iterative process. The performance of the resulting methods is assessed on publicly available object recognition benchmark datasets involving both shallow and deep features. Results of experiments demonstrate the suitability of the proposed game-theoretic approach for solving UDA tasks.
△ Less
Submitted 6 May, 2019;
originally announced May 2019.
-
Vendor-independent soft tissue lesion detection using weakly supervised and unsupervised adversarial domain adaptation
Authors:
Joris van Vugt,
Elena Marchiori,
Ritse Mann,
Albert Gubern-Mérida,
Nikita Moriakov,
Jonas Teuwen
Abstract:
Computer-aided detection aims to improve breast cancer screening programs by hel** radiologists to evaluate digital mammography (DM) exams. DM exams are generated by devices from different vendors, with diverse characteristics between and even within vendors. Physical properties of these devices and postprocessing of the images can greatly influence the resulting mammogram. This results in the f…
▽ More
Computer-aided detection aims to improve breast cancer screening programs by hel** radiologists to evaluate digital mammography (DM) exams. DM exams are generated by devices from different vendors, with diverse characteristics between and even within vendors. Physical properties of these devices and postprocessing of the images can greatly influence the resulting mammogram. This results in the fact that a deep learning model trained on data from one vendor cannot readily be applied to data from another vendor. This paper investigates the use of tailored transfer learning methods based on adversarial learning to tackle this problem. We consider a database of DM exams (mostly bilateral and two views) generated by Hologic and Siemens vendors. We analyze two transfer learning settings: 1) unsupervised transfer, where Hologic data with soft lesion annotation at pixel level and Siemens unlabelled data are used to annotate images in the latter data; 2) weak supervised transfer, where exam level labels for images from the Siemens mammograph are available. We propose tailored variants of recent state-of-the-art methods for transfer learning which take into account the class imbalance and incorporate knowledge provided by the annotations at exam level. Results of experiments indicate the beneficial effect of transfer learning in both transfer settings. Notably, at 0.02 false positives per image, we achieve a sensitivity of 0.37, compared to 0.30 of a baseline with no transfer. Results indicate that using exam level annotations gives an additional increase in sensitivity.
△ Less
Submitted 14 August, 2018;
originally announced August 2018.
-
Multi-Output Convolution Spectral Mixture for Gaussian Processes
Authors:
Kai Chen,
Twan van Laarhoven,
Perry Groot,
**song Chen,
Elena Marchiori
Abstract:
Multi-output Gaussian processes (MOGPs) are an extension of Gaussian Processes (GPs) for predicting multiple output variables (also called channels, tasks) simultaneously. In this paper we use the convolution theorem to design a new kernel for MOGPs, by modeling cross channel dependencies through cross convolution of time and phase delayed components in the spectral domain. The resulting kernel is…
▽ More
Multi-output Gaussian processes (MOGPs) are an extension of Gaussian Processes (GPs) for predicting multiple output variables (also called channels, tasks) simultaneously. In this paper we use the convolution theorem to design a new kernel for MOGPs, by modeling cross channel dependencies through cross convolution of time and phase delayed components in the spectral domain. The resulting kernel is called Multi-Output Convolution Spectral Mixture (MOCSM) kernel. Results of extensive experiments on synthetic and real-life datasets demonstrate the advantages of the proposed kernel and its state of the art performance. MOCSM enjoys the desirable property to reduce to the well known Spectral Mixture (SM) kernel when a single-channel is considered. A comparison with the recently introduced Multi-Output Spectral Mixture kernel reveals that this is not the case for the latter kernel, which contains quadratic terms that generate undesirable scale effects when the spectral densities of different channels are either very close or very far from each other in the frequency domain.
△ Less
Submitted 7 October, 2021; v1 submitted 7 August, 2018;
originally announced August 2018.
-
Multitask Gaussian Process with Hierarchical Latent Interactions
Authors:
Kai Chen,
Twan van Laarhoven,
Elena Marchiori,
Feng Yin,
Shuguang Cui
Abstract:
Multitask Gaussian process (MTGP) is powerful for joint learning of multiple tasks with complicated correlation patterns. However, due to the assembling of additive independent latent functions, all current MTGPs including the salient linear model of coregionalization (LMC) and convolution frameworks cannot effectively represent and learn the hierarchical latent interactions between its latent fun…
▽ More
Multitask Gaussian process (MTGP) is powerful for joint learning of multiple tasks with complicated correlation patterns. However, due to the assembling of additive independent latent functions, all current MTGPs including the salient linear model of coregionalization (LMC) and convolution frameworks cannot effectively represent and learn the hierarchical latent interactions between its latent functions. In this paper, we further investigate the interactions in LMC of MTGP and then propose a novel kernel representation of the hierarchical interactions, which ameliorates both the expressiveness and the interpretability of MTGP. Specifically, we express the interaction as a product of function interaction and coefficient interaction. The function interaction is modeled by using cross convolution of latent functions. The coefficient interaction between the LMCs is described as a cross coregionalization term. We validate that considering the interactions can promote knowledge transferring in MTGP and compare our approach with some state-of-the-art MTGPs on both synthetic- and real-world datasets.
△ Less
Submitted 2 October, 2021; v1 submitted 3 August, 2018;
originally announced August 2018.
-
Compressible Spectral Mixture Kernels with Sparse Dependency Structures for Gaussian Processes
Authors:
Kai Chen,
Yijue Dai,
Feng Yin,
Elena Marchiori,
Sergios Theodoridis
Abstract:
Spectral mixture (SM) kernels comprise a powerful class of generalized kernels for Gaussian processes (GPs) to describe complex patterns. This paper introduces model compression and time- and phase (TP) modulated dependency structures to the original (SM) kernel for improved generalization of GPs. Specifically, by adopting Bienaymés identity, we generalize the dependency structure through cross-co…
▽ More
Spectral mixture (SM) kernels comprise a powerful class of generalized kernels for Gaussian processes (GPs) to describe complex patterns. This paper introduces model compression and time- and phase (TP) modulated dependency structures to the original (SM) kernel for improved generalization of GPs. Specifically, by adopting Bienaymés identity, we generalize the dependency structure through cross-covariance between the SM components. Then, we propose a novel SM kernel with a dependency structure (SMD) by using cross-convolution between the SM components. Furthermore, we ameliorate the expressiveness of the dependency structure by parameterizing it with time and phase delays. The dependency structure has clear interpretations in terms of spectral density, covariance behavior, and sampling path. To enrich the SMD with effective hyperparameter initialization, compressible SM kernel components, and sparse dependency structures, we introduce a novel structure adaptation (SA) algorithm in the end. A thorough comparative analysis of the SMD on both synthetic and real-life applications corroborates its efficacy.
△ Less
Submitted 26 July, 2023; v1 submitted 1 August, 2018;
originally announced August 2018.
-
Adversarial Alignment of Class Prediction Uncertainties for Domain Adaptation
Authors:
Jeroen Manders,
Twan van Laarhoven,
Elena Marchiori
Abstract:
We consider unsupervised domain adaptation: given labelled examples from a source domain and unlabelled examples from a related target domain, the goal is to infer the labels of target examples. Under the assumption that features from pre-trained deep neural networks are transferable across related domains, domain adaptation reduces to aligning source and target domain at class prediction uncertai…
▽ More
We consider unsupervised domain adaptation: given labelled examples from a source domain and unlabelled examples from a related target domain, the goal is to infer the labels of target examples. Under the assumption that features from pre-trained deep neural networks are transferable across related domains, domain adaptation reduces to aligning source and target domain at class prediction uncertainty level. We tackle this problem by introducing a method based on adversarial learning which forces the label uncertainty predictions on the target domain to be indistinguishable from those on the source domain. Pre-trained deep neural networks are used to generate deep features having high transferability across related domains. We perform an extensive experimental analysis of the proposed method over a wide set of publicly available pre-trained deep neural networks. Results of our experiments on domain adaptation tasks for image classification show that class prediction uncertainty alignment with features extracted from pre-trained deep neural networks provides an efficient, robust and effective method for domain adaptation.
△ Less
Submitted 4 January, 2019; v1 submitted 12 April, 2018;
originally announced April 2018.
-
Domain Adaptation with Randomized Expectation Maximization
Authors:
Twan van Laarhoven,
Elena Marchiori
Abstract:
Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to…
▽ More
Domain adaptation (DA) is the task of classifying an unlabeled dataset (target) using a labeled dataset (source) from a related domain. The majority of successful DA methods try to directly match the distributions of the source and target data by transforming the feature space. Despite their success, state of the art methods based on this approach are either involved or unable to directly scale to data with many features. This article shows that domain adaptation can be successfully performed by using a very simple randomized expectation maximization (EM) method. We consider two instances of the method, which involve logistic regression and support vector machine, respectively. The underlying assumption of the proposed method is the existence of a good single linear classifier for both source and target domain. The potential limitations of this assumption are alleviated by the flexibility of the method, which can directly incorporate deep features extracted from a pre-trained deep neural network. The resulting algorithm is strikingly easy to implement and apply. We test its performance on 36 real-life adaptation tasks over text and image data with diverse characteristics. The method achieves state-of-the-art results, competitive with those of involved end-to-end deep transfer-learning methods.
△ Less
Submitted 20 March, 2018;
originally announced March 2018.
-
Spectral-spatial classification of hyperspectral images: three tricks and a new supervised learning setting
Authors:
Jacopo Acquarelli,
Elena Marchiori,
Lutgarde M. C. Buydens,
Thanh Tran,
Twan van Laarhoven
Abstract:
Spectral-spatial classification of hyperspectral images has been the subject of many studies in recent years. In the presence of only very few labeled pixels, this task becomes challenging. In this paper we address the following two research questions: 1) Can a simple neural network with just a single hidden layer achieve state of the art performance in the presence of few labeled pixels? 2) How i…
▽ More
Spectral-spatial classification of hyperspectral images has been the subject of many studies in recent years. In the presence of only very few labeled pixels, this task becomes challenging. In this paper we address the following two research questions: 1) Can a simple neural network with just a single hidden layer achieve state of the art performance in the presence of few labeled pixels? 2) How is the performance of hyperspectral image classification methods affected when using disjoint train and test sets? We give a positive answer to the first question by using three tricks within a very basic shallow Convolutional Neural Network (CNN) architecture: a tailored loss function, and smooth- and label-based data augmentation. The tailored loss function enforces that neighborhood wavelengths have similar contributions to the features generated during training. A new label-based technique here proposed favors selection of pixels in smaller classes, which is beneficial in the presence of very few labeled pixels and skewed class distributions. To address the second question, we introduce a new sampling procedure to generate disjoint train and test set. Then the train set is used to obtain the CNN model, which is then applied to pixels in the test set to estimate their labels. We assess the efficacy of the simple neural network method on five publicly available hyperspectral images. On these images our method significantly outperforms considered baselines. Notably, with just 1% of labeled pixels per class, on these datasets our method achieves an accuracy that goes from 86.42% (challenging dataset) to 99.52% (easy dataset). Furthermore we show that the simple neural network method improves over other baselines in the new challenging supervised setting. Our analysis substantiates the highly beneficial effect of using the entire image (so train and test data) for constructing a model.
△ Less
Submitted 23 July, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.
-
Deep Learning for Automatic Stereotypical Motor Movement Detection using Wearable Sensors in Autism Spectrum Disorders
Authors:
Nastaran Mohammadian Rad,
Seyed Mostafa Kia,
Calogero Zarbo,
Twan van Laarhoven,
Giuseppe Jurman,
Paola Venuti,
Elena Marchiori,
Cesare Furlanello
Abstract:
Autism Spectrum Disorders are associated with atypical movements, of which stereotypical motor movements (SMMs) interfere with learning and social interaction. The automatic SMM detection using inertial measurement units (IMU) remains complex due to the strong intra and inter-subject variability, especially when handcrafted features are extracted from the signal. We propose a new application of th…
▽ More
Autism Spectrum Disorders are associated with atypical movements, of which stereotypical motor movements (SMMs) interfere with learning and social interaction. The automatic SMM detection using inertial measurement units (IMU) remains complex due to the strong intra and inter-subject variability, especially when handcrafted features are extracted from the signal. We propose a new application of the deep learning to facilitate automatic SMM detection using multi-axis IMUs. We use a convolutional neural network (CNN) to learn a discriminative feature space from raw data. We show how the CNN can be used for parameter transfer learning to enhance the detection rate on longitudinal data. We also combine the long short-term memory (LSTM) with CNN to model the temporal patterns in a sequence of multi-axis signals. Further, we employ ensemble learning to combine multiple LSTM learners into a more robust SMM detector. Our results show that: 1) feature learning outperforms handcrafted features; 2) parameter transfer learning is beneficial in longitudinal settings; 3) using LSTM to learn the temporal dynamic of signals enhances the detection rate especially for skewed training data; 4) an ensemble of LSTMs provides more accurate and stable detectors. These findings provide a significant step toward accurate SMM detection in real-time scenarios.
△ Less
Submitted 14 September, 2017;
originally announced September 2017.
-
Unsupervised Domain Adaptation with Random Walks on Target Labelings
Authors:
Twan van Laarhoven,
Elena Marchiori
Abstract:
Unsupervised Domain Adaptation (DA) is used to automatize the task of labeling data: an unlabeled dataset (target) is annotated using a labeled dataset (source) from a related domain. We cast domain adaptation as the problem of finding stable labels for target examples. A new definition of label stability is proposed, motivated by a generalization error bound for large margin linear classifiers: a…
▽ More
Unsupervised Domain Adaptation (DA) is used to automatize the task of labeling data: an unlabeled dataset (target) is annotated using a labeled dataset (source) from a related domain. We cast domain adaptation as the problem of finding stable labels for target examples. A new definition of label stability is proposed, motivated by a generalization error bound for large margin linear classifiers: a target labeling is stable when, with high probability, a classifier trained on a random subsample of the target with that labeling yields the same labeling. We find stable labelings using a random walk on a directed graph with transition probabilities based on labeling stability. The majority vote of those labelings visited by the walk yields a stable label for each target example. The resulting domain adaptation algorithm is strikingly easy to implement and apply: It does not rely on data transformations, which are in general computational prohibitive in the presence of many input features, and does not need to access the source data, which is advantageous when data sharing is restricted. By acting on the original feature space, our method is able to take full advantage of deep features from external pre-trained neural networks, as demonstrated by the results of our experiments.
△ Less
Submitted 20 March, 2018; v1 submitted 16 June, 2017;
originally announced June 2017.
-
Transfer Learning for Domain Adaptation in MRI: Application in Brain Lesion Segmentation
Authors:
Mohsen Ghafoorian,
Alireza Mehrtash,
Tina Kapur,
Nico Karssemeijer,
Elena Marchiori,
Mehran Pesteie,
Charles R. G. Guttmann,
Frank-Erik de Leeuw,
Clare M. Tempany,
Bram van Ginneken,
Andriy Fedorov,
Purang Abolmaesumi,
Bram Platel,
William M. Wells III
Abstract:
Magnetic Resonance Imaging (MRI) is widely used in routine clinical diagnosis and treatment. However, variations in MRI acquisition protocols result in different appearances of normal and diseased tissue in the images. Convolutional neural networks (CNNs), which have shown to be successful in many medical image analysis tasks, are typically sensitive to the variations in imaging protocols. Therefo…
▽ More
Magnetic Resonance Imaging (MRI) is widely used in routine clinical diagnosis and treatment. However, variations in MRI acquisition protocols result in different appearances of normal and diseased tissue in the images. Convolutional neural networks (CNNs), which have shown to be successful in many medical image analysis tasks, are typically sensitive to the variations in imaging protocols. Therefore, in many cases, networks trained on data acquired with one MRI protocol, do not perform satisfactorily on data acquired with different protocols. This limits the use of models trained with large annotated legacy datasets on a new dataset with a different domain which is often a recurring situation in clinical settings. In this study, we aim to answer the following central questions regarding domain adaptation in medical image analysis: Given a fitted legacy model, 1) How much data from the new domain is required for a decent adaptation of the original network?; and, 2) What portion of the pre-trained model parameters should be retrained given a certain number of the new domain training samples? To address these questions, we conducted extensive experiments in white matter hyperintensity segmentation task. We trained a CNN on legacy MR images of brain and evaluated the performance of the domain-adapted network on the same task with images from a different domain. We then compared the performance of the model to the surrogate scenarios where either the same trained network is used or a new network is trained from scratch on the new dataset.The domain-adapted network tuned only by two training examples achieved a Dice score of 0.63 substantially outperforming a similar network trained on the same set of examples from scratch.
△ Less
Submitted 25 February, 2017;
originally announced February 2017.
-
Deep Multi-scale Location-aware 3D Convolutional Neural Networks for Automated Detection of Lacunes of Presumed Vascular Origin
Authors:
Mohsen Ghafoorian,
Nico Karssemeijer,
Tom Heskes,
Mayra Bergkamp,
Joost Wissink,
Jiri Obels,
Karlijn Keizer,
Frank-Erik de Leeuw,
Bram van Ginneken,
Elena Marchiori,
Bram Platel
Abstract:
Lacunes of presumed vascular origin (lacunes) are associated with an increased risk of stroke, gait impairment, and dementia and are a primary imaging feature of the small vessel disease. Quantification of lacunes may be of great importance to elucidate the mechanisms behind neuro-degenerative disorders and is recommended as part of study standards for small vessel disease research. However, due t…
▽ More
Lacunes of presumed vascular origin (lacunes) are associated with an increased risk of stroke, gait impairment, and dementia and are a primary imaging feature of the small vessel disease. Quantification of lacunes may be of great importance to elucidate the mechanisms behind neuro-degenerative disorders and is recommended as part of study standards for small vessel disease research. However, due to the different appearance of lacunes in various brain regions and the existence of other similar-looking structures, such as perivascular spaces, manual annotation is a difficult, elaborative and subjective task, which can potentially be greatly improved by reliable and consistent computer-aided detection (CAD) routines.
In this paper, we propose an automated two-stage method using deep convolutional neural networks (CNN). We show that this method has good performance and can considerably benefit readers. We first use a fully convolutional neural network to detect initial candidates. In the second step, we employ a 3D CNN as a false positive reduction tool. As the location information is important to the analysis of candidate structures, we further equip the network with contextual information using multi-scale analysis and integration of explicit location features. We trained, validated and tested our networks on a large dataset of 1075 cases obtained from two different studies. Subsequently, we conducted an observer study with four trained observers and compared our method with them using a free-response operating characteristic analysis. Shown on a test set of 111 cases, the resulting CAD system exhibits performance similar to the trained human observers and achieves a sensitivity of 0.974 with 0.13 false positives per slice. A feasibility study also showed that a trained human observer would considerably benefit once aided by the CAD system.
△ Less
Submitted 29 October, 2016; v1 submitted 24 October, 2016;
originally announced October 2016.
-
Location Sensitive Deep Convolutional Neural Networks for Segmentation of White Matter Hyperintensities
Authors:
Mohsen Ghafoorian,
Nico Karssemeijer,
Tom Heskes,
Inge van Uden,
Clara Sanchez,
Geert Litjens,
Frank-Erik de Leeuw,
Bram van Ginneken,
Elena Marchiori,
Bram Platel
Abstract:
The anatomical location of imaging features is of crucial importance for accurate diagnosis in many medical tasks. Convolutional neural networks (CNN) have had huge successes in computer vision, but they lack the natural ability to incorporate the anatomical location in their decision making process, hindering success in some medical image analysis tasks.
In this paper, to integrate the anatomic…
▽ More
The anatomical location of imaging features is of crucial importance for accurate diagnosis in many medical tasks. Convolutional neural networks (CNN) have had huge successes in computer vision, but they lack the natural ability to incorporate the anatomical location in their decision making process, hindering success in some medical image analysis tasks.
In this paper, to integrate the anatomical location information into the network, we propose several deep CNN architectures that consider multi-scale patches or take explicit location features while training. We apply and compare the proposed architectures for segmentation of white matter hyperintensities in brain MR images on a large dataset. As a result, we observe that the CNNs that incorporate location information substantially outperform a conventional segmentation method with hand-crafted features as well as CNNs that do not integrate location information. On a test set of 46 scans, the best configuration of our networks obtained a Dice score of 0.791, compared to 0.797 for an independent human observer. Performance levels of the machine and the independent human observer were not statistically significantly different (p-value=0.17).
△ Less
Submitted 29 October, 2016; v1 submitted 16 October, 2016;
originally announced October 2016.
-
Local Network Community Detection with Continuous Optimization of Conductance and Weighted Kernel K-Means
Authors:
Twan van Laarhoven,
Elena Marchiori
Abstract:
Local network community detection is the task of finding a single community of nodes concentrated around few given seed nodes in a localized way. Conductance is a popular objective function used in many algorithms for local community detection. This paper studies a continuous relaxation of conductance. We show that continuous optimization of this objective still leads to discrete communities. We i…
▽ More
Local network community detection is the task of finding a single community of nodes concentrated around few given seed nodes in a localized way. Conductance is a popular objective function used in many algorithms for local community detection. This paper studies a continuous relaxation of conductance. We show that continuous optimization of this objective still leads to discrete communities. We investigate the relation of conductance with weighted kernel k-means for a single community, which leads to the introduction of a new objective function, $σ$-conductance. Conductance is obtained by setting $σ$ to $0$. Two algorithms, EMc and PGDc, are proposed to locally optimize $σ$-conductance and automatically tune the parameter $σ$. They are based on expectation maximization and projected gradient descent, respectively. We prove locality and give performance guarantees for EMc and PGDc for a class of dense and well separated communities centered around the seeds. Experiments are conducted on networks with ground-truth communities, comparing to state-of-the-art graph diffusion algorithms for conductance optimization. On large graphs, results indicate that EMc and PGDc stay localized and produce communities most similar to the ground, while graph diffusion algorithms generate large communities of lower quality.
△ Less
Submitted 17 August, 2016; v1 submitted 21 January, 2016;
originally announced January 2016.
-
Resolution-limit-free and local Non-negative Matrix Factorization quality functions for graph clustering
Authors:
Twan van Laarhoven,
Elena Marchiori
Abstract:
Many graph clustering quality functions suffer from a resolution limit, the inability to find small clusters in large graphs. So called resolution-limit-free quality functions do not have this limit. This property was previously introduced for hard clustering, that is, graph partitioning.
We investigate the resolution-limit-free property in the context of Non-negative Matrix Factorization (NMF)…
▽ More
Many graph clustering quality functions suffer from a resolution limit, the inability to find small clusters in large graphs. So called resolution-limit-free quality functions do not have this limit. This property was previously introduced for hard clustering, that is, graph partitioning.
We investigate the resolution-limit-free property in the context of Non-negative Matrix Factorization (NMF) for hard and soft graph clustering. To use NMF in the hard clustering setting, a common approach is to assign each node to its highest membership cluster. We show that in this case symmetric NMF is not resolution-limit-free, but that it becomes so when hardness constraints are used as part of the optimization. The resulting function is strongly linked to the Constant Potts Model. In soft clustering, nodes can belong to more than one cluster, with varying degrees of membership. In this setting resolution-limit-free turns out to be too strong a property. Therefore we introduce locality, which roughly states that changing one part of the graph does not affect the clustering of other parts of the graph. We argue that this is a desirable property, provide conditions under which NMF quality functions are local, and propose a novel class of local probabilistic NMF quality functions for soft graph clustering.
△ Less
Submitted 22 July, 2014;
originally announced July 2014.
-
Axioms for graph clustering quality functions
Authors:
Twan van Laarhoven,
Elena Marchiori
Abstract:
We investigate properties that intuitively ought to be satisfied by graph clustering quality functions, that is, functions that assign a score to a clustering of a graph. Graph clustering, also known as network community detection, is often performed by optimizing such a function. Two axioms tailored for graph clustering quality functions are introduced, and the four axioms introduced in previous…
▽ More
We investigate properties that intuitively ought to be satisfied by graph clustering quality functions, that is, functions that assign a score to a clustering of a graph. Graph clustering, also known as network community detection, is often performed by optimizing such a function. Two axioms tailored for graph clustering quality functions are introduced, and the four axioms introduced in previous work on distance based clustering are reformulated and generalized for the graph setting. We show that modularity, a standard quality function for graph clustering, does not satisfy all of these six properties. This motivates the derivation of a new family of quality functions, adaptive scale modularity, which does satisfy the proposed axioms. Adaptive scale modularity has two parameters, which give greater flexibility in the kinds of clusterings that can be found. Standard graph clustering quality functions, such as normalized cut and unnormalized cut, are obtained as special cases of adaptive scale modularity.
In general, the results of our investigation indicate that the considered axiomatic framework covers existing `good' quality functions for graph clustering, and can be used to derive an interesting new family of quality functions.
△ Less
Submitted 22 July, 2014; v1 submitted 15 August, 2013;
originally announced August 2013.
-
Practical Methods for Proving Termination of General Logic Programs
Authors:
E. Marchiori
Abstract:
Termination of logic programs with negated body atoms (here called general logic programs) is an important topic. One reason is that many computational mechanisms used to process negated atoms, like Clark's negation as failure and Chan's constructive negation, are based on termination conditions. This paper introduces a methodology for proving termination of general logic programs w.r.t. the Pro…
▽ More
Termination of logic programs with negated body atoms (here called general logic programs) is an important topic. One reason is that many computational mechanisms used to process negated atoms, like Clark's negation as failure and Chan's constructive negation, are based on termination conditions. This paper introduces a methodology for proving termination of general logic programs w.r.t. the Prolog selection rule. The idea is to distinguish parts of the program depending on whether or not their termination depends on the selection rule. To this end, the notions of low-, weakly up-, and up-acceptable program are introduced. We use these notions to develop a methodology for proving termination of general logic programs, and show how interesting problems in non-monotonic reasoning can be formalized and implemented by means of terminating general logic programs.
△ Less
Submitted 31 March, 1996;
originally announced April 1996.