-
A simple, strong baseline for building damage detection on the xBD dataset
Authors:
Sebastian Gerard,
Paul Borne-Pons,
Josephine Sullivan
Abstract:
We construct a strong baseline method for building damage detection by starting with the highly-engineered winning solution of the xView2 competition, and gradually strip** away components. This way, we obtain a much simpler method, while retaining adequate performance. We expect the simplified solution to be more widely and easily applicable. This expectation is based on the reduced complexity,…
▽ More
We construct a strong baseline method for building damage detection by starting with the highly-engineered winning solution of the xView2 competition, and gradually strip** away components. This way, we obtain a much simpler method, while retaining adequate performance. We expect the simplified solution to be more widely and easily applicable. This expectation is based on the reduced complexity, as well as the fact that we choose hyperparameters based on simple heuristics, that transfer to other datasets. We then re-arrange the xView2 dataset splits such that the test locations are not seen during training, contrary to the competition setup. In this setting, we find that both the complex and the simplified model fail to generalize to unseen locations. Analyzing the dataset indicates that this failure to generalize is not only a model-based problem, but that the difficulty might also be influenced by the unequal class distributions between events.
Code, including the baseline model, is available under https://github.com/PaulBorneP/Xview2_Strong_Baseline
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
A Novel ML-driven Test Case Selection Approach for Enhancing the Performance of Grammatical Evolution
Authors:
Krishn Kumar Gupt,
Meghana Kshirsagar,
Douglas Mota Dias,
Joseph P. Sullivan,
Conor Ryan
Abstract:
Computational cost in metaheuristics such as Evolutionary Algorithms (EAs) is often a major concern, particularly with their ability to scale. In data-based training, traditional EAs typically use a significant portion, if not all, of the dataset for model training and fitness evaluation in each generation. This makes EAs suffer from high computational costs incurred during the fitness evaluation…
▽ More
Computational cost in metaheuristics such as Evolutionary Algorithms (EAs) is often a major concern, particularly with their ability to scale. In data-based training, traditional EAs typically use a significant portion, if not all, of the dataset for model training and fitness evaluation in each generation. This makes EAs suffer from high computational costs incurred during the fitness evaluation of the population, particularly when working with large datasets. To mitigate this issue, we propose a Machine Learning (ML)-driven Distance-based Selection (DBS) algorithm that reduces the fitness evaluation time by optimizing test cases. We test our algorithm by applying it to 24 benchmark problems from Symbolic Regression (SR) and digital circuit domains and then using Grammatical Evolution (GE) to train models using the reduced dataset. We use GE to test DBS on SR and produce a system flexible enough to test it on digital circuit problems further. The quality of the solutions is tested and compared against the conventional training method to measure the coverage of training data selected using DBS, i.e., how well the subset matches the statistical properties of the entire dataset. Moreover, the effect of optimized training data on run time and the effective size of the evolved solutions is analyzed. Experimental and statistical evaluations of the results show our method empowered GE to yield superior or comparable solutions to the baseline (using the full datasets) with smaller sizes and demonstrates computational efficiency in terms of speed.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Deterministic Langevin Unconstrained Optimization with Normalizing Flows
Authors:
James M. Sullivan,
Uros Seljak
Abstract:
We introduce a global, gradient-free surrogate optimization strategy for expensive black-box functions inspired by the Fokker-Planck and Langevin equations. These can be written as an optimization problem where the objective is the target function to maximize minus the logarithm of the current density of evaluated samples. This objective balances exploitation of the target objective with explorati…
▽ More
We introduce a global, gradient-free surrogate optimization strategy for expensive black-box functions inspired by the Fokker-Planck and Langevin equations. These can be written as an optimization problem where the objective is the target function to maximize minus the logarithm of the current density of evaluated samples. This objective balances exploitation of the target objective with exploration of low-density regions. The method, Deterministic Langevin Optimization (DLO), relies on a Normalizing Flow density estimate to perform active learning and select proposal points for evaluation. This strategy differs qualitatively from the widely-used acquisition functions employed by Bayesian Optimization methods, and can accommodate a range of surrogate choices. We demonstrate superior or competitive progress toward objective optima on standard synthetic test functions, as well as on non-convex and multi-modal posteriors of moderate dimension. On real-world objectives, such as scientific and neural network hyperparameter optimization, DLO is competitive with state-of-the-art baselines.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
It's more than just money: The real-world harms from ransomware attacks
Authors:
Nandita Pattnaik,
Jason R. C. Nurse,
Sarah Turner,
Gareth Mott,
Jamie MacColl,
Pia Huesch,
James Sullivan
Abstract:
As cyber-attacks continue to increase in frequency and sophistication, organisations must be better prepared to face the reality of an incident. Any organisational plan that intends to be successful at managing security risks must clearly understand the harm (i.e., negative impact) and the various parties affected in the aftermath of an attack. To this end, this article conducts a novel exploratio…
▽ More
As cyber-attacks continue to increase in frequency and sophistication, organisations must be better prepared to face the reality of an incident. Any organisational plan that intends to be successful at managing security risks must clearly understand the harm (i.e., negative impact) and the various parties affected in the aftermath of an attack. To this end, this article conducts a novel exploration into the multitude of real-world harms that can arise from cyber-attacks, with a particular focus on ransomware incidents given their current prominence. This exploration also leads to the proposal of a new, robust methodology for modelling harms from such incidents. We draw on publicly-available case data on high-profile ransomware incidents to examine the types of harm that emerge at various stages after a ransomware attack and how harms (e.g., an offline enterprise server) may trigger other negative, potentially more substantial impacts for stakeholders (e.g., the inability for a customer to access their social welfare benefits or bank account). Prominent findings from our analysis include the identification of a notable set of social/human harms beyond the business itself (and beyond the financial payment of a ransom) and a complex web of harms that emerge after attacks regardless of the industry sector. We also observed that deciphering the full extent and sequence of harms can be a challenging undertaking because of the lack of complete data available. This paper consequently argues for more transparency on ransomware harms, as it would lead to a better understanding of the realities of these incidents to the benefit of organisations and society more generally.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Probabilistic 3d regression with projected huber distribution
Authors:
David Mohlin,
Josephine Sullivan
Abstract:
Estimating probability distributions which describe where an object is likely to be from camera data is a task with many applications. In this work we describe properties which we argue such methods should conform to. We also design a method which conform to these properties. In our experiments we show that our method produces uncertainties which correlate well with empirical errors. We also show…
▽ More
Estimating probability distributions which describe where an object is likely to be from camera data is a task with many applications. In this work we describe properties which we argue such methods should conform to. We also design a method which conform to these properties. In our experiments we show that our method produces uncertainties which correlate well with empirical errors. We also show that the mode of the predicted distribution outperform our regression baselines. The code for our implementation is available online.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
Contrastive pretraining for semantic segmentation is robust to noisy positive pairs
Authors:
Sebastian Gerard,
Josephine Sullivan
Abstract:
Domain-specific variants of contrastive learning can construct positive pairs from two distinct in-domain images, while traditional methods just augment the same image twice. For example, we can form a positive pair from two satellite images showing the same location at different times. Ideally, this teaches the model to ignore changes caused by seasons, weather conditions or image acquisition art…
▽ More
Domain-specific variants of contrastive learning can construct positive pairs from two distinct in-domain images, while traditional methods just augment the same image twice. For example, we can form a positive pair from two satellite images showing the same location at different times. Ideally, this teaches the model to ignore changes caused by seasons, weather conditions or image acquisition artifacts. However, unlike in traditional contrastive methods, this can result in undesired positive pairs, since we form them without human supervision. For example, a positive pair might consist of one image before a disaster and one after. This could teach the model to ignore the differences between intact and damaged buildings, which might be what we want to detect in the downstream task. Similar to false negative pairs, this could impede model performance. Crucially, in this setting only parts of the images differ in relevant ways, while other parts remain similar. Surprisingly, we find that downstream semantic segmentation is either robust to such badly matched pairs or even benefits from them. The experiments are conducted on the remote sensing dataset xBD, and a synthetic segmentation dataset for which we have full control over the pairing conditions. As a result, practitioners can use these domain-specific contrastive methods without having to filter their positive pairs beforehand, or might even be encouraged to purposefully include such pairs in their pretraining dataset.
△ Less
Submitted 23 January, 2023; v1 submitted 24 November, 2022;
originally announced November 2022.
-
PhysioGait: Context-Aware Physiological Context Modeling for Person Re-identification Attack on Wearable Sensing
Authors:
James O Sullivan,
Mohammad Arif Ul Alam
Abstract:
Person re-identification is a critical privacy breach in publicly shared healthcare data. We investigate the possibility of a new type of privacy threat on publicly shared privacy insensitive large scale wearable sensing data. In this paper, we investigate user specific biometric signatures in terms of two contextual biometric traits, physiological (photoplethysmography and electrodermal activity)…
▽ More
Person re-identification is a critical privacy breach in publicly shared healthcare data. We investigate the possibility of a new type of privacy threat on publicly shared privacy insensitive large scale wearable sensing data. In this paper, we investigate user specific biometric signatures in terms of two contextual biometric traits, physiological (photoplethysmography and electrodermal activity) and physical (accelerometer) contexts. In this regard, we propose PhysioGait, a context-aware physiological signal model that consists of a Multi-Modal Siamese Convolutional Neural Network (mmSNN) which learns the spatial and temporal information individually and performs sensor fusion in a Siamese cost with the objective of predicting a person's identity. We evaluated PhysioGait attack model using 4 real-time collected datasets (3-data under IRB #HP-00064387 and one publicly available data) and two combined datasets achieving 89% - 93% accuracy of re-identifying persons.
△ Less
Submitted 29 October, 2022;
originally announced November 2022.
-
Are All Linear Regions Created Equal?
Authors:
Matteo Gamba,
Adrian Chmielewski-Anders,
Josephine Sullivan,
Hossein Azizpour,
Mårten Björkman
Abstract:
The number of linear regions has been studied as a proxy of complexity for ReLU networks. However, the empirical success of network compression techniques like pruning and knowledge distillation, suggest that in the overparameterized setting, linear regions density might fail to capture the effective nonlinearity. In this work, we propose an efficient algorithm for discovering linear regions and u…
▽ More
The number of linear regions has been studied as a proxy of complexity for ReLU networks. However, the empirical success of network compression techniques like pruning and knowledge distillation, suggest that in the overparameterized setting, linear regions density might fail to capture the effective nonlinearity. In this work, we propose an efficient algorithm for discovering linear regions and use it to investigate the effectiveness of density in capturing the nonlinearity of trained VGGs and ResNets on CIFAR-10 and CIFAR-100. We contrast the results with a more principled nonlinearity measure based on function variation, highlighting the shortcomings of linear regions density. Furthermore, interestingly, our measure of nonlinearity clearly correlates with model-wise deep double descent, connecting reduced test error with reduced nonlinearity, and increased local similarity of linear regions.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Validation and Transparency in AI systems for pharmacovigilance: a case study applied to the medical literature monitoring of adverse events
Authors:
Bruno Ohana,
Jack Sullivan,
Nicole Baker
Abstract:
Recent advances in artificial intelligence applied to biomedical text are opening exciting opportunities for improving pharmacovigilance activities currently burdened by the ever growing volumes of real world data. To fully realize these opportunities, existing regulatory guidance and industry best practices should be taken into consideration in order to increase the overall trustworthiness of the…
▽ More
Recent advances in artificial intelligence applied to biomedical text are opening exciting opportunities for improving pharmacovigilance activities currently burdened by the ever growing volumes of real world data. To fully realize these opportunities, existing regulatory guidance and industry best practices should be taken into consideration in order to increase the overall trustworthiness of the system and enable broader adoption. In this paper we present a case study on how to operationalize existing guidance for validated AI systems in pharmacovigilance focusing on the specific task of medical literature monitoring (MLM) of adverse events from the scientific literature. We describe an AI system designed with the goal of reducing effort in MLM activities built in close collaboration with subject matter experts and considering guidance for validated systems in pharmacovigilance and AI transparency. In particular we make use of public disclosures as a useful risk control measure to mitigate system misuse and earn user trust. In addition we present experimental results showing the system can significantly remove screening effort while maintaining high levels of recall (filtering 55% of irrelevant articles on average, for a target recall of 0.99 on suspected adverse articles) and provide a robust method for tuning the desired recall to suit a particular risk profile.
△ Less
Submitted 21 December, 2021;
originally announced January 2022.
-
Probabilistic Regression with Huber Distributions
Authors:
David Mohlin,
Gerald Bianchi,
Josephine Sullivan
Abstract:
In this paper we describe a probabilistic method for estimating the position of an object along with its covariance matrix using neural networks. Our method is designed to be robust to outliers, have bounded gradients with respect to the network outputs, among other desirable properties. To achieve this we introduce a novel probability distribution inspired by the Huber loss. We also introduce a n…
▽ More
In this paper we describe a probabilistic method for estimating the position of an object along with its covariance matrix using neural networks. Our method is designed to be robust to outliers, have bounded gradients with respect to the network outputs, among other desirable properties. To achieve this we introduce a novel probability distribution inspired by the Huber loss. We also introduce a new way to parameterize positive definite matrices to ensure invariance to the choice of orientation for the coordinate system we regress over. We evaluate our method on popular body pose and facial landmark datasets and get performance on par or exceeding the performance of non-heatmap methods. Our code is available at github.com/Davmo049/Public_prob_regression_with_huber_distributions
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Large-Scale Zero-Shot Image Classification from Rich and Diverse Textual Descriptions
Authors:
Sebastian Bujwid,
Josephine Sullivan
Abstract:
We study the impact of using rich and diverse textual descriptions of classes for zero-shot learning (ZSL) on ImageNet. We create a new dataset ImageNet-Wiki that matches each ImageNet class to its corresponding Wikipedia article. We show that merely employing these Wikipedia articles as class descriptions yields much higher ZSL performance than prior works. Even a simple model using this type of…
▽ More
We study the impact of using rich and diverse textual descriptions of classes for zero-shot learning (ZSL) on ImageNet. We create a new dataset ImageNet-Wiki that matches each ImageNet class to its corresponding Wikipedia article. We show that merely employing these Wikipedia articles as class descriptions yields much higher ZSL performance than prior works. Even a simple model using this type of auxiliary data outperforms state-of-the-art models that rely on standard features of word embedding encodings of class names. These results highlight the usefulness and importance of textual descriptions for ZSL, as well as the relative importance of auxiliary data type compared to algorithmic progress. Our experimental results also show that standard zero-shot learning approaches generalize poorly across categories of classes.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
Probabilistic orientation estimation with matrix Fisher distributions
Authors:
D. Mohlin,
G. Bianchi,
J. Sullivan
Abstract:
This paper focuses on estimating probability distributions over the set of 3D rotations ($SO(3)$) using deep neural networks. Learning to regress models to the set of rotations is inherently difficult due to differences in topology between $\mathbb{R}^N$ and $SO(3)$. We overcome this issue by using a neural network to output the parameters for a matrix Fisher distribution since these parameters ar…
▽ More
This paper focuses on estimating probability distributions over the set of 3D rotations ($SO(3)$) using deep neural networks. Learning to regress models to the set of rotations is inherently difficult due to differences in topology between $\mathbb{R}^N$ and $SO(3)$. We overcome this issue by using a neural network to output the parameters for a matrix Fisher distribution since these parameters are homeomorphic to $\mathbb{R}^9$. By using a negative log likelihood loss for this distribution we get a loss which is convex with respect to the network outputs. By optimizing this loss we improve state-of-the-art on several challenging applicable datasets, namely Pascal3D+, ModelNet10-$SO(3)$ and UPNA head pose.
△ Less
Submitted 17 June, 2020;
originally announced June 2020.
-
Explanation-based Weakly-supervised Learning of Visual Relations with Graph Networks
Authors:
Federico Baldassarre,
Kevin Smith,
Josephine Sullivan,
Hossein Azizpour
Abstract:
Visual relationship detection is fundamental for holistic image understanding. However, the localization and classification of (subject, predicate, object) triplets remain challenging tasks, due to the combinatorial explosion of possible relationships, their long-tailed distribution in natural images, and an expensive annotation process. This paper introduces a novel weakly-supervised method for v…
▽ More
Visual relationship detection is fundamental for holistic image understanding. However, the localization and classification of (subject, predicate, object) triplets remain challenging tasks, due to the combinatorial explosion of possible relationships, their long-tailed distribution in natural images, and an expensive annotation process. This paper introduces a novel weakly-supervised method for visual relationship detection that relies on minimal image-level predicate labels. A graph neural network is trained to classify predicates in images from a graph representation of detected objects, implicitly encoding an inductive bias for pairwise relations. We then frame relationship detection as the explanation of such a predicate classifier, i.e. we obtain a complete relation by recovering the subject and object of a predicted predicate. We present results comparable to recent fully- and weakly-supervised methods on three diverse and challenging datasets: HICO-DET for human-object interaction, Visual Relationship Detection for generic object-to-object relations, and UnRel for unusual triplets; demonstrating robustness to non-comprehensive annotations and good few-shot generalization.
△ Less
Submitted 17 July, 2020; v1 submitted 16 June, 2020;
originally announced June 2020.
-
Compiling a Calculus for Relaxed Memory: Practical constraint-based low-level concurrency
Authors:
Michael J. Sullivan,
Karl Crary,
Salil Joshi
Abstract:
Crary and Sullivan's Relaxed Memory Calculus (RMC) proposed a new declarative approach for writing low-level shared memory concurrent programs in the presence of modern relaxed-memory multi-processor architectures and optimizing compilers. In RMC, the programmer explicitly specifies constraints on the order of execution of operations and on the visibility of memory writes. These constraints are th…
▽ More
Crary and Sullivan's Relaxed Memory Calculus (RMC) proposed a new declarative approach for writing low-level shared memory concurrent programs in the presence of modern relaxed-memory multi-processor architectures and optimizing compilers. In RMC, the programmer explicitly specifies constraints on the order of execution of operations and on the visibility of memory writes. These constraints are then enforced by the compiler, which has a wide degree of latitude in how to accomplish its goals.
We present rmc-compiler, a Clang and LLVM-based compiler for RMC-extended C and C++. In addition to using barriers to enforce ordering, rmc-compiler can take advantage of control and data dependencies, something that is beyond the abilities of current C/C++ compilers. In rmc-compiler, RMC compilation is modeled as an SMT problem with a cost term; the solution with the minimum cost determines the compilation strategy. In testing on ARM and POWER devices, RMC performs quite well, with modest performance improvements relative to C++11 on most of our data structure benchmarks and (on some architectures) dramatic improvements on a read-mostly list test that heavily benefits from use of data dependencies for ordering.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
Student Cluster Competition 2017, Team University ofTexas at Austin/Texas State University: Reproducing Vectorization of the Tersoff Multi-Body Potential on the Intel Skylake and NVIDIA V100 Architectures
Authors:
James Sullivan,
Collin Weir,
Austin Reichert,
R. Todd Evans,
W. Cyrus Proctor,
Nicolas Thorne
Abstract:
This paper satisfies the reproducibility challenge of the Student Cluster Competition at Supercomputing 2017. We attempted to reproduce the results of Höhnerbach et al. (2016) for an implementation of a vectorized code for the Tersoff multi-body potential kernel of the molecular dynamics code Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). We investigated accuracy, optimization…
▽ More
This paper satisfies the reproducibility challenge of the Student Cluster Competition at Supercomputing 2017. We attempted to reproduce the results of Höhnerbach et al. (2016) for an implementation of a vectorized code for the Tersoff multi-body potential kernel of the molecular dynamics code Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS). We investigated accuracy, optimization performance, and scaling with our Intel CPU and NVIDIA GPU based cluster.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
Convergence Rates of Gaussian ODE Filters
Authors:
Hans Kersting,
T. J. Sullivan,
Philipp Hennig
Abstract:
A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution $x$ and its first $q$ derivatives \emph{a priori} as a Gauss--Markov process $\boldsymbol{X}$, which is then iteratively conditioned on information about $\dot{x}$. This article estab…
▽ More
A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution $x$ and its first $q$ derivatives \emph{a priori} as a Gauss--Markov process $\boldsymbol{X}$, which is then iteratively conditioned on information about $\dot{x}$. This article establishes worst-case local convergence rates of order $q+1$ for a wide range of versions of this Gaussian ODE filter, as well as global convergence rates of order $q$ in the case of $q=1$ and an integrated Brownian motion prior, and analyses how inaccurate information on $\dot{x}$ coming from approximate evaluations of $f$ affects these rates. Moreover, we show that, in the globally convergent case, the posterior credible intervals are well calibrated in the sense that they globally contract at the same rate as the truncation error. We illustrate these theoretical results by numerical experiments which might indicate their generalizability to $q \in \{2,3,\dots\}$.
△ Less
Submitted 17 July, 2020; v1 submitted 25 July, 2018;
originally announced July 2018.
-
Compression, inversion, and approximate PCA of dense kernel matrices at near-linear computational complexity
Authors:
Florian Schäfer,
T. J. Sullivan,
Houman Owhadi
Abstract:
Dense kernel matrices $Θ\in \mathbb{R}^{N \times N}$ obtained from point evaluations of a covariance function $G$ at locations $\{ x_{i} \}_{1 \leq i \leq N} \subset \mathbb{R}^{d}$ arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously-distributed sampling points, we show how to ident…
▽ More
Dense kernel matrices $Θ\in \mathbb{R}^{N \times N}$ obtained from point evaluations of a covariance function $G$ at locations $\{ x_{i} \}_{1 \leq i \leq N} \subset \mathbb{R}^{d}$ arise in statistics, machine learning, and numerical analysis. For covariance functions that are Green's functions of elliptic boundary value problems and homogeneously-distributed sampling points, we show how to identify a subset $S \subset \{ 1 , \dots , N \}^2$, with $\# S = O ( N \log (N) \log^{d} ( N /ε) )$, such that the zero fill-in incomplete Cholesky factorisation of the sparse matrix $Θ_{ij} 1_{( i, j ) \in S}$ is an $ε$-approximation of $Θ$. This factorisation can provably be obtained in complexity $O ( N \log( N ) \log^{d}( N /ε) )$ in space and $O ( N \log^{2}( N ) \log^{2d}( N /ε) )$ in time, improving upon the state of the art for general elliptic operators; we further present numerical evidence that $d$ can be taken to be the intrinsic dimension of the data set rather than that of the ambient space. The algorithm only needs to know the spatial configuration of the $x_{i}$ and does not require an analytic representation of $G$. Furthermore, this factorization straightforwardly provides an approximate sparse PCA with optimal rate of convergence in the operator norm. Hence, by using only subsampling and the incomplete Cholesky factorization, we obtain, at nearly linear complexity, the compression, inversion and approximate PCA of a large class of covariance matrices. By inverting the order of the Cholesky factorization we also obtain a solver for elliptic PDE with complexity $O ( N \log^{d}( N /ε) )$ in space and $O ( N \log^{2d}( N /ε) )$ in time, improving upon the state of the art for general elliptic operators.
△ Less
Submitted 30 October, 2020; v1 submitted 7 June, 2017;
originally announced June 2017.
-
Face Attribute Prediction Using Off-the-Shelf CNN Features
Authors:
Yang Zhong,
Josephine Sullivan,
Haibo Li
Abstract:
Predicting attributes from face images in the wild is a challenging computer vision problem. To automatically describe face attributes from face containing images, traditionally one needs to cascade three technical blocks --- face localization, facial descriptor construction, and attribute classification --- in a pipeline. As a typical classification problem, face attribute prediction has been add…
▽ More
Predicting attributes from face images in the wild is a challenging computer vision problem. To automatically describe face attributes from face containing images, traditionally one needs to cascade three technical blocks --- face localization, facial descriptor construction, and attribute classification --- in a pipeline. As a typical classification problem, face attribute prediction has been addressed using deep learning. Current state-of-the-art performance was achieved by using two cascaded Convolutional Neural Networks (CNNs), which were specifically trained to learn face localization and attribute description. In this paper, we experiment with an alternative way of employing the power of deep representations from CNNs. Combining with conventional face localization techniques, we use off-the-shelf architectures trained for face recognition to build facial descriptors. Recognizing that the describable face attributes are diverse, our face descriptors are constructed from different levels of the CNNs for different attributes to best facilitate face attribute prediction. Experiments on two large datasets, LFWA and CelebA, show that our approach is entirely comparable to the state-of-the-art. Our findings not only demonstrate an efficient face attribute prediction approach, but also raise an important question: how to leverage the power of off-the-shelf CNN representations for novel tasks.
△ Less
Submitted 21 June, 2016; v1 submitted 11 February, 2016;
originally announced February 2016.
-
Leveraging Mid-Level Deep Representations For Predicting Face Attributes in the Wild
Authors:
Yang Zhong,
Josephine Sullivan,
Haibo Li
Abstract:
Predicting facial attributes from faces in the wild is very challenging due to pose and lighting variations in the real world. The key to this problem is to build proper feature representations to cope with these unfavourable conditions. Given the success of Convolutional Neural Network (CNN) in image classification, the high-level CNN feature, as an intuitive and reasonable choice, has been widel…
▽ More
Predicting facial attributes from faces in the wild is very challenging due to pose and lighting variations in the real world. The key to this problem is to build proper feature representations to cope with these unfavourable conditions. Given the success of Convolutional Neural Network (CNN) in image classification, the high-level CNN feature, as an intuitive and reasonable choice, has been widely utilized for this problem. In this paper, however, we consider the mid-level CNN features as an alternative to the high-level ones for attribute prediction. This is based on the observation that face attributes are different: some of them are locally oriented while others are globally defined. Our investigations reveal that the mid-level deep representations outperform the prediction accuracy achieved by the (fine-tuned) high-level abstractions. We empirically demonstrate that the midlevel representations achieve state-of-the-art prediction performance on CelebA and LFWA datasets. Our investigations also show that by utilizing the mid-level representations one can employ a single deep network to achieve both face recognition and attribute prediction.
△ Less
Submitted 21 June, 2016; v1 submitted 4 February, 2016;
originally announced February 2016.
-
Visual Instance Retrieval with Deep Convolutional Networks
Authors:
Ali Sharif Razavian,
Josephine Sullivan,
Stefan Carlsson,
Atsuto Maki
Abstract:
This paper provides an extensive study on the availability of image representations based on convolutional networks (ConvNets) for the task of visual instance retrieval. Besides the choice of convolutional layers, we present an efficient pipeline exploiting multi-scale schemes to extract local features, in particular, by taking geometric invariance into explicit account, i.e. positions, scales and…
▽ More
This paper provides an extensive study on the availability of image representations based on convolutional networks (ConvNets) for the task of visual instance retrieval. Besides the choice of convolutional layers, we present an efficient pipeline exploiting multi-scale schemes to extract local features, in particular, by taking geometric invariance into explicit account, i.e. positions, scales and spatial consistency. In our experiments using five standard image retrieval datasets, we demonstrate that generic ConvNet image representations can outperform other state-of-the-art methods if they are extracted appropriately.
△ Less
Submitted 9 May, 2016; v1 submitted 19 December, 2014;
originally announced December 2014.
-
Persistent Evidence of Local Image Properties in Generic ConvNets
Authors:
Ali Sharif Razavian,
Hossein Azizpour,
Atsuto Maki,
Josephine Sullivan,
Carl Henrik Ek,
Stefan Carlsson
Abstract:
Supervised training of a convolutional network for object classification should make explicit any information related to the class of objects and disregard any auxiliary information associated with the capture of the image or the variation within the object class. Does this happen in practice? Although this seems to pertain to the very final layers in the network, if we look at earlier layers we f…
▽ More
Supervised training of a convolutional network for object classification should make explicit any information related to the class of objects and disregard any auxiliary information associated with the capture of the image or the variation within the object class. Does this happen in practice? Although this seems to pertain to the very final layers in the network, if we look at earlier layers we find that this is not the case. Surprisingly, strong spatial information is implicit. This paper addresses this, in particular, exploiting the image representation at the first fully connected layer, i.e. the global image descriptor which has been recently shown to be most effective in a range of visual recognition tasks. We empirically demonstrate evidences for the finding in the contexts of four different tasks: 2d landmark detection, 2d object keypoints prediction, estimation of the RGB values of input image, and recovery of semantic label of each pixel. We base our investigation on a simple framework with ridge rigression commonly across these tasks, and show results which all support our insight. Such spatial information can be used for computing correspondence of landmarks to a good accuracy, but should potentially be useful for improving the training of the convolutional nets for classification purposes.
△ Less
Submitted 24 November, 2014;
originally announced November 2014.
-
Factors of Transferability for a Generic ConvNet Representation
Authors:
Hossein Azizpour,
Ali Sharif Razavian,
Josephine Sullivan,
Atsuto Maki,
Stefan Carlsson
Abstract:
Evidence is mounting that Convolutional Networks (ConvNets) are the most effective representation learning method for visual recognition tasks. In the common scenario, a ConvNet is trained on a large labeled dataset (source) and the feed-forward units activation of the trained network, at a certain layer of the network, is used as a generic representation of an input image for a task with relative…
▽ More
Evidence is mounting that Convolutional Networks (ConvNets) are the most effective representation learning method for visual recognition tasks. In the common scenario, a ConvNet is trained on a large labeled dataset (source) and the feed-forward units activation of the trained network, at a certain layer of the network, is used as a generic representation of an input image for a task with relatively smaller training set (target). Recent studies have shown this form of representation transfer to be suitable for a wide range of target visual recognition tasks. This paper introduces and investigates several factors affecting the transferability of such representations. It includes parameters for training of the source ConvNet such as its architecture, distribution of the training data, etc. and also the parameters of feature extraction such as layer of the trained ConvNet, dimensionality reduction, etc. Then, by optimizing these factors, we show that significant improvements can be achieved on various (17) visual recognition tasks. We further show that these visual recognition tasks can be categorically ordered based on their distance from the source task such that a correlation between the performance of tasks and their distance from the source task w.r.t. the proposed factors is observed.
△ Less
Submitted 15 July, 2015; v1 submitted 22 June, 2014;
originally announced June 2014.
-
CNN Features off-the-shelf: an Astounding Baseline for Recognition
Authors:
Ali Sharif Razavian,
Hossein Azizpour,
Josephine Sullivan,
Stefan Carlsson
Abstract:
Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC…
▽ More
Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the \overfeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the \overfeat network was trained to solve. Astonishingly, we report consistent superior results compared to the highly tuned state-of-the-art systems in all the visual classification tasks on various datasets. For instance retrieval it consistently outperforms low memory footprint methods except for sculptures dataset. The results are achieved using a linear SVM classifier (or $L2$ distance in case of retrieval) applied to a feature representation of size 4096 extracted from a layer in the net. The representations are further modified using simple augmentation techniques e.g. jittering. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.
△ Less
Submitted 12 May, 2014; v1 submitted 23 March, 2014;
originally announced March 2014.
-
The Optimal Uncertainty Algorithm in the Mystic Framework
Authors:
M. McKerns,
H. Owhadi,
C. Scovel,
T. J. Sullivan,
M. Ortiz
Abstract:
We have recently proposed a rigorous framework for Uncertainty Quantification (UQ) in which UQ objectives and assumption/information set are brought into the forefront, providing a framework for the communication and comparison of UQ results. In particular, this framework does not implicitly impose inappropriate assumptions nor does it repudiate relevant information. This framework, which we call…
▽ More
We have recently proposed a rigorous framework for Uncertainty Quantification (UQ) in which UQ objectives and assumption/information set are brought into the forefront, providing a framework for the communication and comparison of UQ results. In particular, this framework does not implicitly impose inappropriate assumptions nor does it repudiate relevant information. This framework, which we call Optimal Uncertainty Quantification (OUQ), is based on the observation that given a set of assumptions and information, there exist bounds on uncertainties obtained as values of optimization problems and that these bounds are optimal. It provides a uniform environment for the optimal solution of the problems of validation, certification, experimental design, reduced order modeling, prediction, extrapolation, all under aleatoric and epistemic uncertainties. OUQ optimization problems are extremely large, and even though under general conditions they have finite-dimensional reductions, they must often be solved numerically. This general algorithmic framework for OUQ has been implemented in the mystic optimization framework. We describe this implementation, and demonstrate its use in the context of the Caltech surrogate model for hypervelocity impact.
△ Less
Submitted 6 February, 2012;
originally announced February 2012.
-
Optimal Uncertainty Quantification
Authors:
Houman Owhadi,
Clint Scovel,
Timothy John Sullivan,
Mike McKerns,
Michael Ortiz
Abstract:
We propose a rigorous framework for Uncertainty Quantification (UQ) in which the UQ objectives and the assumptions/information set are brought to the forefront. This framework, which we call \emph{Optimal Uncertainty Quantification} (OUQ), is based on the observation that, given a set of assumptions and information about the problem, there exist optimal bounds on uncertainties: these are obtained…
▽ More
We propose a rigorous framework for Uncertainty Quantification (UQ) in which the UQ objectives and the assumptions/information set are brought to the forefront. This framework, which we call \emph{Optimal Uncertainty Quantification} (OUQ), is based on the observation that, given a set of assumptions and information about the problem, there exist optimal bounds on uncertainties: these are obtained as values of well-defined optimization problems corresponding to extremizing probabilities of failure, or of deviations, subject to the constraints imposed by the scenarios compatible with the assumptions and information. In particular, this framework does not implicitly impose inappropriate assumptions, nor does it repudiate relevant information. Although OUQ optimization problems are extremely large, we show that under general conditions they have finite-dimensional reductions. As an application, we develop \emph{Optimal Concentration Inequalities} (OCI) of Hoeffding and McDiarmid type. Surprisingly, these results show that uncertainties in input parameters, which propagate to output uncertainties in the classical sensitivity analysis paradigm, may fail to do so if the transfer functions (or probability distributions) are imperfectly known. We show how, for hierarchical structures, this phenomenon may lead to the non-propagation of uncertainties or information across scales. In addition, a general algorithmic framework is developed for OUQ and is tested on the Caltech surrogate model for hypervelocity impact and on the seismic safety assessment of truss structures, suggesting the feasibility of the framework for important complex systems. The introduction of this paper provides both an overview of the paper and a self-contained mini-tutorial about basic concepts and issues of UQ.
△ Less
Submitted 23 May, 2012; v1 submitted 2 September, 2010;
originally announced September 2010.
-
Tiling space and slabs with acute tetrahedra
Authors:
David Eppstein,
John M. Sullivan,
Alper Ungor
Abstract:
We show it is possible to tile three-dimensional space using only tetrahedra with acute dihedral angles. We present several constructions to achieve this, including one in which all dihedral angles are less than $77.08^\circ$, and another which tiles a slab in space.
We show it is possible to tile three-dimensional space using only tetrahedra with acute dihedral angles. We present several constructions to achieve this, including one in which all dihedral angles are less than $77.08^\circ$, and another which tiles a slab in space.
△ Less
Submitted 19 February, 2003;
originally announced February 2003.
-
Building Space-Time Meshes over Arbitrary Spatial Domains
Authors:
Jeff Erickson,
Damrong Guoy,
John M. Sullivan,
Alper Üngör
Abstract:
We present an algorithm to construct meshes suitable for space-time discontinuous Galerkin finite-element methods. Our method generalizes and improves the `Tent Pitcher' algorithm of Üngör and Sheffer. Given an arbitrary simplicially meshed domain M of any dimension and a time interval [0,T], our algorithm builds a simplicial mesh of the space-time domain Mx[0,T], in constant time per element. O…
▽ More
We present an algorithm to construct meshes suitable for space-time discontinuous Galerkin finite-element methods. Our method generalizes and improves the `Tent Pitcher' algorithm of Üngör and Sheffer. Given an arbitrary simplicially meshed domain M of any dimension and a time interval [0,T], our algorithm builds a simplicial mesh of the space-time domain Mx[0,T], in constant time per element. Our algorithm avoids the limitations of previous methods by carefully adapting the durations of space-time elements to the local quality and feature size of the underlying space mesh.
△ Less
Submitted 1 June, 2002;
originally announced June 2002.