-
EdgeSphere: A Three-Tier Architecture for Cognitive Edge Computing
Authors:
Christian Makaya,
Keith Grueneberg,
Bongjun Ko,
David Wood,
Nirmit Desai,
Xi** Wang
Abstract:
Computing at the edge is increasingly important as Internet of Things (IoT) devices at the edge generate massive amounts of data and pose challenges in transporting all that data to the Cloud where they can be analyzed. On the other hand, harnessing the edge data is essential for offering cognitive applications, if the challenges, such as device capabilities, connectivity, and heterogeneity can be…
▽ More
Computing at the edge is increasingly important as Internet of Things (IoT) devices at the edge generate massive amounts of data and pose challenges in transporting all that data to the Cloud where they can be analyzed. On the other hand, harnessing the edge data is essential for offering cognitive applications, if the challenges, such as device capabilities, connectivity, and heterogeneity can be overcome. This paper proposes a novel three-tier architecture, called EdgeSphere, which harnesses resources of the edge devices, to analyze the data in situ at the edge. In contrast to the state-of-the-art cloud and mobile applications, EdgeSphere applications span across cloud, edge gateways, and edge devices. At its core, EdgeSphere builds on Apache Mesos to optimize resources usage and scheduling. EdgeSphere has been applied to practical scenarios and this paper describes the engineering challenges faced as well as innovative solutions.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysis
Authors:
Siddharth Agarwal,
David A. Wood,
Mariusz Grzeda,
Chandhini Suresh,
Munaib Din,
James Cole,
Marc Modat,
Thomas C Booth
Abstract:
Purpose: Most studies evaluating artificial intelligence (AI) models that detect abnormalities in neuroimaging are either tested on unrepresentative patient cohorts or are insufficiently well-validated, leading to poor generalisability to real-world tasks. The aim was to determine the diagnostic test accuracy and summarise the evidence supporting the use of AI models performing first-line, high-vo…
▽ More
Purpose: Most studies evaluating artificial intelligence (AI) models that detect abnormalities in neuroimaging are either tested on unrepresentative patient cohorts or are insufficiently well-validated, leading to poor generalisability to real-world tasks. The aim was to determine the diagnostic test accuracy and summarise the evidence supporting the use of AI models performing first-line, high-volume neuroimaging tasks.
Methods: Medline, Embase, Cochrane library and Web of Science were searched until September 2021 for studies that temporally or externally validated AI capable of detecting abnormalities in first-line CT or MR neuroimaging. A bivariate random-effects model was used for meta-analysis where appropriate. PROSPERO: CRD42021269563.
Results: Only 16 studies were eligible for inclusion. Included studies were not compromised by unrepresentative datasets or inadequate validation methodology. Direct comparison with radiologists was available in 4/16 studies. 15/16 had a high risk of bias. Meta-analysis was only suitable for intracranial haemorrhage detection in CT imaging (10/16 studies), where AI systems had a pooled sensitivity and specificity 0.90 (95% CI 0.85 - 0.94) and 0.90 (95% CI 0.83 - 0.95) respectively. Other AI studies using CT and MRI detected target conditions other than haemorrhage (2/16), or multiple target conditions (4/16). Only 3/16 studies implemented AI in clinical pathways, either for pre-read triage or as post-read discrepancy identifiers.
Conclusion: The paucity of eligible studies reflects that most abnormality detection AI studies were not adequately validated in representative clinical cohorts. The few studies describing how abnormality detection AI could impact patients and clinicians did not explore the full ramifications of clinical implementation.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Letter to the Editor: What are the legal and ethical considerations of submitting radiology reports to ChatGPT?
Authors:
Siddharth Agarwal,
David Wood,
Robin Carpenter,
Yiran Wei,
Marc Modat,
Thomas C Booth
Abstract:
This letter critically examines the recent article by Infante et al. assessing the utility of large language models (LLMs) like GPT-4, Perplexity, and Bard in identifying urgent findings in emergency radiology reports. While acknowledging the potential of LLMs in generating labels for computer vision, concerns are raised about the ethical implications of using patient data without explicit approva…
▽ More
This letter critically examines the recent article by Infante et al. assessing the utility of large language models (LLMs) like GPT-4, Perplexity, and Bard in identifying urgent findings in emergency radiology reports. While acknowledging the potential of LLMs in generating labels for computer vision, concerns are raised about the ethical implications of using patient data without explicit approval, highlighting the necessity of stringent data protection measures under GDPR.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
A self-supervised text-vision framework for automated brain abnormality detection
Authors:
David A. Wood,
Emily Guilhem,
Sina Kafiabadi,
Ayisha Al Busaidi,
Kishan Dissanayake,
Ahmed Hammam,
Nina Mansoor,
Matthew Townend,
Siddharth Agarwal,
Yiran Wei,
Asif Mazumder,
Gareth J. Barker,
Peter Sasieni,
Sebastien Ourselin,
James H. Cole,
Thomas C. Booth
Abstract:
Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address…
▽ More
Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address these challenges, we present a self-supervised text-vision framework that learns to detect clinically relevant abnormalities in brain MRI scans by directly leveraging the rich information contained in accompanying free-text neuroradiology reports. Our training approach consisted of two-steps. First, a dedicated neuroradiological language model - NeuroBERT - was trained to generate fixed-dimensional vector representations of neuroradiology reports (N = 50,523) via domain-specific self-supervised learning tasks. Next, convolutional neural networks (one per MRI sequence) learnt to map individual brain scans to their corresponding text vector representations by optimising a mean square error loss. Once trained, our text-vision framework can be used to detect abnormalities in unreported brain MRI examinations by scoring scans against suitable query sentences (e.g., 'there is an acute stroke', 'there is hydrocephalus' etc.), enabling a range of classification-based applications including automated triage. Potentially, our framework could also serve as a clinical decision support tool, not only by suggesting findings to radiologists and detecting errors in provisional reports, but also by retrieving and displaying examples of pathologies from historical examinations that could be relevant to the current case based on textual descriptors.
△ Less
Submitted 11 June, 2024; v1 submitted 4 May, 2024;
originally announced May 2024.
-
Grid Minors and Products
Authors:
Vida Dujmović,
Pat Morin,
David R. Wood,
David Worley
Abstract:
Motivated by recent developments regarding the product structure of planar graphs, we study relationships between treewidth, grid minors, and graph products. We show that the Cartesian product of any two connected $n$-vertex graphs contains an $Ω(\sqrt{n})\timesΩ(\sqrt{n})$ grid minor. This result is tight: The lexicographic product (which includes the Cartesian product as a subgraph) of a star an…
▽ More
Motivated by recent developments regarding the product structure of planar graphs, we study relationships between treewidth, grid minors, and graph products. We show that the Cartesian product of any two connected $n$-vertex graphs contains an $Ω(\sqrt{n})\timesΩ(\sqrt{n})$ grid minor. This result is tight: The lexicographic product (which includes the Cartesian product as a subgraph) of a star and any $n$-vertex tree has no $ω(\sqrt{n})\timesω(\sqrt{n})$ grid minor.
△ Less
Submitted 27 February, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
On mission Twitter Profiles: A Study of Selective Toxic Behavior
Authors:
Hina Qayyum,
Muhammad Ikram,
Benjamin Zi Hao Zhao,
an D. Wood,
Nicolas Kourtellis,
Mohamed Ali Kaafar
Abstract:
The argument for persistent social media influence campaigns, often funded by malicious entities, is gaining traction. These entities utilize instrumented profiles to disseminate divisive content and disinformation, sha** public perception. Despite ample evidence of these instrumented profiles, few identification methods exist to locate them in the wild. To evade detection and appear genuine, sm…
▽ More
The argument for persistent social media influence campaigns, often funded by malicious entities, is gaining traction. These entities utilize instrumented profiles to disseminate divisive content and disinformation, sha** public perception. Despite ample evidence of these instrumented profiles, few identification methods exist to locate them in the wild. To evade detection and appear genuine, small clusters of instrumented profiles engage in unrelated discussions, diverting attention from their true goals. This strategic thematic diversity conceals their selective polarity towards certain topics and fosters public trust.
This study aims to characterize profiles potentially used for influence operations, termed 'on-mission profiles,' relying solely on thematic content diversity within unlabeled data. Distinguishing this work is its focus on content volume and toxicity towards specific themes. Longitudinal data from 138K Twitter or X, profiles and 293M tweets enables profiling based on theme diversity. High thematic diversity groups predominantly produce toxic content concerning specific themes, like politics, health, and news classifying them as 'on-mission' profiles.
Using the identified ``on-mission" profiles, we design a classifier for unseen, unlabeled data. Employing a linear SVM model, we train and test it on an 80/20% split of the most diverse profiles. The classifier achieves a flawless 100% accuracy, facilitating the discovery of previously unknown ``on-mission" profiles in the wild.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Exploring the Distinctive Tweeting Patterns of Toxic Twitter Users
Authors:
Hina Qayyum,
Muhammad Ikram,
Benjamin Zi Hao Zhao,
Ian D. Wood,
Nicolas Kourtellis,
Mohamed Ali Kaafar
Abstract:
In the pursuit of bolstering user safety, social media platforms deploy active moderation strategies, including content removal and user suspension. These measures target users engaged in discussions marked by hate speech or toxicity, often linked to specific keywords or hashtags. Nonetheless, the increasing prevalence of toxicity indicates that certain users adeptly circumvent these measures. Thi…
▽ More
In the pursuit of bolstering user safety, social media platforms deploy active moderation strategies, including content removal and user suspension. These measures target users engaged in discussions marked by hate speech or toxicity, often linked to specific keywords or hashtags. Nonetheless, the increasing prevalence of toxicity indicates that certain users adeptly circumvent these measures. This study examines consistently toxic users on Twitter (rebranded as X) Rather than relying on traditional methods based on specific topics or hashtags, we employ a novel approach based on patterns of toxic tweets, yielding deeper insights into their behavior. We analyzed 38 million tweets from the timelines of 12,148 Twitter users and identified the top 1,457 users who consistently exhibit toxic behavior, relying on metrics like the Gini index and Toxicity score. By comparing their posting patterns to those of non-consistently toxic users, we have uncovered distinctive temporal patterns, including contiguous activity spans, inter-tweet intervals (referred to as 'Burstiness'), and churn analysis. These findings provide strong evidence for the existence of a unique tweeting pattern associated with toxic behavior on Twitter. Crucially, our methodology transcends Twitter and can be adapted to various social media platforms, facilitating the identification of consistently toxic users based on their posting behavior. This research contributes to ongoing efforts to combat online toxicity and offers insights for refining moderation strategies in the digital realm. We are committed to open research and will provide our code and data to the research community.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
NEURO HAND: A weakly supervised Hierarchical Attention Network for interpretable neuroimaging abnormality Detection
Authors:
David A. Wood
Abstract:
Clinical neuroimaging data is naturally hierarchical. Different magnetic resonance imaging (MRI) sequences within a series, different slices covering the head, and different regions within each slice all confer different information. In this work we present a hierarchical attention network for abnormality detection using MRI scans obtained in a clinical hospital setting. The proposed network is su…
▽ More
Clinical neuroimaging data is naturally hierarchical. Different magnetic resonance imaging (MRI) sequences within a series, different slices covering the head, and different regions within each slice all confer different information. In this work we present a hierarchical attention network for abnormality detection using MRI scans obtained in a clinical hospital setting. The proposed network is suitable for non-volumetric data (i.e. stacks of high-resolution MRI slices), and can be trained from binary examination-level labels. We show that this hierarchical approach leads to improved classification, while providing interpretability through either coarse inter- and intra-slice abnormality localisation, or giving importance scores for different slices and sequences, making our model suitable for use as an automated triaging system in radiology departments.
△ Less
Submitted 16 January, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Model-agnostic variable importance for predictive uncertainty: an entropy-based approach
Authors:
Danny Wood,
Theodore Papamarkou,
Matt Benatan,
Richard Allmendinger
Abstract:
In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only the reasons for the predictions themselves, but also the reasons for the model's level of confidence in those predictions. In this paper, we show how exist…
▽ More
In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only the reasons for the predictions themselves, but also the reasons for the model's level of confidence in those predictions. In this paper, we show how existing methods in explainability can be extended to uncertainty-aware models and how such extensions can be used to understand the sources of uncertainty in a model's predictive distribution. In particular, by adapting permutation feature importance, partial dependence plots, and individual conditional expectation plots, we demonstrate that novel insights into model behaviour may be obtained and that these methods can be used to measure the impact of features on both the entropy of the predictive distribution and the log-likelihood of the ground truth labels under that distribution. With experiments using both synthetic and real-world data, we demonstrate the utility of these approaches to understand both the sources of uncertainty and their impact on model performance.
△ Less
Submitted 28 May, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
A max-affine spline approximation of neural networks using the Legendre transform of a convex-concave representation
Authors:
Adam Perrett,
Danny Wood,
Gavin Brown
Abstract:
This work presents a novel algorithm for transforming a neural network into a spline representation. Unlike previous work that required convex and piecewise-affine network operators to create a max-affine spline alternate form, this work relaxes this constraint. The only constraint is that the function be bounded and possess a well-define second derivative, although this was shown experimentally t…
▽ More
This work presents a novel algorithm for transforming a neural network into a spline representation. Unlike previous work that required convex and piecewise-affine network operators to create a max-affine spline alternate form, this work relaxes this constraint. The only constraint is that the function be bounded and possess a well-define second derivative, although this was shown experimentally to not be strictly necessary. It can also be performed over the whole network rather than on each layer independently. As in previous work, this bridges the gap between neural networks and approximation theory but also enables the visualisation of network feature maps. Mathematical proof and experimental investigation of the technique is performed with approximation error and feature maps being extracted from a range of architectures, including convolutional neural networks.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
The grid-minor theorem revisited
Authors:
Vida Dujmović,
Robert Hickingbotham,
Jędrzej Hodor,
Gweanël Joret,
Hoang La,
Piotr Micek,
Pat Morin,
Clément Rambaud,
David R. Wood
Abstract:
We prove that for every planar graph $X$ of treedepth $h$, there exists a positive integer $c$ such that for every $X$-minor-free graph $G$, there exists a graph $H$ of treewidth at most $f(h)$ such that $G$ is isomorphic to a subgraph of $H\boxtimes K_c$. This is a qualitative strengthening of the Grid-Minor Theorem of Robertson and Seymour (JCTB 1986), and treedepth is the optimal parameter in s…
▽ More
We prove that for every planar graph $X$ of treedepth $h$, there exists a positive integer $c$ such that for every $X$-minor-free graph $G$, there exists a graph $H$ of treewidth at most $f(h)$ such that $G$ is isomorphic to a subgraph of $H\boxtimes K_c$. This is a qualitative strengthening of the Grid-Minor Theorem of Robertson and Seymour (JCTB 1986), and treedepth is the optimal parameter in such a result. As an example application, we use this result to improve the upper bound for weak coloring numbers of graphs excluding a fixed graph as a minor.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Proof of the Clustered Hadwiger Conjecture
Authors:
Vida Dujmović,
Louis Esperet,
Pat Morin,
David R. Wood
Abstract:
Hadwiger's Conjecture asserts that every $K_h$-minor-free graph is properly $(h-1)$-colourable. We prove the following improper analogue of Hadwiger's Conjecture: for fixed $h$, every $K_h$-minor-free graph is $(h-1)$-colourable with monochromatic components of bounded size. The number of colours is best possible regardless of the size of monochromatic components. It solves an open problem of Edwa…
▽ More
Hadwiger's Conjecture asserts that every $K_h$-minor-free graph is properly $(h-1)$-colourable. We prove the following improper analogue of Hadwiger's Conjecture: for fixed $h$, every $K_h$-minor-free graph is $(h-1)$-colourable with monochromatic components of bounded size. The number of colours is best possible regardless of the size of monochromatic components. It solves an open problem of Edwards, Kang, Kim, Oum and Seymour [\emph{SIAM J. Disc. Math.} 2015], and concludes a line of research initiated in 2007. Similarly, for fixed $t\geq s$, we show that every $K_{s,t}$-minor-free graph is $(s+1)$-colourable with monochromatic components of bounded size. The number of colours is best possible, solving an open problem of van de Heuvel and Wood [\emph{J.~London Math.\ Soc.} 2018]. We actually prove a single theorem from which both of the above results are immediate corollaries. For an excluded apex minor, we strengthen the result as follows: for fixed $t\geq s\geq 3$, and for any fixed apex graph $X$, every $K_{s,t}$-subgraph-free $X$-minor-free graph is $(s+1)$-colourable with monochromatic components of bounded size. The number of colours is again best possible.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
The Excluded Tree Minor Theorem Revisited
Authors:
Vida Dujmović,
Robert Hickingbotham,
Gwenaël Joret,
Piotr Micek,
Pat Morin,
David R. Wood
Abstract:
We prove that for every tree $T$ of radius $h$, there is an integer $c$ such that every $T$-minor-free graph is contained in $H\boxtimes K_c$ for some graph $H$ with pathwidth at most $2h-1$. This is a qualitative strengthening of the Excluded Tree Minor Theorem of Robertson and Seymour (GM I). We show that radius is the right parameter to consider in this setting, and $2h-1$ is the best possible…
▽ More
We prove that for every tree $T$ of radius $h$, there is an integer $c$ such that every $T$-minor-free graph is contained in $H\boxtimes K_c$ for some graph $H$ with pathwidth at most $2h-1$. This is a qualitative strengthening of the Excluded Tree Minor Theorem of Robertson and Seymour (GM I). We show that radius is the right parameter to consider in this setting, and $2h-1$ is the best possible bound.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
A longitudinal study of the top 1% toxic Twitter profiles
Authors:
Hina Qayyum,
Benjamin Zi Hao Zhao,
Ian D. Wood,
Muhammad Ikram,
Mohamed Ali Kaafar,
Nicolas Kourtellis
Abstract:
Toxicity is endemic to online social networks including Twitter. It follows a Pareto like distribution where most of the toxicity is generated by a very small number of profiles and as such, analyzing and characterizing these toxic profiles is critical. Prior research has largely focused on sporadic, event centric toxic content to characterize toxicity on the platform. Instead, we approach the pro…
▽ More
Toxicity is endemic to online social networks including Twitter. It follows a Pareto like distribution where most of the toxicity is generated by a very small number of profiles and as such, analyzing and characterizing these toxic profiles is critical. Prior research has largely focused on sporadic, event centric toxic content to characterize toxicity on the platform. Instead, we approach the problem of characterizing toxic content from a profile centric point of view. We study 143K Twitter profiles and focus on the behavior of the top 1 percent producers of toxic content on Twitter, based on toxicity scores of their tweets availed by Perspective API. With a total of 293M tweets, spanning 16 years of activity, the longitudinal data allow us to reconstruct the timelines of all profiles involved. We use these timelines to gauge the behavior of the most toxic Twitter profiles compared to the rest of the Twitter population. We study the pattern of tweet posting from highly toxic accounts, based on the frequency and how prolific they are, the nature of hashtags and URLs, profile metadata, and Botometer scores. We find that the highly toxic profiles post coherent and well articulated content, their tweets keep to a narrow theme with lower diversity in hashtags, URLs, and domains, they are thematically similar to each other, and have a high likelihood of bot like behavior, likely to have progenitors with intentions to influence, based on high fake followers score. Our work contributes insight into the top 1 percent of toxic profiles on Twitter and establishes the profile centric approach to investigate toxicity on Twitter to be beneficial.
△ Less
Submitted 25 March, 2023;
originally announced March 2023.
-
A Unified Theory of Diversity in Ensemble Learning
Authors:
Danny Wood,
Tingting Mu,
Andrew Webb,
Henry Reeve,
Mikel Luján,
Gavin Brown
Abstract:
We present a theory of ensemble diversity, explaining the nature of diversity for a wide range of supervised learning scenarios. This challenge has been referred to as the holy grail of ensemble learning, an open research issue for over 30 years. Our framework reveals that diversity is in fact a hidden dimension in the bias-variance decomposition of the ensemble loss. We prove a family of exact bi…
▽ More
We present a theory of ensemble diversity, explaining the nature of diversity for a wide range of supervised learning scenarios. This challenge has been referred to as the holy grail of ensemble learning, an open research issue for over 30 years. Our framework reveals that diversity is in fact a hidden dimension in the bias-variance decomposition of the ensemble loss. We prove a family of exact bias-variance-diversity decompositions, for a wide range of losses in both regression and classification, e.g., squared, cross-entropy, and Poisson losses. For losses where an additive bias-variance decomposition is not available (e.g., 0/1 loss) we present an alternative approach: quantifying the effects of diversity, which turn out to be dependent on the label distribution. Overall, we argue that diversity is a measure of model fit, in precisely the same sense as bias and variance, but accounting for statistical dependencies between ensemble members. Thus, we should not be maximising diversity as so many works aim to do -- instead, we have a bias/variance/diversity trade-off to manage.
△ Less
Submitted 7 February, 2024; v1 submitted 10 January, 2023;
originally announced January 2023.
-
Unintended Memorization and Timing Attacks in Named Entity Recognition Models
Authors:
Rana Salal Ali,
Benjamin Zi Hao Zhao,
Hassan Jameel Asghar,
Tham Nguyen,
Ian David Wood,
Dali Kaafar
Abstract:
Named entity recognition models (NER), are widely used for identifying named entities (e.g., individuals, locations, and other information) in text documents. Machine learning based NER models are increasingly being applied in privacy-sensitive applications that need automatic and scalable identification of sensitive information to redact text for data sharing. In this paper, we study the setting…
▽ More
Named entity recognition models (NER), are widely used for identifying named entities (e.g., individuals, locations, and other information) in text documents. Machine learning based NER models are increasingly being applied in privacy-sensitive applications that need automatic and scalable identification of sensitive information to redact text for data sharing. In this paper, we study the setting when NER models are available as a black-box service for identifying sensitive information in user documents and show that these models are vulnerable to membership inference on their training datasets. With updated pre-trained NER models from spaCy, we demonstrate two distinct membership attacks on these models. Our first attack capitalizes on unintended memorization in the NER's underlying neural network, a phenomenon NNs are known to be vulnerable to. Our second attack leverages a timing side-channel to target NER models that maintain vocabularies constructed from the training data. We show that different functional paths of words within the training dataset in contrast to words not previously seen have measurable differences in execution time. Revealing membership status of training samples has clear privacy implications, e.g., in text redaction, sensitive words or phrases to be found and removed, are at risk of being detected in the training dataset. Our experimental evaluation includes the redaction of both password and health data, presenting both security risks and privacy/regulatory issues. This is exacerbated by results that show memorization with only a single phrase. We achieved 70% AUC in our first attack on a text redaction use-case. We also show overwhelming success in the timing attack with 99.23% AUC. Finally we discuss potential mitigation approaches to realize the safe use of NER models in light of the privacy and security implications of membership inference attacks.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Product structure of graph classes with strongly sublinear separators
Authors:
Zdeněk Dvořák,
David R. Wood
Abstract:
We investigate the product structure of hereditary graph classes admitting strongly sublinear separators. We characterise such classes as subgraphs of the strong product of a star and a complete graph of strongly sublinear size. In a more precise result, we show that if any hereditary graph class $\mathcal{G}$ admits $O(n^{1-ε})$ separators, then for any fixed $δ\in(0,ε)$ every $n$-vertex graph in…
▽ More
We investigate the product structure of hereditary graph classes admitting strongly sublinear separators. We characterise such classes as subgraphs of the strong product of a star and a complete graph of strongly sublinear size. In a more precise result, we show that if any hereditary graph class $\mathcal{G}$ admits $O(n^{1-ε})$ separators, then for any fixed $δ\in(0,ε)$ every $n$-vertex graph in $\mathcal{G}$ is a subgraph of the strong product of a graph $H$ with bounded tree-depth and a complete graph of size $O(n^{1-ε+δ})$. This result holds with $δ=0$ if we allow $H$ to have tree-depth $O(\log\log n)$. Moreover, using extensions of classical isoperimetric inequalties for grids graphs, we show the dependence on $δ$ in our results and the above $\text{td}(H)\in O(\log\log n)$ bound are both best possible. We prove that $n$-vertex graphs of bounded treewidth are subgraphs of the product of a graph with tree-depth $t$ and a complete graph of size $O(n^{1/t})$, which is best possible. Finally, we investigate the conjecture that for any hereditary graph class $\mathcal{G}$ that admits $O(n^{1-ε})$ separators, every $n$-vertex graph in $\mathcal{G}$ is a subgraph of the strong product of a graph $H$ with bounded tree-width and a complete graph of size $O(n^{1-ε})$. We prove this for various classes $\mathcal{G}$ of interest.
△ Less
Submitted 27 September, 2023; v1 submitted 22 August, 2022;
originally announced August 2022.
-
Task Oriented Video Coding: A Survey
Authors:
Daniel Wood
Abstract:
Video coding technology has been continuously improved for higher compression ratio with higher resolution. However, the state-of-the-art video coding standards, such as H.265/HEVC and Versatile Video Coding, are still designed with the assumption the compressed video will be watched by humans. With the tremendous advance and maturation of deep neural networks in solving computer vision tasks, mor…
▽ More
Video coding technology has been continuously improved for higher compression ratio with higher resolution. However, the state-of-the-art video coding standards, such as H.265/HEVC and Versatile Video Coding, are still designed with the assumption the compressed video will be watched by humans. With the tremendous advance and maturation of deep neural networks in solving computer vision tasks, more and more videos are directly analyzed by deep neural networks without humans' involvement. Such a conventional design for video coding standard is not optimal when the compressed video is used by computer vision applications. While the human visual system is consistently sensitive to the content with high contrast, the impact of pixels on computer vision algorithms is driven by specific computer vision tasks. In this paper, we explore and summarize recent progress on computer vision task oriented video coding and emerging video coding standard, Video Coding for Machines.
△ Less
Submitted 20 November, 2022; v1 submitted 15 August, 2022;
originally announced August 2022.
-
2-Layer Graph Drawings with Bounded Pathwidth
Authors:
David R. Wood
Abstract:
We determine which properties of 2-layer drawings characterise bipartite graphs of bounded pathwidth.
We determine which properties of 2-layer drawings characterise bipartite graphs of bounded pathwidth.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Product structure of graph classes with bounded treewidth
Authors:
Rutger Campbell,
Katie Clinch,
Marc Distel,
J. Pascal Gollin,
Kevin Hendrey,
Robert Hickingbotham,
Tony Huynh,
Freddie Illingworth,
Youri Tamitegama,
Jane Tan,
David R. Wood
Abstract:
We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with…
▽ More
We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with ${\text{tw}(H) \leq c}$ such that $G$ is isomorphic to a subgraph of ${H \boxtimes K_{f(\text{tw}(G))}}$. We introduce disjointed coverings of graphs and show they determine the underlying treewidth of any graph class. Using this result, we prove that the class of planar graphs has underlying treewidth 3; the class of $K_{s,t}$-minor-free graphs has underlying treewidth $s$ (for ${t \geq \max\{s,3\}}$); and the class of $K_t$-minor-free graphs has underlying treewidth ${t-2}$. In general, we prove that a monotone class has bounded underlying treewidth if and only if it excludes some fixed topological minor. We also study the underlying treewidth of graph classes defined by an excluded subgraph or excluded induced subgraph. We show that the class of graphs with no $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a subdivided star, and that the class of graphs with no induced $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a star.
△ Less
Submitted 17 July, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
A Bottom-Up End-User Intelligent Assistant Approach to Empower Gig Workers against AI Inequality
Authors:
Toby Jia-Jun Li,
Yuwen Lu,
Jaylexia Clark,
Meng Chen,
Victor Cox,
Meng Jiang,
Yang Yang,
Tamara Kay,
Danielle Wood,
Jay Brockman
Abstract:
The growing inequality in gig work between workers and platforms has become a critical social issue as gig work plays an increasingly prominent role in the future of work. The AI inequality is caused by (1) the technology divide in who has access to AI technologies in gig work; and (2) the data divide in who owns the data in gig work leads to unfair working conditions, growing pay gap, neglect of…
▽ More
The growing inequality in gig work between workers and platforms has become a critical social issue as gig work plays an increasingly prominent role in the future of work. The AI inequality is caused by (1) the technology divide in who has access to AI technologies in gig work; and (2) the data divide in who owns the data in gig work leads to unfair working conditions, growing pay gap, neglect of workers' diverse preferences, and workers' lack of trust in the platforms. In this position paper, we argue that a bottom-up approach that empowers individual workers to access AI-enabled work planning support and share data among a group of workers through a network of end-user-programmable intelligent assistants is a practical way to bridge AI inequality in gig work under the current paradigm of privately owned platforms. This position paper articulates a set of research challenges, potential approaches, and community engagement opportunities, seeking to start a dialogue on this important research topic in the interdisciplinary CHIWORK community.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Bias-Variance Decompositions for Margin Losses
Authors:
Danny Wood,
Tingting Mu,
Gavin Brown
Abstract:
We introduce a novel bias-variance decomposition for a range of strictly convex margin losses, including the logistic loss (minimized by the classic LogitBoost algorithm), as well as the squared margin loss and canonical boosting loss. Furthermore, we show that, for all strictly convex margin losses, the expected risk decomposes into the risk of a "central" model and a term quantifying variation i…
▽ More
We introduce a novel bias-variance decomposition for a range of strictly convex margin losses, including the logistic loss (minimized by the classic LogitBoost algorithm), as well as the squared margin loss and canonical boosting loss. Furthermore, we show that, for all strictly convex margin losses, the expected risk decomposes into the risk of a "central" model and a term quantifying variation in the functional margin with respect to variations in the training data. These decompositions provide a diagnostic tool for practitioners to understand model overfitting/underfitting, and have implications for additive ensemble models -- for example, when our bias-variance decomposition holds, there is a corresponding "ambiguity" decomposition, which can be used to quantify model diversity.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Reduced bandwidth: a qualitative strengthening of twin-width in minor-closed classes (and beyond)
Authors:
Édouard Bonnet,
O-joung Kwon,
David R. Wood
Abstract:
In a reduction sequence of a graph, vertices are successively identified until the graph has one vertex. At each step, when identifying $u$ and $v$, each edge incident to exactly one of $u$ and $v$ is coloured red. Bonnet, Kim, Thomassé and Watrigant [J. ACM 2022] defined the twin-width of a graph $G$ to be the minimum integer $k$ such that there is a reduction sequence of $G$ in which every red g…
▽ More
In a reduction sequence of a graph, vertices are successively identified until the graph has one vertex. At each step, when identifying $u$ and $v$, each edge incident to exactly one of $u$ and $v$ is coloured red. Bonnet, Kim, Thomassé and Watrigant [J. ACM 2022] defined the twin-width of a graph $G$ to be the minimum integer $k$ such that there is a reduction sequence of $G$ in which every red graph has maximum degree at most $k$. For any graph parameter $f$, we define the reduced $f$ of a graph $G$ to be the minimum integer $k$ such that there is a reduction sequence of $G$ in which every red graph has $f$ at most $k$. Our focus is on graph classes with bounded reduced bandwidth, which implies and is stronger than bounded twin-width (reduced maximum degree). We show that every proper minor-closed class has bounded reduced bandwidth, which is qualitatively stronger than an analogous result of Bonnet et al.\ for bounded twin-width. In many instances, we also make quantitative improvements. For example, all previous upper bounds on the twin-width of planar graphs were at least $2^{1000}$. We show that planar graphs have reduced bandwidth at most $466$ and twin-width at most $583$. Our bounds for graphs of Euler genus $γ$ are $O(γ)$. Lastly, we show that fixed powers of graphs in a proper minor-closed class have bounded reduced bandwidth (irrespective of the degree of the vertices). In particular, we show that map graphs of Euler genus $γ$ have reduced bandwidth $O(γ^4)$. Lastly, we separate twin-width and reduced bandwidth by showing that any infinite class of expanders excluding a fixed complete bipartite subgraph has unbounded reduced bandwidth, while there are bounded-degree expanders with twin-width at most 6.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
A deep dive into the consistently toxic 1% of Twitter
Authors:
Hina Qayyum,
Benjamin Zi Hao Zhao,
Ian D. Wood,
Muhammad Ikram,
Mohamed Ali Kaafar,
Nicolas Kourtellis
Abstract:
Misbehavior in online social networks (OSN) is an ever-growing phenomenon. The research to date tends to focus on the deployment of machine learning to identify and classify types of misbehavior such as bullying, aggression, and racism to name a few. The main goal of identification is to curb natural and mechanical misconduct and make OSNs a safer place for social discourse. Going beyond past work…
▽ More
Misbehavior in online social networks (OSN) is an ever-growing phenomenon. The research to date tends to focus on the deployment of machine learning to identify and classify types of misbehavior such as bullying, aggression, and racism to name a few. The main goal of identification is to curb natural and mechanical misconduct and make OSNs a safer place for social discourse. Going beyond past works, we perform a longitudinal study of a large selection of Twitter profiles, which enables us to characterize profiles in terms of how consistently they post highly toxic content. Our data spans 14 years of tweets from 122K Twitter profiles and more than 293M tweets. From this data, we selected the most extreme profiles in terms of consistency of toxic content and examined their tweet texts, and the domains, hashtags, and URLs they shared. We found that these selected profiles keep to a narrow theme with lower diversity in hashtags, URLs, and domains, they are thematically similar to each other (in a coordinated manner, if not through intent), and have a high likelihood of bot-like behavior (likely to have progenitors with intentions to influence). Our work contributes a substantial and longitudinal online misbehavior dataset to the research community and establishes the consistency of a profile's toxic behavior as a useful factor when exploring misbehavior as potential accessories to influence operations on OSNs.
△ Less
Submitted 15 February, 2022;
originally announced February 2022.
-
Three-dimensional graph products with unbounded stack-number
Authors:
David Eppstein,
Robert Hickingbotham,
Laura Merker,
Sergey Norin,
Michał T. Seweryn,
David R. Wood
Abstract:
We prove that the stack-number of the strong product of three $n$-vertex paths is $Θ(n^{1/3})$. The best previously known upper bound was $O(n)$. No non-trivial lower bound was known. This is the first explicit example of a graph family with bounded maximum degree and unbounded stack-number.
The main tool used in our proof of the lower bound is the topological overlap theorem of Gromov. We actua…
▽ More
We prove that the stack-number of the strong product of three $n$-vertex paths is $Θ(n^{1/3})$. The best previously known upper bound was $O(n)$. No non-trivial lower bound was known. This is the first explicit example of a graph family with bounded maximum degree and unbounded stack-number.
The main tool used in our proof of the lower bound is the topological overlap theorem of Gromov. We actually prove a stronger result in terms of so-called triangulations of Cartesian products. We conclude that triangulations of three-dimensional Cartesian products of any sufficiently large connected graphs have large stack-number.
The upper bound is a special case of a more general construction based on families of permutations derived from Hadamard matrices.
The strong product of three paths is also the first example of a bounded degree graph with bounded queue-number and unbounded stack-number. A natural question that follows from our result is to determine the smallest $Δ_0$ such that there exist a graph family with unbounded stack-number, bounded queue-number and maximum degree $Δ_0$. We show that $Δ_0\in \{6,7\}$.
△ Less
Submitted 10 February, 2022;
originally announced February 2022.
-
An improved planar graph product structure theorem
Authors:
Torsten Ueckerdt,
David R. Wood,
Wendy Yi
Abstract:
Dujmović, Joret, Micek, Morin, Ueckerdt and Wood [J. ACM 2020] proved that for every planar graph $G$ there is a graph $H$ with treewidth at most 8 and a path $P$ such that $G\subseteq H\boxtimes P$. We improve this result by replacing "treewidth at most 8" by "simple treewidth at most 6".
Dujmović, Joret, Micek, Morin, Ueckerdt and Wood [J. ACM 2020] proved that for every planar graph $G$ there is a graph $H$ with treewidth at most 8 and a path $P$ such that $G\subseteq H\boxtimes P$. We improve this result by replacing "treewidth at most 8" by "simple treewidth at most 6".
△ Less
Submitted 31 July, 2021;
originally announced August 2021.
-
Smaller extended formulations for spanning tree polytopes in minor-closed classes and beyond
Authors:
Manuel Aprile,
Samuel Fiorini,
Tony Huynh,
Gwenaël Joret,
David R. Wood
Abstract:
Let $G$ be a connected $n$-vertex graph in a proper minor-closed class $\mathcal G$. We prove that the extension complexity of the spanning tree polytope of $G$ is $O(n^{3/2})$. This improves on the $O(n^2)$ bounds following from the work of Wong (1980) and Martin (1991). It also extends a result of Fiorini, Huynh, Joret, and Pashkovich (2017), who obtained a $O(n^{3/2})$ bound for graphs embedded…
▽ More
Let $G$ be a connected $n$-vertex graph in a proper minor-closed class $\mathcal G$. We prove that the extension complexity of the spanning tree polytope of $G$ is $O(n^{3/2})$. This improves on the $O(n^2)$ bounds following from the work of Wong (1980) and Martin (1991). It also extends a result of Fiorini, Huynh, Joret, and Pashkovich (2017), who obtained a $O(n^{3/2})$ bound for graphs embedded in a fixed surface. Our proof works more generally for all graph classes admitting strongly sublinear balanced separators: We prove that for every constant $β$ with $0<β<1$, if $\mathcal G$ is a graph class closed under induced subgraphs such that all $n$-vertex graphs in $\mathcal G$ have balanced separators of size $O(n^β)$, then the extension complexity of the spanning tree polytope of every connected $n$-vertex graph in $\mathcal{G}$ is $O(n^{1+β})$. We in fact give two proofs of this result, one is a direct construction of the extended formulation, the other is via communication protocols. Using the latter approach we also give a short proof of the $O(n)$ bound for planar graphs due to Williams (2002).
△ Less
Submitted 2 December, 2021; v1 submitted 22 June, 2021;
originally announced June 2021.
-
Ookami: Deployment and Initial Experiences
Authors:
Andrew Burford,
Alan C. Calder,
David Carlson,
Barbara Chapman,
Firat CoŞKun,
Tony Curtis,
Catherine Feldman,
Robert J. Harrison,
Yan Kang,
Benjamin Michalow-Icz,
Eric Raut,
Eva Siegmann,
Daniel G. Wood,
Robert L. Deleon,
Mathew Jones,
Nikolay A. Simakov,
Joseph P. White,
Dossay Oryspayev
Abstract:
Ookami is a computer technology testbed supported by the United States National Science Foundation. It provides researchers with access to the A64FX processor developed by Fujitsu in collaboration with RIKΞN for the Japanese path to exascale computing, as deployed in Fugaku, the fastest computer in the world. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vec…
▽ More
Ookami is a computer technology testbed supported by the United States National Science Foundation. It provides researchers with access to the A64FX processor developed by Fujitsu in collaboration with RIKΞN for the Japanese path to exascale computing, as deployed in Fugaku, the fastest computer in the world. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain familiar and successful programming models while achieving very high performance for a wide range of applications. We review relevant technology and system details, and the main body of the paper focuses on initial experiences with the hardware and software ecosystem for micro-benchmarks, mini-apps, and full applications, and starts to answer questions about where such technologies fit into the NSF ecosystem.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
Automated triaging of head MRI examinations using convolutional neural networks
Authors:
David A. Wood,
Sina Kafiabadi,
Ayisha Al Busaidi,
Emily Guilhem,
Antanas Montvila,
Siddharth Agarwal,
Jeremy Lynch,
Matthew Townend,
Gareth Barker,
Sebastien Ourselin,
James H. Cole,
Thomas C. Booth
Abstract:
The growing demand for head magnetic resonance imaging (MRI) examinations, along with a global shortage of radiologists, has led to an increase in the time taken to report head MRI scans around the world. For many neurological conditions, this delay can result in increased morbidity and mortality. An automated triaging tool could reduce reporting times for abnormal examinations by identifying abno…
▽ More
The growing demand for head magnetic resonance imaging (MRI) examinations, along with a global shortage of radiologists, has led to an increase in the time taken to report head MRI scans around the world. For many neurological conditions, this delay can result in increased morbidity and mortality. An automated triaging tool could reduce reporting times for abnormal examinations by identifying abnormalities at the time of imaging and prioritizing the reporting of these scans. In this work, we present a convolutional neural network for detecting clinically-relevant abnormalities in $\text{T}_2$-weighted head MRI scans. Using a validated neuroradiology report classifier, we generated a labelled dataset of 43,754 scans from two large UK hospitals for model training, and demonstrate accurate classification (area under the receiver operating curve (AUC) = 0.943) on a test set of 800 scans labelled by a team of neuroradiologists. Importantly, when trained on scans from only a single hospital the model generalized to scans from the other hospital ($Δ$AUC $\leq$ 0.02). A simulation study demonstrated that our model would reduce the mean reporting time for abnormal examinations from 28 days to 14 days and from 9 days to 5 days at the two hospitals, demonstrating feasibility for use in a clinical triage environment.
△ Less
Submitted 28 June, 2022; v1 submitted 15 June, 2021;
originally announced June 2021.
-
Separating layered treewidth and row treewidth
Authors:
Prosenjit Bose,
Vida Dujmović,
Mehrnoosh Javarsineh,
Pat Morin,
David R. Wood
Abstract:
Layered treewidth and row treewidth are recently introduced graph parameters that have been key ingredients in the solution of several well-known open problems. It follows from the definitions that the layered treewidth of a graph is at most its row treewidth plus 1. Moreover, a minor-closed class has bounded layered treewidth if and only if it has bounded row treewidth. However, it has been open…
▽ More
Layered treewidth and row treewidth are recently introduced graph parameters that have been key ingredients in the solution of several well-known open problems. It follows from the definitions that the layered treewidth of a graph is at most its row treewidth plus 1. Moreover, a minor-closed class has bounded layered treewidth if and only if it has bounded row treewidth. However, it has been open whether row treewidth is bounded by a function of layered treewidth. This paper answers this question in the negative. In particular, for every integer $k$ we describe a graph with layered treewidth 1 and row treewidth $k$. We also prove an analogous result for layered pathwidth and row pathwidth.
△ Less
Submitted 5 May, 2022; v1 submitted 3 May, 2021;
originally announced May 2021.
-
AutoAI-TS: AutoAI for Time Series Forecasting
Authors:
Syed Yousaf Shah,
Dhaval Patel,
Long Vu,
Xuan-Hong Dang,
Bei Chen,
Peter Kirchner,
Horst Samulowitz,
David Wood,
Gregory Bramble,
Wesley M. Gifford,
Giridhar Ganapavarapu,
Roman Vaculin,
Petros Zerfos
Abstract:
A large number of time series forecasting models including traditional statistical models, machine learning models and more recently deep learning have been proposed in the literature. However, choosing the right model along with good parameter values that performs well on a given data is still challenging. Automatically providing a good set of models to users for a given dataset saves both time a…
▽ More
A large number of time series forecasting models including traditional statistical models, machine learning models and more recently deep learning have been proposed in the literature. However, choosing the right model along with good parameter values that performs well on a given data is still challenging. Automatically providing a good set of models to users for a given dataset saves both time and effort from using trial-and-error approaches with a wide variety of available models along with parameter optimization. We present AutoAI for Time Series Forecasting (AutoAI-TS) that provides users with a zero configuration (zero-conf ) system to efficiently train, optimize and choose best forecasting model among various classes of models for the given dataset. With its flexible zero-conf design, AutoAI-TS automatically performs all the data preparation, model creation, parameter optimization, training and model selection for users and provides a trained model that is ready to use. For given data, AutoAI-TS utilizes a wide variety of models including classical statistical models, Machine Learning (ML) models, statistical-ML hybrid models and deep learning models along with various transformations to create forecasting pipelines. It then evaluates and ranks pipelines using the proposed T-Daub mechanism to choose the best pipeline. The paper describe in detail all the technical aspects of AutoAI-TS along with extensive benchmarking on a variety of real world data sets for various use-cases. Benchmark results show that AutoAI-TS, with no manual configuration from the user, automatically trains and selects pipelines that on average outperform existing state-of-the-art time series forecasting toolkits.
△ Less
Submitted 8 March, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
ECOL-R: Encouraging Copying in Novel Object Captioning with Reinforcement Learning
Authors:
Yufei Wang,
Ian D. Wood,
Stephen Wan,
Mark Johnson
Abstract:
Novel Object Captioning is a zero-shot Image Captioning task requiring describing objects not seen in the training captions, but for which information is available from external object detectors. The key challenge is to select and describe all salient detected novel objects in the input images. In this paper, we focus on this challenge and propose the ECOL-R model (Encouraging Copying of Object La…
▽ More
Novel Object Captioning is a zero-shot Image Captioning task requiring describing objects not seen in the training captions, but for which information is available from external object detectors. The key challenge is to select and describe all salient detected novel objects in the input images. In this paper, we focus on this challenge and propose the ECOL-R model (Encouraging Copying of Object Labels with Reinforced Learning), a copy-augmented transformer model that is encouraged to accurately describe the novel object labels. This is achieved via a specialised reward function in the SCST reinforcement learning framework (Rennie et al., 2017) that encourages novel object mentions while maintaining the caption quality. We further restrict the SCST training to the images where detected objects are mentioned in reference captions to train the ECOL-R model. We additionally improve our copy mechanism via Abstract Labels, which transfer knowledge from known to novel object types, and a Morphological Selector, which determines the appropriate inflected forms of novel object labels. The resulting model sets new state-of-the-art on the nocaps (Agrawal et al., 2019) and held-out COCO (Hendricks et al., 2016) benchmarks.
△ Less
Submitted 24 January, 2021;
originally announced January 2021.
-
Stack-number is not bounded by queue-number
Authors:
Vida Dujmović,
David Eppstein,
Robert Hickingbotham,
Pat Morin,
David R. Wood
Abstract:
We describe a family of graphs with queue-number at most 4 but unbounded stack-number. This resolves open problems of Heath, Leighton and Rosenberg (1992) and Blankenship and Oporowski (1999).
We describe a family of graphs with queue-number at most 4 but unbounded stack-number. This resolves open problems of Heath, Leighton and Rosenberg (1992) and Blankenship and Oporowski (1999).
△ Less
Submitted 23 March, 2021; v1 submitted 9 November, 2020;
originally announced November 2020.
-
Nonrepetitive graph colouring
Authors:
David R. Wood
Abstract:
A vertex colouring of a graph $G$ is "nonrepetitive" if $G$ contains no path for which the first half of the path is assigned the same sequence of colours as the second half. Thue's famous theorem says that every path is nonrepetitively 3-colourable. This paper surveys results about nonrepetitive colourings of graphs. The goal is to give a unified and comprehensive presentation of the major result…
▽ More
A vertex colouring of a graph $G$ is "nonrepetitive" if $G$ contains no path for which the first half of the path is assigned the same sequence of colours as the second half. Thue's famous theorem says that every path is nonrepetitively 3-colourable. This paper surveys results about nonrepetitive colourings of graphs. The goal is to give a unified and comprehensive presentation of the major results and proof methods, as well as to highlight numerous open problems.
△ Less
Submitted 4 September, 2020;
originally announced September 2020.
-
Labelling imaging datasets on the basis of neuroradiology reports: a validation study
Authors:
David A. Wood,
Sina Kafiabadi,
Aisha Al Busaidi,
Emily Guilhem,
Jeremy Lynch,
Matthew Townend,
Antanas Montvila,
Juveria Siddiqui,
Naveen Gadapa,
Matthew Benger,
Gareth Barker,
Sebastian Ourselin,
James H. Cole,
Thomas C. Booth
Abstract:
Natural language processing (NLP) shows promise as a means to automate the labelling of hospital-scale neuroradiology magnetic resonance imaging (MRI) datasets for computer vision applications. To date, however, there has been no thorough investigation into the validity of this approach, including determining the accuracy of report labels compared to image labels as well as examining the performan…
▽ More
Natural language processing (NLP) shows promise as a means to automate the labelling of hospital-scale neuroradiology magnetic resonance imaging (MRI) datasets for computer vision applications. To date, however, there has been no thorough investigation into the validity of this approach, including determining the accuracy of report labels compared to image labels as well as examining the performance of non-specialist labellers. In this work, we draw on the experience of a team of neuroradiologists who labelled over 5000 MRI neuroradiology reports as part of a project to build a dedicated deep learning-based neuroradiology report classifier. We show that, in our experience, assigning binary labels (i.e. normal vs abnormal) to images from reports alone is highly accurate. In contrast to the binary labels, however, the accuracy of more granular labelling is dependent on the category, and we highlight reasons for this discrepancy. We also show that downstream model performance is reduced when labelling of training reports is performed by a non-specialist. To allow other researchers to accelerate their research, we make our refined abnormality definitions and labelling rules available, as well as our easy-to-use radiology report labelling app which helps streamline this process.
△ Less
Submitted 8 March, 2021; v1 submitted 8 July, 2020;
originally announced July 2020.
-
The gem5 Simulator: Version 20.0+
Authors:
Jason Lowe-Power,
Abdul Mutaal Ahmad,
Ayaz Akram,
Mohammad Alian,
Rico Amslinger,
Matteo Andreozzi,
Adrià Armejach,
Nils Asmussen,
Brad Beckmann,
Srikant Bharadwaj,
Gabe Black,
Gedare Bloom,
Bobby R. Bruce,
Daniel Rodrigues Carvalho,
Jeronimo Castrillon,
Lizhong Chen,
Nicolas Derumigny,
Stephan Diestelhorst,
Wendy Elsasser,
Carlos Escuin,
Marjan Fariborz,
Amin Farmahini-Farahani,
Pouya Fotouhi,
Ryan Gambord,
Jayneel Gandhi
, et al. (53 additional authors not shown)
Abstract:
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 si…
▽ More
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern computer hardware at the cycle level, and it has enough fidelity to boot unmodified Linux-based operating systems and run full applications for multiple architectures including x86, Arm, and RISC-V. The gem5 simulator has been under active development over the last nine years since the original gem5 release. In this time, there have been over 7500 commits to the codebase from over 250 unique contributors which have improved the simulator by adding new features, fixing bugs, and increasing the code quality. In this paper, we give and overview of gem5's usage and features, describe the current state of the gem5 simulator, and enumerate the major changes since the initial release of gem5. We also discuss how the gem5 simulator has transitioned to a formal governance model to enable continued improvement and community support for the next 20 years of computer architecture research.
△ Less
Submitted 29 September, 2020; v1 submitted 6 July, 2020;
originally announced July 2020.
-
Subgraph densities in a surface
Authors:
Tony Huynh,
Gwenaël Joret,
David R. Wood
Abstract:
Given a fixed graph $H$ that embeds in a surface $Σ$, what is the maximum number of copies of $H$ in an $n$-vertex graph $G$ that embeds in $Σ$? We show that the answer is $Θ(n^{f(H)})$, where $f(H)$ is a graph invariant called the `flap-number' of $H$, which is independent of $Σ$. This simultaneously answers two open problems posed by Eppstein (1993). When $H$ is a complete graph we give more pre…
▽ More
Given a fixed graph $H$ that embeds in a surface $Σ$, what is the maximum number of copies of $H$ in an $n$-vertex graph $G$ that embeds in $Σ$? We show that the answer is $Θ(n^{f(H)})$, where $f(H)$ is a graph invariant called the `flap-number' of $H$, which is independent of $Σ$. This simultaneously answers two open problems posed by Eppstein (1993). When $H$ is a complete graph we give more precise answers.
△ Less
Submitted 2 December, 2021; v1 submitted 30 March, 2020;
originally announced March 2020.
-
Automated Labelling using an Attention model for Radiology reports of MRI scans (ALARM)
Authors:
David A. Wood,
Jeremy Lynch,
Sina Kafiabadi,
Emily Guilhem,
Aisha Al Busaidi,
Antanas Montvila,
Thomas Varsavsky,
Juveria Siddiqui,
Naveen Gadapa,
Matthew Townend,
Martin Kiik,
Keena Patel,
Gareth Barker,
Sebastian Ourselin,
James H. Cole,
Thomas C. Booth
Abstract:
Labelling large datasets for training high-capacity neural networks is a major obstacle to the development of deep learning-based medical imaging applications. Here we present a transformer-based network for magnetic resonance imaging (MRI) radiology report classification which automates this task by assigning image labels on the basis of free-text expert radiology reports. Our model's performance…
▽ More
Labelling large datasets for training high-capacity neural networks is a major obstacle to the development of deep learning-based medical imaging applications. Here we present a transformer-based network for magnetic resonance imaging (MRI) radiology report classification which automates this task by assigning image labels on the basis of free-text expert radiology reports. Our model's performance is comparable to that of an expert radiologist, and better than that of an expert physician, demonstrating the feasibility of this approach. We make code available online for researchers to label their own MRI datasets for medical imaging applications.
△ Less
Submitted 16 February, 2020;
originally announced February 2020.
-
Notes on Tree- and Path-chromatic Number
Authors:
Tony Huynh,
Bruce Reed,
David R. Wood,
Liana Yepremyan
Abstract:
Tree-chromatic number is a chromatic version of treewidth, where the cost of a bag in a tree-decomposition is measured by its chromatic number rather than its size. Path-chromatic number is defined analogously. These parameters were introduced by Seymour (JCTB 2016). In this paper, we survey all the known results on tree- and path-chromatic number and then present some new results and conjectures.…
▽ More
Tree-chromatic number is a chromatic version of treewidth, where the cost of a bag in a tree-decomposition is measured by its chromatic number rather than its size. Path-chromatic number is defined analogously. These parameters were introduced by Seymour (JCTB 2016). In this paper, we survey all the known results on tree- and path-chromatic number and then present some new results and conjectures. In particular, we propose a version of Hadwiger's Conjecture for tree-chromatic number. As evidence that our conjecture may be more tractable than Hadwiger's Conjecture, we give a short proof that every $K_5$-minor-free graph has tree-chromatic number at most $4$, which avoids the Four Colour Theorem. We also present some hardness results and conjectures for computing tree- and path-chromatic number.
△ Less
Submitted 9 July, 2020; v1 submitted 13 February, 2020;
originally announced February 2020.
-
Notes on Graph Product Structure Theory
Authors:
Zdeněk Dvořák,
Tony Huynh,
Gwenaël Joret,
Chun-Hung Liu,
David R. Wood
Abstract:
It was recently proved that every planar graph is a subgraph of the strong product of a path and a graph with bounded treewidth. This paper surveys generalisations of this result for graphs on surfaces, minor-closed classes, various non-minor-closed classes, and graph classes with polynomial growth. We then explore how graph product structure might be applicable to more broadly defined graph class…
▽ More
It was recently proved that every planar graph is a subgraph of the strong product of a path and a graph with bounded treewidth. This paper surveys generalisations of this result for graphs on surfaces, minor-closed classes, various non-minor-closed classes, and graph classes with polynomial growth. We then explore how graph product structure might be applicable to more broadly defined graph classes. In particular, we characterise when a graph class defined by a cartesian or strong product has bounded or polynomial expansion. We then explore graph product structure theorems for various geometrically defined graph classes, and present several open problems.
△ Less
Submitted 2 July, 2020; v1 submitted 23 January, 2020;
originally announced January 2020.
-
NEURO-DRAM: a 3D recurrent visual attention model for interpretable neuroimaging classification
Authors:
David Wood,
James Cole,
Thomas Booth
Abstract:
Deep learning is attracting significant interest in the neuroimaging community as a means to diagnose psychiatric and neurological disorders from structural magnetic resonance images. However, there is a tendency amongst researchers to adopt architectures optimized for traditional computer vision tasks, rather than design networks customized for neuroimaging data. We address this by introducing NE…
▽ More
Deep learning is attracting significant interest in the neuroimaging community as a means to diagnose psychiatric and neurological disorders from structural magnetic resonance images. However, there is a tendency amongst researchers to adopt architectures optimized for traditional computer vision tasks, rather than design networks customized for neuroimaging data. We address this by introducing NEURO-DRAM, a 3D recurrent visual attention model tailored for neuroimaging classification. The model comprises an agent which, trained by reinforcement learning, learns to navigate through volumetric images, selectively attending to the most informative regions for a given task. When applied to Alzheimer's disease prediction, NEURODRAM achieves state-of-the-art classification accuracy on an out-of-sample dataset, significantly outperforming a baseline convolutional neural network. When further applied to the task of predicting which patients with mild cognitive impairment will be diagnosed with Alzheimer's disease within two years, the model achieves state-of-the-art accuracy with no additional training. Encouragingly, the agent learns, without explicit instruction, a search policy in agreement with standardized radiological hallmarks of Alzheimer's disease, suggesting a route to automated biomarker discovery for more poorly understood disorders.
△ Less
Submitted 18 October, 2019; v1 submitted 10 October, 2019;
originally announced October 2019.
-
Clustered Variants of Hajós' Conjecture
Authors:
Chun-Hung Liu,
David R. Wood
Abstract:
Hajós conjectured that every graph containing no subdivision of the complete graph $K_{s+1}$ is properly $s$-colorable. This conjecture was disproved by Catlin. Indeed, the maximum chromatic number of such graphs is $Ω(s^2/\log s)$. We prove that $O(s)$ colors are enough for a weakening of this conjecture that only requires every monochromatic component to have bounded size (so-called clustered co…
▽ More
Hajós conjectured that every graph containing no subdivision of the complete graph $K_{s+1}$ is properly $s$-colorable. This conjecture was disproved by Catlin. Indeed, the maximum chromatic number of such graphs is $Ω(s^2/\log s)$. We prove that $O(s)$ colors are enough for a weakening of this conjecture that only requires every monochromatic component to have bounded size (so-called clustered coloring). Our approach leads to more results. Say that a graph is an almost $(\leq 1)$-subdivision of a graph $H$ if it can be obtained from $H$ by subdividing edges, where at most one edge is subdivided more than once. Note that every graph with no $H$-subdivision does not contain an almost $(\leq 1)$-subdivision of $H$. We prove the following (where $s \geq 2$):
(1) Graphs of bounded treewidth and with no almost $(\leq 1)$-subdivision of $K_{s+1}$ are $s$-choosable with bounded clustering.
(2) For every graph $H$, graphs with no $H$-minor and no almost $(\leq 1)$-subdivision of $K_{s+1}$ are $(s+1)$-colorable with bounded clustering.
(3) For every graph $H$ of maximum degree at most $d$, graphs with no $H$-subdivision and no almost $(\leq 1)$-subdivision of $K_{s+1}$ are $\max\{s+3d-5,2\}$-colorable with bounded clustering.
(4) For every graph $H$ of maximum degree $d$, graphs with no $K_{s,t}$ subgraph and no $H$-subdivision are $\max\{s+3d-4,2\}$-colorable with bounded clustering.
(5) Graphs with no $K_{s+1}$-subdivision are $(4s-5)$-colorable with bounded clustering.
The first result shows that the weakening of Hajós' conjecture is true for graphs of bounded treewidth in a stronger sense; the final result is the first $O(s)$ bound on the clustered chromatic number of graphs with no $K_{s+1}$-subdivision.
△ Less
Submitted 6 September, 2021; v1 submitted 14 August, 2019;
originally announced August 2019.
-
Graph product structure for non-minor-closed classes
Authors:
Vida Dujmović,
Pat Morin,
David R. Wood
Abstract:
Dujmović et al. [\emph{J.~ACM}~'20] recently proved that every planar graph is isomorphic to a subgraph of the strong product of a bounded treewidth graph and a path. Analogous results were obtained for graphs of bounded Euler genus or apex-minor-free graphs. These tools have been used to solve longstanding problems on queue layouts, non-repetitive colouring, $p$-centered colouring, and adjacency…
▽ More
Dujmović et al. [\emph{J.~ACM}~'20] recently proved that every planar graph is isomorphic to a subgraph of the strong product of a bounded treewidth graph and a path. Analogous results were obtained for graphs of bounded Euler genus or apex-minor-free graphs. These tools have been used to solve longstanding problems on queue layouts, non-repetitive colouring, $p$-centered colouring, and adjacency labelling. This paper proves analogous product structure theorems for various non-minor-closed classes. One noteable example is $k$-planar graphs (those with a drawing in the plane in which each edge is involved in at most $k$ crossings). We prove that every $k$-planar graph is isomorphic to a subgraph of the strong product of a graph of treewidth $O(k^5)$ and a path. This is the first result of this type for a non-minor-closed class of graphs. It implies, amongst other results, that $k$-planar graphs have non-repetitive chromatic number upper-bounded by a function of $k$. All these results generalise for drawings of graphs on arbitrary surfaces. In fact, we work in a more general setting based on so-called shortcut systems, which are of independent interest. This leads to analogous results for certain types of map graphs, string graphs, graph powers, and nearest neighbour graphs.
△ Less
Submitted 18 November, 2022; v1 submitted 11 July, 2019;
originally announced July 2019.
-
The size Ramsey number of graphs with bounded treewidth
Authors:
Nina Kamcev,
Anita Liebenau,
David R. Wood,
Liana Yepremyan
Abstract:
A graph $G$ is Ramsey for a graph $H$ if every 2-colouring of the edges of $G$ contains a monochromatic copy of $H$. We consider the following question: if $H$ has bounded treewidth, is there a `sparse' graph $G$ that is Ramsey for $H$? Two notions of sparsity are considered. Firstly, we show that if the maximum degree and treewidth of $H$ are bounded, then there is a graph $G$ with $O(|V(H)|)$ ed…
▽ More
A graph $G$ is Ramsey for a graph $H$ if every 2-colouring of the edges of $G$ contains a monochromatic copy of $H$. We consider the following question: if $H$ has bounded treewidth, is there a `sparse' graph $G$ that is Ramsey for $H$? Two notions of sparsity are considered. Firstly, we show that if the maximum degree and treewidth of $H$ are bounded, then there is a graph $G$ with $O(|V(H)|)$ edges that is Ramsey for $H$. This was previously only known for the smaller class of graphs $H$ with bounded bandwidth. On the other hand, we prove that the treewidth of a graph $G$ that is Ramsey for $H$ cannot be bounded in terms of the treewidth of $H$ alone. In fact, the latter statement is true even if the treewidth is replaced by the degeneracy and $H$ is a tree.
△ Less
Submitted 28 July, 2019; v1 submitted 21 June, 2019;
originally announced June 2019.
-
Planar graphs have bounded nonrepetitive chromatic number
Authors:
Vida Dujmović,
Louis Esperet,
Gwenaël Joret,
Bartosz Walczak,
David R. Wood
Abstract:
A colouring of a graph is "nonrepetitive" if for every path of even order, the sequence of colours on the first half of the path is different from the sequence of colours on the second half. We show that planar graphs have nonrepetitive colourings with a bounded number of colours, thus proving a conjecture of Alon, Grytczuk, Haluszczak and Riordan (2002). We also generalise this result for graphs…
▽ More
A colouring of a graph is "nonrepetitive" if for every path of even order, the sequence of colours on the first half of the path is different from the sequence of colours on the second half. We show that planar graphs have nonrepetitive colourings with a bounded number of colours, thus proving a conjecture of Alon, Grytczuk, Haluszczak and Riordan (2002). We also generalise this result for graphs of bounded Euler genus, graphs excluding a fixed minor, and graphs excluding a fixed topological minor.
△ Less
Submitted 20 January, 2022; v1 submitted 10 April, 2019;
originally announced April 2019.
-
Planar graphs have bounded queue-number
Authors:
Vida Dujmović,
Gwenaël Joret,
Piotr Micek,
Pat Morin,
Torsten Ueckerdt,
David R. Wood
Abstract:
We show that planar graphs have bounded queue-number, thus proving a conjecture of Heath, Leighton and Rosenberg from 1992. The key to the proof is a new structural tool called layered partitions, and the result that every planar graph has a vertex-partition and a layering, such that each part has a bounded number of vertices in each layer, and the quotient graph has bounded treewidth. This result…
▽ More
We show that planar graphs have bounded queue-number, thus proving a conjecture of Heath, Leighton and Rosenberg from 1992. The key to the proof is a new structural tool called layered partitions, and the result that every planar graph has a vertex-partition and a layering, such that each part has a bounded number of vertices in each layer, and the quotient graph has bounded treewidth. This result generalises for graphs of bounded Euler genus. Moreover, we prove that every graph in a minor-closed class has such a layered partition if and only if the class excludes some apex graph. Building on this work and using the graph minor structure theorem, we prove that every proper minor-closed class of graphs has bounded queue-number.
Layered partitions have strong connections to other topics, including the following two examples. First, they can be interpreted in terms of strong products. We show that every planar graph is a subgraph of the strong product of a path with some graph of bounded treewidth. Similar statements hold for all proper minor-closed classes. Second, we give a simple proof of the result by DeVos et al. (2004) that graphs in a proper minor-closed class have low treewidth colourings.
△ Less
Submitted 29 April, 2020; v1 submitted 9 April, 2019;
originally announced April 2019.
-
Queue Layouts of Graphs with Bounded Degree and Bounded Genus
Authors:
Vida Dujmović,
Pat Morin,
David R. Wood
Abstract:
Motivated by the question of whether planar graphs have bounded queue-number, we prove that planar graphs with maximum degree $Δ$ have queue-number $O(Δ^{2})$, which improves upon the best previous bound of $O(Δ^6)$. More generally, we prove that graphs with bounded degree and bounded Euler genus have bounded queue-number. In particular graphs with Euler genus $g$ and maximum degree $Δ$ have queue…
▽ More
Motivated by the question of whether planar graphs have bounded queue-number, we prove that planar graphs with maximum degree $Δ$ have queue-number $O(Δ^{2})$, which improves upon the best previous bound of $O(Δ^6)$. More generally, we prove that graphs with bounded degree and bounded Euler genus have bounded queue-number. In particular graphs with Euler genus $g$ and maximum degree $Δ$ have queue-number $O(g+Δ^{2})$. As a byproduct we prove that if planar graphs have bounded queue-number, then graphs of Euler genus $g$ have queue-number $O(g)$.
△ Less
Submitted 17 March, 2019; v1 submitted 16 January, 2019;
originally announced January 2019.
-
Minor-closed graph classes with bounded layered pathwidth
Authors:
Vida Dujmović,
David Eppstein,
Gwenaël Joret,
Pat Morin,
David R. Wood
Abstract:
We prove that a minor-closed class of graphs has bounded layered pathwidth if and only if some apex-forest is not in the class. This generalises a theorem of Robertson and Seymour, which says that a minor-closed class of graphs has bounded pathwidth if and only if some forest is not in the class.
We prove that a minor-closed class of graphs has bounded layered pathwidth if and only if some apex-forest is not in the class. This generalises a theorem of Robertson and Seymour, which says that a minor-closed class of graphs has bounded pathwidth if and only if some forest is not in the class.
△ Less
Submitted 3 June, 2020; v1 submitted 18 October, 2018;
originally announced October 2018.
-
Tight Upper Bounds on the Crossing Number in a Minor-Closed Class
Authors:
Vida Dujmović,
Ken-ichi Kawarabayashi,
Bojan Mohar,
David R. Wood
Abstract:
The crossing number of a graph is the minimum number of crossings in a drawing of the graph in the plane. Our main result is that every graph $G$ that does not contain a fixed graph as a minor has crossing number $O(Δn)$, where $G$ has $n$ vertices and maximum degree $Δ$. This dependence on $n$ and $Δ$ is best possible. This result answers an open question of Wood and Telle [New York J. Mathematic…
▽ More
The crossing number of a graph is the minimum number of crossings in a drawing of the graph in the plane. Our main result is that every graph $G$ that does not contain a fixed graph as a minor has crossing number $O(Δn)$, where $G$ has $n$ vertices and maximum degree $Δ$. This dependence on $n$ and $Δ$ is best possible. This result answers an open question of Wood and Telle [New York J. Mathematics, 2007], who proved the best previous bound of $O(Δ^2 n)$. We also study the convex and rectilinear crossing numbers, and prove an $O(Δn)$ bound for the convex crossing number of bounded pathwidth graphs, and a $\sum_v°(v)^2$ bound for the rectilinear crossing number of $K_{3,3}$-minor-free graphs.
△ Less
Submitted 30 July, 2018;
originally announced July 2018.
-
Defective and Clustered Choosability of Sparse Graphs
Authors:
Kevin Hendrey,
David R. Wood
Abstract:
An (improper) graph colouring has "defect" $d$ if each monochromatic subgraph has maximum degree at most $d$, and has "clustering" $c$ if each monochromatic component has at most $c$ vertices. This paper studies defective and clustered list-colourings for graphs with given maximum average degree. We prove that every graph with maximum average degree less than $\frac{2d+2}{d+2} k$ is $k$-choosable…
▽ More
An (improper) graph colouring has "defect" $d$ if each monochromatic subgraph has maximum degree at most $d$, and has "clustering" $c$ if each monochromatic component has at most $c$ vertices. This paper studies defective and clustered list-colourings for graphs with given maximum average degree. We prove that every graph with maximum average degree less than $\frac{2d+2}{d+2} k$ is $k$-choosable with defect $d$. This improves upon a similar result by Havet and Sereni [J. Graph Theory, 2006]. For clustered choosability of graphs with maximum average degree $m$, no $(1-ε)m$ bound on the number of colours was previously known. The above result with $d=1$ solves this problem. It implies that every graph with maximum average degree $m$ is $\lfloor{\frac{3}{4}m+1}\rfloor$-choosable with clustering 2. This extends a result of Kopreski and Yu [Discrete Math., 2017] to the setting of choosability. We then prove two results about clustered choosability that explore the trade-off between the number of colours and the clustering. In particular, we prove that every graph with maximum average degree $m$ is $\lfloor{\frac{7}{10}m+1}\rfloor$-choosable with clustering $9$, and is $\lfloor{\frac{2}{3}m+1}\rfloor$-choosable with clustering $O(m)$. As an example, the later result implies that every biplanar graph is 8-choosable with bounded clustering. This is the best known result for the clustered version of the earth-moon problem. The results extend to the setting where we only consider the maximum average degree of subgraphs with at least some number of vertices. Several applications are presented.
△ Less
Submitted 1 February, 2019; v1 submitted 19 June, 2018;
originally announced June 2018.