Skip to main content

Showing 1–36 of 36 results for author: Cole, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13121  [pdf, other

    cs.CL cs.AI cs.IR

    Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?

    Authors: **hyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu

    Abstract: Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases. Leveraging LCLMs' ability to natively ingest and process entire corpora of information offers numerous advantages. It enhances user-friendliness by eliminating the need for specialized knowledge of tools, provides robust end-to-… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 29 pages. Dataset available at https://github.com/google-deepmind/loft

  2. arXiv:2405.05658  [pdf

    eess.IV cs.CV

    Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysis

    Authors: Siddharth Agarwal, David A. Wood, Mariusz Grzeda, Chandhini Suresh, Munaib Din, James Cole, Marc Modat, Thomas C Booth

    Abstract: Purpose: Most studies evaluating artificial intelligence (AI) models that detect abnormalities in neuroimaging are either tested on unrepresentative patient cohorts or are insufficiently well-validated, leading to poor generalisability to real-world tasks. The aim was to determine the diagnostic test accuracy and summarise the evidence supporting the use of AI models performing first-line, high-vo… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  3. arXiv:2405.02782  [pdf

    cs.CV

    A self-supervised text-vision framework for automated brain abnormality detection

    Authors: David A. Wood, Emily Guilhem, Sina Kafiabadi, Ayisha Al Busaidi, Kishan Dissanayake, Ahmed Hammam, Nina Mansoor, Matthew Townend, Siddharth Agarwal, Yiran Wei, Asif Mazumder, Gareth J. Barker, Peter Sasieni, Sebastien Ourselin, James H. Cole, Thomas C. Booth

    Abstract: Artificial neural networks trained on large, expert-labelled datasets are considered state-of-the-art for a range of medical image recognition tasks. However, categorically labelled datasets are time-consuming to generate and constrain classification to a pre-defined, fixed set of classes. For neuroradiological applications in particular, this represents a barrier to clinical adoption. To address… ▽ More

    Submitted 11 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

    Comments: Under Review

  4. arXiv:2403.20327  [pdf, other

    cs.CL cs.AI

    Gecko: Versatile Text Embeddings Distilled from Large Language Models

    Authors: **hyuk Lee, Zhuyun Dai, Xiaoqi Ren, Blair Chen, Daniel Cer, Jeremy R. Cole, Kai Hui, Michael Boratko, Rajvi Kapadia, Wen Ding, Yi Luan, Sai Meher Karthik Duddu, Gustavo Hernandez Abrego, Weiqiang Shi, Nithi Gupta, Aditya Kusupati, Prateek Jain, Siddhartha Reddy Jonnalagadda, Ming-Wei Chang, Iftekhar Naim

    Abstract: We present Gecko, a compact and versatile text embedding model. Gecko achieves strong retrieval performance by leveraging a key idea: distilling knowledge from large language models (LLMs) into a retriever. Our two-step distillation process begins with generating diverse, synthetic paired data using an LLM. Next, we further refine the data quality by retrieving a set of candidate passages for each… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 18 pages

  5. arXiv:2402.09137  [pdf, other

    eess.IV cs.CV

    Semi-Supervised Diffusion Model for Brain Age Prediction

    Authors: Ayodeji Ijishakin, Sophie Martin, Florence Townend, Federica Agosta, Edoardo Gioele Spinelli, Silvia Basaia, Paride Schito, Yuri Falzone, Massimo Filippi, James Cole, Andrea Malaspina

    Abstract: Brain age prediction models have succeeded in predicting clinical outcomes in neurodegenerative diseases, but can struggle with tasks involving faster progressing diseases and low quality data. To enhance their performance, we employ a semi-supervised diffusion model, obtaining a 0.83(p<0.01) correlation between chronological and predicted age on low quality T1w MR images. This was competitive wit… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Journal ref: Deep Generative Models for Health Workshop, NeurIPS 2023

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2310.08464  [pdf, other

    eess.AS cs.SD

    Crowdsourced and Automatic Speech Prominence Estimation

    Authors: Max Morrison, Pranav Pawar, Nathan Pruyne, Jennifer Cole, Bryan Pardo

    Abstract: The prominence of a spoken word is the degree to which an average native listener perceives the word as salient or emphasized relative to its context. Speech prominence estimation is the process of assigning a numeric value to the prominence of each word in an utterance. These prominence labels are useful for linguistic analysis, as well as training automated systems to perform emphasis-controlled… ▽ More

    Submitted 22 December, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at ICASSP 2024

  8. arXiv:2307.07072  [pdf, other

    cs.LG cs.CE eess.IV q-bio.QM stat.ML

    Rician likelihood loss for quantitative MRI using self-supervised deep learning

    Authors: Christopher S. Parker, Anna Schroder, Sean C. Epstein, James Cole, Daniel C. Alexander, Hui Zhang

    Abstract: Purpose: Previous quantitative MR imaging studies using self-supervised deep learning have reported biased parameter estimates at low SNR. Such systematic errors arise from the choice of Mean Squared Error (MSE) loss function for network training, which is incompatible with Rician-distributed MR magnitude signals. To address this issue, we introduce the negative log Rician likelihood (NLR) loss. M… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: 16 pages, 6 figures

  9. arXiv:2306.03022  [pdf, other

    cs.CV cs.LG

    Interpretable Alzheimer's Disease Classification Via a Contrastive Diffusion Autoencoder

    Authors: Ayodeji Ijishakin, Ahmed Abdulaal, Adamos Hadjivasiliou, Sophie Martin, James Cole

    Abstract: In visual object classification, humans often justify their choices by comparing objects to prototypical examples within that class. We may therefore increase the interpretability of deep learning models by imbuing them with a similar style of reasoning. In this work, we apply this principle by classifying Alzheimer's Disease based on the similarity of images to training examples within the latent… ▽ More

    Submitted 25 October, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Journal ref: ICML (2023), 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH)

  10. arXiv:2305.14613  [pdf, other

    cs.CL cs.AI

    Selectively Answering Ambiguous Questions

    Authors: Jeremy R. Cole, Michael J. Q. Zhang, Daniel Gillick, Julian Martin Eisenschlos, Bhuwan Dhingra, Jacob Eisenstein

    Abstract: Trustworthy language models should abstain from answering questions when they do not know the answer. However, the answer to a question can be unknown for a variety of reasons. Prior research has focused on the case in which the question is clear and the answer is unambiguous but possibly unknown, but the answer to a question can also be unclear due to uncertainty of the questioner's intent or con… ▽ More

    Submitted 14 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: To appear in EMNLP 2023. 9 pages, 5 figures, 2 pages of appendix

  11. arXiv:2305.14499  [pdf, other

    cs.CL cs.IR

    NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

    Authors: Livio Baldini Soares, Daniel Gillick, Jeremy R. Cole, Tom Kwiatkowski

    Abstract: Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer's FLOPs… ▽ More

    Submitted 23 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: To appear at EMNLP 2023

  12. arXiv:2303.12860  [pdf, other

    cs.CL cs.AI

    Salient Span Masking for Temporal Understanding

    Authors: Jeremy R. Cole, Aditi Chaudhary, Bhuwan Dhingra, Partha Talukdar

    Abstract: Salient Span Masking (SSM) has shown itself to be an effective strategy to improve closed-book question answering performance. SSM extends general masked language model pretraining by creating additional unsupervised training sentences that mask a single entity or date span, thus oversampling factual information. Despite the success of this paradigm, the span types and sampling strategies are rela… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 5 pages 1 figure, to appear in EACL 2023

  13. arXiv:2303.00242  [pdf, other

    cs.CL

    DIFFQG: Generating Questions to Summarize Factual Changes

    Authors: Jeremy R. Cole, Palak Jain, Julian Martin Eisenschlos, Michael J. Q. Zhang, Eunsol Choi, Bhuwan Dhingra

    Abstract: Identifying the difference between two versions of the same article is useful to update knowledge bases and to understand how articles evolve. Paired texts occur naturally in diverse situations: reporters write similar news stories and maintainers of authoritative websites must keep their information up to date. We propose representing factual changes between paired documents as question-answer pa… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 14 pages. Accepted at EACL 2023 (main, long)

  14. arXiv:2301.00504  [pdf

    eess.IV cs.AI cs.CV eess.SP

    Spectral Bandwidth Recovery of Optical Coherence Tomography Images using Deep Learning

    Authors: Timothy T. Yu, Da Ma, Jayden Cole, Myeong ** Ju, Mirza F. Beg, Marinko V. Sarunic

    Abstract: Optical coherence tomography (OCT) captures cross-sectional data and is used for the screening, monitoring, and treatment planning of retinal diseases. Technological developments to increase the speed of acquisition often results in systems with a narrower spectral bandwidth, and hence a lower axial resolution. Traditionally, image-processing-based techniques have been utilized to reconstruct subs… ▽ More

    Submitted 1 January, 2023; originally announced January 2023.

  15. arXiv:2209.12786  [pdf, other

    cs.CL cs.AI

    Do ever larger octopi still amplify reporting biases? Evidence from judgments of typical colour

    Authors: Fangyu Liu, Julian Martin Eisenschlos, Jeremy R. Cole, Nigel Collier

    Abstract: Language models (LMs) trained on raw texts have no direct access to the physical world. Gordon and Van Durme (2013) point out that LMs can thus suffer from reporting bias: texts rarely report on common facts, instead focusing on the unusual aspects of a situation. If LMs are only trained on text corpora and naively memorise local co-occurrence statistics, they thus naturally would learn a biased v… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: AACL 2022

  16. arXiv:2209.12153  [pdf, other

    cs.CL cs.AI

    WinoDict: Probing language models for in-context word acquisition

    Authors: Julian Martin Eisenschlos, Jeremy R. Cole, Fangyu Liu, William W. Cohen

    Abstract: We introduce a new in-context learning paradigm to measure Large Language Models' (LLMs) ability to learn novel words during inference. In particular, we rewrite Winograd-style co-reference resolution problems by replacing the key concept word with a synthetic but plausible word that the model must understand to complete the task. Solving this task requires the model to make use of the dictionary… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

  17. arXiv:2206.13346  [pdf, other

    cs.CV cs.LG stat.ML

    Distributional Gaussian Processes Layers for Out-of-Distribution Detection

    Authors: Sebastian G. Popescu, David J. Sharp, James H. Cole, Konstantinos Kamnitsas, Ben Glocker

    Abstract: Machine learning models deployed on medical imaging tasks must be equipped with out-of-distribution detection capabilities in order to avoid erroneous predictions. It is unsure whether out-of-distribution detection models reliant on deep neural networks are suitable for detecting domain shifts in medical imaging. Gaussian Processes can reliably separate in-distribution data points from out-of-dist… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: Published in Journal of Machine Learning for Biomedical Imaging: Special Issue: Information Processing in Medical Imaging (IPMI) 2021

  18. arXiv:2203.17019  [pdf, other

    eess.AS cs.LG cs.SD

    DeepFry: Identifying Vocal Fry Using Deep Neural Networks

    Authors: Bronya R. Chernyak, Talia Ben Simon, Yael Segal, Jeremy Steffman, Eleanor Chodroff, Jennifer S. Cole, Joseph Keshet

    Abstract: Vocal fry or creaky voice refers to a voice quality characterized by irregular glottal opening and low pitch. It occurs in diverse languages and is prevalent in American English, where it is used not only to mark phrase finality, but also sociolinguistic factors and affect. Due to its irregular periodicity, creaky voice challenges automatic speech processing and recognition systems, particularly f… ▽ More

    Submitted 26 June, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted to Interspeech 2022

  19. arXiv:2201.10553  [pdf

    cs.CY

    Online discussion forums for monitoring the need for targeted psychological health support: an observational case study of r/COVID19_support

    Authors: Fathima Rushda Balabaskaran, Annabel Jones-Gammon, Rebecca How, Jennifer Cole

    Abstract: The COVID-19 pandemic has placed a severe mental strain on people in general, and on young people in particular. Online support forums offer opportunities for peer-to-peer health support, which can ease pressure on professional and established volunteer services when demand is high. Such forums can also be used to monitor at-risk communities to identify concerns and causes of psychological stress.… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: 27 pages, 1 table, 5 figures

    Journal ref: Global Journal of Medicine and Public Health Vol 10 Issue 5 2021

  20. arXiv:2111.14671  [pdf, other

    cs.LG physics.ao-ph stat.ML

    ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models

    Authors: Salva Rühling Cachay, Venkatesh Ramesh, Jason N. S. Cole, Howard Barker, David Rolnick

    Abstract: Numerical simulations of Earth's weather and climate require substantial amounts of computation. This has led to a growing interest in replacing subroutines that explicitly compute physical processes with approximate machine learning (ML) methods that are fast at inference time. Within weather and climate models, atmospheric radiative transfer (RT) calculations are especially expensive. This has m… ▽ More

    Submitted 29 November, 2021; originally announced November 2021.

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and Benchmarks

  21. arXiv:2109.04587  [pdf, other

    cs.CL cs.AI

    Graph-Based Decoding for Task Oriented Semantic Parsing

    Authors: Jeremy R. Cole, Nanjiang Jiang, Panupong Pasupat, Luheng He, Peter Shaw

    Abstract: The dominant paradigm for semantic parsing in recent years is to formulate parsing as a sequence-to-sequence task, generating predictions with auto-regressive sequence decoders. In this work, we explore an alternative paradigm. We formulate semantic parsing as a dependency parsing task, applying graph-based decoding techniques developed for syntactic parsing. We compare various decoding techniques… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: To appear in EMNLP 5 pages 4 figures

  22. arXiv:2107.07977  [pdf, other

    cs.LG q-bio.PE

    An Uncertainty-Aware, Shareable and Transparent Neural Network Architecture for Brain-Age Modeling

    Authors: Tim Hahn, Jan Ernsting, Nils R. Winter, Vincent Holstein, Ramona Leenings, Marie Beisemann, Lukas Fisch, Kelvin Sarink, Daniel Emden, Nils Opel, Ronny Redlich, Jonathan Repple, Dominik Grotegerd, Susanne Meinert, Jochen G. Hirsch, Thoralf Niendorf, Beate Endemann, Fabian Bamberg, Thomas Kröncke, Robin Bülow, Henry Völzke, Oyunbileg von Stackelberg, Ramona Felizitas Sowade, Lale Umutlu, Börge Schmidt , et al. (9 additional authors not shown)

    Abstract: The deviation between chronological age and age predicted from neuroimaging data has been identified as a sensitive risk-marker of cross-disorder brain changes, growing into a cornerstone of biological age-research. However, Machine Learning models underlying the field do not consider uncertainty, thereby confounding results with training data density and variability. Also, existing models are com… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

  23. Time-Aware Language Models as Temporal Knowledge Bases

    Authors: Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, William W. Cohen

    Abstract: Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. But language models (LMs) are trained on snapshots of data collected at a specific moment in time, and this can limit their utility, especially in the closed-book setting where the pretraining corpus must contain the facts the model should memorize. We introduce a diagnostic datas… ▽ More

    Submitted 23 April, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: Version accepted to TACL

    Journal ref: Transactions of the Association for Computational Linguistics 2022; 10 257-273

  24. arXiv:2106.08176  [pdf, other

    eess.IV cs.CV

    Automated triaging of head MRI examinations using convolutional neural networks

    Authors: David A. Wood, Sina Kafiabadi, Ayisha Al Busaidi, Emily Guilhem, Antanas Montvila, Siddharth Agarwal, Jeremy Lynch, Matthew Townend, Gareth Barker, Sebastien Ourselin, James H. Cole, Thomas C. Booth

    Abstract: The growing demand for head magnetic resonance imaging (MRI) examinations, along with a global shortage of radiologists, has led to an increase in the time taken to report head MRI scans around the world. For many neurological conditions, this delay can result in increased morbidity and mortality. An automated triaging tool could reduce reporting times for abnormal examinations by identifying abno… ▽ More

    Submitted 28 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: Accepted as an oral presentation at Medical Imaging with Deep Learning (MIDL) 2021

  25. arXiv:2105.12235  [pdf

    cs.SI

    Acquisition and analysis of crowd-sourced traffic data

    Authors: Markus Hilpert, Jenni A. Shearston, Jemaleddin Cole, Steven N. Chillrud, Micaela E. Martinez

    Abstract: Crowd-sourced traffic data offer great promise in environmental modeling. However, archives of such traffic data are typically not made available for research; instead, the data must be acquired in real time. The objective of this paper is to present methods we developed for acquiring and analyzing time series of real-time crowd-sourced traffic data. We present scripts, which can be run in Unix/Li… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: 19 pages, 8 figures, 1 table

    ACM Class: D.1; J.2; J.3

  26. arXiv:2104.13756  [pdf, other

    stat.ML cs.LG

    Distributional Gaussian Process Layers for Outlier Detection in Image Segmentation

    Authors: Sebastian G. Popescu, David J. Sharp, James H. Cole, Konstantinos Kamnitsas, Ben Glocker

    Abstract: We propose a parameter efficient Bayesian layer for hierarchical convolutional Gaussian Processes that incorporates Gaussian Processes operating in Wasserstein-2 space to reliably propagate uncertainty. This directly replaces convolving Gaussian Processes with a distance-preserving affine operator on distributions. Our experiments on brain tissue-segmentation show that the resulting architecture a… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

  27. arXiv:2011.08787  [pdf

    cs.CY

    The COVID19 infodemic. The role and place of academics in science communication

    Authors: Jennifer Cole

    Abstract: As the COVID19 pandemic has spread across the world, a concurrent pandemic of information has spread with it. Deemed an infodemic by the World Health Organization, and described as an overabundance of information, some accurate, some not, that occurs during an epidemic, this proliferation of data, research and opinions provides both opportunities and challenges for academics. Academics and scienti… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 17 Pages

    Journal ref: Global Journal of Medicine and Public Health Vol 9 Issue 2 2020

  28. arXiv:2010.14877  [pdf, other

    stat.ML cs.LG

    Hierarchical Gaussian Processes with Wasserstein-2 Kernels

    Authors: Sebastian Popescu, David Sharp, James Cole, Ben Glocker

    Abstract: Stacking Gaussian Processes severely diminishes the model's ability to detect outliers, which when combined with non-zero mean functions, further extrapolates low non-parametric variance to low training data density regions. We propose a hybrid kernel inspired from Varifold theory, operating in both Euclidean and Wasserstein space. We posit that directly taking into account the variance in the com… ▽ More

    Submitted 1 February, 2022; v1 submitted 28 October, 2020; originally announced October 2020.

  29. arXiv:2007.04226  [pdf, other

    eess.IV cs.CV

    Labelling imaging datasets on the basis of neuroradiology reports: a validation study

    Authors: David A. Wood, Sina Kafiabadi, Aisha Al Busaidi, Emily Guilhem, Jeremy Lynch, Matthew Townend, Antanas Montvila, Juveria Siddiqui, Naveen Gadapa, Matthew Benger, Gareth Barker, Sebastian Ourselin, James H. Cole, Thomas C. Booth

    Abstract: Natural language processing (NLP) shows promise as a means to automate the labelling of hospital-scale neuroradiology magnetic resonance imaging (MRI) datasets for computer vision applications. To date, however, there has been no thorough investigation into the validity of this approach, including determining the accuracy of report labels compared to image labels as well as examining the performan… ▽ More

    Submitted 8 March, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

  30. arXiv:2004.11123  [pdf

    cs.LG stat.AP stat.ML

    Imputation of missing sub-hourly precipitation data in a large sensor network: a machine learning approach

    Authors: Benedict Delahaye Chivers, John Wallbank, Steven J. Cole, Ondrej Sebek, Simon Stanley, Matthew Fry, Georgios Leontidis

    Abstract: Precipitation data collected at sub-hourly resolution represents specific challenges for missing data recovery by being largely stochastic in nature and highly unbalanced in the duration of rain vs non-rain. Here we present a two-step analysis utilising current machine learning techniques for imputing precipitation data sampled at 30-minute intervals by devolving the task into (a) the classificati… ▽ More

    Submitted 2 May, 2020; v1 submitted 30 March, 2020; originally announced April 2020.

    Comments: 24 pages, 7 figures, 5 tables

    Journal ref: Journal of Hydrology 2020

  31. arXiv:2002.06588  [pdf, other

    cs.CV

    Automated Labelling using an Attention model for Radiology reports of MRI scans (ALARM)

    Authors: David A. Wood, Jeremy Lynch, Sina Kafiabadi, Emily Guilhem, Aisha Al Busaidi, Antanas Montvila, Thomas Varsavsky, Juveria Siddiqui, Naveen Gadapa, Matthew Townend, Martin Kiik, Keena Patel, Gareth Barker, Sebastian Ourselin, James H. Cole, Thomas C. Booth

    Abstract: Labelling large datasets for training high-capacity neural networks is a major obstacle to the development of deep learning-based medical imaging applications. Here we present a transformer-based network for magnetic resonance imaging (MRI) radiology report classification which automates this task by assigning image labels on the basis of free-text expert radiology reports. Our model's performance… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

  32. arXiv:1910.04721  [pdf, other

    cs.LG stat.ML

    NEURO-DRAM: a 3D recurrent visual attention model for interpretable neuroimaging classification

    Authors: David Wood, James Cole, Thomas Booth

    Abstract: Deep learning is attracting significant interest in the neuroimaging community as a means to diagnose psychiatric and neurological disorders from structural magnetic resonance images. However, there is a tendency amongst researchers to adopt architectures optimized for traditional computer vision tasks, rather than design networks customized for neuroimaging data. We address this by introducing NE… ▽ More

    Submitted 18 October, 2019; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: Improved network figure

  33. arXiv:1901.08180  [pdf

    cs.RO

    Vision-based Obstacle Removal System for Autonomous Ground Vehicles Using a Robotic Arm

    Authors: Khashayar Asadi, Rahul Jain, Ziqian Qin, Mingda Sun, Mojtaba Noghabaei, Jeremy Cole, Kevin Han, Edgar Lobaton

    Abstract: Over the past few years, the use of camera-equipped robotic platforms for data collection and visually monitoring applications has exponentially grown. Cluttered construction sites with many objects (e.g., bricks, pipes, etc.) on the ground are challenging environments for a mobile unmanned ground vehicle (UGV) to navigate. To address this issue, this study presents a mobile UGV equipped with a st… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.

    Comments: The 2019 ASCE International Conference on Computing in Civil Engineering

  34. arXiv:1810.12646  [pdf, other

    cs.CL

    Prosodic entrainment in dialog acts

    Authors: Uwe D. Reichel, Katalin Mády, Jennifer Cole

    Abstract: We examined prosodic entrainment in spoken dialogs separately for several dialog acts in cooperative and competitive games. Entrainment was measured for intonation features derived from a superpositional intonation stylization as well as for rhythm features. The found differences can be related to the cooperative or competitive nature of the game, as well as to dialog act properties as its intrins… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: This manuscript is under revision. Please contact the authors for information about updates

  35. arXiv:1705.10312  [pdf

    cs.LG cs.CE stat.AP

    Classification of Major Depressive Disorder via Multi-Site Weighted LASSO Model

    Authors: Dajiang Zhu, Brandalyn C. Riedel, Neda Jahanshad, Nynke A. Groenewold, Dan J. Stein, Ian H. Gotlib, Matthew D. Sacchet, Danai Dima, James H. Cole, Cynthia H. Y. Fu, Henrik Walter, Ilya M. Veer, Thomas Frodl, Lianne Schmaal, Dick J. Veltman, Paul M. Thompson

    Abstract: Large-scale collaborative analysis of brain imaging data, in psychiatry and neu-rology, offers a new source of statistical power to discover features that boost ac-curacy in disease classification, differential diagnosis, and outcome prediction. However, due to data privacy regulations or limited accessibility to large datasets across the world, it is challenging to efficiently integrate distribut… ▽ More

    Submitted 3 June, 2017; v1 submitted 26 May, 2017; originally announced May 2017.

    Comments: Accepted by MICCAI 2017

  36. arXiv:1612.02572  [pdf

    stat.ML cs.CV cs.LG q-bio.NC

    Predicting brain age with deep learning from raw imaging data results in a reliable and heritable biomarker

    Authors: James H Cole, Rudra PK Poudel, Dimosthenis Tsagkrasoulis, Matthan WA Caan, Claire Steves, Tim D Spector, Giovanni Montana

    Abstract: Machine learning analysis of neuroimaging data can accurately predict chronological age in healthy people and deviations from healthy brain ageing have been associated with cognitive impairment and disease. Here we sought to further establish the credentials of "brain-predicted age" as a biomarker of individual differences in the brain ageing process, using a predictive modelling approach based on… ▽ More

    Submitted 8 December, 2016; originally announced December 2016.