Search | arXiv e-print repository

Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation

Authors: Sidra Aleem, Fangyijie Wang, Mayug Maniparambil, Eric Arazo, Julia Dietlmeier, Guenole Silvestre, Kathleen Curran, Noel E. O'Connor, Suzanne Little

Abstract: The Segment Anything Model (SAM) and CLIP are remarkable vision foundation models (VFMs). SAM, a prompt driven segmentation model, excels in segmentation tasks across diverse domains, while CLIP is renowned for its zero shot recognition capabilities. However, their unified potential has not yet been explored in medical image segmentation. To adapt SAM to medical imaging, existing methods primarily… ▽ More The Segment Anything Model (SAM) and CLIP are remarkable vision foundation models (VFMs). SAM, a prompt driven segmentation model, excels in segmentation tasks across diverse domains, while CLIP is renowned for its zero shot recognition capabilities. However, their unified potential has not yet been explored in medical image segmentation. To adapt SAM to medical imaging, existing methods primarily rely on tuning strategies that require extensive data or prior prompts tailored to the specific task, making it particularly challenging when only a limited number of data samples are available. This work presents an in depth exploration of integrating SAM and CLIP into a unified framework for medical image segmentation. Specifically, we propose a simple unified framework, SaLIP, for organ segmentation. Initially, SAM is used for part based segmentation within the image, followed by CLIP to retrieve the mask corresponding to the region of interest (ROI) from the pool of SAM generated masks. Finally, SAM is prompted by the retrieved ROI to segment a specific organ. Thus, SaLIP is training and fine tuning free and does not rely on domain expertise or labeled data for prompt engineering. Our method shows substantial enhancements in zero shot segmentation, showcasing notable improvements in DICE scores across diverse segmentation tasks like brain (63.46%), lung (50.11%), and fetal head (30.82%), when compared to un prompted SAM. Code and text prompts are available at: https://github.com/aleemsidra/SaLIP. △ Less

Submitted 30 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

arXiv:2402.04964 [pdf, other]

ConvLoRA and AdaBN based Domain Adaptation via Self-Training

Authors: Sidra Aleem, Julia Dietlmeier, Eric Arazo, Suzanne Little

Abstract: Existing domain adaptation (DA) methods often involve pre-training on the source domain and fine-tuning on the target domain. For multi-target domain adaptation, having a dedicated/separate fine-tuned network for each target domain, that retain all the pre-trained model parameters, is prohibitively expensive. To address this limitation, we propose Convolutional Low-Rank Adaptation (ConvLoRA). Conv… ▽ More Existing domain adaptation (DA) methods often involve pre-training on the source domain and fine-tuning on the target domain. For multi-target domain adaptation, having a dedicated/separate fine-tuned network for each target domain, that retain all the pre-trained model parameters, is prohibitively expensive. To address this limitation, we propose Convolutional Low-Rank Adaptation (ConvLoRA). ConvLoRA freezes pre-trained model weights, adds trainable low-rank decomposition matrices to convolutional layers, and backpropagates the gradient through these matrices thus greatly reducing the number of trainable parameters. To further boost adaptation, we utilize Adaptive Batch Normalization (AdaBN) which computes target-specific running statistics and use it along with ConvLoRA. Our method has fewer trainable parameters and performs better or on-par with large independent fine-tuned networks (with less than 0.9% trainable parameters of the total base model) when tested on the segmentation of Calgary-Campinas dataset containing brain MRI images. Our approach is simple, yet effective and can be applied to any deep learning-based architecture which uses convolutional and batch normalization layers. Code is available at: https://github.com/aleemsidra/ConvLoRA. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2309.08760 [pdf, other]

Biased Attention: Do Vision Transformers Amplify Gender Bias More than Convolutional Neural Networks?

Authors: Abhishek Mandal, Susan Leavy, Suzanne Little

Abstract: Deep neural networks used in computer vision have been shown to exhibit many social biases such as gender bias. Vision Transformers (ViTs) have become increasingly popular in computer vision applications, outperforming Convolutional Neural Networks (CNNs) in many tasks such as image classification. However, given that research on mitigating bias in computer vision has primarily focused on CNNs, it… ▽ More Deep neural networks used in computer vision have been shown to exhibit many social biases such as gender bias. Vision Transformers (ViTs) have become increasingly popular in computer vision applications, outperforming Convolutional Neural Networks (CNNs) in many tasks such as image classification. However, given that research on mitigating bias in computer vision has primarily focused on CNNs, it is important to evaluate the effect of a different network architecture on the potential for bias amplification. In this paper we therefore introduce a novel metric to measure bias in architectures, Accuracy Difference. We examine bias amplification when models belonging to these two architectures are used as a part of large multimodal models, evaluating the different image encoders of Contrastive Language Image Pretraining which is an important model used in many generative models such as DALL-E and Stable Diffusion. Our experiments demonstrate that architecture can play a role in amplifying social biases due to the different techniques employed by the models for feature extraction and embedding as well as their different learning properties. This research found that ViTs amplified gender bias to a greater extent than CNNs △ Less

Submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.04997 [pdf, other]

Gender Bias in Multimodal Models: A Transnational Feminist Approach Considering Geographical Region and Culture

Authors: Abhishek Mandal, Suzanne Little, Susan Leavy

Abstract: Deep learning based visual-linguistic multimodal models such as Contrastive Language Image Pre-training (CLIP) have become increasingly popular recently and are used within text-to-image generative models such as DALL-E and Stable Diffusion. However, gender and other social biases have been uncovered in these models, and this has the potential to be amplified and perpetuated through AI systems. In… ▽ More Deep learning based visual-linguistic multimodal models such as Contrastive Language Image Pre-training (CLIP) have become increasingly popular recently and are used within text-to-image generative models such as DALL-E and Stable Diffusion. However, gender and other social biases have been uncovered in these models, and this has the potential to be amplified and perpetuated through AI systems. In this paper, we present a methodology for auditing multimodal models that consider gender, informed by concepts from transnational feminism, including regional and cultural dimensions. Focusing on CLIP, we found evidence of significant gender bias with varying patterns across global regions. Harmful stereotypical associations were also uncovered related to visual cultural cues and labels such as terrorism. Levels of gender bias uncovered within CLIP for different regions aligned with global indices of societal gender equality, with those from the Global South reflecting the highest levels of gender bias. △ Less

Submitted 10 September, 2023; originally announced September 2023.

Comments: Selected for publication at the Aequitas 2023: Workshop on Fairness and Bias in AI | co-located with ECAI 2023, Kraków, Poland

arXiv:2305.18566 [pdf]

doi 10.1142/S2251171723400068

The Scientific Investigation of Unidentified Aerial Phenomena (UAP) Using Multimodal Ground-Based Observatories

Authors: Wesley Andrés Watters, Abraham Loeb, Frank Laukien, Richard Cloete, Alex Delacroix, Sergei Dobroshinsky, Benjamin Horvath, Ezra Kelderman, Sarah Little, Eric Masson, Andrew Mead, Mitch Randall, Forrest Schultz, Matthew Szenher, Foteini Vervelidou, Abigail White, Angelique Ahlström, Carol Cleland, Spencer Dockal, Natasha Donahue, Mark Elowitz, Carson Ezell, Alex Gersznowicz, Nicholas Gold, Michael G. Hercz , et al. (13 additional authors not shown)

Abstract: (Abridged) Unidentified Aerial Phenomena (UAP) have resisted explanation and have received little formal scientific attention for 75 years. A primary objective of the Galileo Project is to build an integrated software and instrumentation system designed to conduct a multimodal census of aerial phenomena and to recognize anomalies. Here we present key motivations for the study of UAP and address hi… ▽ More (Abridged) Unidentified Aerial Phenomena (UAP) have resisted explanation and have received little formal scientific attention for 75 years. A primary objective of the Galileo Project is to build an integrated software and instrumentation system designed to conduct a multimodal census of aerial phenomena and to recognize anomalies. Here we present key motivations for the study of UAP and address historical objections to this research. We describe an approach for highlighting outlier events in the high-dimensional parameter space of our census measurements. We provide a detailed roadmap for deciding measurement requirements, as well as a science traceability matrix (STM) for connecting sought-after physical parameters to observables and instrument requirements. We also discuss potential strategies for deciding where to locate instruments for development, testing, and final deployment. Our instrument package is multimodal and multispectral, consisting of (1) wide-field cameras in multiple bands for targeting and tracking of aerial objects and deriving their positions and kinematics using triangulation; (2) narrow-field instruments including cameras for characterizing morphology, spectra, polarimetry, and photometry; (3) passive multistatic arrays of antennas and receivers for radar-derived range and kinematics; (4) radio spectrum analyzers to measure radio and microwave emissions; (5) microphones for sampling acoustic emissions in the infrasonic through ultrasonic frequency bands; and (6) environmental sensors for characterizing ambient conditions (temperature, pressure, humidity, and wind velocity), as well as quasistatic electric and magnetic fields, and energetic particles. The use of multispectral instruments and multiple sensor modalities will help to ensure that artifacts are recognized and that true detections are corroborated and verifiable. △ Less

Submitted 31 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: This paper is published in the Journal of Astronomical Instrumentation, 12(1), 2340006 (2023) https://doi.org/10.1142/S2251171723400068

Journal ref: Journal of Astronomical Instrumentation, 12(1), 2340006 (2023)

arXiv:2305.18562 [pdf]

doi 10.1142/S2251171723400044

SkyWatch: A Passive Multistatic Radar Network for the Measurement of Object Position and Velocity

Authors: Mitch Randall, Alex Delacroix, Carson Ezell, Ezra Kelderman, Sarah Little, Abraham Loeb, Eric Masson, Wesley Andrés Watters, Richard Cloete, Abigail White

Abstract: (Abridged) Quantitative three-dimensional (3D) position and velocity estimates obtained by passive radar will assist the Galileo Project in the detection and classification of aerial objects by providing critical measurements of range, location, and kinematics. These parameters will be combined with those derived from the Project{\textquoteright}s suite of electromagnetic sensors and used to separ… ▽ More (Abridged) Quantitative three-dimensional (3D) position and velocity estimates obtained by passive radar will assist the Galileo Project in the detection and classification of aerial objects by providing critical measurements of range, location, and kinematics. These parameters will be combined with those derived from the Project{\textquoteright}s suite of electromagnetic sensors and used to separate known aerial objects from those exhibiting anomalous kinematics. SkyWatch, a passive multistatic radar system based on commercial broadcast FM radio transmitters of opportunity, is a network of receivers spaced at geographical scales that enables estimation of the 3D position and velocity time series of objects at altitudes up to 80km, horizontal distances up to 150km, and at velocities to {\textpm}2{\textpm}2km/s ({\textpm}6{\textpm}6Mach). The receivers are designed to collect useful data in a variety of environments varying by terrain, transmitter power, relative transmitter distance, adjacent channel strength, etc. In some cases, the direct signal from the transmitter may be large enough to be used as the reference with which the echoes are correlated. In other cases, the direct signal may be weak or absent, in which case a reference is communicated to the receiver from another network node via the internet for echo correlation. Various techniques are discussed specific to the two modes of operation and a hybrid mode. Delay and Doppler data are sent via internet to a central server where triangulation is used to deduce time series of 3D positions and velocities. A multiple receiver (multistatic) radar experiment is undergoing Phase 1 testing, with several receivers placed at various distances around the Harvard{\textendash}Smithsonian Center for Astrophysics (CfA), to validate full 3D position and velocity recovery. △ Less

Submitted 31 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

Comments: This paper is published in the Journal of Astronomical Instrumentation, 12(1) (2023) 10.1142/S2251171723400044 The abstract has been updated

Journal ref: Journal of Astronomical Instrumentation, 12(1) (2023)

arXiv:2305.18555 [pdf]

doi 10.1142/S2251171723400020

A Hardware and Software Platform for Aerial Object Localization

Authors: Matthew Szenher, Alex Delacroix, Eric Keto, Sarah Little, Mitch Randall, Wesley Andrés Watters, Eric Masson, Richard Cloete

Abstract: To date, there are little reliable data on the position, velocity and acceleration characteristics of Unidentified Aerial Phenomena (UAP). The dual hardware and software system described in this document provides a means to address this gap. We describe a weatherized multi-camera system which can capture images in the visible, infrared and near infrared wavelengths. We then describe the software w… ▽ More To date, there are little reliable data on the position, velocity and acceleration characteristics of Unidentified Aerial Phenomena (UAP). The dual hardware and software system described in this document provides a means to address this gap. We describe a weatherized multi-camera system which can capture images in the visible, infrared and near infrared wavelengths. We then describe the software we will use to calibrate the cameras and to robustly localize objects-of-interest in three dimensions. We show how object localizations captured over time will be used to compute the velocity and acceleration of airborne objects. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Journal ref: Journal of Astronomical Instrumentation, 12(1), 2340002 (2023)

arXiv:2305.18551 [pdf]

doi 10.1142/S2251171723400056

Multi-Band Acoustic Monitoring of Aerial Signatures

Authors: Andrew Mead, Sarah Little, Paul Sail, Michelle Tu, Wesley Andrés Watters, Abigail White, Richard Cloete

Abstract: The Galileo Project's acoustic monitoring, omni-directional system (AMOS) aids in the detection and characterization of aerial phenomena. It uses a multi-band microphone suite spanning infrasonic to ultrasonic frequencies, providing an independent signal modality for validation and characterization of detected objects. The system utilizes infrasonic, audible, and ultrasonic systems to cover a wide… ▽ More The Galileo Project's acoustic monitoring, omni-directional system (AMOS) aids in the detection and characterization of aerial phenomena. It uses a multi-band microphone suite spanning infrasonic to ultrasonic frequencies, providing an independent signal modality for validation and characterization of detected objects. The system utilizes infrasonic, audible, and ultrasonic systems to cover a wide range of sounds produced by both natural and man-made aerial phenomena. Sound signals from aerial objects can be captured given certain conditions, such as when the sound level is above ambient noise and isn't excessively distorted by its transmission path. Findings suggest that audible sources can be detected up to 1 km away, infrasonic sources can be detected over much longer distances, and ultrasonic at shorter ones. Initial data collected from aircraft recordings with spectral analysis will help develop algorithms and software for quick identification of known aircraft. Future work will involve multi-sensor arrays for sound localization, larger data sets analysis, and incorporation of machine learning and AI for detection and identification of more types of phenomena in all frequency bands. △ Less

Submitted 29 May, 2023; originally announced May 2023.

Journal ref: Journal of Astronomical Instrumentation, 12(1), 2340005 (2023)

arXiv:2305.10115 [pdf, other]

An Ensemble Deep Learning Approach for COVID-19 Severity Prediction Using Chest CT Scans

Authors: Sidra Aleem, Mayug Maniparambil, Suzanne Little, Noel O'Connor, Kevin McGuinness

Abstract: Chest X-rays have been widely used for COVID-19 screening; however, 3D computed tomography (CT) is a more effective modality. We present our findings on COVID-19 severity prediction from chest CT scans using the STOIC dataset. We developed an ensemble deep learning based model that incorporates multiple neural networks to improve predictions. To address data imbalance, we used slicing functions an… ▽ More Chest X-rays have been widely used for COVID-19 screening; however, 3D computed tomography (CT) is a more effective modality. We present our findings on COVID-19 severity prediction from chest CT scans using the STOIC dataset. We developed an ensemble deep learning based model that incorporates multiple neural networks to improve predictions. To address data imbalance, we used slicing functions and data augmentation. We further improved performance using test time data augmentation. Our approach which employs a simple yet effective ensemble of deep learning-based models with strong test time augmentations, achieved results comparable to more complex methods and secured the fourth position in the STOIC2021 COVID-19 AI Challenge. Our code is available on online: at: https://github.com/aleemsidra/stoic2021- baseline-finalphase-main. △ Less

Submitted 17 May, 2023; originally announced May 2023.

arXiv:2304.13855 [pdf, other]

Multimodal Composite Association Score: Measuring Gender Bias in Generative Multimodal Models

Authors: Abhishek Mandal, Susan Leavy, Suzanne Little

Abstract: Generative multimodal models based on diffusion models have seen tremendous growth and advances in recent years. Models such as DALL-E and Stable Diffusion have become increasingly popular and successful at creating images from texts, often combining abstract ideas. However, like other deep learning models, they also reflect social biases they inherit from their training data, which is often crawl… ▽ More Generative multimodal models based on diffusion models have seen tremendous growth and advances in recent years. Models such as DALL-E and Stable Diffusion have become increasingly popular and successful at creating images from texts, often combining abstract ideas. However, like other deep learning models, they also reflect social biases they inherit from their training data, which is often crawled from the internet. Manually auditing models for biases can be very time and resource consuming and is further complicated by the unbounded and unconstrained nature of inputs these models can take. Research into bias measurement and quantification has generally focused on small single-stage models working on a single modality. Thus the emergence of multistage multimodal models requires a different approach. In this paper, we propose Multimodal Composite Association Score (MCAS) as a new method of measuring gender bias in multimodal generative models. Evaluating both DALL-E 2 and Stable Diffusion using this approach uncovered the presence of gendered associations of concepts embedded within the models. We propose MCAS as an accessible and scalable method of quantifying potential bias for models with different modalities and a range of potential biases. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this contribution has been accepted at the Fourth International Workshop on Algorithmic Bias in Search and Recommendation held as a part of the 45th European Conference on Information Retrieval (ECIR 2023) and will be published soon

arXiv:2304.11438 [pdf, other]

doi 10.1109/ACCESS.2023.3274113

Constructing a meta-learner for unsupervised anomaly detection

Authors: Małgorzata Gutowska, Suzanne Little, Andrew McCarren

Abstract: Unsupervised anomaly detection (AD) is critical for a wide range of practical applications, from network security to health and medical tools. Due to the diversity of problems, no single algorithm has been found to be superior for all AD tasks. Choosing an algorithm, otherwise known as the Algorithm Selection Problem (ASP), has been extensively examined in supervised classification problems, throu… ▽ More Unsupervised anomaly detection (AD) is critical for a wide range of practical applications, from network security to health and medical tools. Due to the diversity of problems, no single algorithm has been found to be superior for all AD tasks. Choosing an algorithm, otherwise known as the Algorithm Selection Problem (ASP), has been extensively examined in supervised classification problems, through the use of meta-learning and AutoML, however, it has received little attention in unsupervised AD tasks. This research proposes a new meta-learning approach that identifies an appropriate unsupervised AD algorithm given a set of meta-features generated from the unlabelled input dataset. The performance of the proposed meta-learner is superior to the current state of the art solution. In addition, a mixed model statistical analysis has been conducted to examine the impact of the meta-learner components: the meta-model, meta-features, and the base set of AD algorithms, on the overall performance of the meta-learner. The analysis was conducted using more than 10,000 datasets, which is significantly larger than previous studies. Results indicate that a relatively small number of meta-features can be used to identify an appropriate AD algorithm, but the choice of a meta-model in the meta-learner has a considerable impact. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: 16 pages, 4 figures

ACM Class: I.2.m

Journal ref: IEEE Access, vol. 11, pp. 45815-45825, 2023

arXiv:2210.00824 [pdf, other]

Random Data Augmentation based Enhancement: A Generalized Enhancement Approach for Medical Datasets

Authors: Sidra Aleem, Teerath Kumar, Suzanne Little, Malika Bendechache, Rob Brennan, Kevin McGuinness

Abstract: Over the years, the paradigm of medical image analysis has shifted from manual expertise to automated systems, often using deep learning (DL) systems. The performance of deep learning algorithms is highly dependent on data quality. Particularly for the medical domain, it is an important aspect as medical data is very sensitive to quality and poor quality can lead to misdiagnosis. To improve the di… ▽ More Over the years, the paradigm of medical image analysis has shifted from manual expertise to automated systems, often using deep learning (DL) systems. The performance of deep learning algorithms is highly dependent on data quality. Particularly for the medical domain, it is an important aspect as medical data is very sensitive to quality and poor quality can lead to misdiagnosis. To improve the diagnostic performance, research has been done both in complex DL architectures and in improving data quality using dataset dependent static hyperparameters. However, the performance is still constrained due to data quality and overfitting of hyperparameters to a specific dataset. To overcome these issues, this paper proposes random data augmentation based enhancement. The main objective is to develop a generalized, data-independent and computationally efficient enhancement approach to improve medical data quality for DL. The quality is enhanced by improving the brightness and contrast of images. In contrast to the existing methods, our method generates enhancement hyperparameters randomly within a defined range, which makes it robust and prevents overfitting to a specific dataset. To evaluate the generalization of the proposed method, we use four medical datasets and compare its performance with state-of-the-art methods for both classification and segmentation tasks. For grayscale imagery, experiments have been performed with: COVID-19 chest X-ray, KiTS19, and for RGB imagery with: LC25000 datasets. Experimental results demonstrate that with the proposed enhancement methodology, DL architectures outperform other existing methods. Our code is publicly available at: https://github.com/aleemsidra/Augmentation-Based-Generalized-Enhancement △ Less

Submitted 3 October, 2022; originally announced October 2022.

Comments: Our paper is accepted at 24th Irish Machine Vision and Image Processing (IMVIP) Conference, Belfast. Paper got BCS NI Best Poster Presentation Award and copy of proceeding is at https://imvipconference.github.io/IMVIP2022_Proceedings.pdf

arXiv:2201.04231 [pdf]

Modeling Homophily in Dynamic Networks with Application to HIV Molecular Surveillance

Authors: V. DeGruttola, M. Nakazawa, J. Liu, X. Tu, S. Little, S. Mehta

Abstract: This paper describes a novel approach to modeling homphily, i.e. the tendency of nodes that share (or differ in) certain attributes to be linked; we consider dynamic networks in which nodes can be added over time but not removed. Our application is to HIV genetic linkage analysis that has been used to investigate HIV transmission dynamics. In this setting, two HIV sequences from different persons… ▽ More This paper describes a novel approach to modeling homphily, i.e. the tendency of nodes that share (or differ in) certain attributes to be linked; we consider dynamic networks in which nodes can be added over time but not removed. Our application is to HIV genetic linkage analysis that has been used to investigate HIV transmission dynamics. In this setting, two HIV sequences from different persons with HIV (PWH) are said to be linked if the genetic distance between these sequences is less than a given threshold. Such linkage suggests that that the nodes representing the two infected PWH, are close to each other in a transmission network; such proximity would imply that either one of the infected people directly transmitted the virus to the other or indirectly transmitted it through a small number of intermediaries. These viral genetic linkage networks are dynamic in the sense that, over time, a group or cluster of genetically linked viral sequences may increase in size as new people are infected by those in the cluster either directly or through intermediaries. Our approach makes use of a logistic model to describe homophily with regard to demographic and behavioral characteristics that is we investigate whether similarities (or differences) between PWH in these characteristics impacts the probability that their sequences are be linked. Such analyses provide information about HIV transmission dynamics within a population. △ Less

Submitted 29 December, 2021; originally announced January 2022.

arXiv:2109.03571 [pdf, other]

TrollsWithOpinion: A Dataset for Predicting Domain-specific Opinion Manipulation in Troll Memes

Authors: Shardul Suryawanshi, Bharathi Raja Chakravarthi, Mihael Arcan, Suzanne Little, Paul Buitelaar

Abstract: Research into the classification of Image with Text (IWT) troll memes has recently become popular. Since the online community utilizes the refuge of memes to express themselves, there is an abundance of data in the form of memes. These memes have the potential to demean, harras, or bully targeted individuals. Moreover, the targeted individual could fall prey to opinion manipulation. To comprehend… ▽ More Research into the classification of Image with Text (IWT) troll memes has recently become popular. Since the online community utilizes the refuge of memes to express themselves, there is an abundance of data in the form of memes. These memes have the potential to demean, harras, or bully targeted individuals. Moreover, the targeted individual could fall prey to opinion manipulation. To comprehend the use of memes in opinion manipulation, we define three specific domains (product, political or others) which we classify into troll or not-troll, with or without opinion manipulation. To enable this analysis, we enhanced an existing dataset by annotating the data with our defined classes, resulting in a dataset of 8,881 IWT or multimodal memes in the English language (TrollsWithOpinion dataset). We perform baseline experiments on the annotated dataset, and our result shows that existing state-of-the-art techniques could only reach a weighted-average F1-score of 0.37. This shows the need for a development of a specific technique to deal with multimodal troll memes. △ Less

Submitted 10 May, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

arXiv:2008.00106 [pdf, other]

Utilising Visual Attention Cues for Vehicle Detection and Tracking

Authors: Feiyan Hu, Venkatesh G M, Noel E. O'Connor, Alan F. Smeaton, Suzanne Little

Abstract: Advanced Driver-Assistance Systems (ADAS) have been attracting attention from many researchers. Vision-based sensors are the closest way to emulate human driver visual behavior while driving. In this paper, we explore possible ways to use visual attention (saliency) for object detection and tracking. We investigate: 1) How a visual attention map such as a \emph{subjectness} attention or saliency m… ▽ More Advanced Driver-Assistance Systems (ADAS) have been attracting attention from many researchers. Vision-based sensors are the closest way to emulate human driver visual behavior while driving. In this paper, we explore possible ways to use visual attention (saliency) for object detection and tracking. We investigate: 1) How a visual attention map such as a \emph{subjectness} attention or saliency map and an \emph{objectness} attention map can facilitate region proposal generation in a 2-stage object detector; 2) How a visual attention map can be used for tracking multiple objects. We propose a neural network that can simultaneously detect objects as and generate objectness and subjectness maps to save computational power. We further exploit the visual attention map during tracking using a sequential Monte Carlo probability hypothesis density (PHD) filter. The experiments are conducted on KITTI and DETRAC datasets. The use of visual attention and hierarchical features has shown a considerable improvement of $\approx$8\% in object detection which effectively increased tracking performance by $\approx$4\% on KITTI dataset. △ Less

Submitted 31 July, 2020; originally announced August 2020.

Comments: Accepted in ICPR2020

arXiv:2005.06748 [pdf, other]

ECIR 2020 Workshops: Assessing the Impact of Going Online

Authors: Sérgio Nunes, Suzanne Little, Sumit Bhatia, Ludovico Boratto, Guillaume Cabanac, Ricardo Campos, Francisco M. Couto, Stefano Faralli, Ingo Frommholz, Adam Jatowt, Alípio Jorge, Mirko Marras, Philipp Mayr, Giovanni Stilo

Abstract: ECIR 2020 https://ecir2020.org/ was one of the many conferences affected by the COVID-19 pandemic. The Conference Chairs decided to keep the initially planned dates (April 14-17, 2020) and move to a fully online event. In this report, we describe the experience of organizing the ECIR 2020 Workshops in this scenario from two perspectives: the workshop organizers and the workshop participants. We pr… ▽ More ECIR 2020 https://ecir2020.org/ was one of the many conferences affected by the COVID-19 pandemic. The Conference Chairs decided to keep the initially planned dates (April 14-17, 2020) and move to a fully online event. In this report, we describe the experience of organizing the ECIR 2020 Workshops in this scenario from two perspectives: the workshop organizers and the workshop participants. We provide a report on the organizational aspect of these events and the consequences for participants. Covering the scientific dimension of each workshop is outside the scope of this article. △ Less

Submitted 14 May, 2020; originally announced May 2020.

Comments: 10 pages, 3 figures, submitted to ACM SIGIR Forum

arXiv:1910.11603 [pdf, other]

MediaEval 2019: Concealed FGSM Perturbations for Privacy Preservation

Authors: Panagiotis Linardos, Suzanne Little, Kevin McGuinness

Abstract: This work tackles the Pixel Privacy task put forth by MediaEval 2019. Our goal is to manipulate images in a way that conceals them from automatic scene classifiers while preserving the original image quality. We use the fast gradient sign method, which normally has a corrupting influence on image appeal, and devise two methods to minimize the damage. The first approach uses a map of pixel location… ▽ More This work tackles the Pixel Privacy task put forth by MediaEval 2019. Our goal is to manipulate images in a way that conceals them from automatic scene classifiers while preserving the original image quality. We use the fast gradient sign method, which normally has a corrupting influence on image appeal, and devise two methods to minimize the damage. The first approach uses a map of pixel locations that are either salient or flat, and directs perturbations away from them. The second approach subtracts the gradient of an aesthetics evaluation model from the gradient of the attack model to guide the perturbations towards a direction that preserves appeal. We make our code available at: https://git.io/JesXr. △ Less

Submitted 25 October, 2019; originally announced October 2019.

Comments: MediaEval 2019 - Pixel Privacy

arXiv:1809.02575 [pdf, other]

Differentially Private Continual Release of Graph Statistics

Authors: Shuang Song, Susan Little, Sanjay Mehta, Staal Vinterbo, Kamalika Chaudhuri

Abstract: Motivated by understanding the dynamics of sensitive social networks over time, we consider the problem of continual release of statistics in a network that arrives online, while preserving privacy of its participants. For our privacy notion, we use differential privacy -- the gold standard in privacy for statistical data analysis. The main challenge in this problem is maintaining a good privacy-u… ▽ More Motivated by understanding the dynamics of sensitive social networks over time, we consider the problem of continual release of statistics in a network that arrives online, while preserving privacy of its participants. For our privacy notion, we use differential privacy -- the gold standard in privacy for statistical data analysis. The main challenge in this problem is maintaining a good privacy-utility tradeoff; naive solutions that compose across time, as well as solutions suited to tabular data either lead to poor utility or do not directly apply. In this work, we show that if there is a publicly known upper bound on the maximum degree of any node in the entire network sequence, then we can release many common graph statistics such as degree distributions and subgraph counts continually with a better privacy-accuracy tradeoff. Code available at https://bitbucket.org/shs037/graphprivacycode △ Less

Submitted 18 September, 2018; v1 submitted 7 September, 2018; originally announced September 2018.

arXiv:1801.07660 [pdf, other]

Bayesian reconstruction of HIV transmission trees from viral sequences and uncertain infection times

Authors: Hesam Montazeri, Susan Little, Niko Beerenwinkel, Victor DeGruttola

Abstract: Genetic sequence data of pathogens are increasingly used to investigate transmission dynamics in both endemic diseases and disease outbreaks; such research can aid in development of appropriate interventions and in design of studies to evaluate them. Several methods have been proposed to infer transmission chains from sequence data; however, existing methods do not generally reliably reconstruct t… ▽ More Genetic sequence data of pathogens are increasingly used to investigate transmission dynamics in both endemic diseases and disease outbreaks; such research can aid in development of appropriate interventions and in design of studies to evaluate them. Several methods have been proposed to infer transmission chains from sequence data; however, existing methods do not generally reliably reconstruct transmission trees because genetic sequence data or inferred phylogenetic trees from such data are insufficient for accurate inference regarding transmission chains. In this paper, we demonstrate the lack of a one-to-one relationship between phylogenies and transmission trees, and also show that information regarding infection times together with genetic sequences permit accurate reconstruction of transmission trees. We propose a Bayesian inference method for this purpose and demonstrate that precision of inference regarding these transmission trees depends on precision of the estimated times of infection. We also illustrate the use of these methods to study features of epidemic dynamics, such as the relationship between characteristics of nodes and average number of outbound edges or inbound edges-- signifying possible transmission events from and to nodes. We study the performance of the proposed method in simulation experiments and demonstrate its superiority in comparison to an alternative method. We apply them to a transmission cluster in San Diego and investigate the impact of biological, behavioral, and demographic factors. △ Less

Submitted 23 January, 2018; originally announced January 2018.

arXiv:1712.08215 [pdf, other]

Diverse spatial expression patterns emerge from common transcription bursting kinetics

Authors: Benjamin Zoller, Shawn C. Little, Thomas Gregor

Abstract: In early development, regulation of transcription results in precisely positioned and highly reproducible expression patterns that specify cellular identities. How transcription, a fundamentally noisy molecular process, is regulated to achieve reliable embryonic patterning remains unclear. In particular, it is unknown how gene-specific regulation mechanisms affect kinetic rates of transcription, a… ▽ More In early development, regulation of transcription results in precisely positioned and highly reproducible expression patterns that specify cellular identities. How transcription, a fundamentally noisy molecular process, is regulated to achieve reliable embryonic patterning remains unclear. In particular, it is unknown how gene-specific regulation mechanisms affect kinetic rates of transcription, and whether there are common, global features that govern these rates across a genetic network. Here, we measure nascent transcriptional activity in the gap gene network of early Drosophila embryos and characterize the variability in absolute activity levels across expression boundaries. We demonstrate that boundary formation follows a common transcriptional principle: a single control parameter determines the distribution of transcriptional activity, regardless of gene identity, boundary position, or enhancer-promoter architecture. By employing a minimalist model of transcription, we infer kinetic rates of transcriptional bursting for these patterning genes; we find that the key regulatory parameter is the fraction of time a gene is in an actively transcribing state, while the rate of Pol II loading appears globally conserved. These results point to a universal simplicity underlying the apparently complex transcriptional processes responsible for early embryonic patterning and indicate a path to general rules in transcriptional regulation. △ Less

Submitted 21 December, 2017; originally announced December 2017.

Comments: Main text: 12 pages, 3 figures Supplement: 19 pages, 8 figures

arXiv:1711.05586 [pdf, other]

People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting

Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Ciara E. Keogh, Noel E. O'Connor

Abstract: In this paper we propose a technique to adapt a convolutional neural network (CNN) based object counter to additional visual domains and object types while still preserving the original counting function. Domain-specific normalisation and scaling operators are trained to allow the model to adjust to the statistical distributions of the various visual domains. The developed adaptation technique is… ▽ More In this paper we propose a technique to adapt a convolutional neural network (CNN) based object counter to additional visual domains and object types while still preserving the original counting function. Domain-specific normalisation and scaling operators are trained to allow the model to adjust to the statistical distributions of the various visual domains. The developed adaptation technique is used to produce a singular patch-based counting regressor capable of counting various object types including people, vehicles, cell nuclei and wildlife. As part of this study a challenging new cell counting dataset in the context of tissue culture and patient diagnosis is constructed. This new collection, referred to as the Dublin Cell Counting (DCC) dataset, is the first of its kind to be made available to the wider computer vision community. State-of-the-art object counting performance is achieved in both the Shanghaitech (parts A and B) and Penguins datasets while competitive performance is observed on the TRANCOS and Modified Bone Marrow (MBM) datasets, all using a shared counting model. △ Less

Submitted 15 November, 2017; originally announced November 2017.

Comments: 10 pages

arXiv:1705.10698 [pdf, other]

ResnetCrowd: A Residual Deep Learning Architecture for Crowd Counting, Violent Behaviour Detection and Crowd Density Level Classification

Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Noel E. O'Connor

Abstract: In this paper we propose ResnetCrowd, a deep residual architecture for simultaneous crowd counting, violent behaviour detection and crowd density level classification. To train and evaluate the proposed multi-objective technique, a new 100 image dataset referred to as Multi Task Crowd is constructed. This new dataset is the first computer vision dataset fully annotated for crowd counting, violent… ▽ More In this paper we propose ResnetCrowd, a deep residual architecture for simultaneous crowd counting, violent behaviour detection and crowd density level classification. To train and evaluate the proposed multi-objective technique, a new 100 image dataset referred to as Multi Task Crowd is constructed. This new dataset is the first computer vision dataset fully annotated for crowd counting, violent behaviour detection and density level classification. Our experiments show that a multi-task approach boosts individual task performance for all tasks and most notably for violent behaviour detection which receives a 9\% boost in ROC curve AUC (Area under the curve). The trained ResnetCrowd model is also evaluated on several additional benchmarks highlighting the superior generalisation of crowd analysis models trained for multiple objectives. △ Less

Submitted 30 May, 2017; originally announced May 2017.

Comments: 7 Pages, AVSS 2017

arXiv:1612.00220 [pdf, other]

Fully Convolutional Crowd Counting On Highly Congested Scenes

Authors: Mark Marsden, Kevin McGuinness, Suzanne Little, Noel E. O'Connor

Abstract: In this paper we advance the state-of-the-art for crowd counting in high density scenes by further exploring the idea of a fully convolutional crowd counting model introduced by (Zhang et al., 2016). Producing an accurate and robust crowd count estimator using computer vision techniques has attracted significant research interest in recent years. Applications for crowd counting systems exist in ma… ▽ More In this paper we advance the state-of-the-art for crowd counting in high density scenes by further exploring the idea of a fully convolutional crowd counting model introduced by (Zhang et al., 2016). Producing an accurate and robust crowd count estimator using computer vision techniques has attracted significant research interest in recent years. Applications for crowd counting systems exist in many diverse areas including city planning, retail, and of course general public safety. Develo** a highly generalised counting model that can be deployed in any surveillance scenario with any camera perspective is the key objective for research in this area. Techniques developed in the past have generally performed poorly in highly congested scenes with several thousands of people in frame (Rodriguez et al., 2011). Our approach, influenced by the work of (Zhang et al., 2016), consists of the following contributions: (1) A training set augmentation scheme that minimises redundancy among training samples to improve model generalisation and overall counting performance; (2) a deep, single column, fully convolutional network (FCN) architecture; (3) a multi-scale averaging step during inference. The developed technique can analyse images of any resolution or aspect ratio and achieves state-of-the-art counting performance on the Shanghaitech Part B and UCF CC 50 datasets as well as competitive performance on Shanghaitech Part A. △ Less

Submitted 17 January, 2017; v1 submitted 1 December, 2016; originally announced December 2016.

Comments: 7 pages , VISAPP 2017

arXiv:1606.05310 [pdf, other]

Holistic Features For Real-Time Crowd Behaviour Anomaly Detection

Authors: M. Marsden, K. McGuinness, S. Little, N. E. O'Connor

Abstract: This paper presents a new approach to crowd behaviour anomaly detection that uses a set of efficiently computed, easily interpretable, scene-level holistic features. This low-dimensional descriptor combines two features from the literature: crowd collectiveness [1] and crowd conflict [2], with two newly developed crowd features: mean motion speed and a new formulation of crowd density. Two differe… ▽ More This paper presents a new approach to crowd behaviour anomaly detection that uses a set of efficiently computed, easily interpretable, scene-level holistic features. This low-dimensional descriptor combines two features from the literature: crowd collectiveness [1] and crowd conflict [2], with two newly developed crowd features: mean motion speed and a new formulation of crowd density. Two different anomaly detection approaches are investigated using these features. When only normal training data is available we use a Gaussian Mixture Model (GMM) for outlier detection. When both normal and abnormal training data is available we use a Support Vector Machine (SVM) for binary classification. We evaluate on two crowd behaviour anomaly detection datasets, achieving both state-of-the-art classification performance on the violent-flows dataset [3] as well as better than real-time processing performance (40 frames per second). △ Less

Submitted 16 June, 2016; originally announced June 2016.

Comments: 4 pages

arXiv:1501.07342 [pdf, other]

doi 10.1098/rsos.150486

Only accessible information is useful: insights from gradient-mediated patterning

Authors: Mikhail Tikhonov, Shawn C. Little, Thomas Gregor

Abstract: Information theory is gaining popularity as a tool to characterize performance of biological systems. However, information is commonly quantified without reference to whether or how a system could extract and use it; as a result, information-theoretic quantities are easily misinterpreted. Here we take the example of pattern-forming developmental systems which are commonly structured as cascades of… ▽ More Information theory is gaining popularity as a tool to characterize performance of biological systems. However, information is commonly quantified without reference to whether or how a system could extract and use it; as a result, information-theoretic quantities are easily misinterpreted. Here we take the example of pattern-forming developmental systems which are commonly structured as cascades of sequential gene expression steps. Such a multi-tiered structure appears to constitute sub-optimal use of the positional information provided by the input morphogen because noise is added at each tier. However, the conventional theory fails to distinguish between the total information in a morphogen and information that can be usefully extracted and interpreted by downstream elements. We demonstrate that quantifying the information that is _accessible_ to the system naturally explains the prevalence of multi-tiered network architectures as a consequence of the noise inherent to the control of gene expression. We support our argument with empirical observations from patterning along the major body axis of the fruit fly embryo. Our results exhibit the limitations of the standard information-theoretic characterization of biological signaling and illustrate how they can be resolved. △ Less

Submitted 7 May, 2015; v1 submitted 28 January, 2015; originally announced January 2015.

Comments: Updated and refocused; 9 pages, 4 figures + supplement

Journal ref: Open Science 2(11): 150486, 2015

arXiv:1012.5264 [pdf]

Engineering the Zero-Point Field and Polarizable Vacuum For Interstellar Flight

Authors: H. E. Puthoff, S. R. Little

Abstract: A theme that has come to the fore in advanced planning for long-range space exploration is the concept of "propellantless propulsion" or "field propulsion." One version of this concept involves the projected possibility that empty space itself (the quantum vacuum, or space-time metric) might be manipulated so as to provide energy/thrust for future space vehicles. Although such a proposal has a cer… ▽ More A theme that has come to the fore in advanced planning for long-range space exploration is the concept of "propellantless propulsion" or "field propulsion." One version of this concept involves the projected possibility that empty space itself (the quantum vacuum, or space-time metric) might be manipulated so as to provide energy/thrust for future space vehicles. Although such a proposal has a certain science-fiction quality about it, modern theory describes the vacuum as a polarizable medium that sustains energetic quantum fluctuations. Thus the possibility that matter/vacuum interactions might be engineered for space-flight applications is not a priori ruled out, although certain constraints need to be acknowledged. The structure and implications of such a far-reaching hypothesis are considered herein. △ Less

Submitted 23 December, 2010; originally announced December 2010.

Comments: 12 pages, 1 figure, 2 tables, 35 references

Journal ref: J.Br.Interplanet.Soc.55:137-144,2002

arXiv:gr-qc/0608009 [pdf, ps, other]

The Projection Operator Method and the Ashtekar-Horowitz-Boulware Model

Authors: J. Scott Little

Abstract: Motivated by the recent work of Louko and Molgado, we consider the Ashtekar-Horowitz-Boulware model using the projection operator formalism. This paper uses the techniques developed in a recent paper of Klauder and Little to overcome the potential difficulties of this particular model. We also extend the model by including a larger class of functions than previously considered and evaluate the c… ▽ More Motivated by the recent work of Louko and Molgado, we consider the Ashtekar-Horowitz-Boulware model using the projection operator formalism. This paper uses the techniques developed in a recent paper of Klauder and Little to overcome the potential difficulties of this particular model. We also extend the model by including a larger class of functions than previously considered and evaluate the classical limit of the model. △ Less

Submitted 1 August, 2006; originally announced August 2006.

arXiv:gr-qc/0602114 [pdf, ps, other]

doi 10.1088/0264-9381/23/10/025

Highly Irregular Quantum Constraints

Authors: John R. Klauder, J. Scott Little

Abstract: Motivated by a recent paper of Louko and Molgado, we consider a simple system with a single classical constraint R(q)=0. If q_l denotes a generic solution to R(q)=0, our examples include cases where R'(q_l)\ne 0 (regular constraint) and R'(q_l)=0 (irregular constraint) of varying order as well as the case where R(q)=0 for an interval, such as a \leq q \leq b. Quantization of irregular constraint… ▽ More Motivated by a recent paper of Louko and Molgado, we consider a simple system with a single classical constraint R(q)=0. If q_l denotes a generic solution to R(q)=0, our examples include cases where R'(q_l)\ne 0 (regular constraint) and R'(q_l)=0 (irregular constraint) of varying order as well as the case where R(q)=0 for an interval, such as a \leq q \leq b. Quantization of irregular constraints is normally not considered; however, using the projection operator formalism we provide a satisfactory quantization which reduces to the constrained classical system when \hbar \to 0. It is noteworthy that irregular constraints change the observable aspects of a theory as compared to strictly regular constraints. △ Less

Submitted 27 February, 2006; originally announced February 2006.

Comments: 19 pages, latex

Journal ref: Class.Quant.Grav. 23 (2006) 3641

arXiv:gr-qc/0502045 [pdf, ps, other]

doi 10.1103/PhysRevD.71.085014

Elementary Model of Constraint Quantization with an Anomaly

Authors: J. Scott Little, John R. Klauder

Abstract: Quantum gravity is made more difficult in part by its constraint structure. The constraints are classically first-class; however, upon quantization they become partially second-class. To study such behavior, we focus on a simple problem with finitely many degrees of freedom and demonstrate how the projection operator formalism for dealing with quantum constraints is well suited to this type of e… ▽ More Quantum gravity is made more difficult in part by its constraint structure. The constraints are classically first-class; however, upon quantization they become partially second-class. To study such behavior, we focus on a simple problem with finitely many degrees of freedom and demonstrate how the projection operator formalism for dealing with quantum constraints is well suited to this type of example. △ Less

Submitted 10 February, 2005; originally announced February 2005.

Comments: 14 pages, 2 figures

Journal ref: Phys.Rev. D71 (2005) 085014

arXiv:astro-ph/0107316

Engineering the Zero-Point Field and Polarizable Vacuum For Interstellar Flight

Authors: H. E. Puthoff, S. R. Little, M. Ibison

Abstract: A theme that has come to the fore in advanced planning for long-range space exploration is the concept of "propellantless propulsion" or "field propulsion." One version of this concept involves the projected possibility that empty space itself (the quantum vacuum, or space-time metric) might be manipulated so as to provide energy/thrust for future space vehicles. Although such a proposal has a c… ▽ More A theme that has come to the fore in advanced planning for long-range space exploration is the concept of "propellantless propulsion" or "field propulsion." One version of this concept involves the projected possibility that empty space itself (the quantum vacuum, or space-time metric) might be manipulated so as to provide energy/thrust for future space vehicles. Although such a proposal has a certain science-fiction quality about it, modern theory describes the vacuum as a polarizable medium that sustains energetic quantum fluctuations. Thus the possibility that matter/vacuum interactions might be engineered for space-flight applications is not a priori ruled out, although certain constraints need to be acknowledged. The structure and implications of such a far-reaching hypothesis are considered herein. △ Less

Submitted 26 October, 2010; v1 submitted 17 July, 2001; originally announced July 2001.

Comments: Paper withdrawn. Author Ibison does not subscribe to some of the speculations in this document

arXiv:physics/9910050 [pdf]

The speed of gravity revisited

Authors: Michael Ibison, Harold E. Puthoff, Scott R. Little

Abstract: Recently Van Flandern concluded from astrophysical data that gravity propagates faster than light. We demonstrate that the data can be explained by current theory that does not permit superluminal speeds. We explain the origin of apparently instantaneous connections, first within EM, and then within strong-field GR. Recently Van Flandern concluded from astrophysical data that gravity propagates faster than light. We demonstrate that the data can be explained by current theory that does not permit superluminal speeds. We explain the origin of apparently instantaneous connections, first within EM, and then within strong-field GR. △ Less

Submitted 29 October, 1999; originally announced October 1999.

Comments: 10 pages, 4 figs, submitted to Physics Letters A on September 10, 1999

Showing 1–31 of 31 results for author: Little, S