Search | arXiv e-print repository

Probabilistic Image-Driven Traffic Modeling via Remote Sensing

Abstract: This work addresses the task of modeling spatiotemporal traffic patterns directly from overhead imagery, which we refer to as image-driven traffic modeling. We extend this line of work and introduce a multi-modal, multi-task transformer-based segmentation architecture that can be used to create dense city-scale traffic models. Our approach includes a geo-temporal positional encoding module for int… ▽ More This work addresses the task of modeling spatiotemporal traffic patterns directly from overhead imagery, which we refer to as image-driven traffic modeling. We extend this line of work and introduce a multi-modal, multi-task transformer-based segmentation architecture that can be used to create dense city-scale traffic models. Our approach includes a geo-temporal positional encoding module for integrating geo-temporal context and a probabilistic objective function for estimating traffic speeds that naturally models temporal variations. We evaluate our method extensively using the Dynamic Traffic Speeds (DTS) benchmark dataset and significantly improve the state-of-the-art. Finally, we introduce the DTS++ dataset to support mobility-related location adaptation experiments. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2312.12189 [pdf, other]

Teeth Localization and Lesion Segmentation in CBCT Images using SpatialConfiguration-Net and U-Net

Authors: Arnela Hadzic, Barbara Kirnbauer, Darko Stern, Martin Urschler

Abstract: The localization of teeth and segmentation of periapical lesions in cone-beam computed tomography (CBCT) images are crucial tasks for clinical diagnosis and treatment planning, which are often time-consuming and require a high level of expertise. However, automating these tasks is challenging due to variations in shape, size, and orientation of lesions, as well as similar topologies among teeth. M… ▽ More The localization of teeth and segmentation of periapical lesions in cone-beam computed tomography (CBCT) images are crucial tasks for clinical diagnosis and treatment planning, which are often time-consuming and require a high level of expertise. However, automating these tasks is challenging due to variations in shape, size, and orientation of lesions, as well as similar topologies among teeth. Moreover, the small volumes occupied by lesions in CBCT images pose a class imbalance problem that needs to be addressed. In this study, we propose a deep learning-based method utilizing two convolutional neural networks: the SpatialConfiguration-Net (SCN) and a modified version of the U-Net. The SCN accurately predicts the coordinates of all teeth present in an image, enabling precise crop** of teeth volumes that are then fed into the U-Net which detects lesions via segmentation. To address class imbalance, we compare the performance of three reweighting loss functions. After evaluation on 144 CBCT images, our method achieves a 97.3% accuracy for teeth localization, along with a promising sensitivity and specificity of 0.97 and 0.88, respectively, for subsequent lesion detection. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: Accepted for VISIGRAPP 2024 (Track: VISAPP), 8 pages

arXiv:2211.15790 [pdf, other]

Handling Image and Label Resolution Mismatch in Remote Sensing

Authors: Scott Workman, Armin Hadzic, M. Usman Rafique

Abstract: Though semantic segmentation has been heavily explored in vision literature, unique challenges remain in the remote sensing domain. One such challenge is how to handle resolution mismatch between overhead imagery and ground-truth label sources, due to differences in ground sample distance. To illustrate this problem, we introduce a new dataset and use it to showcase weaknesses inherent in existing… ▽ More Though semantic segmentation has been heavily explored in vision literature, unique challenges remain in the remote sensing domain. One such challenge is how to handle resolution mismatch between overhead imagery and ground-truth label sources, due to differences in ground sample distance. To illustrate this problem, we introduce a new dataset and use it to showcase weaknesses inherent in existing strategies that naively upsample the target label to match the image resolution. Instead, we present a method that is supervised using low-resolution labels (without upsampling), but takes advantage of an exemplar set of high-resolution labels to guide the learning process. Our method incorporates region aggregation, adversarial learning, and self-supervised pretraining to generate fine-grained predictions, without requiring high-resolution annotations. Extensive experiments demonstrate the real-world applicability of our approach. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

arXiv:2202.13883 [pdf, other]

EdgeMixup: Improving Fairness for Skin Disease Classification and Segmentation

Authors: Haolin Yuan, Armin Hadzic, William Paul, Daniella Villegas de Flores, Philip Mathew, John Aucott, Yinzhi Cao, Philippe Burlina

Abstract: Skin lesions can be an early indicator of a wide range of infectious and other diseases. The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients. However, these models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones. To the best… ▽ More Skin lesions can be an early indicator of a wide range of infectious and other diseases. The use of deep learning (DL) models to diagnose skin lesions has great potential in assisting clinicians with prescreening patients. However, these models often learn biases inherent in training data, which can lead to a performance gap in the diagnosis of people with light and/or dark skin tones. To the best of our knowledge, limited work has been done on identifying, let alone reducing, model bias in skin disease classification and segmentation. In this paper, we examine DL fairness and demonstrate the existence of bias in classification and segmentation models for subpopulations with darker skin tones compared to individuals with lighter skin tones, for specific diseases including Lyme, Tinea Corporis and Herpes Zoster. Then, we propose a novel preprocessing, data alteration method, called EdgeMixup, to improve model fairness with a linear combination of an input skin lesion image and a corresponding a predicted edge detection mask combined with color saturation alteration. For the task of skin disease classification, EdgeMixup outperforms much more complex competing methods such as adversarial approaches, achieving a 10.99% reduction in accuracy gap between light and dark skin tone samples, and resulting in 8.4% improved performance for an underrepresented subpopulation. △ Less

Submitted 28 February, 2022; originally announced February 2022.

arXiv:2111.00599 [pdf, other]

Bayesian optimization of distributed neurodynamical controller models for spatial navigation

Authors: Armin Hadzic, Grace M. Hwang, Kechen Zhang, Kevin M. Schultz, Joseph D. Monaco

Abstract: Dynamical systems models for controlling multi-agent swarms have demonstrated advances toward resilient, decentralized navigation algorithms. We previously introduced the NeuroSwarms controller, in which agent-based interactions were modeled by analogy to neuronal network interactions, including attractor dynamics and phase synchrony, that have been theorized to operate within hippocampal place-ce… ▽ More Dynamical systems models for controlling multi-agent swarms have demonstrated advances toward resilient, decentralized navigation algorithms. We previously introduced the NeuroSwarms controller, in which agent-based interactions were modeled by analogy to neuronal network interactions, including attractor dynamics and phase synchrony, that have been theorized to operate within hippocampal place-cell circuits in navigating rodents. This complexity precludes linear analyses of stability, controllability, and performance typically used to study conventional swarm models. Further, tuning dynamical controllers by hand or grid search is often inadequate due to the complexity of objectives, dimensionality of model parameters, and computational costs of simulation-based sampling. Here, we present a framework for tuning dynamical controller models of autonomous multi-agent systems based on Bayesian Optimization (BayesOpt). Our approach utilizes a task-dependent objective function to train Gaussian Processes (GPs) as surrogate models to achieve adaptive and efficient exploration of a dynamical controller model's parameter space. We demonstrate this approach by studying an objective function selecting for NeuroSwarms behaviors that cooperatively localize and capture spatially distributed rewards under time pressure. We generalized task performance across environments by combining scores for simulations in distinct geometries. To validate search performance, we compared high-dimensional clustering for high- vs. low-likelihood parameter points by visualizing sample trajectories in Uniform Manifold Approximation and Projection (UMAP) embeddings. Our findings show that adaptive, sample-efficient evaluation of the self-organizing behavioral capacities of complex systems, including dynamical swarm controllers, can accelerate the translation of neuroscientific theory to applied domains. △ Less

Submitted 31 October, 2021; originally announced November 2021.

Comments: 29 pages, 10 figures

arXiv:2103.08829 [pdf, other]

Towards Indirect Top-Down Road Transport Emissions Estimation

Authors: Ryan Mukherjee, Derek Rollend, Gordon Christie, Armin Hadzic, Sally Matson, Anshu Saksena, Marisa Hughes

Abstract: Road transportation is one of the largest sectors of greenhouse gas (GHG) emissions affecting climate change. Tackling climate change as a global community will require new capabilities to measure and inventory road transport emissions. However, the large scale and distributed nature of vehicle emissions make this sector especially challenging for existing inventory methods. In this work, we devel… ▽ More Road transportation is one of the largest sectors of greenhouse gas (GHG) emissions affecting climate change. Tackling climate change as a global community will require new capabilities to measure and inventory road transport emissions. However, the large scale and distributed nature of vehicle emissions make this sector especially challenging for existing inventory methods. In this work, we develop machine learning models that use satellite imagery to perform indirect top-down estimation of road transport emissions. Our initial experiments focus on the United States, where a bottom-up inventory was available for training our models. We achieved a mean absolute error (MAE) of 39.5 kg CO$_{2}$ of annual road transport emissions, calculated on a pixel-by-pixel (100 m$^{2}$) basis in Sentinel-2 imagery. We also discuss key model assumptions and challenges that need to be addressed to develop models capable of generalizing to global geography. We believe this work is the first published approach for automated indirect top-down estimation of road transport sector emissions using visual imagery and represents a critical step towards scalable, global, near-real-time road transportation emissions inventories that are measured both independently and objectively. △ Less

Submitted 15 March, 2021; originally announced March 2021.

arXiv:2012.06387 [pdf, other]

TARA: Training and Representation Alteration for AI Fairness and Domain Generalization

Authors: William Paul, Armin Hadzic, Neil Joshi, Fady Alajaji, Phil Burlina

Abstract: We propose a novel method for enforcing AI fairness with respect to protected or sensitive factors. This method uses a dual strategy performing training and representation alteration (TARA) for the mitigation of prominent causes of AI bias by including: a) the use of representation learning alteration via adversarial independence to suppress the bias-inducing dependence of the data representation… ▽ More We propose a novel method for enforcing AI fairness with respect to protected or sensitive factors. This method uses a dual strategy performing training and representation alteration (TARA) for the mitigation of prominent causes of AI bias by including: a) the use of representation learning alteration via adversarial independence to suppress the bias-inducing dependence of the data representation from protected factors; and b) training set alteration via intelligent augmentation to address bias-causing data imbalance, by using generative models that allow the fine control of sensitive factors related to underrepresented populations via domain adaptation and latent space manipulation. When testing our methods on image analytics, experiments demonstrate that TARA significantly or fully debiases baseline models while outperforming competing debiasing methods that have the same amount of information, e.g., with (% overall accuracy, % accuracy gap) = (78.8, 0.5) vs. the baseline method's score of (71.8, 10.5) for EyePACS, and (73.7, 11.8) vs. (69.1, 21.7) for CelebA. Furthermore, recognizing certain limitations in current metrics used for assessing debiasing performance, we propose novel conjunctive debiasing metrics. Our experiments also demonstrate the ability of these novel metrics in assessing the Pareto efficiency of the proposed methods. △ Less

Submitted 20 August, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

Comments: Accepted for publication in MIT Neural Computation

arXiv:2006.14547 [pdf, other]

Estimating Displaced Populations from Overhead

Authors: Armin Hadzic, Gordon Christie, Jeffrey Freeman, Amber Dismer, Stevan Bullard, Ashley Greiner, Nathan Jacobs, Ryan Mukherjee

Abstract: We introduce a deep learning approach to perform fine-grained population estimation for displacement camps using high-resolution overhead imagery. We train and evaluate our approach on drone imagery cross-referenced with population data for refugee camps in Cox's Bazar, Bangladesh in 2018 and 2019. Our proposed approach achieves 7.02% mean absolute percent error on sequestered camp imagery. We bel… ▽ More We introduce a deep learning approach to perform fine-grained population estimation for displacement camps using high-resolution overhead imagery. We train and evaluate our approach on drone imagery cross-referenced with population data for refugee camps in Cox's Bazar, Bangladesh in 2018 and 2019. Our proposed approach achieves 7.02% mean absolute percent error on sequestered camp imagery. We believe our experiments with real-world displacement camp data constitute an important step towards the development of tools that enable the humanitarian community to effectively and rapidly respond to the global displacement crisis. △ Less

Submitted 21 December, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: Fixed typo in abstract

arXiv:2006.08021 [pdf, other]

RasterNet: Modeling Free-Flow Speed using LiDAR and Overhead Imagery

Authors: Armin Hadzic, Hunter Blanton, Weilian Song, Mei Chen, Scott Workman, Nathan Jacobs

Abstract: Roadway free-flow speed captures the typical vehicle speed in low traffic conditions. Modeling free-flow speed is an important problem in transportation engineering with applications to a variety of design, operation, planning, and policy decisions of highway systems. Unfortunately, collecting large-scale historical traffic speed data is expensive and time consuming. Traditional approaches for est… ▽ More Roadway free-flow speed captures the typical vehicle speed in low traffic conditions. Modeling free-flow speed is an important problem in transportation engineering with applications to a variety of design, operation, planning, and policy decisions of highway systems. Unfortunately, collecting large-scale historical traffic speed data is expensive and time consuming. Traditional approaches for estimating free-flow speed use geometric properties of the underlying road segment, such as grade, curvature, lane width, lateral clearance and access point density, but for many roads such features are unavailable. We propose a fully automated approach, RasterNet, for estimating free-flow speed without the need for explicit geometric features. RasterNet is a neural network that fuses large-scale overhead imagery and aerial LiDAR point clouds using a geospatially consistent raster structure. To support training and evaluation, we introduce a novel dataset combining free-flow speeds of road segments, overhead imagery, and LiDAR point clouds across the state of Kentucky. Our method achieves state-of-the-art results on a benchmark dataset. △ Less

Submitted 14 June, 2020; originally announced June 2020.

arXiv:1901.06013 [pdf, other]

FARSA: Fully Automated Roadway Safety Assessment

Authors: Weilian Song, Scott Workman, Armin Hadzic, Xu Zhang, Eric Green, Mei Chen, Reginald Souleyrette, Nathan Jacobs

Abstract: This paper addresses the task of road safety assessment. An emerging approach for conducting such assessments in the United States is through the US Road Assessment Program (usRAP), which rates roads from highest risk (1 star) to lowest (5 stars). Obtaining these ratings requires manual, fine-grained labeling of roadway features in street-level panoramas, a slow and costly process. We propose to a… ▽ More This paper addresses the task of road safety assessment. An emerging approach for conducting such assessments in the United States is through the US Road Assessment Program (usRAP), which rates roads from highest risk (1 star) to lowest (5 stars). Obtaining these ratings requires manual, fine-grained labeling of roadway features in street-level panoramas, a slow and costly process. We propose to automate this process using a deep convolutional neural network that directly estimates the star rating from a street-level panorama, requiring milliseconds per image at test time. Our network also estimates many other road-level attributes, including curvature, roadside hazards, and the type of median. To support this, we incorporate task-specific attention layers so the network can focus on the panorama regions that are most useful for a particular task. We evaluated our approach on a large dataset of real-world images from two US states. We found that incorporating additional tasks, and using a semi-supervised training approach, significantly reduced overfitting problems, allowed us to optimize more layers of the network, and resulted in higher accuracy. △ Less

Submitted 17 January, 2019; originally announced January 2019.

Comments: 9 pages, 8 figures, WACV 2018

Showing 1–10 of 10 results for author: Hadzic, A