Search | arXiv e-print repository

doi 10.1016/j.futures.2023.103182

Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control

Authors: Kyle A. Kilian, Christopher J. Ventura, Mark M. Bailey

Abstract: Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opa… ▽ More Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opaque AI decision processes. This is especially true in the subfield of machine learning (ML), where systems learn to optimize objectives without human assistance. Objectives can be imperfectly specified or executed in an unexpected or potentially harmful way. This becomes more concerning as systems increase in power and autonomy, where an abrupt capability jump could result in unexpected shifts in power dynamics or even catastrophic failures. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis. Survey data were collected from domain experts in the public and private sectors to classify AI impact and likelihood. The results show increased uncertainty over the powerful AI agent scenario, confidence in multiagent environments, and increased concern over AI alignment failures and influence-seeking behavior. △ Less

Submitted 24 November, 2023; v1 submitted 6 November, 2022; originally announced November 2022.

Comments: 44 pages

arXiv:2208.10607 [pdf, other]

doi 10.1016/j.jag.2024.103848

Individual Tree Detection in Large-Scale Urban Environments using High-Resolution Multispectral Imagery

Authors: Jonathan Ventura, Camille Pawlak, Milo Honsberger, Cameron Gonsalves, Julian Rice, Natalie L. R. Love, Skyler Han, Viet Nguyen, Keilana Sugano, Jacqueline Doremus, G. Andrew Fricker, Jenn Yost, Matt Ritter

Abstract: We introduce a novel deep learning method for detection of individual trees in urban environments using high-resolution multispectral aerial imagery. We use a convolutional neural network to regress a confidence map indicating the locations of individual trees, which are localized using a peak finding algorithm. Our method provides complete spatial coverage by detecting trees in both public and pr… ▽ More We introduce a novel deep learning method for detection of individual trees in urban environments using high-resolution multispectral aerial imagery. We use a convolutional neural network to regress a confidence map indicating the locations of individual trees, which are localized using a peak finding algorithm. Our method provides complete spatial coverage by detecting trees in both public and private spaces, and can scale to very large areas. We performed a thorough evaluation of our method, supported by a new dataset of over 1,500 images and almost 100,000 tree annotations, covering eight cities, six climate zones, and three image capture years. We trained our model on data from Southern California, and achieved a precision of 73.6% and recall of 73.3% using test data from this region. We generally observed similar precision and slightly lower recall when extrapolating to other California climate zones and image capture dates. We used our method to produce a map of trees in the entire urban forest of California, and estimated the total number of urban trees in California to be about 43.5 million. Our study indicates the potential for deep learning methods to support future urban forestry studies at unprecedented scales. △ Less

Submitted 3 July, 2024; v1 submitted 22 August, 2022; originally announced August 2022.

Journal ref: International Journal of Applied Earth Observation and Geoinformation 130 (2024): 103848

arXiv:2104.04893 [pdf, other]

The Atari Data Scraper

Authors: Brittany Davis Pierson, Justine Ventura, Matthew E. Taylor

Abstract: Reinforcement learning has made great strides in recent years due to the success of methods using deep neural networks. However, such neural networks act as a black box, obscuring the inner workings. While reinforcement learning has the potential to solve unique problems, a lack of trust and understanding of reinforcement learning algorithms could prevent their widespread adoption. Here, we presen… ▽ More Reinforcement learning has made great strides in recent years due to the success of methods using deep neural networks. However, such neural networks act as a black box, obscuring the inner workings. While reinforcement learning has the potential to solve unique problems, a lack of trust and understanding of reinforcement learning algorithms could prevent their widespread adoption. Here, we present a library that attaches a "data scraper" to deep reinforcement learning agents, acting as an observer, and then show how the data collected by the Atari Data Scraper can be used to understand and interpret deep reinforcement learning agents. The code for the Atari Data Scraper can be found here: https://github.com/IRLL/Atari-Data-Scraper △ Less

Submitted 10 April, 2021; originally announced April 2021.

Comments: 3 authors, nine pages, 6 figures, papers with code

arXiv:2011.08790 [pdf, other]

P1AC: Revisiting Absolute Pose From a Single Affine Correspondence

Authors: Jonathan Ventura, Zuzana Kukelova, Torsten Sattler, Dániel Baráth

Abstract: Affine correspondences have traditionally been used to improve feature matching over wide baselines. While recent work has successfully used affine correspondences to solve various relative camera pose estimation problems, less attention has been given to their use in absolute pose estimation. We introduce the first general solution to the problem of estimating the pose of a calibrated camera give… ▽ More Affine correspondences have traditionally been used to improve feature matching over wide baselines. While recent work has successfully used affine correspondences to solve various relative camera pose estimation problems, less attention has been given to their use in absolute pose estimation. We introduce the first general solution to the problem of estimating the pose of a calibrated camera given a single observation of an oriented point and an affine correspondence. The advantage of our approach (P1AC) is that it requires only a single correspondence, in comparison to the traditional point-based approach (P3P), significantly reducing the combinatorics in robust estimation. P1AC provides a general solution that removes restrictive assumptions made in prior work and is applicable to large-scale image-based localization. We propose a minimal solution to the P1AC problem and evaluate our novel solver on synthetic data, showing its numerical stability and performance under various types of noise. On standard image-based localization benchmarks we show that P1AC achieves more accurate results than the widely used P3P algorithm. Code for our method is available at https://github.com/jonathanventura/P1AC/ . △ Less

Submitted 29 June, 2024; v1 submitted 17 November, 2020; originally announced November 2020.

Comments: ICCV 2023 (with corrections in Eqs. 6 and 13 and Fig. 4)

arXiv:2010.08262 [pdf, other]

Local plasticity rules can learn deep representations using self-supervised contrastive predictions

Authors: Bernd Illing, Jean Ventura, Guillaume Bellec, Wulfram Gerstner

Abstract: Learning in the brain is poorly understood and learning rules that respect biological constraints, yet yield deep hierarchical representations, are still unknown. Here, we propose a learning rule that takes inspiration from neuroscience and recent advances in self-supervised deep learning. Learning minimizes a simple layer-specific loss function and does not need to back-propagate error signals wi… ▽ More Learning in the brain is poorly understood and learning rules that respect biological constraints, yet yield deep hierarchical representations, are still unknown. Here, we propose a learning rule that takes inspiration from neuroscience and recent advances in self-supervised deep learning. Learning minimizes a simple layer-specific loss function and does not need to back-propagate error signals within or between layers. Instead, weight updates follow a local, Hebbian, learning rule that only depends on pre- and post-synaptic neuronal activity, predictive dendritic input and widely broadcasted modulation factors which are identical for large groups of neurons. The learning rule applies contrastive predictive learning to a causal, biological setting using saccades (i.e. rapid shifts in gaze direction). We find that networks trained with this self-supervised and local rule build deep hierarchical representations of images, speech and video. △ Less

Submitted 25 October, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

arXiv:2010.07704 [pdf, other]

doi 10.1142/S1793351X20400139

Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic Video with Applications for Virtual Reality

Authors: Alisha Sharma, Ryan Nett, Jonathan Ventura

Abstract: We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation. In contrast to previous approaches for applying convolutional neural networks to panoramic imagery, we use the cylindrical… ▽ More We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation. In contrast to previous approaches for applying convolutional neural networks to panoramic imagery, we use the cylindrical panoramic projection which allows for the use of the traditional CNN layers such as convolutional filters and max pooling without modification. Our evaluation of synthetic and real data shows that unsupervised learning of depth and ego-motion on cylindrical panoramic images can produce high-quality depth maps and that an increased field-of-view improves ego-motion estimation accuracy. We create two new datasets to evaluate our approach: a synthetic dataset created using the CARLA simulator, and Headcam, a novel dataset of panoramic video collected from a helmet-mounted camera while biking in an urban setting. We also apply our network to the problem of converting monocular panoramas to stereo panoramas. △ Less

Submitted 9 November, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: Expansion on arXiv:1901.00979 for IJSC SI; correct table 1 and 3 headings, reduce file size

Journal ref: Int.J.Semantic Computing 14(3) (2020) 315-322

arXiv:2002.09558 [pdf, other]

Self-Supervised Poisson-Gaussian Denoising

Authors: Wesley Khademi, Sonia Rao, Clare Minnerath, Guy Hagen, Jonathan Ventura

Abstract: We extend the blindspot model for self-supervised denoising to handle Poisson-Gaussian noise and introduce an improved training scheme that avoids hyperparameters and adapts the denoiser to the test data. Self-supervised models for denoising learn to denoise from only noisy data and do not require corresponding clean images, which are difficult or impossible to acquire in some application areas of… ▽ More We extend the blindspot model for self-supervised denoising to handle Poisson-Gaussian noise and introduce an improved training scheme that avoids hyperparameters and adapts the denoiser to the test data. Self-supervised models for denoising learn to denoise from only noisy data and do not require corresponding clean images, which are difficult or impossible to acquire in some application areas of interest such as low-light microscopy. We introduce a new training strategy to handle Poisson-Gaussian noise which is the standard noise model for microscope images. Our new strategy eliminates hyperparameters from the loss function, which is important in a self-supervised regime where no ground truth data is available to guide hyperparameter tuning. We show how our denoiser can be adapted to the test data to improve performance. Our evaluations on microscope image denoising benchmarks validate our approach. △ Less

Submitted 18 November, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

Comments: to appear in IEEE WACV 2021

arXiv:1907.11561 [pdf, other]

Deep Learning for Classification and Severity Estimation of Coffee Leaf Biotic Stress

Authors: J. G. M. Esgario, R. A. Krohling, J. A. Ventura

Abstract: Biotic stress consists of damage to plants through other living organisms. Efficient control of biotic agents such as pests and pathogens (viruses, fungi, bacteria, etc.) is closely related to the concept of agricultural sustainability. Agricultural sustainability promotes the development of new technologies that allow the reduction of environmental impacts, greater accessibility to farmers and, c… ▽ More Biotic stress consists of damage to plants through other living organisms. Efficient control of biotic agents such as pests and pathogens (viruses, fungi, bacteria, etc.) is closely related to the concept of agricultural sustainability. Agricultural sustainability promotes the development of new technologies that allow the reduction of environmental impacts, greater accessibility to farmers and, consequently, increase on productivity. The use of computer vision with deep learning methods allows the early and correct identification of the stress-causing agent. So, corrective measures can be applied as soon as possible to mitigate the problem. The objective of this work is to design an effective and practical system capable of identifying and estimating the stress severity caused by biotic agents on coffee leaves. The proposed approach consists of a multi-task system based on convolutional neural networks. In addition, we have explored the use of data augmentation techniques to make the system more robust and accurate. The experimental results obtained for classification as well as for severity estimation indicate that the proposed system might be a suitable tool to assist both experts and farmers in the identification and quantification of biotic stresses in coffee plantations. △ Less

Submitted 26 July, 2019; originally announced July 2019.

arXiv:1904.00742 [pdf, other]

A smartphone application to detection and classification of coffee leaf miner and coffee leaf rust

Authors: Giuliano L. Manso, Helder Knidel, Renato A. Krohling, Jose A. Ventura

Abstract: Generally, the identification and classification of plant diseases and/or pests are performed by an expert . One of the problems facing coffee farmers in Brazil is crop infestation, particularly by leaf rust Hemileia vastatrix and leaf miner Leucoptera coffeella. The progression of the diseases and or pests occurs spatially and temporarily. So, it is very important to automatically identify the de… ▽ More Generally, the identification and classification of plant diseases and/or pests are performed by an expert . One of the problems facing coffee farmers in Brazil is crop infestation, particularly by leaf rust Hemileia vastatrix and leaf miner Leucoptera coffeella. The progression of the diseases and or pests occurs spatially and temporarily. So, it is very important to automatically identify the degree of severity. The main goal of this article consists on the development of a method and its i implementation as an App that allow the detection of the foliar damages from images of coffee leaf that are captured using a smartphone, and identify whether it is rust or leaf miner, and in turn the calculation of its severity degree. The method consists of identifying a leaf from the image and separates it from the background with the use of a segmentation algorithm. In the segmentation process, various types of backgrounds for the image using the HSV and YCbCr color spaces are tested. In the segmentation of foliar damages, the Otsu algorithm and the iterative threshold algorithm, in the YCgCr color space, have been used and compared to k-means. Next, features of the segmented foliar damages are calculated. For the classification, artificial neural network trained with extreme learning machine have been used. The results obtained shows the feasibility and effectiveness of the approach to identify and classify foliar damages, and the automatic calculation of the severity. The results obtained are very promising according to experts. △ Less

Submitted 19 March, 2019; originally announced April 2019.

arXiv:1901.00979 [pdf, other]

doi 10.1109/aivr46125.2019.00018

Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic Video

Authors: Alisha Sharma, Jonathan Ventura

Abstract: We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation. In contrast to previous approaches for applying convolutional neural networks to panoramic imagery, we use the cylindrical… ▽ More We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation. In contrast to previous approaches for applying convolutional neural networks to panoramic imagery, we use the cylindrical panoramic projection which allows for the use of the traditional CNN layers such as convolutional filters and max pooling without modification. Our evaluation of synthetic and real data shows that unsupervised learning of depth and ego-motion on cylindrical panoramic images can produce high-quality depth maps and that an increased field-of-view improves ego-motion estimation accuracy. We also introduce Headcam, a novel dataset of panoramic video collected from a helmet-mounted camera while biking in an urban setting. △ Less

Submitted 29 October, 2019; v1 submitted 3 January, 2019; originally announced January 2019.

Comments: Accepted to IEEE AIVR 2019

arXiv:1811.02629 [pdf, other]

Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge

Authors: Spyridon Bakas, Mauricio Reyes, Andras Jakab, Stefan Bauer, Markus Rempfler, Alessandro Crimi, Russell Takeshi Shinohara, Christoph Berger, Sung Min Ha, Martin Rozycki, Marcel Prastawa, Esther Alberts, Jana Lipkova, John Freymann, Justin Kirby, Michel Bilello, Hassan Fathallah-Shaykh, Roland Wiest, Jan Kirschke, Benedikt Wiestler, Rivka Colen, Aikaterini Kotrotsou, Pamela Lamontagne, Daniel Marcus, Mikhail Milchenko , et al. (402 additional authors not shown)

Abstract: Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles dissem… ▽ More Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrotic core, active and non-enhancing core. This intrinsic heterogeneity is also portrayed in their radio-phenotype, as their sub-regions are depicted by varying intensity profiles disseminated across multi-parametric magnetic resonance imaging (mpMRI) scans, reflecting varying biological properties. Their heterogeneous shape, extent, and location are some of the factors that make these tumors difficult to resect, and in some cases inoperable. The amount of resected tumor is a factor also considered in longitudinal scans, when evaluating the apparent tumor for potential diagnosis of progression. Furthermore, there is mounting evidence that accurate segmentation of the various tumor sub-regions can offer the basis for quantitative image analysis towards prediction of patient overall survival. This study assesses the state-of-the-art machine learning (ML) methods used for brain tumor image analysis in mpMRI scans, during the last seven instances of the International Brain Tumor Segmentation (BraTS) challenge, i.e., 2012-2018. Specifically, we focus on i) evaluating segmentations of the various glioma sub-regions in pre-operative mpMRI scans, ii) assessing potential tumor progression by virtue of longitudinal growth of tumor sub-regions, beyond use of the RECIST/RANO criteria, and iii) predicting the overall survival from pre-operative mpMRI scans of patients that underwent gross total resection. Finally, we investigate the challenge of identifying the best ML algorithms for each of these tasks, considering that apart from being diverse on each instance of the challenge, the multi-institutional mpMRI BraTS dataset has also been a continuously evolving/growing dataset. △ Less

Submitted 23 April, 2019; v1 submitted 5 November, 2018; originally announced November 2018.

Comments: The International Multimodal Brain Tumor Segmentation (BraTS) Challenge

arXiv:1804.07821 [pdf, other]

An Aggregated Multicolumn Dilated Convolution Network for Perspective-Free Counting

Authors: Diptodip Deb, Jonathan Ventura

Abstract: We propose the use of dilated filters to construct an aggregation module in a multicolumn convolutional neural network for perspective-free counting. Counting is a common problem in computer vision (e.g. traffic on the street or pedestrians in a crowd). Modern approaches to the counting problem involve the production of a density map via regression whose integral is equal to the number of objects… ▽ More We propose the use of dilated filters to construct an aggregation module in a multicolumn convolutional neural network for perspective-free counting. Counting is a common problem in computer vision (e.g. traffic on the street or pedestrians in a crowd). Modern approaches to the counting problem involve the production of a density map via regression whose integral is equal to the number of objects in the image. However, objects in the image can occur at different scales (e.g. due to perspective effects) which can make it difficult for a learning agent to learn the proper density map. While the use of multiple columns to extract multiscale information from images has been shown before, our approach aggregates the multiscale information gathered by the multicolumn convolutional neural network to improve performance. Our experiments show that our proposed network outperforms the state-of-the-art on many benchmark datasets, and also that using our aggregation module in combination with a higher number of columns is beneficial for multiscale counting. △ Less

Submitted 20 April, 2018; originally announced April 2018.

Comments: CVPR 2018 Workshop On Visual Understanding of Humans in Crowd Scene

arXiv:1604.00409 [pdf, other]

Structure from Motion on a Sphere

Authors: Jonathan Ventura

Abstract: We describe a special case of structure from motion where the camera rotates on a sphere. The camera's optical axis lies perpendicular to the sphere's surface. In this case, the camera's pose is minimally represented by three rotation parameters. From analysis of the epipolar geometry we derive a novel and efficient solution for the essential matrix relating two images, requiring only three point… ▽ More We describe a special case of structure from motion where the camera rotates on a sphere. The camera's optical axis lies perpendicular to the sphere's surface. In this case, the camera's pose is minimally represented by three rotation parameters. From analysis of the epipolar geometry we derive a novel and efficient solution for the essential matrix relating two images, requiring only three point correspondences in the minimal case. We apply this solver in a structure-from-motion pipeline that aggregates pairwise relations by rotation averaging followed by bundle adjustment with an inverse depth parameterization. Our methods enable scene modeling with an outward-facing camera and object scanning with an inward-facing camera. △ Less

Submitted 5 September, 2016; v1 submitted 1 April, 2016; originally announced April 2016.

Comments: in European Conference on Computer Vision (ECCV) 2016

arXiv:1503.02675 [pdf]

Global 6DOF Pose Estimation from Untextured 2D City Models

Authors: Clemens Arth, Christian Pirchheim, Jonathan Ventura, Vincent Lepetit

Abstract: We propose a method for estimating the 3D pose for the camera of a mobile device in outdoor conditions, using only an untextured 2D model. Previous methods compute only a relative pose using a SLAM algorithm, or require many registered images, which are cumbersome to acquire. By contrast, our method returns an accurate, absolute camera pose in an absolute referential using simple 2D+height maps, w… ▽ More We propose a method for estimating the 3D pose for the camera of a mobile device in outdoor conditions, using only an untextured 2D model. Previous methods compute only a relative pose using a SLAM algorithm, or require many registered images, which are cumbersome to acquire. By contrast, our method returns an accurate, absolute camera pose in an absolute referential using simple 2D+height maps, which are broadly available, to refine a first estimate of the pose provided by the device's sensors. We show how to first estimate the camera absolute orientation from straight line segments, and then how to estimate the translation by aligning the 2D map with a semantic segmentation of the input image. We demonstrate the robustness and accuracy of our approach on a challenging dataset. △ Less

Submitted 18 March, 2015; v1 submitted 9 March, 2015; originally announced March 2015.

Comments: 9 pages excluding supplementary material

ACM Class: I.2.10; I.3.7; I.4.3

Showing 1–14 of 14 results for author: Ventura, J